Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bgd 5296 marge latest kubeflow spark operator #19

Closed

Conversation

sigmarkarl
Copy link

@sigmarkarl sigmarkarl commented Jun 5, 2024

Merge spark operator upstream master to keep up to date

Jira Ticket

BGD-5296

Checklist:

  • I have filled relevant self assessment (NodeJS, Frontend, Backend)
  • I have run ESlint on my changes and fixed all warnings and errors (NodeJS & Frontend Services)
  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have validated all the requirements in the Jira task were answered
  • I have all neccessary approvals for the design/mini design of this task
  • I have approved the API changes and granular permission patterns (documentation subtask) (For public services only)

antonipp and others added 30 commits March 17, 2023 08:55
Added new RBAC permissions needed by default for leader election for the coordination/v1 API.
Required after upgrade to golang:1.19.2.
In k8s.io/[email protected]/tools/leaderelection/resourcelock/interface.go:166 `configMapsResourceLock` was removed and should be replaced by `ConfigMapsLeasesResourceLock`.
* Added support for setting extra commonLabels

* Added support for podLabels on cleanup and init job

* Fixed templating errors

* Added documentation
Used to run Analytics Jobs and ETL pipelines along with AI/ML jobs.
* add sidecars for operator

* bumping chart version
This notes the controller intended to be used with the operator-managed ingress resources.

When setting up my own cluster I also tried https://docs.nginx.com/nginx-ingress-controller/ and https://kubernetes-sigs.github.io/aws-load-balancer-controller/ but the path format generated by the operator won't work with those.
* Add envFrom to operator deployment

Useful to when env vars are used for auth when downloading `spark.archives` from S3.

* Fix over-indenting
Resolves kubeflow#1344

Spark 3.4 supports IPv6:
- apache/spark#36868

So I want to make the operator support IPv6.

I can confirm that this can submit the spark-job in IPv6-only environment.

Although it is necessary to add the following environment variables to the operator

```yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: spark-on-k8s-spark-operator
spec:
  template:
    spec:
      containers:
      - name: spark-operator
        env:
        - name: _JAVA_OPTIONS
          value: "-Djava.net.preferIPv6Addresses=true"
        - name: KUBERNETES_DISABLE_HOSTNAME_VERIFICATION
          value: "true"

```
…necessary for Kubernetes Pod Security Standards Restricted profile. (kubeflow#1768)

https://kubernetes.io/docs/concepts/security/pod-security-standards/#restricted

* Fixed pre-commit jobs.
  `build-helm-chart` and `integration-test` were failing with:
  `Run manusa/[email protected]
   Error: Unsupported OS, action only works in Ubuntu 18 or 20`
…lcano on OCP (kubeflow#1724)

* fix: fix issue kubeflow#1723 about spark-operator not working with volcano on OCP

Signed-off-by: disaster37 <[email protected]>

* Update volcano_scheduler.go

---------

Signed-off-by: disaster37 <[email protected]>
…and service name (kubeflow#1778)

* Installing operator using kustomize and custom namespace and service name

* update quick start guide with suggested changes.
@sigmarkarl sigmarkarl requested review from a team as code owners June 5, 2024 10:46
ChenYi015 and others added 23 commits June 5, 2024 14:39
…beflow#2043)

* feat: give an option to set the priority class for spark-operator pod
Signed-off-by: Praveen Gajulapalli <[email protected]>

* feat: bumped up helm chart version
Signed-off-by: Praveen Gajulapalli <[email protected]>

* fix: fixed issue with position of priorityClassName
Signed-off-by: Praveen Gajulapalli <[email protected]>
* feat: add support for setting objectSelector on webhook

Signed-off-by: Cian Gallagher <[email protected]>

* feat: update objectSelector to match expressions

Signed-off-by: Cian Gallagher <[email protected]>

* chore: use out of the box label parser

Signed-off-by: Cian Gallagher <[email protected]>

* chore: update chart version

Signed-off-by: Cian Gallagher <[email protected]>

* chore: update app version

Signed-off-by: Cian Gallagher <[email protected]>

* fix: use parseSelector

Signed-off-by: Cian Gallagher <[email protected]>

* ci: update minikube action to latest release

Signed-off-by: Cian Gallagher <[email protected]>

* revert: undo ci changes. create seperate pr

Signed-off-by: Cian Gallagher <[email protected]>

* Trigger CI

Signed-off-by: Cian Gallagher <[email protected]>

* chore: update chart version & docs following previous merge

Signed-off-by: Cian Gallagher <[email protected]>

* docs: update docs

Signed-off-by: Cian Gallagher <[email protected]>

---------

Signed-off-by: Cian Gallagher <[email protected]>
* Add CNCF Code of Conduct

Signed-off-by: Yi Chen <[email protected]>

* Update contributing guide

Signed-off-by: Yi Chen <[email protected]>

* Redirect links to kubeflow website

Signed-off-by: Yi Chen <[email protected]>

---------

Signed-off-by: Yi Chen <[email protected]>
* Update docs

Signed-off-by: Yi Chen <[email protected]>

* Remove docs and update README

Signed-off-by: Yi Chen <[email protected]>

* Add link to monthly community meeting

Signed-off-by: Yi Chen <[email protected]>

---------

Signed-off-by: Yi Chen <[email protected]>
* Add PodDisruptionBudget to chart

Signed-off-by: Carlos Sánchez Páez <[email protected]>
Signed-off-by: Carlos Sánchez Páez <[email protected]>
Signed-off-by: Carlos Sánchez Páez <[email protected]>

* PR comments

Signed-off-by: Carlos Sánchez Páez <[email protected]>

---------

Signed-off-by: Carlos Sánchez Páez <[email protected]>
Signed-off-by: Carlos Sánchez Páez <[email protected]>
Signed-off-by: Carlos Sánchez Páez <[email protected]>
* Add workflow for closing staled issues and PRs

Signed-off-by: Yi Chen <[email protected]>

* Add job permissions

Signed-off-by: Yi Chen <[email protected]>

---------

Signed-off-by: Yi Chen <[email protected]>
…kubeflow#2046)

* Update .gitignore

Signed-off-by: Yi Chen <[email protected]>

* Update .dockerignore

Signed-off-by: Yi Chen <[email protected]>

* Update Makefile

Signed-off-by: Yi Chen <[email protected]>

* Update the process to generate api docs

Signed-off-by: Yi Chen <[email protected]>

* Update the workflow to generate api docs

Signed-off-by: Yi Chen <[email protected]>

* Use controller-gen to generate CRD and deep copy related methods

Signed-off-by: Yi Chen <[email protected]>

* Update helm chart CRDs

Signed-off-by: Yi Chen <[email protected]>

* Update workflow for building spark operator

Signed-off-by: Yi Chen <[email protected]>

* Update README.md

Signed-off-by: Yi Chen <[email protected]>

---------

Signed-off-by: Yi Chen <[email protected]>
* Update README and documentation (kubeflow#2047)

* Update docs

Signed-off-by: Yi Chen <[email protected]>

* Remove docs and update README

Signed-off-by: Yi Chen <[email protected]>

* Add link to monthly community meeting

Signed-off-by: Yi Chen <[email protected]>

---------

Signed-off-by: Yi Chen <[email protected]>
Signed-off-by: jbhalodia-slack <[email protected]>

* Add PodDisruptionBudget to chart (kubeflow#2078)

* Add PodDisruptionBudget to chart

Signed-off-by: Carlos Sánchez Páez <[email protected]>
Signed-off-by: Carlos Sánchez Páez <[email protected]>
Signed-off-by: Carlos Sánchez Páez <[email protected]>

* PR comments

Signed-off-by: Carlos Sánchez Páez <[email protected]>

---------

Signed-off-by: Carlos Sánchez Páez <[email protected]>
Signed-off-by: Carlos Sánchez Páez <[email protected]>
Signed-off-by: jbhalodia-slack <[email protected]>

* Set topologySpreadConstraints

Signed-off-by: jbhalodia-slack <[email protected]>

* Update README and increase patch version

Signed-off-by: jbhalodia-slack <[email protected]>

* Revert replicaCount change

Signed-off-by: jbhalodia-slack <[email protected]>

* Update README after master merger

Signed-off-by: jbhalodia-slack <[email protected]>

* Update README

Signed-off-by: jbhalodia-slack <[email protected]>

---------

Signed-off-by: Yi Chen <[email protected]>
Signed-off-by: jbhalodia-slack <[email protected]>
Signed-off-by: Carlos Sánchez Páez <[email protected]>
Signed-off-by: Carlos Sánchez Páez <[email protected]>
Co-authored-by: Yi Chen <[email protected]>
Co-authored-by: Carlos Sánchez Páez <[email protected]>
* Update .helmignore

Signed-off-by: Yi Chen <[email protected]>

* Add release docs

Signed-off-by: Yi Chen <[email protected]>

* Update release workflow

Signed-off-by: Yi Chen <[email protected]>

* Update integration test workflow

Signed-off-by: Yi Chen <[email protected]>

* Add workflow for pushing tag when VERSION file changes

Signed-off-by: Yi Chen <[email protected]>

* Update

Signed-off-by: Yi Chen <[email protected]>

* Remove the leading 'v' from chart version

Signed-off-by: Yi Chen <[email protected]>

* Update docker image tags

Signed-off-by: Yi Chen <[email protected]>

---------

Signed-off-by: Yi Chen <[email protected]>
* Use controller-runtime to reconstruct spark operator

Signed-off-by: Yi Chen <[email protected]>

* Update helm charts

Signed-off-by: Yi Chen <[email protected]>

* Update examples

Signed-off-by: Yi Chen <[email protected]>

---------

Signed-off-by: Yi Chen <[email protected]>
@alextarasov-spot alextarasov-spot deleted the BGD-5296-marge-latest-kubeflow-spark-operator branch September 17, 2024 10:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.