Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SYSTEMDS-2926] AWS scripts update for EMR-7.0.0 #2003

Merged
merged 4 commits into from
Feb 25, 2024
Merged

Conversation

lachezar-n
Copy link
Contributor

The changes fix some general issues:

  • creating and referencing the S3 buckets
  • not assigning any sub-network for the cluster (bad practice + potential security vulnerability)

The changes also update the used EMR version to the currently most recent one: emr-7.0.0:

  • configurations update
  • exchanging Ganglia with AmazonCloudWatchAgent

While testing the script with the current repo version the following bug was observed: when launching SystemDS in execution mode "spark" an IllegalCallerException is thrown.
For running the command spark-submit target/SystemDS.jar -f path/to/hello.dml -exec spark -stats -explain the exact output in the console is:

...
--MAIN PROGRAM
----GENERIC (lines 1-1) [recompile=false]
------CP print Hello World.SCALAR.STRING.true _Var0.SCALAR.STRING 8
------CP rmvar _Var0


An Error Occurred : 
   IllegalCallerException -- java.lang.ref is not open to unnamed module @4eba373c

@j143
Copy link
Contributor

j143 commented Feb 25, 2024

Hi @lachezar-n , welcome to SystemDS project ✨

@j143 j143 self-requested a review February 25, 2024 11:52
@j143 j143 changed the title AWS scripts update for emr-7.0.0 [SYSTEMDS-2926] AWS scripts update for emr-7.0.0 Feb 25, 2024
@j143 j143 changed the title [SYSTEMDS-2926] AWS scripts update for emr-7.0.0 [SYSTEMDS-2926] AWS scripts update for EMR-7.0.0 Feb 25, 2024
@j143 j143 merged commit eb29b2d into apache:main Feb 25, 2024
42 checks passed
@j143
Copy link
Contributor

j143 commented Feb 25, 2024

LGTM. Thanks

@j143
Copy link
Contributor

j143 commented Feb 25, 2024

Hi @lachezar-n , about the error you are facing could you share more details

please make sure to use Java 11, Spark 3.x

@j143 j143 removed their request for review February 25, 2024 13:14
@lachezar-n
Copy link
Contributor Author

Hi @j143, thanks for the note about the execution, indeed I didn't know that Java 11 is required.
EMR uses default Java version 8 so I made some changes to the the shell script to enforce using Java 11 on the Amazon cluster. I will post the changes as a new pull-request so the run_systemds_script.sh can execute successfully.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

2 participants