Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support dynamic class loading #2442

Merged
merged 1 commit into from
Jun 8, 2021
Merged

Conversation

ziyilin
Copy link
Collaborator

@ziyilin ziyilin commented May 8, 2020

Overview
This patch supports java.lang.ClassLoader.defineClass(String name, byte[] b, int off, int len) based dynamic class loading in native image. It is separated from the previous PR (#2323).
The basic idea for supporting dynamic class loading is dumping classes generated at run time in a beforehand run with native-image-agent to a specified directory which is added to the classpath for building native image.
This is implemented in 3 steps:

  1. Intercept java.lang.ClassLoader.postDefineClass by native-image-agent to dump the dynamically generated class to file system.
  2. User should make sure the generated classes' names are fixed through Hotspot version runtime and native-image version runtime.
  3. Substitute relevant classes in SubstrateVM to load the previously dumped classes at native image runtime.

Features:

  • Add new agent option -agentlib:native-image-agent=dynamic-class-dump-dir= to specify where to dump the dynamically generated classes, the default value is "dynClass" if not specified.
  • Both dynamically generated class and its stack trace are dumped.
  • Dynamically generated class's name could be decided at runtime (e.g. runtime sequence number as post-fix) or null when java.lang.ClassLoader.defineClass is invoked. It is the user's job to make sure the dynamic generated class' name is constant through Agent runtime and native image runtime. The patch has supported the null name scenario.
  • Different classloaders can dynamically define same name class at runtime. SubstrateVM doesn't support multiple classloaders yet. This patch will report error at Agent runtime if same name classes are defined by multiple classloaders.
  • All unsupported dynamic class loading usages shall be reported at the end of Agent run.

Limitations:

  • This patch does not support defineClass with specified ProtectionDomain.

  • It is possible the jar on classpath has a different signature file from dynamically generated class and fail the check in java.lang.ClassLoader.checkCerts(String name, CodeSource cs) at native-image build time. Current solution is to manually delete jar's signature file before building. This limitation can be shown by test: testWithSignedJar.zip

  • It is possible the dynamically generated class contents are changing from run to run. We make a class contents check at runtime based on the byte array's hash code to assure the runtime defined class is the same as the previously prepared one. But this check excludes SourceFile attribute because it often varies from runs but doesn't affect runtime behavior. This limitation can be shown by test: testChangedClass.zip

Tests:
This patch can be tested by
testDynamicLoading.zip

@ziyilin ziyilin force-pushed the dynamicClassLoading branch 8 times, most recently from 61af582 to 397df00 Compare May 15, 2020 06:07
@ziyilin ziyilin force-pushed the dynamicClassLoading branch from 397df00 to 1b2203d Compare May 18, 2020 12:31
@ziyilin ziyilin force-pushed the dynamicClassLoading branch 7 times, most recently from 40eeed9 to c9b96ff Compare June 17, 2020 11:37
@ziyilin ziyilin force-pushed the dynamicClassLoading branch 4 times, most recently from 8c088b7 to 5371bc8 Compare June 22, 2020 06:47
@peter-hofer
Copy link
Member

Thank you for your contribution @ziyilin ! Before I go into details, I'd like to discuss the overall design (and I'd also appreciate input from @christianwimmer).

The current design, as I understand it, dumps class files for defined classes directly and immediately to a separate directory, bypassing the mechanism for writing trace events and generating configuration files. I understand the tracing approach involves some additional overhead, but it would provide better opportunities for tracking dynamically loaded classes and their metadata (like the hash code) and loading them during the image build. The current approach requires manual changes to the class path and uses a fake Class.forName reflection event to carry metadata and trigger loading the class.
What I'm considering is to write a trace event containing the Base64-encoded class data along with any metadata. From this, trace processing can write a JSON file with the class metadata, including its fully-qualified name, which might not be supported as a file name on each platform. This file can be written to the configuration directory so that the configuration is self-contained and users don't have to maintain a separate directory. Considering the class file data, I see three options:

  1. Store raw .class files in a subdirectory of the configuration directory and use mangled class names or generated identifiers for file names. The class files are referenced from the (single) JSON file.
  2. Store a single JSON file for each class in a subdirectory, containing its Base64-encoded class data and metadata, using a mangled class name. This would make it easier to delete individual undesirable classes (and avoid potential issues with binary files in version control, across platforms, etc.).
  3. Store all metadata and Base64-encoded class data of dynamically loaded classes in a single JSON file. This would be in line what we currently do for other purposes, but probably be impractically large.
    Options 1 and 2 have the advantage that the class data can be immediately written after trace processing rather than staying resident in memory until the application finishes.
    The approach in general would keep all information in a single trace stream, and all processed information in a single configuration directory, simplifying the build and retaining the current workflows of users.

@ziyilin
Copy link
Collaborator Author

ziyilin commented Aug 20, 2020

The original design was for quick implementation to support the business requirement. I agree all things should be kept together in a single system. The option 1 seems good to me. In option 1, the json configuration file acts as an index, and the raw classes can be stored in a subdirectory independently. The configuration file is small so it's feasible for human reading and checking, the independently stored class files are also more flexible for future extension.

@vjovanov
Copy link
Member

Hi @ziyilin, I have made a rebase of your PR to test it out with our benchmarks. It works in all of the cases there. The link to the fixed-up version is here:

https://github.com/vjovanov/graal/tree/dynamicClassLoading

I am running the tests now and will keep you posted on the results.

Amazing work!

@ziyilin
Copy link
Collaborator Author

ziyilin commented Apr 7, 2021

@vjovanov Is there any review updates?

@peter-hofer
Copy link
Member

@ziyilin I'm looking into it.

if (isDynamicallyGenerated) {
if (!justAdded) {
unsupportedExceptions.add("Class " + definedClassName + " has been defined before. Multiple definitions are not supported.\n" +
ClassLoaderDefineClassSupport.getStackTrace(jni).toString());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not an issue that needs to be handled or reported in the agent, but should be reported during the image build or at image runtime.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unsupportedExceptions and reportExceptions are unused now.

Comment on lines 139 to 138
if (featureName.equals("dynamicClassLoading") && location instanceof URL) {
dynamicClassLoadingDirs.add((URL) location);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not clean and the static field will not work when there is more than one image build in the process (server mode). You might need to add a method to collect the locations and return them so you can use them in the feature's afterRegistration method.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the dynamicClassLoadingDirs. The original way is to fetch the dumped class file from all possible locations setting by -H:DynamicClassesConfigurationFiles= and -H:DynamicClassesConfigurationResources= , but the class file should be located in the dynClasses directory under the configuration file's directory. So we only need to follow the location of the currently parsed dynamicClass configuration file to look for the dumped class files.

@ziyilin ziyilin force-pushed the dynamicClassLoading branch 6 times, most recently from 132bac9 to 74f6442 Compare April 14, 2021 03:26
Support dynamic class loading for native image
@ziyilin ziyilin force-pushed the dynamicClassLoading branch from 74f6442 to a13778c Compare April 14, 2021 04:47
@ziyilin
Copy link
Collaborator Author

ziyilin commented Apr 15, 2021

@peter-hofer I have dealed with all comments, please take a look. Thanks.

@peter-hofer
Copy link
Member

@ziyilin , thanks, I will.

@peter-hofer
Copy link
Member

@ziyilin : looks good now, thanks again for your contribution. I'll take care of it from here 🙂

@ziyilin
Copy link
Collaborator Author

ziyilin commented Apr 19, 2021

@peter-hofer Great! Thank you for your reviewing.

@dzou
Copy link

dzou commented Aug 25, 2021

Is there any guidance on how to use this feature? Seems like the option was renamed from dynamic-class-dump-dir to something else?

Error log:

# Created at 2021-08-25T14:11:54.207
native-image-agent: unknown option: 'dynamic-class-dump-dir=target/tmp_classes'.

# Created at 2021-08-25T14:11:54.208
native-image-agent: Example usage: -agentlib:native-image-agent=config-output-dir=/path/to/config-dir/

@dthommes
Copy link

Is it possible, that the option now is named -dynamic-classes-output as seen in the code here (?):

a13778c#diff-573485114a4aca2da43e2c66f93f17a7621b9fe3601acb8ac3d2fb35c99c4456R200

I did not try it. Would be great to have some documentation :-)

@peter-hofer
Copy link
Member

peter-hofer commented Nov 3, 2021

@dzou @dthommes we renamed this feature predefined classes because there really is not much dynamic about this mechanism at image runtime. You can enable tracking in the agent with option experimental-class-define-support=true, then any classes not loaded by a built-in loader will be stored in the subdirectory agent-extracted-predefined-classes of the configuration directory and added to predefined-classes-config.json. You will likely need to manually pick those classes which should actually be predefined during the image build, and exclude others. There are other restrictions, a class cannot be initialized at image build time and at runtime, cannot be "loaded" more than once in the entire VM (isolate).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants