Native Image Committer Community Meeting 2022-04-07 #4472

christianwimmer · 2022-04-04T21:57:20Z

christianwimmer
Apr 4, 2022

List of all past and upcoming meetings: #3933

New and Noteworthy

Compatibility improvements
[GR-37582] Default to running image-builder on module path. #4468
015b140 Remove the option to do the reference handling in a regular Java thread. Reference handling is now always in a separate thread, or must be manually invoked
[GR-37414] Add support for Unsafe.getLoadAverage. #4420
[GR-30347] [GR-20166] Remove OperatingSystemMXBean substitutions. #4383
[GR-34529] InheritableThreadLocal support for native image #4394
Tracing agent
[GR-37572] Introduce agent experimental options documentation. #4429
1e40cb7 Generate conditional configuration using the native-image-configure tool
[GR-37224] Support for Unsafe.allocateInstance in NI #4393
[GR-27173] Automatic conditional configuration generation. #4270
Image build time / image size
[GR-36815] Run reachability handlers concurrently during analysis. #4471 Can require code changes in existing code because callbacks that were previously serial are now run in parallel
54db64f Reduce size of Graal graphs that are alive for the whole AOT compilation stage
[GR-37411] Fix method inlining and quickbuild mode. #4391
[GR-36431] Use reflection metadata for all objects #4222
JFR
3fc931b Working on stack sampling events, this is the first PR with some infrastructure for it
JFR: emit a large event if fails to write as an small event #4469
[GR-37311] Implement JFR ExecuteVMOperation event. #4426

Conditional Configuration and Tracing Agent Improvements That Generate It

The purpose of conditional configuration is to reduce footprint.
Without conditional configuration, large class paths will include unnecessary elements.

Conditional Configuration

Each configuration entry regardless of the configuration type can contain a special condition property.
During the image build, such an entry is only processed if the condition evaluates to true.
Currently, we only support the "is class X reachable in the image" condition called typeReachable.

Let's look at the following example reflect-config.json:

[
{
    "condition": {
        "typeReachable":"com.acme.MyLogger"
    },
    "name":"org.labx.ShreddingLogger",
    "allDeclaredConstructors": true
},
{
    "condition": {
        "typeReachable":"com.acme.MyLogger"
    },
    "name":"org.labx.FileLogger",
    "allDeclaredConstructors": true
}
]

The 2 configuration entries include logger implementations only if our facade class, com.acme.MyLogger, is included in the image.
Such configuration is especially useful for libraries with a large number of third party dependencies that are only used for certain functionalities.

Conditional Configuration Under the Hood

Native Image provides a handy mechanism called reachability handlers.
A reachability handler is a callback function you register through the Feature API along with reflection elements (classes, fields, methods, constructors).
If any of the elements become reachable, your callback is invoked.

Consider the following example:

import org.graalvm.nativeimage.hosted.Feature;

public class MagicFeature implements Feature {

    @Override
    void beforeAnalysis(BeforeAnalysisAccess access) {
        access.registerReachabilityHandler(MagicFeature::callback, access.findClassByName("com.acme.MyLogger"));
    }

    private static void callback(DuringAnalysisAccess access) {
        System.out.println("Logging is enabled and ready for liftoff!");
    }

}

Our callback function will only be invoked if com.acme.MyLogger is deemed reachable.
At a glance, this looks exactly like something we would need for implementing conditional configuration!
A note on the handlers: these callbacks are executed by Native Image during analysis and since recently can be executed in parallel.
This is important if you have multiple handlers that for example modify a shared map.

Conditional Configuration Generation with the Tracing Agent

Configuration with Origins

One of the functionalities we've added to the agent is tracing configuration with origins.
In this mode, the agent will break down the entries to the context they originated from (the call stack) in tree form.
To turn this mode on, add experimental-configuration-with-origins to the agent option string.
Example output from a netty test run:

└── io.netty.buffer.AbstractByteBufTest#testSliceReadGatheringByteChannelMultipleThreads()
    └── io.netty.buffer.AbstractByteBufTest#testReadGatheringByteChannelMultipleThreads(boolean)
        └── io.netty.buffer.AbstractReferenceCountedByteBuf#release()
            └── io.netty.buffer.AbstractReferenceCountedByteBuf#handleRelease(boolean)
                └── io.netty.buffer.UnpooledDirectByteBuf#deallocate()
                    └── io.netty.buffer.UnpooledDirectByteBuf#freeDirect(java.nio.ByteBuffer)
                        └── io.netty.util.internal.PlatformDependent#freeDirectBuffer(java.nio.ByteBuffer)
                            └── io.netty.util.internal.CleanerJava9#freeDirectBuffer(java.nio.ByteBuffer)
                                └── java.lang.reflect.Method#invoke(java.lang.Object,java.lang.Object[]) - [ {   "name":"sun.misc.Unsafe",   "methods":[{"name":"invokeCleaner","parameterTypes":["java.nio.ByteBuffer"] }] } ]

So, how does this tie in with generating conditional configuration?

How the Tracing Agent Generates Conditional Configuration

We use the origin information of the entries to deduce conditions.
Directly using the caller method's class to generate the condition wouldn't work in cases where utility methods are involved (e.g., wrappers around Class.forName)

To overcome this, we've developed the following algorithm:

Group all calls of a method from the call tree into a list
For each method with more than one call in the tree, iterate over its call list and find configuration common to all calls. Iterate over the list again, this time propagating any non-common configuration to its caller.
Repeat 2. while there are changes.
For each call node that contains configuration entries, generate these entries with the method's class as the condition.

The rationale behind this approach is: if a method generates different configuration entries depending on the caller, then the caller is in a way "responsible" for generating these entries.
Of course, as this is a heuristic, it will not work well in all cases.
If a method's entries vary due to for e.g. an external file, we would propagate it to the caller if it appeared in the call tree more than once.

As this is mainly targeted towards library authors, this brings us to the next problem - how do we ensure that only our library classes appear in conditions?
Take the above logger example: our com.acme.MyLogger facade might really be directly invoking a framework factory that then reflectively accesses the loggers' constructors.
Further more, let's say the org.labx.FileLogger creates a proxy in its constructor - from our library's perspective, such a proxy should only be generated if com.acme.MyLogger is reachable.
The solution here is to only keep "user code" classes in the call tree. A filter file is used to create a filter that includes only the user's classes.
In our logger example, we could use the following filter:

{
    "rules" : {
        {"includeClasses": "com.acme.**"}
    }
}

With this filter, only classes belonging to the com.acme package will end up in the call tree.
The format of this filter file is the same as the other agent filter files - see the agent docs.

An Example Agent Run

To enable this mode, simply add experimental-conditional-config-filter-file=<path-to-a-filter-file> to the agent option string. For example: -agentlib:native-image-agent=config-output-dir=config/,experimental-conditional-config-filter-file=my-filter.json.

Using the native-image-configure Tool to Generate Conditional Configuration From Multiple Agent Runs

If your library has a good unit test coverage, you may want to run these tests with the agent to generate a configuration for your library.
These tests can be executed in multiple different JVMs, exercising different code paths.
In these and other scenarios, we can use the agent to collect and save configuration with metadata.
The output of these runs is then ingested by native-image-configure that creates a merged call tree which is then used to generate conditional configuration.

A Brief Introduction to native-image-configure

We've recently started shipping native-image-configure with the GraalVM native-image component.
It is a handy tool that allows us manipulate existing configuration.
For example, you can merge multiple sets of configuration with: native-image-configure generate --input-dir=<config_dir_one> --input-dir=<config_dir_two> --output-dir=<merged_config_dir>

Back to the conditional configuration topic - we can instruct the agent to output the configuration with metadata by adding experimental-conditional-config-part to the agent option string (note that we do not pass in a filter file now, this step will be handled by native-image-configure)
Once all of the agent runs finish, we can generate conditional configuration with: native-image-configure generate-conditional --user-code-filter=<path-to-filter-file> --input-dir=<agent_output_dir_one> --input-dir=<agent_output_dir_two> --output-dir=<conditional_config_output_dir>

Integration with Native Build Tools

To simplify this process further, we plan on supporting this mode in the native build tools.
You can read more about this effort here: graalvm/native-build-tools#206

Tracing Agent Development Basics

The tracing agent is a great use case of Native Image.
It leverages JVMTI (Java Virtual Machine Tool Interface) and JNI to gather information from the running JVM.

JVMTI - TLDR

JVMTI is used for writing agents in a low level language (e.g. C).
It allows native code to receive callbacks from and call into the JVM.
The core concepts of JVMTI are:

Events: these are basically callbacks from the JVM when interesting things happen.
For example, the class file load hook event is triggered when the JVM obtains class data, but before this data is processed by the JVM.
Some events allow us to tweak the event data - in the case of the class file load hook event, we can change the bytecode of the class before it is loaded.
Events callbacks can be set through SetEventCallbacks.
Events must be explicitly enabled by calling SetEventNotificationMode.
Functions
Capabilities: parts of the JVMTI are optional and the JVM needn't implement it.
Other parts of JVMTI would incur runtime cost even if they are never used.
For these reasons and more, functions and events may only be used if their required capabilities are enabled.
For example, CompileMethodLoad can only be used if the can_generate_compiled_method_load_events capability is enabled.

For our purposes, the JVMTI data structures are mapped into Java classes using our C API.
For example, our representation of the jvmtiEventCallbacks C struct is com.oracle.svm.jvmtiagentbase.jvmti.JvmtiEventCallbacks and looks like:

@CStruct("jvmtiEventCallbacks")
@CContext(JvmtiDirectives.class)
public interface JvmtiEventCallbacks extends PointerBase {
    @CField
    void setVMStart(CFunctionPointer callback);

    @CField
    void setVMInit(CFunctionPointer callback);
    ...
}

Breakpoints, Breakpoints Everywhere!

At its core, the tracing agent works by setting breakpoints on interesting methods, much like the ones set by your favorite IDE when debugging.
Breakpoints are specified in com.oracle.svm.agent.BreakpointInterceptor#BREAKPOINT_SPECIFICATIONS and their specification consists of a Java class, method and signature, along with a breakpoint handler.
The agent will then install a breakpoint at the start of the target method.
Once the breakpoint is hit, your handler will be invoked.

Usually, the handler will extract some data from the breakpoint event and the environment and forward this data to the underlying com.oracle.svm.agent.tracing.core.Tracer with a call to com.oracle.svm.agent.BreakpointInterceptor#traceBreakpoint.
The tracer then dispatches the data to an appropriate com.oracle.svm.configure.trace.TraceProcessor.
The processor will then peek into the data and delegate it to an appropriate configuration processor.

Well, Not Exactly Everywhere - How Do We Handle JNI?

Things are a little bit different with JNI - the native code can still access for e.g. a field through GetFieldID.
com.oracle.svm.agent.JniCallInterceptor is in charge of intercepting these entries.
The JVM uses the JNI function table to figure out which function to call when a JNI method is invoked from native code.
JniCallInterceptor works by changing this table in com.oracle.svm.agent.JniCallInterceptor#onVMStart, replacing interesting JNI functions with our own wrappers.
These wrappers will delegate calls to their wrapped method, along with forwarding interesting bits to the tracer via com.oracle.svm.agent.JniCallInterceptor#traceCall.

Possible deep dive topics for next meeting

Please send suggestions, or "upvote" a suggestion, by adding a comment to this discussion.

How to write "AOT friendly" Java code

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Native Image Committer Community Meeting 2022-04-07 #4472

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 0 comments

Select a reply

Native Image Committer Community Meeting 2022-04-07 #4472

christianwimmer Apr 4, 2022

New and Noteworthy

Conditional Configuration and Tracing Agent Improvements That Generate It

Conditional Configuration

Conditional Configuration Under the Hood

Conditional Configuration Generation with the Tracing Agent

Configuration with Origins

How the Tracing Agent Generates Conditional Configuration

An Example Agent Run

Using the native-image-configure Tool to Generate Conditional Configuration From Multiple Agent Runs

A Brief Introduction to native-image-configure

Integration with Native Build Tools

Tracing Agent Development Basics

JVMTI - TLDR

Breakpoints, Breakpoints Everywhere!

Well, Not Exactly Everywhere - How Do We Handle JNI?

Possible deep dive topics for next meeting

Replies: 0 comments

christianwimmer
Apr 4, 2022