Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow users to use their own cuda contexts and streams in JIT mode #6345

Merged
merged 10 commits into from
Oct 28, 2021

Conversation

abadams
Copy link
Member

@abadams abadams commented Oct 23, 2021

By exposing the acquire_cuda_context and friends methods in the same way as other JIT runtime overrides, we can let people replace them by swapping in their own handlers for those methods (see the new test)

This PR is a bit of a can-kick, because it's a piecemeal function-pointer-based exposure of three new runtime methods, rather than a single coherent way to replace any parts of the runtime you want. It should solve the problem in the short term, however.

Builds on #6344

Copy link
Contributor

@steven-johnson steven-johnson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we have something like jit_handlers().reset() to set all the handlers back to their defaults?

src/JITModule.h Outdated
@@ -28,16 +28,130 @@ struct JITUserContext;

/** A set of custom overrides of runtime functions */
struct JITHandlers {
/** Set the function called to print messages from the runtime.
* If you are compiling statically, you can also just define your
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comments about "if you are compiling statically" are arguably confusing and/or misplaced here. I'd suggest replacing everything after the first sentence with something like "(Note that this applies only when jitting code; if you are doing ahead-of-time compilation, see [README_foo.md or wherever we document this, rather than trying to replicate that documentation here])".

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done (in underlying branch)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we have something like jit_handlers().reset() to set all the handlers back to their defaults?

It's just a struct, so you can say my_func.jit_handlers() = JITHandlers{} to reset it. Not sure if it's worth adding a method.

@@ -65,6 +65,23 @@ extern uintptr_t halide_cuda_get_device_ptr(void *user_context, struct halide_bu
* driver. See halide_reuse_device_allocations. */
extern int halide_cuda_release_unused_device_allocations(void *user_context);

// These typedefs treat both a CUcontext and a CUstream as a void *,
// to avoid dependencies on cuda headers.
typedef int (*halide_cuda_acquire_context_t)(void *, // user_context
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Completely orthogonal to this PR, but IMHO we should consider migrating typedef bar foo; to using foo = bar; as I think it reads easier and is easier to search for)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this header is supposed to compile in C mode.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and/or with janky legacy toolchains

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ahhhh right

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Do we actually compile/run any tests in plain-C mode? If not, we should add one)

These can come up if a JITUserContext is passed to something like
copy_to_device before getting fully populated by passing it to a call to
realize.
and reuse the runtime's name resolution mechanism instead
This change means we'll only ever create one built-in cuda context in
this circumstance.
@abadams
Copy link
Member Author

abadams commented Oct 26, 2021

Turns out the gpu_object_lifetime tests were creating both a cuda runtime module and a cuda-debug runtime module, and were thus creating two cuda contexts. The latest changes dedup this and add some comments explaining things.

@abadams abadams merged commit 1c7388a into master Oct 28, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants