Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor pyawkward.cpp both for compilation speed and so that arrays can be dynamically loaded by dependent Python modules. #116

Merged
merged 28 commits into from
Feb 11, 2020

Conversation

jpivarski
Copy link
Member

No description provided.

…rrays can be dynamically loaded by dependent Python modules.
@jpivarski jpivarski self-assigned this Feb 10, 2020
@jpivarski jpivarski linked an issue Feb 10, 2020 that may be closed by this pull request
@jpivarski
Copy link
Member Author

Compilation of all sources from scratch (12 core machine):

real	1m47.566s
user	1m44.996s
sys	0m4.160s

Linking only (no files touched):

real	0m28.955s
user	0m28.361s
sys	0m1.152s

1/3 of the compilation time is spent only making the awkward1.layout extension module. It's unclear that refactoring this into several extension modules would make an impact on full compilations (depends on whether linking is nonlinear or making separate .so files can be done in parallel—linking one is single-threaded), but it would definitely make an impact on partial recompilations, in which only the classes in one of the smaller .so files are touched.

Also, calling types, fillables, etc. "layouts" is an abuse of terminology. Only the Content subclasses are layouts.

@jpivarski
Copy link
Member Author

jpivarski commented Feb 11, 2020

I'm away from my big computer right now, but I don't think this solved the compilation speed issue. Nevertheless, it was a necessary refactoring.

@jpivarski jpivarski changed the title [WIP] Refactor pyawkward.cpp both for compilation speed and so that arrays can be dynamically loaded by dependent Python modules. Refactor pyawkward.cpp both for compilation speed and so that arrays can be dynamically loaded by dependent Python modules. Feb 11, 2020
@jpivarski
Copy link
Member Author

@glass-ships and @lukasheinrich:

You both want to create projects that depend on Awkward. While I understand that @lukasheinrich isn't currently considering round-tripping the data through Python, you should both be aware that I've tested compiling a dependent Python module against Awkward, and it works in MacOS and Linux. (I haven't figured out a CMake issue in Windows, yet.)

Here's an example, which I should also link from the README.

https://github.com/scikit-hep/awkward-1.0/tree/feature/PR116-refactor-pyawkward/dependent-project

The dependent project has either the static or dynamic libraries as a compile-time dependency. (The example shows it for static: just remove the -static in CMakeLists.txt.) It can then use std::shared_ptr<ak::Content> as an argument or a return type from its functions, and when the dependent project is compiled as a Python module with pybind11, pybind11 links those C++ objects to the right Python objects.

Some caveats:

  • The Windows issue I referred to, but it would probably just involve having access to a Windows machine or help from a Windows expert.
  • You're compiling against a particular version of Awkward, so your dependent project will be pinned to that version.
  • Fortunately or unfortunately, depending on your point of view, you're also compiling against a particular version of pybind11. I've asked about this on the pybind11 Gitter, and it sounds like there are intervals of compatibility, which might or might not be related to semantic version number. The reason this is "fortunate" is because it gives you a nice Python exception, rather than a random segfault deep in an Awkward routine because the C++ layout is different. When I release a version of Awkward, it should always include the pybind11 version number in the release notes so that dependent projects compiling against Awkward X.Y know that they need pybind11 version Z.

Can the linking be purely dynamic? I'm not sure. I would consider that a valuable goal, but I don't know if it can be achieved.

@henryiii may also have some good ideas about this.

… name I liked better (because it's like Numba's 'types', and it's qualified so not to be confused with the standard library module).
@jpivarski jpivarski merged commit e9f6ba0 into master Feb 11, 2020
@jpivarski jpivarski deleted the feature/PR116-refactor-pyawkward branch February 11, 2020 04:37
@jpivarski
Copy link
Member Author

To confirm that this refactoring didn't do a thing for compilation time, here's compilation of all sources from scratch (same 12 core machine).

real	2m1.092s
user	1m58.334s
sys	0m5.153s

It's about 10 seconds longer.

Linking only (no files touched):

real	0m31.471s
user	0m30.691s
sys	0m1.359s

It's about 2 seconds longer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Refactor pyawkward.cpp
1 participant