Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

There is a lot of cloning #1551

Open
zrho opened this issue Oct 3, 2024 · 0 comments
Open

There is a lot of cloning #1551

zrho opened this issue Oct 3, 2024 · 0 comments
Labels
bug Something isn't working perf Performance issue

Comments

@zrho
Copy link
Contributor

zrho commented Oct 3, 2024

I have benchmarked the roundtrip of a circuit via the path core -> model -> capnp -> model -> core. I was surprised about how slow it was; way slower than the JSON export. So I profiled to see where all the time goes. This is the export:
Screenshot 2024-10-03 at 18 09 41
Most of the time is spent in OpType::port_kind, which takes so long since it copies a lot of stuff:

pub fn port_kind(&self, port: impl Into<Port>) -> Option<EdgeKind> {
    let signature = self.dataflow_signature().unwrap_or_default();      // <-- this deep copies he signature
    let port: Port = port.into();
    let dir = port.direction();
    let port_count = signature.port_count(dir);

    // Dataflow ports
    if port.index() < port_count {
        return signature.port_type(port).cloned().map(EdgeKind::Value);    // <-- this deep copies the relevant type again
    }

    // ....
}

Looking at the import, it's similar:
Screenshot 2024-10-03 at 18 05 46

The first bump is mostly due to splitting strings into the extension and the name part. The remaining two bumps come from OpType::value_port_count, which spends most of its time copying data.

pub fn value_port_count(&self, dir: portgraph::Direction) -> usize {
     self.dataflow_signature() // <-- copy all of the signature just to count ports
         .map(|sig| sig.port_count(dir))
         .unwrap_or(0)
}

The hugr-core data structures are designed in a way that you don't have to worry about ownership, which is not a bad thing. But there is a lot of deep copying, allocating and freeing going on, even just to look at something. There is a massive performance improvement to be had by being more careful here, not just for import or export, but for everything the compiler does that ever looks at types.

The most immediate and pragmatic approach that comes to my mind is to slap a lot of Arcs on as many types as possible, ideally in a way that minimally affects the API. (There are other approaches but they are too radical for now). What do you think?

@zrho zrho added the perf Performance issue label Oct 3, 2024
github-merge-queue bot pushed a commit that referenced this issue Oct 11, 2024
…1557)

A binary serialisation format for `hugr-model` based on capnproto.

- Export from `hugr-core` to `hugr-model` now deduplicates terms on the
fly.
- Removed types on ports and instead used the signature. This allows for
more deduplication and also provides a natural place for extension set
annotations which was missing before.
- Added the `NodeId` of the node that defines a parameter to the indexed
variant of a `LocalRef` so that deduplication does not accidentally
merge two local variables that refer to different parameters.

Serializing the same 2000 2-qubit gate + 1000 1-qubit gate circuit by
going from `core` via the `model` takes 5.5682 ms per roundtrip on Apple
M3. Meanwhile the JSON roundtrip baseline is 10.760 ms. A significant
portion of the time is spent on needless copies of signatures since
`export` and `import` have to interact with the `hugr-core` API (#1551),
so there is still a lot of potential to make this faster yet.
@aborgna-q aborgna-q added the bug Something isn't working label Nov 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working perf Performance issue
Projects
None yet
Development

No branches or pull requests

2 participants