Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ggml : cgraph export/import/eval example + GPU support #108

Merged
merged 27 commits into from
May 29, 2023
Merged

Conversation

ggerganov
Copy link
Owner

@ggerganov ggerganov commented Apr 24, 2023

This is the first step towards full GPU and custom hardware inference support (see ggerganov/llama.cpp#915)

The idea is to be able to export the ggml computation graphs (ggml_cgraph) into standalone .ggml files.
These files can be later imported by a separate application and evaluated based on the available hardware / framework (CUDA, Metal, WebGPU, etc.). The computation graph contains everything necessary to perform the inference:

  • model weights
  • operations
  • work buffers
  • sizes + offsets
  • memory layout

As an example, we export the MNIST computation graph from the mnist example into the file mnist.ggml:

$ ./bin/mnist ./models/mnist/ggml-model-f32.bin ../examples/mnist/models/mnist/t10k-images.idx3-ubyte

Next, using the mnist-cpu tool, we load the graph and re-evaluate it on the CPU using ggml_graph_compute():

./bin/mnist-cpu ./mnist.ggml ../examples/mnist/models/mnist/t10k-images.idx3-ubyte

Or we can run it on the Apple Silicon GPU using Metal:

./bin/mnist-mtl ./mnist.ggml ../examples/mnist/models/mnist/t10k-images.idx3-ubyte 

Here is a sample run:

$ ./bin/mnist ./models/mnist/ggml-model-f32.bin ../examples/mnist/models/mnist/t10k-images.idx3-ubyte

mnist_model_load: loading model from './models/mnist/ggml-model-f32.bin'
mnist_model_load: ggml ctx size =   1.52 MB
main: loaded model in     1.02 ms
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ * * * * * * _ _ _ _ _ _ _ _ _ _ _ _ _ 
_ _ _ _ * * * * * * * * _ * * * _ _ _ _ _ _ _ _ _ _ _ _ 
_ _ _ _ * _ _ _ _ _ _ _ _ _ * * _ _ _ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ * * _ _ _ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ _ _ _ * * _ _ _ _ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ _ _ * * * _ _ _ _ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ _ * * * * * * * _ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ * * * _ _ _ _ * * * _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ * * * * _ _ _ _ _ _ _ * _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ * _ _ _ _ _ _ _ _ _ _ * * _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ * _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ * _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ * _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ * _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ * _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ * * _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ * * _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ * * * _ _ _ _ _ _ _ _ * * * _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ * * * * * * * * * _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

ggml_graph_dump_dot: dot -Tpng mnist.dot -o mnist.dot.png && open mnist.dot.png

magic            67676d6c
version                 1
leafs                   5
nodes                   6
eval                 6120

TYPE   OP              NDIMS      NE0      NE1      NE2      NE3              NB0              NB1              NB2              NB3             DATA             NAME
f32    NONE                2      500       10        1        1                4             2000            20000            20000      0x1201877d0       fc2_weight
f32    NONE                2      784      500        1        1                4             3136          1568000          1568000      0x120008100       fc1_weight
f32    NONE                1      784        1        1        1                4             3136             3136             3136      0x11e809f00            input
f32    NONE                1      500        1        1        1                4             2000             2000             2000      0x120186f00         fc1_bias
f32    NONE                1       10        1        1        1                4               40               40               40      0x12018c6f0         fc2_bias

ARG    TYPE   OP              NDIMS      NE0      NE1      NE2      NE3              NB0              NB1              NB2              NB3   NTASKS             DATA             NAME
DST    f32    MUL_MAT             1      500        1        1        1                4             2000             2000             2000        1      0x11e80ac40           node_0
SRC0   f32    NONE                2      784      500        1        1                4             3136          1568000          1568000        0      0x120008100       fc1_weight
SRC1   f32    NONE                1      784        1        1        1                4             3136             3136             3136        0      0x11e809f00            input

DST    f32    ADD                 1      500        1        1        1                4             2000             2000             2000        1      0x11e80b510           node_1
SRC0   f32    MUL_MAT             1      500        1        1        1                4             2000             2000             2000        1      0x11e80ac40           node_0
SRC1   f32    NONE                1      500        1        1        1                4             2000             2000             2000        0      0x120186f00         fc1_bias

DST    f32    RELU                1      500        1        1        1                4             2000             2000             2000        1      0x11e80bde0           node_2
SRC0   f32    ADD                 1      500        1        1        1                4             2000             2000             2000        1      0x11e80b510           node_1

DST    f32    MUL_MAT             1       10        1        1        1                4               40               40               40        1      0x11e80c6b0           node_3
SRC0   f32    NONE                2      500       10        1        1                4             2000            20000            20000        0      0x1201877d0       fc2_weight
SRC1   f32    RELU                1      500        1        1        1                4             2000             2000             2000        1      0x11e80bde0           node_2

DST    f32    ADD                 1       10        1        1        1                4               40               40               40        1      0x11e80c7e0           node_4
SRC0   f32    MUL_MAT             1       10        1        1        1                4               40               40               40        1      0x11e80c6b0           node_3
SRC1   f32    NONE                1       10        1        1        1                4               40               40               40        0      0x12018c6f0         fc2_bias

DST    f32    SOFT_MAX            1       10        1        1        1                4               40               40               40        1      0x11e80c910            probs
SRC0   f32    ADD                 1       10        1        1        1                4               40               40               40        1      0x11e80c7e0           node_4


mnist_eval: exported compute graph to 'mnist.ggml'
main: predicted digit is 3
$ dot -Tpng mnist.dot -o mnist.dot.png && open mnist.dot.png

image

CPU (via ggml)

$ ./bin/mnist-cpu ./mnist.ggml ../examples/mnist/models/mnist/t10k-images.idx3-ubyte
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ * * * * _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ * * * * * _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ _ _ _ * * * * _ _ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ _ _ * * * * _ _ _ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ _ * * * _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ * * * _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ * * * _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ * * _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ * _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ * * _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ * * _ _ _ _ _ _ _ _ _ _ _ * * * * * _ _ _ _ _ 
_ _ _ _ _ * * * _ _ _ _ _ _ _ _ _ * * _ _ _ * * _ _ _ _ 
_ _ _ _ _ _ * * _ _ _ _ _ _ _ _ _ * _ _ _ _ _ * * _ _ _ 
_ _ _ _ _ _ * * _ _ _ _ _ _ _ _ * _ _ _ _ _ _ * _ _ _ _ 
_ _ _ _ _ _ * * * _ _ _ _ _ _ _ _ _ _ _ _ _ _ * _ _ _ _ 
_ _ _ _ _ _ _ * * * _ _ _ _ _ _ _ _ _ _ _ _ * * _ _ _ _ 
_ _ _ _ _ _ _ _ _ * * * * * * * * _ _ _ * * * _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ _ * * * * * * * * * * _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

ggml_graph_import: loaded leaf 0: '      fc2_weight',   2 dims,     20000 bytes
ggml_graph_import: loaded leaf 1: '      fc1_weight',   2 dims,   1568000 bytes
ggml_graph_import: loaded leaf 2: '           input',   1 dims,      3136 bytes
ggml_graph_import: loaded leaf 3: '        fc1_bias',   1 dims,      2000 bytes
ggml_graph_import: loaded leaf 4: '        fc2_bias',   1 dims,        40 bytes
ggml_graph_import: loaded node 0: '          node_0',   1 dims,      2000 bytes
ggml_graph_import: loaded node 1: '          node_1',   1 dims,      2000 bytes
ggml_graph_import: loaded node 2: '          node_2',   1 dims,      2000 bytes
ggml_graph_import: loaded node 3: '          node_3',   1 dims,        40 bytes
ggml_graph_import: loaded node 4: '          node_4',   1 dims,        40 bytes
ggml_graph_import: loaded node 5: '           probs',   1 dims,        40 bytes
main: predicted digit is 6

Metal

$ ./bin/mnist-mtl ./mnist.ggml ../examples/mnist/models/mnist/t10k-images.idx3-ubyte 

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ _ _ _ * * * * * * * _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ _ * * * _ _ _ _ _ _ * * _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ * * _ _ _ _ _ _ _ _ _ _ * * _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ * * _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ * _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ * * _ _ _ _ _ _ _ _ _ _ _ * _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ * _ _ _ _ _ _ _ _ _ _ * * _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ * _ _ _ _ _ _ _ _ _ _ * * _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ * _ _ _ _ _ _ _ _ _ * * _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ * _ _ _ _ _ _ * * * * _ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ * * _ _ _ _ * * * _ * _ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ * * * * * _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ * _ _ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ * _ _ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ * _ _ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ * * _ _ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ * * _ _ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ * _ _ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ * _ _ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

ggml_cgraph_import: loaded leaf 0: '      fc2_weight',   2 dims,     20000 bytes
ggml_cgraph_import: loaded leaf 1: '      fc1_weight',   2 dims,   1568000 bytes
ggml_cgraph_import: loaded leaf 2: '           input',   1 dims,      3136 bytes
ggml_cgraph_import: loaded leaf 3: '        fc1_bias',   1 dims,      2000 bytes
ggml_cgraph_import: loaded leaf 4: '        fc2_bias',   1 dims,        40 bytes
ggml_cgraph_import: loaded node 0: '          node_0',   1 dims,      2000 bytes
ggml_cgraph_import: loaded node 1: '          node_1',   1 dims,      2000 bytes
ggml_cgraph_import: loaded node 2: '          node_2',   1 dims,      2000 bytes
ggml_cgraph_import: loaded node 3: '          node_3',   1 dims,        40 bytes
ggml_cgraph_import: loaded node 4: '          node_4',   1 dims,        40 bytes
ggml_cgraph_import: loaded node 5: '           probs',   1 dims,        40 bytes
mnist_mtl_init: allocating
mnist_mtl_init: using MPS
mnist_mtl_init: allocated data buffer, size = 1594896
mnist_mtl_init: allocated eval buffer, size = 9120
mnist_mtl_init: allocated results buffer, size = 40
mnist_mtl_eval: evaluating
mnist_mtl_eval: encoding node   0, op =  MUL_MAT
mnist_mtl_get_buffer: data tensor '      fc1_weight', offs =    20512, size =  1568000
mnist_mtl_get_buffer: data tensor '           input', offs =  1588628, size =     3136
mnist_mtl_get_buffer: eval tensor '          node_0', offs =     1536, size =     2000
mnist_mtl_eval: encoding node   1, op =      ADD
mnist_mtl_get_buffer: eval tensor '          node_0', offs =     1536, size =     2000
mnist_mtl_get_buffer: data tensor '        fc1_bias', offs =  1591880, size =     2000
mnist_mtl_get_buffer: eval tensor '          node_1', offs =     3792, size =     2000
mnist_mtl_eval: encoding node   2, op =     RELU
mnist_mtl_get_buffer: eval tensor '          node_1', offs =     3792, size =     2000
mnist_mtl_get_buffer: eval tensor '          node_2', offs =     6048, size =     2000
mnist_mtl_eval: encoding node   3, op =  MUL_MAT
mnist_mtl_get_buffer: data tensor '      fc2_weight', offs =      396, size =    20000
mnist_mtl_get_buffer: eval tensor '          node_2', offs =     6048, size =     2000
mnist_mtl_get_buffer: eval tensor '          node_3', offs =     8304, size =       40
mnist_mtl_eval: encoding node   4, op =      ADD
mnist_mtl_get_buffer: eval tensor '          node_3', offs =     8304, size =       40
mnist_mtl_get_buffer: data tensor '        fc2_bias', offs =  1593996, size =       40
mnist_mtl_get_buffer: eval tensor '          node_4', offs =     8608, size =       40
mnist_mtl_eval: encoding node   5, op = SOFT_MAX
mnist_mtl_get_buffer: eval tensor '          node_4', offs =     8608, size =       40
mnist_mtl_get_buffer: eval tensor '           probs', offs =     8912, size =       40
mnist_mtl_get_buffer: eval tensor '           probs', offs =     8912, size =       40
mnist_mtl_eval: time elapsed = 0.001637
mnist_mtl_eval: probs[ 0] = 0.000000
mnist_mtl_eval: probs[ 1] = 0.000000
mnist_mtl_eval: probs[ 2] = 0.000000
mnist_mtl_eval: probs[ 3] = 0.000000
mnist_mtl_eval: probs[ 4] = 0.000000
mnist_mtl_eval: probs[ 5] = 0.000000
mnist_mtl_eval: probs[ 6] = 0.000000
mnist_mtl_eval: probs[ 7] = 0.000000
mnist_mtl_eval: probs[ 8] = 0.000000
mnist_mtl_eval: probs[ 9] = 1.000000
mnist_mtl_free: deallocating
main: predicted digit is 9

@emidoots
Copy link

@ggerganov I'm a bit curious/interested in this approach; I like that you are trying to separate ggml and the GPU implementation layer like this.

I'd be keen to make a quick attempt at executing the ggml graph output you have here using WebGPU from Zig; but I'm not sure exactly how to piece that output together (or even read it, necessarily) - so I wonder if you'd consider adding a C example or something that executes it on the CPU and validates the results it gets, so I could better understand how it works?

@ggerganov
Copy link
Owner Author

@slimsag Will try to prioritise this soon and finalize the export format + a CPU and/or Metal example

@JohnnyOpcode
Copy link

Netron supports many formats of exported graphs already. I think GGML could be easily added.

https://github.com/lutzroeder/netron

@ggerganov ggerganov force-pushed the cgraph-export branch 4 times, most recently from 6264c52 to eed3eac Compare May 24, 2023 10:09
@ggerganov
Copy link
Owner Author

Bit of slow progress here, but I think it is starting to work out
Hopefully will have a working prototype over the weekend

@Sslithercode
Copy link

Ive been waiting for this for months, Nothing has been as easy to use as llama.cpp.

@ggerganov ggerganov changed the title ggml : cgraph export brainstorming ggml : cgraph export/import/eval example May 27, 2023
@ggerganov ggerganov marked this pull request as ready for review May 27, 2023 13:10
@ggerganov
Copy link
Owner Author

ggerganov commented May 27, 2023

Ok, I'm finally at the interesting part. I have the ggml compute graph exported together without all tensor data and work buffers. Now I have to map this to the GPU and implement the operators. For MNIST we have just 4 operators:

  • F32 add
  • F32 mul mat
  • F32 RELU
  • F32 SOFT_MAX

Regarding the memory mapping, it looks like I need to use MTLHeap to map the ggml contexts and then create the MTLBuffers corresponding to the compute graph tensors as views of the heap(s) using newBufferWithLength:options:offset:

Everything should go into a single MTLCommandBuffer

@philipturner
Copy link

Everything should go into a single MTLCommandBuffer

Even though that command buffer takes multiple milliseconds, it won't cause a UI hitch. The Apple GPU can execute two separate command buffers concurrently from different MTLCommandQueues. The only stipulation is, a single kernel invocation within the cmdbuf doesn't take >16 ms. I recommend using Metal Frame Capture if possible (a bit buggy though).

@ggerganov ggerganov changed the title ggml : cgraph export/import/eval example ggml : cgraph export/import/eval example + GPU support May 28, 2023
@ggerganov
Copy link
Owner Author

ggerganov commented May 28, 2023

This is now working as expected and can serve as a proof-of-concept for offloading a ggml compute graph to be evaluated on the GPU via Metal (or some other framework, like CUDA). There are still many things to be careful about and it's easy to mess things up, but I think with time I will be able to make it easier to work with.

Before merging this, I will move the new import / export functions to the core ggml library (currently, they are in common).

After merging, the next step will be to implement LLaMA inference with the same approach.
This will involve implementing the missing matrix-vector multiplication kernels, RoPE kernel, Norm kernel + solving the "dynamic shape" problem where some of the tensor shapes depend on the number of input / predicted tokens.

examples/mnist/main-mtl.m Outdated Show resolved Hide resolved
@ggerganov ggerganov merged commit 3b697a2 into master May 29, 2023
@ggerganov ggerganov deleted the cgraph-export branch May 29, 2023 16:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants