-
Notifications
You must be signed in to change notification settings - Fork 261
SGX startup is slow due to quadratic TOML processing #2593
Comments
Yup, because we're using wrong TOML constructs for this, we should use arrays, not dictionaries (which are slow, and the keys make no sense here). But this will be resolved when we completely fix #2076. |
Using TOML tables instead of TOML arrays also blocks my other PR: #2484 I started working on this transition. Here is the idea:
|
@dimakuv What about |
I have a branch in my local repo, I'll publish it after #2607 is merged. |
There's those other efforts related to partial manifest and HSM signing and I'm not sure how the manifest structure should look like. In case of partial manifests (i.e. situation, when you don't have all the trusted/protected files on your machine and you rely on externally provided hashes), don't you want something like: sgx.trusted_files = [
{ 'path' = '/q/werty', 'sha256' = 'deadbeef' },
]
# or maybe
[[sgx.trusted_files]]
path = '/asdf/zxcv'
sha256 = 'abcd' ? Because managing parallel arrays, while certainly possible to get right, might be more error-prone. |
Definitely doable, though I wouldn't consider it important. Forcing users to use an "array of two-field tables" sound much more complicated than my current "array of file paths":
Anyway, my points are:
I am not aware of such scenarios. Can this really happen for |
Yes, there are at least two scenarios for
So we need to have a possibility of "partially finalised" manifest and to merge several manifests in various stages of finalisation. From this POV it's not internal anymore, unless you want some manifests that look like manifests, still unsigned, but you'd better not touch them by hand. If you'd like to preserve simplicity of an array of strings, |
I like this idea, it preserves simplicity for usual use-cases, but doesn't block more complicated ones. |
Ok, let me implement Woju's approach. |
So I tried this:
And got Python TOML error:
So yeah, Python's TOML parser doesn't support mixed arrays: uiri/toml#270. Actually, looking at this GitHub repo, the project seems to be dying? There was no commit activity in the last couple months (I think from January 2021). But this workaround works:
|
Oh nice, our C TOML parser doesn't support mixed arrays:
Well, the latest version supports it: cktan/tomlc99#51 I will update our TOML C parser to this latest version then. |
Ok, I implemented everything in my local branch. My Python SGX manifest is similar to Pawel's in terms of number of Python-internal files:
Old times:
New times:
About 5x improvement (looking at |
Description of the problem
Graphene-SGX startup is slow for manifests that have a lot of
trusted_files
.Steps to reproduce
On an Ubuntu 18.04 machine, current master branch (c602e56). Try to run Python example with
graphene-sgx python -c "print('hello')"
.Expected results
This should be relatively quick.
Actual results
The command takes 10 seconds:
It looks like
python.manifest.sgx
contains a lot of files (all of/usr/lib/python3
,/lib/x86_64-linux-gnu
,/usr/lib/x86_64-linux-gnu
):Stopping in GDB shows that time is spent in this loop:
It looks like
toml_raw_in
does a linear traversal of the wholetrusted_files
table.The text was updated successfully, but these errors were encountered: