-
Notifications
You must be signed in to change notification settings - Fork 110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
read: cache abbreviations at offset 0 #628
Conversation
This would work perfectly for us, we wouldn’t even need any code changes vs the normal gimli calls. Drawbacks as you said is a Mutex, plus the additional public field on |
Maybe you could use an |
Perhaps what I'll do is use |
Using |
Another option might be to make it generic over traits from
We do need to support multi-threaded still. The |
I think the locking question boils down to doing all this lazily. In my workaround I just pre-parse abbrevs at offset 0 at the very beginning when constructing the |
Right, but do you think that interior mutability a problem? If we didn't need to support
Can you expand on how that does not make it easy? The current API design is because we don't have a strong commitment to stability, and having public members is meant to make things more flexible for users. For example, I think your workaround is only possible because of the public members? |
Indeed, the fields being public was one reason why I was able to work around this in the first place. I was just thinking that because of public members and struct literals, you don’t have a constructor to do work in. So you would either have to do lazy parsing with internal mutability, or put the burden on the user to correctly initialize the cache ahead of time. Which I found out I might have done wrong and was too overeager to just parse at offset 0 and propagate EOF errors for objects that have no abbrevs at all. I don’t think there is anything wrong with that interior mutability, and I think its a good tradeoff to have locking there. |
BTW, I blogged about this issue here: https://swatinem.de/blog/abbreviations/ |
@bjorn3 Are you aware of any examples of using |
Unfortunately it doesn't appear to support |
40534df
to
de3b572
Compare
This optimises for the case where there is a single set of abbreviations that is used for all units.
Still looking into the code, but: I would normally use |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me with some minor comments below.
src/read/lazy.rs
Outdated
if !value_ptr.is_null() { | ||
// SAFETY: all writes to `self.value` are pointers obtained from `Arc::into_raw`. | ||
unsafe { | ||
Arc::from_raw(value_ptr); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When using Arc::from_raw
or Box::from_raw
specifically just to run the Drop
implementation, I like to wrap it in drop()
for clarity:
Arc::from_raw(value_ptr); | |
drop(Arc::from_raw(value_ptr)); |
src/read/lazy.rs
Outdated
Ordering::Release, | ||
Ordering::Relaxed, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think some kind of explanation of why these orderings are chosen is called for here.
When I see SeqCst
I feel okay (as far as I feel okay about atomics code in general!) and when I see AcqRel
I know to double check that basically we are dealing with one memory location and not relying on orderings between operations on different memory locations, and when I see anything else I basically know that I am out of my depth. Sometimes I feel better if there is a link to some comment in std
's implementation that explains why weaker orderings are okay in this specific case, and how the code I am actually reading just does the same thing std
is doing. But yeah I definitely don't have the hubris required to say I understand justifications for weaker orderings. Mostly I just try to make sure that whoever wrote the code actually really understands what they're doing and has a good justification, and the way I do that is by asking them to write a comment :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good suggestion. I've improved my understanding and decided that Relaxed
is wrong here, and I've added comments.
src/read/lazy.rs
Outdated
// Only written once with a value obtained from `Arc<T>::into_raw`. | ||
value: AtomicPtr<T>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe add a comment about how this is holding a ref count for us, so we can always safely load it and return a Arc::clone
of it without fear of someone else racing in, decrementing its ref count to zero, and dropping it while we are still managing it / giving out clones?
This optimises for the case where there is a single set of abbreviations that is used for all units.
Closes #626
Closes #627