Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Concurrency issue with the UnitValueAbbreviationLookup #919

Closed
lipchev opened this issue Apr 12, 2021 · 11 comments · Fixed by #1159
Closed

Concurrency issue with the UnitValueAbbreviationLookup #919

lipchev opened this issue Apr 12, 2021 · 11 comments · Fixed by #1159
Assignees
Labels

Comments

@lipchev
Copy link
Collaborator

lipchev commented Apr 12, 2021

Describe the bug
There is a possible concurrency issue with adding the missing abbreviation here (likely applicable to the overloads as well).

To Reproduce

  1. Add nuget UnitsNet 4.9.0 to a .NET 4.0 project
  2. In parallel, try to get the Unit for a random string (representing the missing abbreviation)
  3. Index out of bounds exception would eventually be thrown due to the concurrent assignment to the lookup dictionary (I get this error occasionally when building the model objects in parallel- at one spot I need to determine the concentration type from the abbreviation and thus hit the missing-lower-case-abbreviation-assignment)

Expected behavior
The GetMethod simply returns the empty list (I don't know if it's worth storing every single string that is passed in the method).

As for the actual mapping vs getting - it is very unlikely that someone would actually MapUnitAbbreviations while at same time parsing in a another thread, using the same AbbreviationCache, but I could see a use case where abbreviations are added asynchronously so I don't know I guess we could use a Concurrent collection instead..

See, now that I think about it- even if mapping inside a static constructor (as I do) there is always the possibility for somebody (unknowingly) launching an asynchronous operation that does any string-related operation that crashes the whole thing because it is reading some abbreviation while my InitializeCustomAbbreviations is still executing.

@lipchev lipchev added the bug label Apr 12, 2021
@angularsen
Copy link
Owner

angularsen commented Apr 18, 2021

ConcurrentDictionary will probably solve the problem of concurrent access to our library, but as you say, the application will have to ensure it initializes all the desired writes before it attempts any reads. Typically on startup.

@lipchev
Copy link
Collaborator Author

lipchev commented Apr 18, 2021

Right, there are 3 options that I can see (yours being the second):

  1. Ignore all missing abbreviations- returning an empty list each time: client-side synchronization for read/write operations: e.g. on startup
  2. Store the empty list for the missing abbreviation in ConcurrentDictionary : fixes the current issue, but still requires synchronization for read/write operations, necessary cause of the List<int> that's in the mix
  3. Making it all thread-safe- also replacing the List with something that's thread safe or given that currently we're doing a copy-on-read anyways, a simple lock on SyncRoot should be enough (however this begs the question why were we doing Distinc().ToList() in the first place- if we could have instead used a simple SortedSet)

I personally don't think we should store the passed-in string at all: this would leave us with the following tradeoff - lock-on-read-and-across-the-board-thread-safety-guarantee vs no-thread-safety-guarantees (+ the code size difference)

In summary, if we say that late-bound mappings on UnitAbbreviationCache.Default should be avoided (especially ones setting the default abbreviation), then I'd say that option 1 is the way to go (+ maybe using a SortedSet)
What do you think?

@lipchev
Copy link
Collaborator Author

lipchev commented Apr 18, 2021

Correction: I was thinking of LinkedHashSet when talking about SortedSet, but anyways- I believe even the HashSet preserves the insert order (until you start removing stuff).

PS: Ok, that's wouldn't be an ideal solution cause you'd have to re-create the whole set when setting a default abbreviation, but still better than doing Distinct().ToList() every time..

PS2 Is there any actual need for Distinct().ToList() : it seems to me like the Add method already discards any duplicates

@angularsen
Copy link
Owner

Gotcha. No, I agree, it seems like a flaw that we store the passed in abbreviation if there is a miss.
I like option 1.

@lipchev lipchev self-assigned this Apr 29, 2021
@lipchev lipchev linked a pull request May 1, 2021 that will close this issue
@stale
Copy link

stale bot commented Jun 29, 2021

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the wontfix label Jun 29, 2021
@stale stale bot closed this as completed Jul 8, 2021
@Jagailo
Copy link
Contributor

Jagailo commented Nov 17, 2022

Faced this problem.
I'm using the library at different endpoints on the server.
When calling endpoints at the same time after starting the server, I get an exception.

System.InvalidOperationException: Operations that change non-concurrent collections must have exclusive access. A concurrent update was performed on this collection and corrupted its state. The collection's state is no longer correct.

image

image

If I call the endpoints one by one, then everything works fine.
It can be fixed?

@angularsen
Copy link
Owner

@Jagailo Not sure why this effort stopped, but it seems fixable.

@lipchev Your PR #924 seems almost complete. Option 1) seems implemented, but I agree we should probably also add some thread-safety either with a lock or some concurrent collection type.
Do you want to resume that PR?

@angularsen angularsen reopened this Nov 17, 2022
@stale stale bot removed wontfix labels Nov 17, 2022
@Jagailo
Copy link
Contributor

Jagailo commented Nov 18, 2022

Not sure why this effort stopped, but it seems fixable.

I think there were no more reports about this bug, because my case is quite specific.
(How many people are looking for units of all scopes at once, and also in several threads on the server?..)

If you are wonder, I'll clarify my case detailly.
So, I needed to get a list of all possible units with only an abbreviation. I just used the UnitParser.Default.TryParse() method in a loop through all the enums in the library. And there was this exception at parallel calls.

To bypass this limitation, I reproduced the TryParse() method using reflection. Empirically, I found out that 2 calls cannot share the UnitValueAbbreviationLookup object. So I just locked it.

image

@angularsen
Copy link
Owner

Thanks for the details.

At any rate, there are two things going wrong here:

  1. There should be no writes on TryParse, this seems like a bug
  2. There should be a lock or some concurrency checking, since we do support writes to add custom abbreviations

angularsen added a commit that referenced this issue Dec 2, 2022
Fixes #919
Duplicates #924 

Do not update collection on get.

Co-authored-by: lipchev <[email protected]>
angularsen added a commit that referenced this issue Dec 3, 2022
Refs #919 

Use `ConcurrentDictionary` for lookups.
Use `IReadOnlyList` and treat list instances as immutable.
Reduce `ToList()` allocations.
Use `Lazy` to compute all abbreviations.
@angularsen
Copy link
Owner

@Jagailo Two separate fixes for concurrency issues are now merged. Would you mind giving it a spin and report back if it helps for you?

cc @lipchev

Release UnitsNet/5.0.0-rc003 · angularsen/UnitsNet

@Jagailo
Copy link
Contributor

Jagailo commented Dec 7, 2022

I tried on version 5.0.0-rc004, everything works fine 👍
Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
3 participants