Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace map with unordered_map #484

Merged
merged 1 commit into from
Jul 13, 2023
Merged

Replace map with unordered_map #484

merged 1 commit into from
Jul 13, 2023

Conversation

stephomi
Copy link
Contributor

@stephomi stephomi commented Jul 8, 2023

Context: #483
Case study: subtract a 150k verts mesh on a 1.5M verts one (with UVs).

The edge map is one of the bottleneck when doing boolean on high poly.

Total time of GetMeshGL():

- Time
std::map (current) 2555 ms
std::unordered_map 1985 ms
phmap::flat_hash_map 1450 ms
(without UV) 1000 ms

phmap is header-only and from https://github.com/greg7mdp/parallel-hashmap.
For now I didn't include it, but the performance is not negligeable.
Should I do it? (git submodule? just copying the include directory?)

@codecov
Copy link

codecov bot commented Jul 8, 2023

Codecov Report

Patch coverage: 100.00% and project coverage change: +0.03 🎉

Comparison is base (ba5fead) 90.31% compared to head (28a25bd) 90.35%.

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #484      +/-   ##
==========================================
+ Coverage   90.31%   90.35%   +0.03%     
==========================================
  Files          35       35              
  Lines        4389     4396       +7     
==========================================
+ Hits         3964     3972       +8     
+ Misses        425      424       -1     
Impacted Files Coverage Δ
src/manifold/src/boolean_result.cpp 97.49% <100.00%> (+0.02%) ⬆️
src/manifold/src/manifold.cpp 95.34% <100.00%> (+0.43%) ⬆️

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

@elalish
Copy link
Owner

elalish commented Jul 9, 2023

Thanks for the perf data! What device are you testing on and which backend are you using? Would you mind adding a comparison with the hash table we already use? #483 (comment)

First, because I'd like to benchmark its performance anyway, and second, because it'd be nice to use the same hash table everywhere if we find a favorite.

@stephomi
Copy link
Contributor Author

stephomi commented Jul 10, 2023

I made another branch to test.
If you want to test, you can simply check it out, everything is included.
(to switch between method, there is a #define at the top of hashtable.h).

Device is MacBookAir M1, backend is TBB but it doesn't matter since the two sections of the code I changed are both serial.
It's the same test (1.5M and 150k subtract).

- manifold.cpp boolean_result.cpp
std::map (master) 670ms 782ms
std::unordered_map 227ms 437ms
phmap::flat_hash_map 120ms 153ms
hashtable.h 165ms 200ms
hashtable.h + identity 50ms -
vector<anker::svector<>> 50ms 74ms
(without UV) 0ms 0ms

For now a simple vector<vector>> linear search gave the best result (hashing mesh edges).
Maybe it's possible to improve the table result by tweaking the hash, but the vector solution is likely the best one (speed and memory). <- with the assumption that most vertex have ~6 vertex neighbor, of course if a vertex has 20k neighbor... an hash would be a safer solution


Not sure if I understand how hashtable.h is working but it's not a drop-in replacement to map/unordered_map.
I templated it (very quick-n-dirty way) to handle arbirtrary key, not just u64.
Also the hashtable is not dynamic and you need to provide a correct size at first.
Not sure what's the expected value but when I gave the exact number of entries, it crashed afterwards.
So I'm allocating twice the number of entries. Edit: ah that's the *2 here.

@elalish
Copy link
Owner

elalish commented Jul 10, 2023

Indeed, given the high likelihood of degree-6 verts, I think your vector of vectors approach makes a lot of sense. I'm reading about the small vector lib, but I don't quite understand why it's an improvement over a std::vector? If std::vector gives about the same perf, let's just use that (I figure the compilers have lots of experience optimizing std::vector).

@pca006132
Copy link
Collaborator

I think those small vector libraries usually flatten the vector and perform some small size optimizations, so to reduce the number of pointer chasing when accessing an element (at the expense of more expensive move etc.). I thought I modified map with unordered_map a few months ago but it seems that this is reintroduced for other parts of the code, have to say that the naming of std::map is unfortunate...

It seems to me that using vector of vectors make sense in this case. I just got back home and is now dealing with jet lag, will look at it in more details tmr.

@stephomi stephomi force-pushed the nomad branch 2 times, most recently from 930925e to 4d1e82b Compare July 11, 2023 08:27
@stephomi
Copy link
Contributor Author

stephomi commented Jul 11, 2023

I figure the compilers have lots of experience optimizing std::vector

It's not a compiler thing but purely a malloc thing.
In my test I got ~1.5M elements, so it's at least 1.5M malloc (vs a few malloc for the svo approach).
It's ok if the malloc is smart enough to give region close together.

There's a slight gain with svo, but I'm fine with std::vector<std::vector though!
I mean getMeshGL went from 2555ms -> 1173ms just because of the extra prop (UV)!
Without the extra prop, it was 1050ms so I won't nitpick about ~20ms.

- manifold.cpp boolean_result.cpp
std::vector<> 67ms 80ms
anker::svector<> 50ms 74ms

Fyi boolean_result.cpp, here are the average size on my test (only 5 malloc, the rest is in-place)

size == 0 (2900)
size == 1 (1464930)
size == 2 (4932)
size == 3 (1734)
size == 4 (5)
size >= 5 (0)

For manifold.cpp I can reduce it down to 40ms (using only std::vector<int>) because we visit only one of prop neighbor.
Maybe double check that this assumption is correct.

I updated the PR. If you want svector, just tell me and I'll update it.

@stephomi stephomi force-pushed the nomad branch 4 times, most recently from 577413f to 8e35908 Compare July 11, 2023 09:54
Copy link
Collaborator

@pca006132 pca006132 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me! I think the gain from svector is pretty small, so it may not be worth it to introduce new dependency unless we can find cases in which the gain is very significant. The current performance with nested vector looks pretty good already.

@hrgdavor
Copy link

This looks significant. How moch perf gains are we looking here?

Copy link
Owner

@elalish elalish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent, thank you for a significant speed up when handling properties! And also for not introducing an extra dependency. One fix and I think it's ready.

if (oldNumProp > 0) {
std::get<1>(key) = vert;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These lines are something I just changed in #480 that I think we want to keep. The fact the tests still pass implies I need to add a test. This is to keep extra properties from being created when a mesh without properties is attached a mesh with properties.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok but maybe I'll need to revert to hashmap for manifold.cpp as it breaks most assumptions.
I'll check.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One way would be to use plain array for small size hashmap and hashmap for larger size.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's fine, I can still use the std::vector<std::vector approach.
I made the previous assumption that one prop could belong to only 1 logical vertex but it's not the case (although gets flattened in the end).

@pca006132
Copy link
Collaborator

This looks significant. How moch perf gains are we looking here?

Seems pretty large when your mesh uses UV mapping.

@stephomi
Copy link
Contributor Author

stephomi commented Jul 12, 2023

This looks significant. How moch perf gains are we looking here?

Seems pretty large when your mesh uses UV mapping.

To clarify, it's not necassirly UV, but simply the presence of extra properties (mergeFromVert could still be empty)
I mentioned UV because I don't provide any other properties to save memory (avoiding potential crash on mobile during memory peak usage).

And yes the difference is big, to sum up:

- GetMeshGL(prop=3) GetMeshGL(prop>3)
master 1100ms 2639ms(!)
PR 1100ms 1270ms

The manifold creation itself takes 780ms (prop=3) 850ms (prop>3).

@stephomi stephomi requested review from elalish and pca006132 July 12, 2023 18:20
Copy link
Owner

@elalish elalish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, thanks!

using Entry = std::pair<glm::ivec3, int>;
int idMissProp = outR.NumVert();
std::vector<std::vector<Entry>> propIdx(outR.NumVert() + 1);
outR.meshRelation_.properties.reserve(outR.NumVert() * numProp * 1.1);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is * 1.1 for?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've removed it.

@elalish elalish merged commit ac2cfd8 into elalish:master Jul 13, 2023
@elalish elalish mentioned this pull request Nov 3, 2023
cartesian-theatrics pushed a commit to SovereignShop/manifold that referenced this pull request Mar 11, 2024
Replace map usage with vector
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants