-
Notifications
You must be signed in to change notification settings - Fork 63
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Geojson optimization #444
Geojson optimization #444
Conversation
I haven't built all ngen tests locally yet, so I didn't see these failing until the PR went up. I'll dig into these a little bit and see what is happening. |
@@ -277,7 +274,7 @@ namespace geojson { | |||
} | |||
} | |||
else if (child.first == "id") { | |||
id = child.second.data(); | |||
id = std::move(child.second.data()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that std::move
might end up ripping stuff off off child
here. It's hard to tell here without having example docs in front of me, but it looks like there's a chance that these std::move
operations might modify the tree and bubble up the updates, harming reuse. There might need to be some comments highlighting this so it's easier to identify while skimming.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once processed and recorded does the state of that node in the ptree really matter? The reference in the tree should still be valid if I understand the semantics of move correctly, but even if it isn't does it matter in this context?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's why I'm not sure - if the ptree
isn't used afterwards, no one cares, but std::move
stands to "replace" values (I'm assuming arrays?) when moving. Strings, for example, stand to be converted to empty strings after movement (new location still gets the correct value, though). This specific line isn't a good example; I just picked one of the moves to put a comment about moves. I think the biggest risk is when moving stuff into vectors. I'm not the expert, so I can easily be missing 90% of what's going on. That's sort of the reason I suggested adding comments instead of actually changing code. If someone comes along further down the line (most likely a goober like me) and expects the state of the tree to remain the same they might be in for a rude awakening.
std::move
was a really good idea, so I'm not disputing its use.
How computationally expensive would it be to wrap the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My only concern is the addition of a raw pointer to JSONProperty, I think a shared pointer or unique pointer should be usable, depending on if ownership is being taken of the data.
2a6f30c
to
c62c4ed
Compare
1fd2673
to
ae4ac20
Compare
Finally, tests passing! I also pushed a couple clean up commits and some additional |
This is really just a view to backing storage that is managed via stl containers in the This pointer is never allocated to anything, and the only way to set it is via the constructor of the struct by passing the reference it should point to when the struct is created. |
After looking through #443, it became clear that a large part of the initialization overhead in
FeatureCollection
was due to an excessive amount of copies of strings (and some other data). This PR utilizedstd::move
to reduce the amount of copies of data that go through the system. Also, I took the opportunity to finish implementing theJSONProperty
using theboost::variant
for all data types.This work when used on the partition generator with the optimization in #443 was able to reduce the runtime of the conus run to about 8 minutes.
I noticed a point of diminishing returns working on this code, and investigated briefely. Turns out, on my machine, I'll likely never get better than about 8 minutes on the conus run. It literally takes about 5 minutes just to DESTRUCT the property tree as the stack unrolls. This is because I was pushing ~56GB RAM out of my 16GB phsical memory, so the swapping was intense.
To test this idea a bit, I replaced all geometries in the
catchment_data.geosjon
with a simplePoint
geometry, instead of aMultiPolygon
. I then ran the partitioning using that input, and completed the entire thing in 1 minute and 20 seconds!Additions
Removals
Changes
std::move
JSONProperty
using theboost::variant
that was introduced to utilizeList
parameters inngen
Testing
Notes
JSONProperty
and the variousFeature
types.Todos
Checklist
Target Environment support