-
Notifications
You must be signed in to change notification settings - Fork 112
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Switch deserialization to an iterator-based interface to avoid allocations #462
Labels
API-breaking
This might introduce incompatible API changes
area/deserialization
cpp-rust-driver-p0
Functionality required by cpp-rust-driver
performance
Improves performance of existing features
Milestone
Comments
This was referenced Dec 9, 2022
6 tasks
@avelanarius - is this complete for 0.8? If not, please re-target to 0.8.1 or 0.9. |
8 tasks
8 tasks
This was referenced Mar 12, 2024
Ref: #571 |
This was referenced Apr 11, 2024
wprzytula
added
area/deserialization
cpp-rust-driver-p0
Functionality required by cpp-rust-driver
labels
Jul 9, 2024
This was referenced Aug 8, 2024
5 tasks
4 tasks
This was referenced Oct 11, 2024
5 tasks
This was referenced Nov 10, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
API-breaking
This might introduce incompatible API changes
area/deserialization
cpp-rust-driver-p0
Functionality required by cpp-rust-driver
performance
Improves performance of existing features
Currently, after receiving a
Rows
response from the database, the driver unconditionally parses the row data into an equivalent ofVec<Vec<Option<CqlValue>>
. This results in at leastN + 1
allocations (whereN
is the number of rows) and possibly more if there are some compositeCqlValue
s in the result such as collections or UDTs.The most common way to interpret the data - I suppose it is the most common way, at least - is to convert the rows into some typed representation before consuming them. For example, if the data is fetched from a table which has three
int
columns, then they can be converted to(i32, i32, i32)
. Note that this representation contains all data in-line without any allocation, soVec<Option<CqlValue>>
is quite a costly intermediate representation.I propose to cut the middleman out and adjust our deserialization framework to avoid allocations. Instead of eagerly materializing all rows into a vector, let's keep them in an unserialized form and introduce an iterator interface to avoid allocations.
Here are some ideas on how the new interface could look like:
We should also introduce a new trait which allows deserializing values directly from a byte slice:
The
ColType
describes the type of the column and may aid in deserialization. It will be absolutely necessary if we implement the deserialization trait forCqlValue
, but may be useful forUDTIterator
as well. It is meant as a placeholder here, we can probably look over the existing types in the driver and choose something.For existing types which implement
FromCqlValue
, we can try introducing an automatic implimpl<T> WipName<'static> for T where T: FromCqlValue
. This is a good starting point for this task. Because of that I think it won't be possible to defineFromCqlValue
andWipName
on the same type, but I don't think it is a big problem. After some time, we can consider deprecating and removingFromCqlValue
.We should introduce procedural macros for
WipName
as we already have forFromCqlValue
.Doing all of this requires to introduce a breaking change by modifying
QueryResult
. However, I suspect that not many users are usingQueryResult::rows
directly and rather they are using typed API - in its case, I think it is possible to preserve the API as it is without compromising on the allocation counts, just change how it works under the hood. For those which really need the oldVec<Vec<Option<CqlValue>>
, we can introduce helper methods so that they convert the data on demand.As this is a change which will break much of the existing code, we should make it possible for users to keep using the old way of deserializing things. It should be possible to convert the raw rows into
Vec<Vec<Option<CqlValue>>
and then impose types using theFromCqlValue
and its friends. I think it is OK if using the old way requires some adjustments when upgrading, e.g. requiring users to add some method calls, provided that such changes are easy to make. After some time, we can deprecate those methods as well.Tasks
The text was updated successfully, but these errors were encountered: