Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Snapshot preemption on OnDbChange #829

Closed
adiholden opened this issue Feb 20, 2023 · 6 comments
Closed

Snapshot preemption on OnDbChange #829

adiholden opened this issue Feb 20, 2023 · 6 comments
Assignees
Labels
bug Something isn't working

Comments

@adiholden
Copy link
Collaborator

Currently when calling OnDbChange we might preempt if when calling push data to channel.
preemption in this flow can cause bugs because we hold iterator and call registered funcitons from db_slice with the iterator.
If we preempt the iterator might become invalid

@adiholden adiholden added the bug Something isn't working label Feb 20, 2023
@adiholden adiholden self-assigned this Feb 20, 2023
@adiholden
Copy link
Collaborator Author

adiholden commented Feb 20, 2023

My suggested solution:
Today we preepmt on this flow if the serializer is not the default one, in this flow we must create a temporary serializer and after the entry is serialized we push it to channel.
We dont use the default serializer in case that the entry db index is different than the db index of the default serializer.
What I suggest is adding the logic of the db index serializing to the serializer, so if we want to serialize entry different from the last entry serialized we will serialize the SELECTDB and the db index.
This logic in the consume channel can than be dropped.
After this change we will not need to call push to channel from the on OnDbChange callback

@romange @dranikpg

@romange
Copy link
Collaborator

romange commented Feb 20, 2023

Does replication_test write into multiple datasets? Does it cover this case?

@romange
Copy link
Collaborator

romange commented Feb 20, 2023

What I suggest is adding the logic of the db index serializing to the serializer, so if we want to serialize entry different from the last entry serialized we will serialize the SELECTDB and the db index.

you are suggesting that the logic inside RdbSaver::Impl::ConsumeChannel won't have anything related to db_index and RDB_OPCODE_SELECTDB and will just write blobs?

@dranikpg
Copy link
Contributor

This would save us preempting and quite easily, I agree.

The bigger issue I see is that we don't provide memory capped back pressure neither on same-db changes nor with the new approach. We have no upper bound on the buffer size, but in reality we should have one. Reordering the cb calls with the latest changes only makes it more likely.

@adiholden
Copy link
Collaborator Author

@romange yes

@dranikpg
Copy link
Contributor

Does replication_test write into multiple datasets? Does it cover this case?

Yes, the numbers are fairly small, but the callback was running often (last time I checked when writing the seeder) https://github.com/dragonflydb/dragonfly/blob/main/tests/dragonfly/replication_test.py#L27

adiholden added a commit that referenced this issue Feb 26, 2023
* bug(snapshot) : Do not preempt inside OnDbChange

Signed-off-by: adi_holden <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants