Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TLA Spec for ordered replication in Ambry with Replica Tokens #2

Open
wants to merge 4 commits into
base: replication
Choose a base branch
from

Conversation

ankagrawal
Copy link
Owner

@ankagrawal ankagrawal commented May 3, 2023

This Spec fixes the limitation in AmbryReplication spec that the replica local log is not ordered, and that replication doesn't make progress via replica token.

ankagrawal added 2 commits May 3, 2023 11:27
Each replica tracks a replicaToken which is the offset upto which is it up-to-date with its peers.
A replica only sends back only message from its log on recieving a replication request.
@ankagrawal ankagrawal changed the title [WIP] TLA Spec for ordered replication in Ambry with Replica Tokens TLA Spec for ordered replication in Ambry with Replica Tokens Sep 26, 2023
* Ambry generates the blobid for each user requests.
* In this spec, everytime new blob is written to Ambry, Ambry increments a zero-based counter and uses that as the blobid.
*)
BlobIds == 0 .. NumBlobs-1

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Weird nit: Any reason why these must be zero based?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No reason. Just easier to think as a software engineer.

\/ RecvPut
\/ \E blobid \in BlobIds:
\/ RecvUpdateTtl(blobid)
\* \/ \E blobid \in BlobIds:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need these?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I will uncomment.

/\ blobIdCtr <= NumBlobs \* To Remove
/\ \E r1, r2 \in Replicas:
/\ r1 # r2
/\ orderedStoredMessagesPerReplica' = [r \in DOMAIN orderedStoredMessagesPerReplica |->

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks a little strange to me, but maybe it's because I don't understand it? It seems like we're setting orderedStoredMessagesPerReplica' to be some element of orderedStoredMessagesPerReplica by setting it equal to orderedStoredMessagesPerReplica[r]. Shouldn't it be something like orderedStoredMessagesPerReplica' = [orderedStoredMessagesPerReplica EXCEPT ....]?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh wait. I see. You've effectively done what EXCEPT does with r \in DOMAIN... Ok, in that case I think this works.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, other question.

\E r1, r2 \in Replicas. It seems that we do the operation on two Replicas at a time for all these Recv functions. Why is that? AFAICT, it looks like we do symmetrical operations for both replicas.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do the put operation on two replicas at a time because thats the quorum for each operation. Each operation is sent to two replicas and is successful only if applied to two replicas.

But you have a good point. There is no guarantee for the operation to be atomic, and the above implementation makes it atomic. Let me try to fix it and update.

/\ inflightReplicationResponse[r] # {}
/\ \E msg \in inflightReplicationResponse[r]:
/\ \E message \in msg.messages:
/\ IF message.blobmessage = UPDATE_TTL THEN

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe pull some of these out to UPDATE_TTL/PUT/DELETE just for readabilities sake if possible?

RecvReplicationResponse(r) ==
/\ inflightReplicationResponse[r] # {}
/\ \E msg \in inflightReplicationResponse[r]:
/\ \E message \in msg.messages:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any need for ordering? Are all messages downloaded in parallel in prod?

ELSE
orderedStoredMessagesPerReplica[rr]
]
/\ \E message \in msg.messages:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So. This \E message \in msg.messages may refer to a different message then the one picked out in the above clause that looks identical. Is that ok? My initial thought looking at this was that we could combine these three for message update/offset update/token update. Buuuut.... if it's actually desirable they diverge, then maybe this is the way to go?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing this out. I will fix this. I totally missed the fact that \E can pick any entry.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants