Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add SeqNum field to advertisement specification #30

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

gammazero
Copy link
Contributor

No description provided.

@gammazero gammazero requested a review from masih February 21, 2024 21:04
Copy link
Member

@masih masih left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fine thanks @gammazero 👍

Do you think we need a separate doc for this?

There are a few edge cases to expand on, e.g.:

  • what should new providers do.
  • what should existing providers do.
  • what does it mean to reset the sequence number of allowed at all.
  • how it would impact the advertisement signature and its verification.
  • how should indexers interpret its presence, validation, etc.

A separate doc would also make it clear what's current and what's being added, similar to how IPFS IPIP process functions.

@gammazero
Copy link
Contributor Author

gammazero commented Feb 22, 2024

Do you think we need a separate doc for this?

Does not seem necessary, since this only adds information and does not change protocol behavior. I will add a new subsection to the doc, and then we can decide if a separate doc is warranted.

There are a few edge cases to expand on, e.g.:

  • what should new providers do.

Use sequence numbers. (added to spec)

  • what should existing providers do.

Use sequence numbers (added to spec)

  • what does it mean to reset the sequence number of allowed at all.

Reset means set back to 0, allowed when there are 9007199254740992 advertisements on the chain. (clarified in new subsection)

  • how it would impact the advertisement signature and its verification.

If present then it is included in the data that the advertisement signature is calculated over. (in doc)

  • how should indexers interpret its presence, validation, etc.

Indexers are not required to validate that SeqNum is monotonically increasing, but an incorrect sequence number may result in incorrect indexing information or may be ignored resulting in some indexing features being unavailable for that chain's provider. If an indexer depends on the sequence number for more than display and reporting purposes, it can choose not index advertisement chains that do not have a monotonically increasing SeqNum.

A separate doc would also make it clear what's current and what's being added, similar to how IPFS IPIP process functions.

No behavior change, only additional info, so seems like new doc not needed. However, if an IPIP process is the way we want to handle all protocol updates, then we can do it that way.

@gammazero gammazero requested a review from masih February 22, 2024 17:46
IPNI.md Outdated

If `SeqNum` is present then it is included in the data that the advertisement signature is calculated over.

Indexers are not required to validate that `SeqNum` is monotonically increasing, but an incorrect sequence number may result in incorrect indexing information or may be ignored resulting in some indexing features being unavailable for that chain's provider. If an indexer depends on the sequence number for more than display and reporting purposes, it can choose not index advertisement chains that do not have a monotonically increasing `SeqNum`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is too forgiving; For sequence number to be useful it should provide guarantees to the observer.

Copy link
Contributor Author

@gammazero gammazero Feb 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other than the above the above policy of tolerance, I see these ways to deal with bad sequence:

  1. Ignore all ads until an ad has the next expected sequence is put on the chain. For example:
    1 2 9 4 5 6 3 4 5 6... Here 9 4 5 6 are ignored and indexing is resumed at 4
  2. Ignore all ads that do not match the expected sequence. Example:
    Expect: 1 2 3 4 5 6 7 8 ...
    Have: 1 2 9 4 5 6 3 4 ...
    Here 9 is ignored, then 3 4 and all that follow are ignored until the sequence lines up with expected.
  3. Ignore all ads until sequence number is reset to 0.
  4. Stop indexing chain and require a new chain be created.

The first two can prevent different indexers from reaching consistency, since depending on where they start indexing, the errors may look different. Also, either of these strategies can cause the remainder of the chain to be ignored until the appropriate correction is put into an advertisement.

Number 3 also ignores the chain until a specific correction (reset to 0) is put into an advertisement. This also makes sequence completely optional by allowing a zero value that resets sequence with each ad.

Number 4 seems unnecessarily harsh and wasteful, and may not be something that providers will be willing to even do.

So, that leaves us with the original strategy of tolerance. A wrong sequence may cause some unreliable distance computations when computing the distance of ads that span one or more bad sequence numbers. Fortunately, after enough ads with monotonically increasing sequence numbers are added to the chain, these errors will disappear as they will be sufficiently in the past that all but very far behind indexers will even see them. Tolerance leads to a self-correcting system that will allow different indexers to reach consistency.

Do you have other suggestions about what is the correct way to handle an incorrect sequence number, or want to choose anything presented here?

Copy link
Member

@masih masih Feb 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add to the list Option 5:

  • No. 4 on providers side in that once they decide to use sequence numbers they should stick to it and increment it correctly.
  • Do height calculation at indexers as starting point.

Tolerance leads to a self-correcting system

...with the right incentive. I am not sure just now what that is.

Another question in my head is:

  • how different would incorrect sequence number be compared to incorrect signature? Providers don't even think about the signature. They just follow the protocol. What specific failure scenario we are thinking of that makes sequence number require tolerance from system design perspective.

Maybe it's worth fleshing those failure scenarios out.

@masih
Copy link
Member

masih commented Feb 23, 2024

Thank you for including the PR comments in the document.

Indexers are not required to validate that SeqNum is monotonically increasing

My sense is that flexing on this significantly reduces the effectiveness of Sequence numbers.

IPIP process

It seems like we are moving towards a world where there are different versions of IPNI spec, and indexers/providers that comply with different versions. This is not a blocker for now but I think we should think about introducing some kind of versioning to allow a user to infer what they can expect from an indexer instance. For example, assuming we make the validation of sequence number strict then another indexer can rely on it to concretely infer their lag in ingestion. They can also infer that some other indexer that does not support the newer spec cannot be relied upon to offer such insight. This allows backward compatibility while at the same time offer meaningful usefulness in exchange for protocol changes.

@@ -300,6 +307,17 @@ Specified protocols are expected to be ordered in increasing order.

If the `Metadata` field is not specified, the advertisement is treated as address update only.

#### SeqNum

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like it'd be useful to return this in /ad/head as well, right (could also be by enabling returning just the root advertisement block either in the same or a separate call)?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants