Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-48772][SS][SQL] State Data Source Change Feed Reader Mode #47188
[SPARK-48772][SS][SQL] State Data Source Change Feed Reader Mode #47188
Changes from 1 commit
1ade442
98bf8ec
fb890ae
db45c6f
1926e5e
24c0351
d4a4b80
42552ac
24db837
adde991
d3ca86c
5199c56
ce75133
84dcf15
22a086b
c797d0b
5921479
e5674cf
c012e1a
ff0cd43
2ad7590
43420f6
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Strictly saying, 3rd party state store providers can implement their own format of delta/changelog files. We need to define an interface for change data reader, and have a built-in implementation of the interface which works for both HDFS and RocksDB.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think users can extend
StateStoreChangeDataReader
in StateStoreChangelog.scala to implement their own change data reader. There are built-in implementation examples in both providers.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But StateStoreChangeDataReader is only helpful when they have very similar implementation with the built-in one, right? If they have totally different approach of maintaining changelog, they are going to reimplement everything and it is not clear what needs to be implemented.
An interface is to declare the spec. Whenever we design pluggable functionality, please be sure to define the spec and describe the spec as interface. Don't make others struggle with understanding spec from actual implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see your point now. I would say to make the interface easier for users to implement, why don't we use the superclass of
StateStoreChangeDataReader
which isNextIterator
as the interface. It has two well defined extendible methodgetNext
andclose
and it is also stable.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry one last small suggestion - Could we please put the information of tuple in the method doc? Now it's not self-explained.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now this needs to be also available to the interface method doc.