-
Notifications
You must be signed in to change notification settings - Fork 211
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Store file io props to allow re-build it #802
Merged
Merged
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we really expose this? Why not just derive
Serialize
/Deserialize
forFileIOBuilder
?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unlike
TableMetadata
,FileIOBuilder
does not have a predefined struct layout. ImplementingSerialize
orDeserialize
forFileIOBuilder
could lead to users relying on it directly. By exposinginto_parts
, we give users the flexibility to make their own decisions. This approach also allows us to modify the API and adjust the internal layout in the future by changing the API directly without unintentionally breaking users' applications.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the goal of the
scheme_str
? If we want to have a specific impl, we could also pass that into the configuration itself. This is of course bound to a language, for examplepy-io-impl
and in Java it is justio-impl
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense to me.
I exposed the necessary elements as they are to build
FileIO
for now. I can address this in another PR that removes the requirement forscheme
to buildFileIO
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's exactly what I want. But user are not supposed to depend on the serialized format, but only the ser/de api provided by this library. The pros of this approach is that we don't need to expose internals fields to user, but only apis.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Providing ser/de is a contract I don't want to establish with users. It implies a promise that users can deserialize the file I/O back at any time after serialization. Once we expose ser/de, users will inevitably try to use it in ways we never anticipated, such as storing it in a metadata service or transferring it between clusters.
There should be a much broader discussion about how we should handle serialization/deserialization across our project. Similar issues could occur in #774. However, I don't want to initiate that discussion here.
In my opinion, making an API-breaking change is much better than offering serialization/deserialization but breaking it unexpectedly.