-
Notifications
You must be signed in to change notification settings - Fork 189
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Loading table from metadata file directly. #246
Comments
Hi @liurenjie1024 , pub fn from_metadata(
metadata: TableMetadata,
table_ident: TableIdent,
file_io: FileIO,
) -> Self {
Self::builder()
.metadata(metadata)
.identifier(table_ident)
.file_io(file_io)
.build()
} or more ergonomic like the Python version - something like this (that's the way I have implemented this): async fn from_metadata(metadata_file_location: &str, file_io: FileIO) -> anyhow::Result<Table> {
let metadata_file = file_io.new_input(metadata_file_location)?;
let mut metadata_file_reader = metadata_file.reader().await?;
let mut metadata_file_content = String::new();
metadata_file_reader
.read_to_string(&mut metadata_file_content)
.await?;
let table_metadata = serde_json::from_str::<TableMetadata>(&metadata_file_content)?;
Ok(Table::builder()
.metadata(table_metadata)
.identifier(TableIdent::from_strs(["static_ns", "static_table"]).unwrap())
.file_io(file_io)
.build())
} either way would be happy to hear what you think |
Hi, @a-agmon I think both method could be provided and the second version could be implemented as a simple wrapper as the first one. What I want to add is that instead of implement it directly on pub struct StaticTable(Table);
impl StaticTable {
pub async fn from_metadata(metadata_file_location: &str, file_io: FileIO) -> anyhow::Result<Self> {
...
}
pub fn scan(&self) -> TableScanBuilder<'_> {
TableScanBuilder::new(self.0)
}
pub fn metadata(&self) -> TableMetadataRef {...}
} The reason we need a |
Thanks @liurenjie1024 |
Hi @liurenjie1024 , |
Hi, @a-agmon I've reviewed the pr and it looks great! I've left small comments to improve it. |
From the discussion: https://apache-iceberg.slack.com/archives/C05HTENMJG4/p1710392414486259
Should simply returning an error for modification operations be acceptable? The current design requires users to write separate logic for |
Yeah, that's one approach. But The motivation behind Is that we want to do it in a type safe way without runtime check. |
Do you think it's a good idea to provide both ways? |
It's kind of weird to me to have both of them. Is there any concrete use case? |
Database like databend supports loading iceberg table from both catalog or paths of a storage services. For example: CREATE TABLE <table_name>
ENGINE = Iceberg
LOCATION = 's3://<path_to_table>'
CONNECTION_NAME = '<connection_name>' If we provides enum IcebergTable {
Table(iceberg::Table), // build from catalog
StaticTable(iceberg::StaticTable) // build from a path of table metadata
} |
Perhaps we can also implement |
Make sense. Will following code work for you?
|
The issue I encountered is that the |
Good point, how about allowing access inner Table as unsafe? |
If that still not works for you, I agree that we provide both apis. |
Providing a Alought I still stick with my idea that By the way, |
How about this:
|
Maybe we can also incorporate something like this at this stage. like adding a readonly property that will default to false, while the StaticTable sets it as true. /// Table represents a table in the catalog.
#[derive(TypedBuilder, Debug)]
pub struct Table {
file_io: FileIO,
#[builder(default, setter(strip_option, into))]
metadata_location: Option<String>,
#[builder(setter(into))]
metadata: TableMetadataRef,
identifier: TableIdent,
#[builder(default = false)]
readonly: bool,
} I tried to sketch it here for example. I can add this if that seems right. |
Looks reasonably to me, thanks for this! |
Great! PR is now also updated with changes. |
Close by #259 Feel free to open when if necessary. |
This feature is quite similar to StaticTable in python api, which allows user to load table without catalog.
The text was updated successfully, but these errors were encountered: