-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Runtime schema declaration #165
Conversation
* fix point query at sst * code fmt
… the storage layer and support multiple storage (tonbo-io#163) * refactor: use [fusio](https://github.com/tonbo-io/fusio) to implement the storage layer and support multiple storage * chore: remove println macros codegen * fix: blocking of LevelStream in `open_file` under Unix system * refactor: use fusio on Benchmark * fix: benchmark/writes.rs file not found * test: add unit tests for serdes * chore: update read & write for fusio * fix: example datafusion projection error * fix: `Fs::list` may not sorted * ci: add exmaples check * feat: enable SQL query support in datafusion example (tonbo-io#169) Added support for executing SQL queries using DataFusion's parser and physical plan execution. This enhancement allows querying the "music" table with SQL statements, improving flexibility and functionality. * chore: resolve review --------- Co-authored-by: Xwg <[email protected]>
…untime-schema # Conflicts: # src/compaction/mod.rs # src/inmem/immutable.rs # src/inmem/mutable.rs # src/lib.rs # src/ondisk/scan.rs # src/option.rs # src/record/mod.rs # src/record/runtime/column.rs # src/record/runtime/mod.rs # src/record/runtime/record.rs # src/record/runtime/record_ref.rs # src/stream/package.rs # src/transaction.rs
@@ -69,6 +69,23 @@ jobs: | |||
command: fmt | |||
args: -- --check | |||
|
|||
exmaples: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
flume = { version = "0.11", features = ["async"] } | ||
fusio = { git = "https://github.com/tonbo-io/fusio.git", package = "fusio", rev = "317b1b0621b297f52145b41b90506632f2dc7a1d", features = ["tokio", "dyn"] } | ||
fusio-parquet = { git = "https://github.com/tonbo-io/fusio.git", package = "fusio-parquet", rev = "317b1b0621b297f52145b41b90506632f2dc7a1d" } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently the latest is: ab6a8073b966f9c7d7ca61c32646b7d45dbc4f31
src/inmem/immutable.rs
Outdated
where | ||
A: ArrowArrays, | ||
A::Record: Send, | ||
{ | ||
fn from(mutable: SkipMap<Timestamped<<A::Record as Record>::Key>, Option<A::Record>>) -> Self { | ||
fn from( | ||
mutable: ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
naming variables separately is more readable
(mutable, instance): (
SkipMap<Timestamped<<A::Record as Record>::Key>, Option<A::Record>>,
&RecordInstance,
)
src/lib.rs
Outdated
let instance = | ||
RecordInstance::Runtime(DynRecord::empty_record(column_descs, primary_index)); | ||
|
||
let db = Self::build(option, executor, instance, manager).await?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that the variable db
can be removed
fixed_projection.append(&mut projection); | ||
fixed_projection.dedup(); | ||
|
||
let mask = ProjectionMask::roots( | ||
&arrow_to_parquet_schema(R::arrow_schema()).unwrap(), | ||
&arrow_to_parquet_schema(&self.schema.record_instance.arrow_schema::<R>()).unwrap(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there is no need to call this method self.schema.record_instance.arrow_schema::<R>()
, use primary_key_index
directly
src/record/mod.rs
Outdated
} | ||
|
||
pub(crate) fn size(&self) -> usize { | ||
todo!() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the lack of size()
implementation should cause the data to be unable to trigger Compaction (the threshold will not be reached)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method is not used, I will delete it
src/record/mod.rs
Outdated
{ | ||
match self { | ||
RecordInstance::Normal => R::primary_key_path(), | ||
RecordInstance::Runtime(_) => todo!(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
missing implementation
let datatype = col.datatype; | ||
let name = col.name.to_string(); | ||
let value = match datatype { | ||
Datatype::Int8 => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think macros can be used to simplify this to simply implement most types.
type Ref<'r> = DynRecordRef<'r>; | ||
|
||
fn primary_key_index() -> usize { | ||
unreachable!("This method is not used.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can this method be used to implement another trait such as StaticRecord
instead of unreachable!
?
support runtime schema