GitHub - GlobalWebIndex/storage-partitioner: Abstraction over storages with partitioned data

storage-partitioner

"net.globalwebindex" %% "storage-partitioner-api" % "x.y.z"
"net.globalwebindex" %% "storage-partitioner-all" % "x.y.z"
"net.globalwebindex" %% "storage-partitioner-s3" % "x.y.z"
"net.globalwebindex" %% "storage-partitioner-gcs" % "x.y.z"
"net.globalwebindex" %% "storage-partitioner-cql" % "x.y.z"
"net.globalwebindex" %% "storage-partitioner-druid" % "x.y.z"

This project targets primarily storages like FS, S3, FTP, etc., that :

do not have any kind of built-in partitioning like databases do
cannot be searched easily, so that you want to reduce the area to be searched the hard way

But even columnar databases need some kind of partitioning management because they persist data denormalized and it is not exactly easy to track partition state.

Partitioning then must be implemented on client side for such storages and this is what this library helps with. Currently only time series data is supported and implementation is provided for s3, druid, cassandra or scyllaDB.

When building an ETL pipeline that extracts and loads data with the same partitioning between various storage types, the user must focus on Transform instead of Extract and Load.

Note that :

this library is extremely WIP, adding one more storage could lead to heavy API changes.
this way of "integration by abstraction" might seem a bit wrong and a way of storage "Sinks" and "Sources" makes better sense but in case of time series data, if you take partitioning and granularity into consideration, it would be very hard to implement something like these generic Sinks and Sources, however it might go this direction further on

Name		Name	Last commit message	Last commit date
Latest commit History 182 Commits
project		project
src		src
.drone.yml		.drone.yml
.gitignore		.gitignore
LICENSE.md		LICENSE.md
README.md		README.md
build.sbt		build.sbt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

storage-partitioner

About

Releases

Packages

Contributors 2

Languages

License

GlobalWebIndex/storage-partitioner

Folders and files

Latest commit

History

Repository files navigation

storage-partitioner

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages