Highlights
- Pro
-
trino Public
Forked from trinodb/trinoOfficial repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
Java Apache License 2.0 UpdatedOct 13, 2024 -
imap_tools Public
Forked from ikvk/imap_toolsWork with email by IMAP
Python Apache License 2.0 UpdatedSep 22, 2023 -
sqlmesh Public
Forked from TobikoData/sqlmeshSQLMesh is a DataOps framework that brings the benefits of DevOps to data teams. It enables data scientists, analysts, and engineers to efficiently run and deploy data transformations written in SQ…
Python Apache License 2.0 UpdatedAug 4, 2023 -
-
rich Public
Forked from Textualize/richRich is a Python library for rich text and beautiful formatting in the terminal.
Python MIT License UpdatedSep 12, 2022 -
pennsylvania-vaccines Public
Forked from jherrm/pennsylvania-vaccinesThis is a centralized repository for the Pennsylvania Vaccine Updates bots.
JavaScript MIT License UpdatedFeb 14, 2021 -
spark Public
Forked from apache/sparkMirror of Apache Spark
Scala Apache License 2.0 UpdatedJul 19, 2020 -
tap-framework Public
Forked from dbt-labs/tap-frameworka framework for rapidly prototyping new singer taps
Python UpdatedDec 3, 2019 -
tap-shopify Public
Forked from singer-io/tap-shopifySinger.io tap for extracting Shopify data
Python GNU Affero General Public License v3.0 UpdatedOct 14, 2019 -
getting-started Public
Forked from singer-io/getting-startedThis repository is a getting started guide to Singer.
Makefile UpdatedSep 28, 2019 -
singer-python Public
Forked from singer-io/singer-pythonWrites the Singer format from Python
Python Apache License 2.0 UpdatedSep 24, 2019 -
data-validator Public
Forked from target/data-validatorA tool to validate data built around Apache Spark.
Scala Other UpdatedJul 3, 2019 -
deequ Public
Forked from awslabs/deequDeequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.
Scala Apache License 2.0 UpdatedMar 10, 2019 -
hdfs Public
Forked from colinmarc/hdfsA native go client for HDFS
Go MIT License UpdatedNov 14, 2018 -
scala-chart Public
Forked from scala-chart/scala-chartScala Chart Library
Scala GNU Lesser General Public License v3.0 UpdatedApr 24, 2018 -
spark-deep-learning Public
Forked from databricks/spark-deep-learningDeep Learning Pipelines for Apache Spark
Python Apache License 2.0 UpdatedJun 6, 2017 -
PowerGraph Public
Forked from jegonzal/PowerGraphPowerGraph: A framework for large-scale machine learning and graph computation.
C++ UpdatedApr 27, 2017 -
kafka-connect-cassandra Public
Forked from tuplejump/kafka-connect-cassandraKafka Connect Cassandra Connector. This project includes source/sink connectors for Cassandra to/from Kafka.
-
-
SILT: A Memory-Efficient, High-Performance Key-Value Store
C++ Other UpdatedDec 21, 2015 -
fantasy-football Public
Forked from hougs/fantasy-footballChoosing a fantasy football team using spark, hive, python, and really just about anything.
Java UpdatedFeb 13, 2015 -
petuum Public
Forked from hoqirong/petuumSailingLab's Petuum project.
C++ BSD 3-Clause "New" or "Revised" License UpdatedJul 1, 2014 -
wumpus Public
Forked from gmargari/wumpusWumpus is an information retrieval system developed at the University of Waterloo. Its main purpose is to study issues that arise in the context of indexing dynamic text collections in multi-user e…
C++ GNU General Public License v2.0 UpdatedJun 10, 2013 -
bytecask Public
Forked from pbudzik/bytecaskKey/value database inspired by Bitcask
Scala UpdatedMay 16, 2013 -