-
Open Source Big Data Dev
- San Francisco, CA, USA
- http://www.holdenkarau.com/resume.pdf?q=github
- @holdenkarau
-
spark Public
Forked from apache/sparkMirror of Apache Spark
-
spark-testing-base Public
Base classes to use when writing tests with Spark
-
high-performance-spark-examples Public
Forked from high-performance-spark/high-performance-spark-examplesExamples for High Performance Spark
-
spark-upgrade Public
Magic to help Spark pipelines upgrade
-
spark-flowchart Public
Flowchart for debugging Spark applications
-
django-rest-framework-braces Public
Forked from dealertrack/django-rest-framework-bracesCollection of utilities for working with django rest framework (DRF)
Python Other UpdatedSep 15, 2024 -
-
mydotfiles Public
My dotfiles. You probably don't care about this.
-
-
sparkProjectTemplate.g8 Public
Template for Spark Projects
-
sparklens Public
Forked from qubole/sparklensQubole Sparklens tool for performance tuning Apache Spark
Scala Apache License 2.0 UpdatedMay 10, 2024 -
spark-connect-rs Public
Forked from sjrusso8/spark-connect-rsApache Spark Connect Client for Rust
-
arrow-datafusion-comet Public
Forked from apache/datafusion-cometApache Arrow DataFusion Comet Spark Accelerator
Rust Apache License 2.0 UpdatedApr 11, 2024 -
data-validator Public
Forked from target/data-validatorA tool to validate data, built around Apache Spark.
-
gluten Public
Forked from apache/incubator-glutenGluten: Plugin to Double SparkSQL's Performance
-
onetable Public
Forked from apache/incubator-xtableOneTable is an omni-directional converter for table formats that facilitates interoperability across data processing systems and query engines.
Java Apache License 2.0 UpdatedNov 28, 2023 -
ray Public
Forked from ray-project/rayA fast and simple framework for building and running distributed applications. Ray is packaged with RLlib, a scalable reinforcement learning library, and Tune, a scalable hyperparameter tuning libr…
-
vllm Public
Forked from vllm-project/vllmA high-throughput and memory-efficient inference and serving engine for LLMs
Python Apache License 2.0 UpdatedOct 5, 2023 -
spark-expectations Public
Forked from Nike-Inc/spark-expectationsA Python Library to support running data quality rules while the spark job is running⚡
Python Apache License 2.0 UpdatedOct 3, 2023 -
-
-
lit-parrot Public
Forked from Lightning-AI/litgptImplementation of Falcon, StableLM, Pythia, INCITE language models based on nanoGPT. Supports flash attention, LLaMA-Adapter fine-tuning, pre-training. Apache 2.0-licensed.
Python Apache License 2.0 UpdatedJun 17, 2023 -
dolly Public
Forked from databrickslabs/dollyDatabricks’ Dolly, a large language model trained on the Databricks Machine Learning Platform
Python Apache License 2.0 UpdatedJun 3, 2023 -
bitsandbytes Public
Forked from bitsandbytes-foundation/bitsandbytes8-bit CUDA functions for PyTorch
Python MIT License UpdatedJun 3, 2023 -
explore-dolly Public
Exploring what we can do with Databrick's Dolly (and similar)
-
uszipcode-project Public
Forked from MacHu-GWU/uszipcode-projectUSA zipcode programmable database, includes up-to-date census and geometry information.
Python MIT License UpdatedMay 29, 2023 -
sparklingpinkpandas Public
Website for Sparkling Pink Pandas (queer, trans focused scooter club)
-
-
obico-server Public
Forked from TheSpaghettiDetective/obico-serverObico is a community-built, open-source smart 3D printing platform used by makers, enthusiasts, and tinkerers around the world.
Python GNU Affero General Public License v3.0 UpdatedFeb 11, 2023 -
looking-glass Public
Forked from gmazoyer/looking-glassEasy to deploy Looking Glass
PHP GNU General Public License v3.0 UpdatedFeb 1, 2023