Skip to content
View ddelange's full-sized avatar
💥
["translatio", "imitatio", "aemulatio"]
💥
["translatio", "imitatio", "aemulatio"]

Block or report ddelange

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Stars

etl

Extract-Transform-Load, Data Wrangling, Data Mining, ...
251 repositories

Dead simple Object schema validation

TypeScript 23,104 935 Updated Jan 23, 2025

Die wichtigsten APIs Deutschlands in einem Python Paket.

HTML 1,251 67 Updated Oct 19, 2024

Public type stubs for pandas

Python 241 133 Updated Jan 15, 2025

Blackbox Protobuf is a set of tools for working with encoded Protocol Buffers (protobuf) without the matching protobuf definition.

Python 556 92 Updated Nov 27, 2024

borb is a library for reading, creating and manipulating PDF files in python.

Python 3,439 147 Updated Dec 1, 2024

Automated learning of regexes for DNS discovery

Python 362 42 Updated Feb 18, 2023

Metadata scraper with support for oEmbed, Twitter Cards and Open Graph Protocol for Node.js ⚡

TypeScript 484 52 Updated Apr 9, 2024

Download the entire Wayback Machine archive for a given URL.

Python 2,937 191 Updated May 15, 2024

Plumb a PDF for detailed information about each char, rectangle, line, et cetera — and easily extract text and tables.

Python 7,087 692 Updated Jan 1, 2025

DAOS Storage Stack (client libraries, storage engine, control plane)

C 790 306 Updated Jan 23, 2025

Memphis.dev is a highly scalable and effortless data streaming platform

Go 3,278 220 Updated May 27, 2024

Build data pipelines, the easy way 🛠️

TypeScript 4,100 260 Updated Jun 6, 2023

A next-generation crawling and spidering framework.

Go 12,898 673 Updated Jan 20, 2025

NVTabular is a feature engineering and preprocessing library for tabular data designed to quickly and easily manipulate terabyte scale datasets used to train deep learning based recommender systems.

Python 1,064 143 Updated Sep 3, 2024

Python Extract Transform and Load Tables of Data

Python 1,255 194 Updated May 12, 2024

Immutable and statically-typeable DataFrames with runtime type and data validation

Python 453 34 Updated Jan 23, 2025

The merlin dataloader lets you rapidly load tabular data for training deep leaning models with TensorFlow, PyTorch or JAX

Python 412 25 Updated Apr 16, 2024

An archiving tool with an IM-style interface that prioritizes privacy and accessibility, integrated with various archival services including Internet Archive, archive.today, Ghostarchive, IPFS, Tel…

Go 1,868 68 Updated Jan 22, 2025

vacuum is the worlds fastest OpenAPI 3, OpenAPI 2 / Swagger linter and quality analysis tool. Built in go, it tears through API specs faster than you can think. vacuum is compatible with Spectral r…

Go 673 50 Updated Jan 22, 2025

USA zipcode programmable database, includes up-to-date census and geometry information.

Python 235 49 Updated May 17, 2024

A full-text search and indexing server written in Rust.

Rust 1,858 72 Updated Mar 6, 2023

A scalable general purpose micro-framework for defining dataflows. THIS REPOSITORY HAS BEEN MOVED TO www.github.com/dagworks-inc/hamilton

Python 863 37 Updated Jul 3, 2023

Easily convert common crawl to a dataset of caption and document. Image/text Audio/text Video/text, ...

Python 315 25 Updated Dec 9, 2023

SeaTunnel is a next-generation super high-performance, distributed, massive data integration tool.

Java 8,239 1,885 Updated Jan 23, 2025

A Python package for manipulating 2-dimensional tabular data structures

C++ 1,820 159 Updated Oct 24, 2024

Event-driven networking engine written in Python.

Python 5,678 1,186 Updated Jan 20, 2025

A cross platform way to express data transformation, relational algebra, standardized record expression and plans.

Python 1,240 166 Updated Jan 20, 2025

🔥 Blazing fast bulk data transfers between any cloud 🔥

Python 1,112 63 Updated May 11, 2024

The no-magic web API and microservices framework for Python developers, with an emphasis on reliability and performance at scale.

Python 9,580 950 Updated Jan 21, 2025

A developer-friendly API for converting numerous document formats into PDF files, and more!

Go 8,425 566 Updated Jan 23, 2025