Skip to content
View ddelange's full-sized avatar
💥
["translatio", "imitatio", "aemulatio"]
💥
["translatio", "imitatio", "aemulatio"]

Block or report ddelange

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Stars

etl

Extract-Transform-Load, Data Wrangling, Data Mining, ...
257 repositories

Dead simple Object schema validation

TypeScript 23,214 934 Updated Mar 3, 2025

Die wichtigsten APIs Deutschlands in einem Python Paket.

HTML 1,266 68 Updated Oct 19, 2024

Public type stubs for pandas

Python 256 139 Updated Mar 7, 2025

Blackbox Protobuf is a set of tools for working with encoded Protocol Buffers (protobuf) without the matching protobuf definition.

Python 570 96 Updated Nov 27, 2024

borb is a library for reading, creating and manipulating PDF files in python.

Python 3,453 148 Updated Dec 1, 2024

Automated learning of regexes for DNS discovery

Python 363 42 Updated Feb 18, 2023

Metadata scraper with support for oEmbed, Twitter Cards and Open Graph Protocol for Node.js ⚡

TypeScript 485 52 Updated Apr 9, 2024

Download the entire Wayback Machine archive for a given URL.

Python 2,972 196 Updated May 15, 2024

Plumb a PDF for detailed information about each char, rectangle, line, et cetera — and easily extract text and tables.

Python 7,361 709 Updated Feb 11, 2025

DAOS Storage Stack (client libraries, storage engine, control plane)

C 814 309 Updated Mar 7, 2025

Memphis.dev is a highly scalable and effortless data streaming platform

Go 3,280 222 Updated May 27, 2024

Build data pipelines, the easy way 🛠️

TypeScript 4,112 262 Updated Jun 6, 2023

A next-generation crawling and spidering framework.

Go 13,128 690 Updated Mar 3, 2025

NVTabular is a feature engineering and preprocessing library for tabular data designed to quickly and easily manipulate terabyte scale datasets used to train deep learning based recommender systems.

Python 1,071 146 Updated Sep 3, 2024

Python Extract Transform and Load Tables of Data

Python 1,262 194 Updated May 12, 2024

Immutable and statically-typeable DataFrames with runtime type and data validation

Python 455 34 Updated Mar 4, 2025

The merlin dataloader lets you rapidly load tabular data for training deep leaning models with TensorFlow, PyTorch or JAX

Python 417 26 Updated Apr 16, 2024

An archiving tool with an IM-style interface that prioritizes privacy and accessibility, integrated with various archival services including Internet Archive, archive.today, Ghostarchive, IPFS, Tel…

Go 1,897 69 Updated Mar 6, 2025

vacuum is the worlds fastest OpenAPI 3, OpenAPI 2 / Swagger linter and quality analysis tool. Built in go, it tears through API specs faster than you can think. vacuum is compatible with Spectral r…

Go 725 52 Updated Feb 25, 2025

USA zipcode programmable database, includes up-to-date census and geometry information.

Python 236 49 Updated May 17, 2024

A full-text search and indexing server written in Rust.

Rust 1,863 73 Updated Mar 6, 2023

A scalable general purpose micro-framework for defining dataflows. THIS REPOSITORY HAS BEEN MOVED TO www.github.com/dagworks-inc/hamilton

Python 863 37 Updated Jul 3, 2023

Easily convert common crawl to a dataset of caption and document. Image/text Audio/text Video/text, ...

Python 317 26 Updated Dec 9, 2023

SeaTunnel is a next-generation super high-performance, distributed, massive data integration tool.

Java 8,330 1,918 Updated Mar 6, 2025

A Python package for manipulating 2-dimensional tabular data structures

C++ 1,822 160 Updated Mar 6, 2025

Event-driven networking engine written in Python.

Python 5,710 1,188 Updated Mar 5, 2025

A cross platform way to express data transformation, relational algebra, standardized record expression and plans.

Python 1,267 166 Updated Mar 6, 2025

🔥 Blazing fast bulk data transfers between any cloud 🔥

Python 1,124 65 Updated May 11, 2024

The no-magic web API and microservices framework for Python developers, with an emphasis on reliability and performance at scale.

Python 9,609 953 Updated Feb 22, 2025

A developer-friendly API for converting numerous document formats into PDF files, and more!

Go 8,752 588 Updated Mar 6, 2025