- Digital Nomad
- [email protected]
etl
Die wichtigsten APIs Deutschlands in einem Python Paket.
Blackbox Protobuf is a set of tools for working with encoded Protocol Buffers (protobuf) without the matching protobuf definition.
borb is a library for reading, creating and manipulating PDF files in python.
Automated learning of regexes for DNS discovery
Metadata scraper with support for oEmbed, Twitter Cards and Open Graph Protocol for Node.js ⚡
Download the entire Wayback Machine archive for a given URL.
Plumb a PDF for detailed information about each char, rectangle, line, et cetera — and easily extract text and tables.
DAOS Storage Stack (client libraries, storage engine, control plane)
Memphis.dev is a highly scalable and effortless data streaming platform
A next-generation crawling and spidering framework.
NVTabular is a feature engineering and preprocessing library for tabular data designed to quickly and easily manipulate terabyte scale datasets used to train deep learning based recommender systems.
Python Extract Transform and Load Tables of Data
Immutable and statically-typeable DataFrames with runtime type and data validation
The merlin dataloader lets you rapidly load tabular data for training deep leaning models with TensorFlow, PyTorch or JAX
An archiving tool with an IM-style interface that prioritizes privacy and accessibility, integrated with various archival services including Internet Archive, archive.today, Ghostarchive, IPFS, Tel…
vacuum is the worlds fastest OpenAPI 3, OpenAPI 2 / Swagger linter and quality analysis tool. Built in go, it tears through API specs faster than you can think. vacuum is compatible with Spectral r…
USA zipcode programmable database, includes up-to-date census and geometry information.
A full-text search and indexing server written in Rust.
A scalable general purpose micro-framework for defining dataflows. THIS REPOSITORY HAS BEEN MOVED TO www.github.com/dagworks-inc/hamilton
Easily convert common crawl to a dataset of caption and document. Image/text Audio/text Video/text, ...
SeaTunnel is a next-generation super high-performance, distributed, massive data integration tool.
A Python package for manipulating 2-dimensional tabular data structures
Event-driven networking engine written in Python.
A cross platform way to express data transformation, relational algebra, standardized record expression and plans.
🔥 Blazing fast bulk data transfers between any cloud 🔥
The no-magic web API and microservices framework for Python developers, with an emphasis on reliability and performance at scale.
A developer-friendly API for converting numerous document formats into PDF files, and more!