Phase 1 - hello world

pyDB, a simple project with a simple goal: learn how databases work.

At the core of data engineering lives databases and distributed systems. Two immensley complex topics. As a foray into them, I'm iteratively building my own simple database, taking courses, and adding complexity as I go.

Phase 1 - hello world

The first version will help me learn the basics, with no tutorials, all written in python. It'll include:

A command line interface
- REPL with pretty printing
- command history
A tokenizer with create table and select commands
A CSV serializer/deserializer
Unit and integration tests
Error handling
Logging

Phase 2 - optimization

The next step will include taking a CMU course on databases. The optimizations are TBD based on what I learn. They may include

Storage updates for
- compression
- data versioning
- distributed storage
Query optimizers
More advanced commands like
- Aggregate functions
- Where clauses

Phase 3 - rustify c++-ify

Python is not the optimal language for a database. So pyDB will either be rewritten in rust or C++. That's it!

Edit: I ended up following all the assignments for the CMU course listed above. It included creating a copy-on-write trie, a buffer pool manager, a b+ tree, a query optimizer, and concurrency controls. I did not make the repo public to respect the wishes of the instructors of that course.

Phase 4 - conquer the world?

By conquer the world I mean help others. Obviously. I hope to contribute to some popular opensource databases. Ideally ones that solve a common issue like distributed compute or in-memory processing.

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
pyDB		pyDB
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Phase 1 - hello world

Phase 2 - optimization

Phase 3 - rustify c++-ify

Phase 4 - conquer the world?

About

Releases

Packages

Languages

License

kentkr/pyDB

Folders and files

Latest commit

History

Repository files navigation

Phase 1 - hello world

Phase 2 - optimization

Phase 3 - rustify c++-ify

Phase 4 - conquer the world?

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages