-
Notifications
You must be signed in to change notification settings - Fork 26
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #179 from tomarrell/engine
engine: implement execution engine
- Loading branch information
Showing
156 changed files
with
5,829 additions
and
1,662 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,73 @@ | ||
# File Format | ||
This document describes the v1.x file format of a database. | ||
|
||
## Terms | ||
This section will quickly describe the terms, that will be used throughout this | ||
document. | ||
|
||
A **database** is a single file, that holds all information of a single | ||
database. | ||
|
||
## Format | ||
The database is a single file. Its size is a multiple of the page size, which is | ||
16K or 16384 bytes for v1.x. The file consists of pages only, meaning there is | ||
no fixed size header, only a header page (and maybe overflow pages). | ||
|
||
### Header page | ||
The page with the **ID 0** is the header page of the whole database. It holds | ||
values for the following keys. The keys are given as strings, the actual key | ||
bytes are the UTF-8 encoding of that string. | ||
|
||
* `pageCount` is a record cell whose entry is an 8 byte big endian unsigned | ||
integer, representing the amount of pages stored in the file. | ||
* `config` is a pointer cell which points to a page, that contains configuration | ||
parameters for this database. | ||
* `tables` is a pointer cell which points to a page, that contains pointers to | ||
all tables that are stored in this database. The format of the table pages is | ||
explained in the next section. | ||
|
||
### Table pages | ||
Table pages do not directly hold data of a table. Instead, they hold pointers to | ||
pages, that do, i.e. the index and data page. Table pages do however hold | ||
information about the table data definition. The data definition information is | ||
a single record that is to be interpreted as a data definition (<span | ||
style="color:red;">**TODO: data definitions**</span>). | ||
|
||
The keys of the three values, index page, data page and schema are as follows. | ||
|
||
* `datadefinition` is a record cell containing the schema information about this | ||
table. That is, columns, column types, references, triggers etc. How the | ||
schema information is to be interpreted, is explained | ||
[here](#data-definition). | ||
* `index` is a pointer cell pointing to the index page of this table. The index | ||
page points to pages that are an actual index in the table. See more | ||
[here](#index-pages) | ||
* `data` is a pointer cell pointing to the data page of this table. See more | ||
[here](#data-pages) | ||
|
||
### Index page | ||
|
||
### Data pages | ||
A data page stores plain record in a cell. Cell values are the full records, | ||
cell keys are the RID of the record. The RID (row-ID) is an 8 byte unsigned | ||
integer, which may not be reused for other records, even if a record was | ||
deleted. The only scenario where an RID may be re-used is, when a record is | ||
deleted from a page, while it is also being written into the same page or | ||
another page (i.e. on move only) (this means, that the RID is not actually | ||
re-used, just kept when moving or re-writing a cell). This can happen, if the | ||
size of the record grows, and the cell has to be re-written. The cell keys aka. | ||
RIDs are referenced by cells from the index pages. A full table scan is | ||
performed by obtaining all cells in the data page and checking their records. | ||
|
||
### Data definition | ||
A data definition follows the following format (everything encoded in big | ||
endian). | ||
|
||
* 2 bytes `uint16` the amount of columns | ||
* for each column | ||
* 2 bytes `uint16` frame for the column name | ||
* name bytes | ||
* 1 byte `bool` that is 0 if the table is **NOT**, and 1 if the column is | ||
nullable | ||
* 2 bytes `uint16` frame for the type name | ||
* type name bytes |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,48 @@ | ||
# Page layout | ||
This document describes the layout and format of a single memory page. All pages | ||
are structured like this. | ||
|
||
**Please note:** "2 bytes indicating a length" implies, that these two bytes, | ||
interpreted as **big endian** encoded **unsigned two byte integer** indicate | ||
said length. In other words, whenever we talk about bytes forming some kind of | ||
number, it is always the big endian encoding of an integer, either 2, 4 or 8 | ||
bytes. | ||
|
||
## Page format | ||
Pages implement the concept of slotted pages. A helpful resource for | ||
understanding is | ||
[this](https://db.in.tum.de/teaching/ss17/moderndbs/chapter3.pdf#page=8) PDF. | ||
Please note though, that we do not follow the exact data structure that is | ||
proposed in that file. | ||
|
||
![Page Structure](./page_v1.png) | ||
|
||
The above image describes the layout of a page. A page has a **fixed size of | ||
16KiB** (16384 byte). | ||
|
||
The **first 6 bytes** are the page header. It has a fixed size and will always | ||
have that 6 byte layout. The **first 4 bytes** represent the page ID. This is | ||
globally unique and is set upon page allocation. A page cannot infer it's own ID | ||
without that header field. The **next 2 bytes** represent the cell count in this | ||
page. This is the amount of slots that occur after the header, and is updated | ||
with every call to `storeCell`. | ||
|
||
After the header, in **4 byte chunks**, slots are defined. A slot points to an | ||
absolute offset within the page, and holds a size attribute. The **first 2 | ||
bytes** are the offset, the **second 2 bytes** are the size. | ||
|
||
Between slots and data, there is free space. This is the space, where new cells | ||
(slots on the left, and data on the right) will be inserted. Slots are always | ||
**sorted by the key of the cell that they point to**. | ||
|
||
A single "slot data" is a full cell, as described [below](#cell-format). | ||
|
||
## Cell format | ||
Cells are simple key-value entities. | ||
|
||
![Cell Structure](./cell_v1.png) | ||
|
||
The above image describes the layout of a cell. A cell contains of a single key | ||
and a single value, which is called the record. Both in front of the key and in | ||
front of the record, there are **2 bytes** indicating the length of the key | ||
respectively the record. |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
3 changes: 3 additions & 0 deletions
3
internal/compiler/testdata/TestCompileGolden/delete/#00.golden
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1 +1,4 @@ | ||
command.Delete{Table:command.SimpleTable{Schema:"", Table:"myTable", Alias:"", Indexed:false, Index:""}, Filter:command.ConstantBooleanExpr{Value:true}} | ||
|
||
String: | ||
Delete[filter=true](myTable) |
3 changes: 3 additions & 0 deletions
3
internal/compiler/testdata/TestCompileGolden/delete/#01.golden
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1 +1,4 @@ | ||
command.Delete{Table:command.SimpleTable{Schema:"mySchema", Table:"myTable", Alias:"", Indexed:false, Index:""}, Filter:command.ConstantBooleanExpr{Value:true}} | ||
|
||
String: | ||
Delete[filter=true](mySchema.myTable) |
3 changes: 3 additions & 0 deletions
3
internal/compiler/testdata/TestCompileGolden/delete/#02.golden
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1 +1,4 @@ | ||
command.Delete{Table:command.SimpleTable{Schema:"", Table:"myTable", Alias:"", Indexed:false, Index:""}, Filter:command.BinaryExpr{Operator:"==", Left:command.LiteralExpr{Value:"col1"}, Right:command.LiteralExpr{Value:"col2"}}} | ||
|
||
String: | ||
Delete[filter=col1 == col2](myTable) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1 +1,4 @@ | ||
command.DropTable{IfExists:false, Schema:"", Name:"myTable"} | ||
|
||
String: | ||
DropTable[table=myTable,ifexists=false]() |
Oops, something went wrong.