Merge pull request #10 from kuzudb/blogs

Migrate old blogs to new site
kuzudb · Feb 29, 2024 · 8b594e1 · 8b594e1
2 parents 284e0bd + 0b47245
commit 8b594e1
Show file tree

Hide file tree

Showing 57 changed files with 3,844 additions and 32 deletions.
diff --git a/README.md b/README.md
@@ -7,19 +7,24 @@ The site is built on top of the [EV0](https://github.com/gndx/ev0-astro-theme) O
 
 Clone this repository to your local machine using Git.
 
-| Command           | Action                                       |
-| :---------------- | :------------------------------------------- |
-| `npm install`     | Installs dependencies                        |
-| `npm run dev`     | Starts local dev server at `localhost:4321`  |
-| `npm run build`   | Build your production site to `./dist/`      |
-| `npm run preview` | Preview your build locally, before deploying |
-| `npm run youtube` | Fetches the Latest YouTube Channel Videos    |
-| `npm run newpost` | Generate a New Blogpost Markdown Entry       |
+| Command           | Action                                             |
+| :---------------- | :------------------------------------------------- |
+| `npm install`     | Installs dependencies                              |
+| `npm start`       | Builds & runs local dev server at `localhost:4321` |
+| `npm run dev`     | Starts local dev server at `localhost:4321`        |
+| `npm run build`   | Build your production site to `./dist/`            |
+| `npm run preview` | Preview your build locally, before deploying       |
+| `npm run youtube` | Fetches the Latest YouTube Channel Videos          |
 
 * Edit the `.astro` files in the `src/pages` directory to add blog, category, tag and other information.
 * The blog layout can be modified from the `src/layouts` directory.
 * Global CSS is located in the `src/styles` directory.
 
+> [!NOTE]
+> For any URL-related changes such as custom slugs or internal navigation between pages, make
+> sure to run `npm run build` first, or simply run `npm start` to watch the local directory for such
+> changes and to build the site before preview.
+
 ## 📝 Configuration Blog
 
 To configure the blog, edit the `src/config/config.json` file. This file contains the following options:
@@ -61,9 +66,13 @@ The menu is configured in the `src/config/menu.json` file. This file contains th
     "url": "/"
   },
   {
-    "name": "Blog",
-    "url": "/blog"
+    "name": "Tags",
+    "url": "/tags"
   },
+  {
+    "name": "Categories",
+    "url": "/categories"
+  }
 ]
 ```
 

diff --git a/public/default.png b/public/default.png
diff --git a/public/img/2023-07-19-iamgraphviz/adminviz.png b/public/img/2023-07-19-iamgraphviz/adminviz.png
diff --git a/public/img/2023-07-19-iamgraphviz/readonlyviz.png b/public/img/2023-07-19-iamgraphviz/readonlyviz.png
diff --git a/public/img/2023-07-19-iamgraphviz/schema.png b/public/img/2023-07-19-iamgraphviz/schema.png
diff --git a/public/img/2023-10-25-kuzuexplorer/preexisting-datasets.png b/public/img/2023-10-25-kuzuexplorer/preexisting-datasets.png
diff --git a/public/img/2023-10-25-kuzuexplorer/query-result-node-link-view.png b/public/img/2023-10-25-kuzuexplorer/query-result-node-link-view.png
diff --git a/public/img/2023-10-25-kuzuexplorer/schema-panel.png b/public/img/2023-10-25-kuzuexplorer/schema-panel.png
diff --git a/public/img/2024-01-04-llms-graphs-part-1/qa-over-enterprise-data.png b/public/img/2024-01-04-llms-graphs-part-1/qa-over-enterprise-data.png
diff --git a/public/img/2024-01-04-llms-graphs-part-1/rag-using-structured-data.png b/public/img/2024-01-04-llms-graphs-part-1/rag-using-structured-data.png
diff --git a/public/img/2024-01-04-llms-graphs-part-1/two-sql-generation-approaches.png b/public/img/2024-01-04-llms-graphs-part-1/two-sql-generation-approaches.png
diff --git a/public/img/2024-01-15-llms-graphs-part-2/kg-enhanced-rag-overview.png b/public/img/2024-01-15-llms-graphs-part-2/kg-enhanced-rag-overview.png
diff --git a/public/img/2024-01-15-llms-graphs-part-2/kg-enhanced-rag-preprocessing.png b/public/img/2024-01-15-llms-graphs-part-2/kg-enhanced-rag-preprocessing.png
diff --git a/public/img/2024-01-15-llms-graphs-part-2/rag-unstructured-overview.png b/public/img/2024-01-15-llms-graphs-part-2/rag-unstructured-overview.png
diff --git a/public/img/2024-01-15-llms-graphs-part-2/standard-rag-overview.png b/public/img/2024-01-15-llms-graphs-part-2/standard-rag-overview.png
diff --git a/public/img/2024-01-15-llms-graphs-part-2/standard-rag-preprocessing.png b/public/img/2024-01-15-llms-graphs-part-2/standard-rag-preprocessing.png
diff --git a/public/img/2024-01-15-llms-graphs-part-2/triples-based-rag-overview.png b/public/img/2024-01-15-llms-graphs-part-2/triples-based-rag-overview.png
diff --git a/public/img/2024-01-15-llms-graphs-part-2/triples-based-rag-preprocessing.png b/public/img/2024-01-15-llms-graphs-part-2/triples-based-rag-preprocessing.png
diff --git a/public/img/2024-01-24-transforming-your-data-1/edge_tables.png b/public/img/2024-01-24-transforming-your-data-1/edge_tables.png
diff --git a/public/img/2024-01-24-transforming-your-data-1/graph_schema.png b/public/img/2024-01-24-transforming-your-data-1/graph_schema.png
diff --git a/public/img/2024-01-24-transforming-your-data-1/graph_viz.png b/public/img/2024-01-24-transforming-your-data-1/graph_viz.png
diff --git a/public/img/2024-01-24-transforming-your-data-1/kuzu_schema_viz.png b/public/img/2024-01-24-transforming-your-data-1/kuzu_schema_viz.png
diff --git a/public/img/2024-01-24-transforming-your-data-1/relational_schema.png b/public/img/2024-01-24-transforming-your-data-1/relational_schema.png
diff --git a/public/img/2024-02-23-transforming-your-data-2/dispute_graph_viz.png b/public/img/2024-02-23-transforming-your-data-2/dispute_graph_viz.png
diff --git a/public/img/2024-02-23-transforming-your-data-2/graph_schema_dispute.png b/public/img/2024-02-23-transforming-your-data-2/graph_schema_dispute.png
diff --git a/public/img/2024-02-23-transforming-your-data-2/kuzu_explorer_schema.png b/public/img/2024-02-23-transforming-your-data-2/kuzu_explorer_schema.png
diff --git a/public/img/2024-02-23-transforming-your-data-2/query_boston_panera.png b/public/img/2024-02-23-transforming-your-data-2/query_boston_panera.png
diff --git a/...mg/2024-02-23-transforming-your-data-2/query_disputed_transactions_vicinity.png b/...mg/2024-02-23-transforming-your-data-2/query_disputed_transactions_vicinity.png
diff --git a/public/img/2024-02-23-transforming-your-data-2/relational_schema_dispute.png b/public/img/2024-02-23-transforming-your-data-2/relational_schema_dispute.png
diff --git a/public/img/default.png b/public/img/default.png
diff --git a/src/content/post/2022-11-15-meet-kuzu.md b/src/content/post/2022-11-15-meet-kuzu.md
@@ -0,0 +1,58 @@
+---
+slug: "meet-kuzu"
+title: "Meet Kùzu 🤗"
+description: "Kùzu is a new embeddable property graph database management system (GDBMS) that is 
+designed for high scalability and very fast querying"
+pubDate: "Nov 15 2022"
+heroImage: "/img/default.png"
+categories: ["release"]
+authors: ["team"]
+tags: ["vision"]
+---
+
+# Meet Kùzu 🤗
+
+Today we are very excited to make an initial version of [Kùzu public on github](https://github.com/kuzudb/kuzu)! 
+Kùzu[^1] is a new embeddable property graph database management system (GDBMS) that is 
+designed for high scalability and very fast querying. We are releasing 
+Kùzu today under a permissible MIT license. Through years of research on GDBMSs, we observed a lack of
+highly efficient GDBMS in the market that adopts state-of-the-art 
+querying and storage techniques and that can very easily integrate into applications, 
+similar to DuckDB or SQLite. Kùzu aims to fill this space and evolve into the 
+go-to open-source system to develop
+graph database applications, e.g., to manage and query your knowledge graphs, 
+and develop graph machine learning and analytics pipelines, 
+e.g., in the Python data science ecosystem. 
+
+Kùzu's core architecture is informed by 6 years of research we conducted 
+at University of Waterloo on an earlier prototype GDBMS called [GraphflowDB](http://graphflow.io/). 
+Unlike GraphflowDB, which was intended to be a prototype for our research, Kùzu aims to be
+a usable feature-rich system. Some of the primary features of Kùzu's architecture are:
+
+- Flexible Property Graph Data Model and Cypher query language
+- Embeddable, serverless integration into applications
+- Columnar disk-based storage
+- Columnar sparse row-based (CSR) adjacency list/join indices
+- Vectorized and factorized query processor
+- Novel and very fast join algorithms
+- Multi-core query parallelism
+- Serializable ACID transactions
+
+What we are releasing today includes many of the features of the core engine. This is what we
+called the "Phase 1" of the project. In the next "Phase 2" of the project, as we continue adding 
+more features to the core engine, e.g., better support for ad-hoc properties, string compression,
+and support for new recursive queries, we will also be focusing developing around the core engine
+to more easily ingest data into the system and output data to downstream data science/graph data science
+libraries. You can keep an eye on our tentative [roadmap here](https://github.com/kuzudb/kuzu/issues/981). 
+You can also read more about some of our longer term goals and vision as a system
+in [our new CIDR 2023 paper](https://cs.uwaterloo.ca/~ssalihog/papers/kuzu-tr.pdf), 
+which we will present in Amsterdam next January. 
+
+*And most importantly please start using Kùzu, tell us your feature requests, use cases, and report bugs. We can evolve into a
+more stable, usable, and feature-rich system only through your feedback!* 
+
+We are looking forward to to your feedback and a long and exciting journey as we continue developing Kùzu 🤗. 
+
+---
+
+[^1]: For interested readers: the word kù-zu is the Sumerian (the oldest known human language) word for "wisdom".
diff --git a/src/content/post/2023-01-12-what-every-gdbms-should-do.md b/src/content/post/2023-01-12-what-every-gdbms-should-do.md
@@ -1,9 +1,10 @@
 ---
+slug: "what-every-gdbms-should-do-and-vision"
 title: "What every competent GDBMS should do (a.k.a. the goals and vision of Kùzu)"
 description: "What every competent GDBMS should do (a.k.a. the goals and vision of Kùzu)"
 pubDate: "Jan 12 2023"
 heroImage: "/img/2023-01-12-what-every-gdbms-should-do/bachmann.png"
-categories: ["concepts"]
+categories: ["concept"]
 authors: ["semih"]
 tags: ["vision"]
 ---
@@ -73,9 +74,9 @@ enterprise applications.
 
 I want to start a 3-part blog post to cover the contents of our CIDR paper in a less academic language: 
 
-- __Post 1__: Kùzu's goals and vision as a system 
-- __Post 2__: [Factorization technique for compression](../2023-01-20-factorization)
-- __Post 3__: [Worst-case optimal join algorithms](../2023-02-22-wcoj)
+- __Post 1__: Kùzu's goals and vision as a system (this post)
+- __Post 2__: [Factorization technique for compression](../factorization)
+- __Post 3__: [Worst-case optimal join algorithms](../wcoj)
 
 In this Post 1, I discuss the following: 
    (i)   [an overview of GDBMSs](#overview-of-gdbms-and-a-bit-of-history).

diff --git a/src/content/post/2023-01-20-factorization.md b/src/content/post/2023-01-20-factorization.md
@@ -1,9 +1,10 @@
 ---
+slug: "factorization"
 title: "Factorization and great ideas from database theory"
 description: "Factorization and great ideas from database theory"
 pubDate: "Jan 20 2023"
 heroImage: "/img/2023-01-20-factorization/factorization-banner.png"
-categories: ["concepts"]
+categories: ["concept"]
 authors: ["semih"]
 tags: ["internals", "factorization"]
 ---
@@ -55,7 +56,7 @@ In contrast, you can't use factorization to compress your raw database files.
 Factorization has a very unique property:
 it is designed to compress the intermediate 
 data that are generated when query processors of DBMSs evaluate 
-many-to-many (m-n) growing joins. If you have read [my previous blog](../2023-01-12-what-every-gdbms-should-do),
+many-to-many (m-n) growing joins. If you have read [my previous blog](../what-every-gdbms-should-do-and-vision),
 efficiently handling m-n joins was one of the items on my list of properties 
 that competent GDBMSs should excel in. This is because 
 the workloads of GDBMSs commonly contain m-n joins
@@ -471,7 +472,7 @@ in mind is called
 for another time. For now, I invite you to check our performance out on large queries 
 and let us know if we are slow on some queries! The Kùzu team says hi (👋 🙋‍♀️ 🙋🏽) and 
 is at your service to fix all performance bugs as we continue implementing the system! 
-My next post will be about the novel [worst-case optimal join algorithms](../2023-02-22-wcoj), which emerged
+My next post will be about the novel [worst-case optimal join algorithms](../wcoj), which emerged
 from another theoretical insight on m-n joins! Take care until then!
 
 ---
@@ -480,6 +481,6 @@ from another theoretical insight on m-n joins! Take care until then!
 
 [^2]: Vectorization emerged as a design in the context of columnar RDBMSs, which are analytical systems, about 15-20 years old. It is still a very good idea. The prior design was to pass a single tuple between operators called Volcano-style tuple-at-a-time processing, which is quite easy to implement, but quite inefficient on modern CPUs. If you have access to the following link, you can read all about it from the pioneers of [columnar RDBMSs](https://www.nowpublishers.com/article/Details/DBS-024).
 
-[^3]: Note that GDBMSs are able to avoid scans of entire files because notice that they do the join on internal record/node IDs, which mean something very specific. If a system needs to scan the name property of node with record/node ID 75, it can often arithmetically compute the disk page and offset where this is stored, because record IDs are dense, i.e., start from 0, 1, 2..., and so can serve as  pointers if the system's storage design exploits this. This is what I was referring to as "Predefined/pointer-based joins" in my [previous blog post](../2023-01-12-what-every-gdbms-should-do). This is a good feature of GDBMSs that allows them to efficiently evaluate the joins of node records that are happening along the "predefined" edges in the database. I don't know of a mechanism where RDBMSs can do something similar, unless they develop a mechanism to convert value-based joins to pointer-based joins. See my student [Guodong's work last year in VLDB](https://www.vldb.org/pvldb/vol15/p1011-jin.pdf) of how this can be done. In Kùzu, our sideways information passing technique follows Guodong's design in this work.
+[^3]: Note that GDBMSs are able to avoid scans of entire files because notice that they do the join on internal record/node IDs, which mean something very specific. If a system needs to scan the name property of node with record/node ID 75, it can often arithmetically compute the disk page and offset where this is stored, because record IDs are dense, i.e., start from 0, 1, 2..., and so can serve as  pointers if the system's storage design exploits this. This is what I was referring to as "Predefined/pointer-based joins" in my [previous blog post](../what-every-gdbms-should-do-and-vision). This is a good feature of GDBMSs that allows them to efficiently evaluate the joins of node records that are happening along the "predefined" edges in the database. I don't know of a mechanism where RDBMSs can do something similar, unless they develop a mechanism to convert value-based joins to pointer-based joins. See my student [Guodong's work last year in VLDB](https://www.vldb.org/pvldb/vol15/p1011-jin.pdf) of how this can be done. In Kùzu, our sideways information passing technique follows Guodong's design in this work.
 
 [^4]: Umbra is being developed by [Thomas Neumann](https://www.professoren.tum.de/en/neumann-thomas) and his group. If Thomas's name does not ring a bell let me explain his weight in the field like this. As the joke goes, in the field of DBMSs: there are gods at the top, then there is Thomas Neumann, and then other holy people, and then we mere mortals.