Skip to content

Commit

Permalink
[docs] Fix grammar in docs #1404 (#1406)
Browse files Browse the repository at this point in the history
* Update server.md

* Update performance.md

* Update efficient-agdb.md

* Update queries.md

* Update queries.md

* Update concepts.md

* update grammar
  • Loading branch information
michaelvlach authored Dec 23, 2024
1 parent 5cf22bb commit 4d53417
Show file tree
Hide file tree
Showing 15 changed files with 172 additions and 139 deletions.
4 changes: 2 additions & 2 deletions agdb_web/pages/en-US/api-docs/php.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -7,13 +7,13 @@ import { Callout, Steps } from "nextra/components";

# PHP

The php agdb API client is generated with [openapi-generator](https://github.com/OpenAPITools/openapi-generator/blob/master/docs/generators/php.md). The following is the quickstart guide for the agdb client in PHP (connecting to the server). It assumes an `agdb_server` is running locally. Please refer to the [server guide](/docs/guides/how-to-run-server) to learn how to run the server.
The PHP agdb API client is generated with [openapi-generator](https://github.com/OpenAPITools/openapi-generator/blob/master/docs/generators/php.md). The following is the quickstart guide for the agdb client in PHP (connecting to the server). It assumes an `agdb_server` is running locally. Please refer to the [server guide](/docs/guides/how-to-run-server) to learn how to run the server.

Looking for... [how to run a server?](/docs/guides/how-to-run-server) | [another language?](/api-docs/openapi) | [embedded db guide?](/docs/guides/quickstart)

## Usage

The following is the from-scratch guide to use `agdb-api` php package.
The following is the from-scratch guide to use `agdb-api` PHP package.

<Steps>
### Install PHP
Expand Down
4 changes: 2 additions & 2 deletions agdb_web/pages/en-US/api-docs/rust.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ Looking for... [how to run a server?](/docs/guides/how-to-run-server) | [another

Please install the Rust toolchain from the [official source](https://www.rust-lang.org/tools/install).

### Create an applicaiton
### Create an application

First we initialize an application called `agdb_client` with cargo:

Expand Down Expand Up @@ -54,7 +54,7 @@ async fn main() -> anyhow::Result<()> {

### Create a database user

First we need to login as default admin user and create our database user and then login as them:
First we need to log in as default admin user and create our database user and then login as them:

```rs
client.user_login("admin", "admin").await?; // The authentication login is stored in
Expand Down
6 changes: 3 additions & 3 deletions agdb_web/pages/en-US/api-docs/typescript.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ https://nodejs.org/en

### Create your project

Let's create a directory (e.g. `my_agdb`) and nitialize the package:
Let's create a directory (e.g. `my_agdb`) and initialize the package:

```bash
mkdir my_agdb
Expand Down Expand Up @@ -76,7 +76,7 @@ async function main() {
}
```

### Create a databasse user
### Create a database user

To create a database user we use the default admin user:

Expand Down Expand Up @@ -145,7 +145,7 @@ let results = (await client.db_exec({ owner: "user1", db: "db1" }, queries))
.data;
```

### Print the the result of the final query to the console:
### Print the result of the final query to the console:

```ts
console.log(`User (id: ${results[3].elements[0].id})`);
Expand Down
6 changes: 3 additions & 3 deletions agdb_web/pages/en-US/blog/object-queries.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ description: "Blog, Agnesoft Graph Database"

# Object queries

The most ubiquitous database query language is SQL which is text based language created in the 1970s. Its biggest advantage is that being text based it can be used from any language to communicate with the database. However just like relational (table) bases databases from the same era it has some major flaws:
The most ubiquitous database query language is SQL which is text based language created in the 1970s. Its biggest advantage is that being text based it can be used from any language to communicate with the database. However, just like relational (table) bases databases from the same era it has some major flaws:

- It needs to be parsed and interpreted by the database during runtime leading to common syntax errors that are hard or impossible to statically check.
- Being a separate programming language from the client coding language increases cognitive load on the programmer.
Expand All @@ -14,6 +14,6 @@ The most ubiquitous database query language is SQL which is text based language

The last point is particularly troublesome because it partially stems from the `schema` issue discussed in the previous points. One common way to avoid changing the schema is to transform the data via queries. This is not only less efficient than representing the data in the correct form directly but also increases the complexity of queries significantly.

The solutions include heavily sanitizing the user inputs in an attempt to prevent SQL injection attacks, wrapping the constructing of SQL in a builder-pattern to prevent syntax errors and easing the cognitive load by letting programmers create their queries in their main coding language. The complexity is often being reduced by the use of stored SQL procedures (pre-created queries). However all of these options can only mitigate the issues SQL has.
The solutions include heavily sanitizing the user inputs in an attempt to prevent SQL injection attacks, wrapping the constructing of SQL in a builder-pattern to prevent syntax errors and easing the cognitive load by letting programmers create their queries in their main coding language. The complexity is often being reduced by the use of stored SQL procedures (pre-created queries). However, all of these options can only mitigate the issues SQL has.

Using native objects representing the queries eliminate all of the SQL issues sacrificing the portability between languages. However that can be relatively easily be made up via already very mature (de)serialization of native objects available in most languages. Using builder pattern to construct these objects further improve their correctness and readability. Native objects carry no additional cognitive load on the programmer and can be easily used just like any other code.
Using native objects representing the queries eliminate all of SQL issues sacrificing the portability between languages. However, that can be relatively easily be made up via already very mature (de)serialization of native objects available in most languages. Using builder pattern to construct these objects further improve their correctness and readability. Native objects carry no additional cognitive load on the programmer and can be easily used just like any other code.
Original file line number Diff line number Diff line change
Expand Up @@ -5,10 +5,10 @@ description: "Blog, Agnesoft Graph Database"

# Sharding, replication and performance at scale

Most databases tackle the issue of (poor) performance at scale by scaling up using replication/sharding strategies. While these techniques are definitely useful and they are planned for `agdb` they should be avoided as much as possible. The increase in complexity when using replication and/or sharding is dramatic and it has adverse performance impact meaning it is only worth it if there is no other choice.
Most databases tackle the issue of (poor) performance at scale by scaling up using replication/sharding strategies. While these techniques are definitely useful, and they are planned for `agdb` they should be avoided as much as possible. The increase in complexity when using replication and/or sharding is dramatic, and it has adverse performance impact meaning it is only worth it if there is no other choice.

The `agdb` is designed so that it performs well regardless of the data set size. Direct access operations are O(1) and there is no limit on concurrency. Write operations are O(1) amortized however they are exclusive - there can be only one write operation running on the database at any given time preventing any other read or write operations at the same time. You will still get O(n) complexity when searching the (sub)graph as reading a 1000 connected nodes will take 1000 O(1) operations = O(n) same as reading 1000 rows in a table. However if the data does not indiscriminately connect everything to everything one can have as large data set as the hardware can fit without performance issues. The key is querying only subset of the graph (subgraph) since your query will have performance based on that subgraph and not all the data stored in the database.
The `agdb` is designed so that it performs well regardless of the data set size. Direct access operations are O(1) and there is no limit on concurrency. Write operations are O(1) amortized however they are exclusive - there can be only one write operation running on the database at any given time preventing any other read or write operations at the same time. You will still get O(n) complexity when searching the (sub)graph as reading 1000 connected nodes will take 1000 O(1) operations = O(n) same as reading 1000 rows in a table. However, if the data does not indiscriminately connect everything to everything one can have as large data set as the hardware can fit without performance issues. The key is querying only subset of the graph (subgraph) since your query will have performance based on that subgraph and not all the data stored in the database.

The point here is that scaling has significant cost regardless of technology or clever tricks. Only when the database starts exceeding limits of a single machine they shall be considered because adding data replication/backup will mean huge performance hit. To mitigate it to some extent caching can be used but it can never be as performant as local database. The features "at scale" are definitely coming you should avoid using them as much as possible even if available.
The point here is that scaling has significant cost regardless of technology or clever tricks. Only when the database starts exceeding limits of a single machine they shall be considered because adding data replication/backup will mean huge performance hit. To mitigate it to some extent caching can be used, but it can never be as performant as local database. The features "at scale" are definitely coming you should avoid using them as much as possible even if available.

[For real world performance see dedicated documentation.](/docs/references/performance)
4 changes: 2 additions & 2 deletions agdb_web/pages/en-US/blog/single-file.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -5,8 +5,8 @@ description: "Blog, Agnesoft Graph Database"

# Single file

All operating systems have fairly low limit on number of open file descriptors for a program and for all programs in total making this system resource one of the rarest. Furthermore operating over multiple files does not seem to bring in any substantial benefits for the database while it complicates its implementation significantly. The graph database typically needs to have access to the full graph at all times unlike say key-value stores or document databases. Splitting the data into multiple files would therefore be actually detrimental. Lastly overall storage taken by the multiple files would not actually change as the amount of data would be the same.
All operating systems have fairly low limit on number of open file descriptors for a program and for all programs in total making this system resource one of the rarest. Furthermore, operating over multiple files does not seem to bring in any substantial benefits for the database while it complicates its implementation significantly. The graph database typically needs to have access to the full graph at all times unlike say key-value stores or document databases. Splitting the data into multiple files would therefore be actually detrimental. Lastly overall storage taken by the multiple files would not actually change as the amount of data would be the same.

Conversely using just a single file (with a second temporary write ahead log file) makes everything simpler and easier. You can for example easily transfer the data to a different machine - it is just one file. The database can also operate on the file directly if memory mapping was turned off to save RAM at the cost of performance. The program would not need to juggle multiple files consuming valuable system resources.
Conversely, using just a single file (with a second temporary write ahead log file) makes everything simpler and easier. You can for example easily transfer the data to a different machine - it is just one file. The database can also operate on the file directly if memory mapping was turned off to save RAM at the cost of performance. The program would not need to juggle multiple files consuming valuable system resources.

The one file is the database and the data.
Loading

0 comments on commit 4d53417

Please sign in to comment.