Skip to content

Commit

Permalink
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add docs
Browse files Browse the repository at this point in the history
Sergiu Marin authored and Sergiu Marin committed May 30, 2024

Verified

This commit was signed with the committer’s verified signature.
kianenigma Kian Paimani
1 parent 019d280 commit 06285ee
Showing 1 changed file with 116 additions and 0 deletions.
116 changes: 116 additions & 0 deletions docs/architecture.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,116 @@
Sonic follows the proactor model of asynchronous completion. For any asynchronous operation, such as a read/write from/to a socket, the user provides a callback which will be invoked at some point in the future, when the operation completes, either successfully, or through an error. In some cases, the completion handler will be invoked immediately as the operation can complete within the asynchronous call.

For example:
```go
b := make([]byte, 4096) // buffer to read into
conn.AsyncRead(b, func(err error, n int) {
// if err == nil, then n is the number of bytes read
// if err != nil, then the read failed and n >= 0
})
```

Note that the user can invoke another asynchronous operation in the provided callback:
```go
var onRead func(error, int)
onRead = func(err error, n int) {
conn.AsyncRead(b, onRead)
}
conn.AsyncRead(b, onRead) // the first read
```

Since the first read can complete immediately, `onRead` can get called before the first `AsyncRead` returns. This also holds for the following `AsyncRead`s done in the callback. In other words, if multiple reads can complete immediately, we keep growing the call stack as we never get the change to return from an `AsyncRead`. The solution is to limit the number of consecutive immediate reads - if we hit the limit, we asynchronously schedule the next read, hence popping the call frames. For example, see `conn.go`.

# The central IO component

The `IO struct` (also called the IO context) is the main entity which provides users with the ability to schedule asynchronous operations. There must be only one `IO` per goroutine. Multiple `IO`s can exist within a program.

Any construct which must provide asynchronous operations, such as `conn.go`, takes an instance of an IO context.

Since the IO context is the central pillar of Sonic, every program's main function will end with a call to run the IO. For example:

```go
ioc.Run() // runs until the program is terminated, yielding the CPU between async operations

for { ioc.PollOne() } // runs until the program is terminated, polling for IO, thus never yielding the CPU between IO operations
```

Under the hood, the IO context uses a platform specific event notification system to get informed when an IO operation can be completed (`epoll` for Linux, `kqueue` for BSD/macOS). In short, this event notification systems allow us to register a socket for read/write events. Calling `epoll` after registering a socket might return an event on that socket, telling us whether it's ready to be read.

# Networking constructs

## Transports

- TCP client/server: `conn.go`, `listen_conn.go`
- connection oriented UDP peer: `packet.go` and `listen_packet.go`
- UDP multicast: see the `multicast` package, which supports IPv4 and source IP filtering.

## Codecs

Codecs are meant to sit on top of stream based protocols. Currently, the only stream-based transport in Sonic is a TCP connection.

Stream transports do not give any guarantee on the number of bytes read. A single socket read might end up with 0 or 100 bytes. This is in contrast with packet based transports, like UDP(/multicast), in which every read returns a complete packet. If your buffer is big enough, you'll read the whole packet. If your packet is smaller than a packet, you'll read as much as possible from the packet and the rest will be discarded.

To help with writing application layer protocols on top of stream based transports, such as WebSocket, Sonic provides users with the notion of a `codec` and `codec stream`.

A `codec` is simply an interface with two functions:
- `Encode(Item, *ByteBuffer) error`: encodes the given `Item` into the `ByteBuffer`
- `Decode(*ByteBuffer) (Item, error)`: decodes and returns the `Item` from the `ByteBuffer`, or, if there are not enough bytes, returns `nil, sonicerrors.ErrNeedMore`.

A `codec stream` takes a user defined `codec` and implements read functions that return an `Item` and write functions that write an `Item`. A `codec stream` automatically handles short reads as notified by the `codec` through `sonicerrors.ErrNeedMore`.

See `codec/frame.go` for a sample codec.

## Buffers

Sonic offers three buffer types:
- `ByteBuffer`: a FIFO like buffer on which `codec`s are based. Not extremely efficient as data is copied on some operations needed by `codec`s
- `MirroredBuffer`: a zero-copy `ByteBuffer` - this was recently added and needs a bit more code to fully replace the `ByteBuffer`
- `BipBuffer`: a zero-copy FIFO buffer best suited for packet based communication

### BipBuffer in Packet Transports

The purpose of the `BipBuffer` is to offer an efficient way to store packets in the event of loss, while the missing packets are replayed.

Say each packet is 1 byte and carries a sequence number. Packet loss occurs when the next received sequence number does not follow the previous one. For example, if we receive `1 2 4` then we miss packet `3`. We must buffer `4` and everything that follows it until we replay packet `3` (through a TCP feed, for example).

To store packets until the missing ones are replayed, we use a `BipBuffer`. Say
`| | | | |` is a bip buffer which can hold 4 bytes. Reading `1 2` goes as follows:
```
|1| | | |
|2| | | |
```

Reading `4 5 6` then results in buffering each packet
```
|4|5|6|x| --> the next read is done in x
```

If we somehow get 3 again, we can then iterate over the `BipBuffer` to process `4 5 6`. All subsequent reads will be done in the place of `x` as long as they're in sequence. If `x` is out of sequence, then it's stored there and we read the next packet in the first byte slot.

If the `BipBuffer` is not full, then it acts like a circular buffer: if the next packet does not fit at the end of the buffer and there's enough space at the beginning, then it will be put there. If not, then an error will be returned - the user can then decide what to do in the event of a full buffer.

Packet loss is the most common reliability issue in exchange communications. Packet re-ordering happens rarely, if ever. As such, we can use the zero-copy nature of a `BipBuffer` to efficiently store out of sequence packets until we replay the missing ones.

#### Efficiently replaying packets

Say we have the packets `1 3 5 7` - we miss `2 4 6`. In a program, we will process `1` and then queue `3 5 7` until `2` is received - when that happens, we are left with `5 7` at which point we wait for `4` etc.

An efficient program will issue replay requests as gaps happen, and not when they're encountered. That means that in our example above, we issue a replay request for `2` after `3`, for `4` after `5` and for `6` after `7`, as they're read from the network. The innefficient alternative is to issue replay requests as the packets are read from the `BipBuffer`.

This can be achieved by keeping track of two sequence numbers:
- `present`: the last valid sequence number read from the network. In the case of `1 3 5 7`, this will be `1`.
- `future`: the sequence number assuming all missing packets are delivered. In the case of `1 3 5 7`, this will be `7`

The flow is as follows:
```
read 1: present=1 future=1
read 3: present=1 future=3, replay 2
read 5: present=1 future=5, replay 4
read 7: present=1 future=7, replay 6
2 is replayed: present=3 future=7
4 is replayed: present=5 future=7
6 is replayed: present=7 future=7
```

We know we don't miss anything if `present == future`. We know we miss some packets if `present < future`. We can never have `present > future`. We know whether we need to replay something after reading a packet by comparing its sequence number with `future`. Say we read `seq`. We then request packets from `future + 1` to `seq - 1` if `seq > future`.

0 comments on commit 06285ee

Please sign in to comment.