- Configurable maximum size of one message
- via property
message.length.max.bytes
in application.properties
- via property
- Dockerfile and other necessary materials for quick deployments
- Dockerfile for deployments
- docker-compose.yaml for local environment
- Estimated system workload: 400 rps, 99% latency within 0.5s.
- Publisher and subscriber concurrent load tests with 200 rps each:
- Please refer to Publisher.scala and Subscriber.scala for more information about the load test plans.
- These can be repeated by running
docker-compose up
on the root directory of this project.
- Messages are NOT lost unless hardware-level incident.
- Example http calls are given with respective documentation:
We can improve the current throughput performance by committing changes to the state of the server to a cache layer.
For each client request, we can query/update the cache layer accordingly. We can then respond to clients and asynchronously persist the state changes to disk.
It is worth to note that, by implementing this feature, we sacrifice the ability to sustain message loss in the events of hardware failures (disk fails during asynchronous call to persist a message to disk after the server issues a response to a client).
Therefore, this behaviour should be enabled or disabled via configuration properties so that we can customize the fault tolerance options of the server.
We can improve the current throughput performance by running the server on a machine with higher specifications, especially faster I/O.
This might not be the best option since we have a single point of failure.
We can deploy multiple instances of the server into a cluster, preferably in different machines for redundancy purposes.
This particular approach can be tackled similarly to other well-established solutions such as Apache Kafka.
Rafka maintains the following data elements:
- Subscriber offset (per subscriber)
- Topic messages (per topic)
Therefore, we need to devise a partitioning logic for each of these data elements.
Each instance must be able to serve an individual request as before, but only requiring the view of its own partitioned data.
The cluster must also be able to load-balance requests to all instances.
Publishers and subscribers are currently identified via their IP address, or the IP address of the last proxy that sent the request. This is not foolproof and can be spoofed.
Instead, we can request clients to be authenticated on a per-request basis via HTTP Authorization request headers. We can then use the authentication principal as the identity of clients instead of their IP address.
Only the mime-type 'application/json' and charset 'utf-8' is supported.
In addition, we can provide support for multiple mime-types and charsets.
Traffic is not compressed, resulting in larger network bandwidth utilization.
We can provide support for content encoding to compress the traffic.
If we scale the cluster horizontally (please see above), messages can be lost in the event of a hardware-level failures of single nodes in the cluster.
To improve the fault tolerance guarantees of the system we have to also tackle the problem of maintaining the overall health of the cluster and ensuring data is properly replicated across the cluster nodes.
This is a distributed system problem in itself, requiring a leader election mechanism, replication management, etc. We could look at well-established solutions such as Apache ZooKeeper to solve these problems.