-
Notifications
You must be signed in to change notification settings - Fork 5
SWBus Actors
To simplify building the services, we recommend building services using the actor model.
This approach fits the things we are doing in SONiC well, because we are essentially managing the state of each resources, such as port, lag, and so on, based on changes of system states or messages/requests sent from other services. The actor model can easily ensure that the resource management logic is safe (no cross-resource data access, hence no data locks or race conditions) as well as can easily scale in the future (actors can be easily mapped onto any number of threads).
Maybe we use a simple model, no special framework features other than runner and maybe swbus message resender.
pub trait Actor {
async fn handle_reqest(
&mut self,
payload: Vec<u8>,
source: ServicePath,
) -> Result<{Continue | Quit}, Error> {
actor::spawn(another_actor); // Spawn a new actor
actor::send(OutgoingMessage::new(...)); // queue a message for sending when callback finishes
return Ok(Continue); // Actor continues, messages get sent, request acked Ok
return Ok(Quit); // Actor dies, messages get sent, request acked Ok
return Err(...); // Actor continues, messages do NOT get sent, request acked Err
}
}
Pros:
- Easily maps onto tokio: 1 actor = 1 tokio task. This makes lifetimes, spawning new actors, etc. easy because tokio already handles it.
- Very simple interface: get message and send message. Maybe also an init method.
- No special framework features (swss bridge, state); this makes it very easy to implement the framework, but maybe more complex to implement actors.
Cons:
- Open questions:
- How to implement the special behavior in the big framework (swss bridge, state, actor dispatcher) as small actors?
- Need unsubscribe feature from swbus. When an actor quits it should no longer be in the routing table.
An actor is a struct that implements two callback functions:
handle_request(&mut self, state: &mut State, outbox: &mut Outbox, message: IncomingMessage)
handle_table_update(&mut self, state: &mut State, outbox: &mut Outbox, key: Key<'_>)
These functions have access to the actor's state (SWSS tables) and can send Swbus messages. State changes (SWSS table writes) and outgoing messages are only committed/sent if the function succeeds.
Actor state is a local, in-memory copy of SWSS tables entry that an actor is subscribed to. One state table corresponds to one key in an swss table. Actors must subscribe to specific keys at initialization. State is split into two categories: input and output tables.
- Input tables are read-only. When an input table is updated by another service, it triggers
handle_table_update
on the actor. - Output tables are read-write. When an output table is written to during an actor callback, and that callback succeeds, the writes are send to the database.
The SWSS bridge is the part of the actor framework that is responsible for listening to input table updates and writing to output tables. It is not an actor, but it is intrinsically tied to the actor framework. It keeps track of which actors are subscribed to which input table keys, and sends table updates appropriately. If a new key is added to an input table, this is sent instead to the actor dispatcher.
The actor dispatcher is a special callback in the SWSS bridge that receives table updates that are not subscribed to by an actor. This callback can choose to ignore the update, or spawn a new actor and forward the update.
For example, new HAMgrD ENIs would be brought up by a new ENI ID key being used in a DASH table. SDN controller writes a new key -> SWSS Bridge reads the update, but nobody is subscribed -> Actor Dispatcher receives the update and spawns a new actor to handle it.
-
Actors are spawned by table keys being written. If hamgrd dies, all the actors go down, and we forget which keys were written to. When it restarts, how do we know which actors to bring back up?
-
When initializing actor state, should it be empty? Should we have a best-effort rehydration (Table::get() each one, and if we get some data, use it, otherwise initialize empty)?