Resource Host Discovery Requirements

Introduction

Unbound resource hosts are resource hosts that are not part of a peer. These resource hosts may be newly created or may have been removed from a peer previously. Either way, they MUST announce their availability as unbound resource hosts.

Unbound resource hosts MUST automatically discover peers existing on their LAN willing to accept them, and request to join them. Approval of the resource host join request is at the discretion of the peer administrator and not within the scope of this feature's requirements.

Resource Host Roles

Resource hosts can be thought of as having different roles:

Unbound: not part of a peer
Bound: part of a peer
Peer: part of a peer and hosting the peer management container

Different events can change the role of a resource host:

the import of a management container (Unbound -> Peer)
joining a peer (Unbound -> Bound)
leaving a peer (Bound -> Unbound)

Since the role can change, the agent running on the resource host MUST first identify its role before taking any action. The agent MUST check to see if a management container has been installed on it to see if it is in the peer role. It MUST also check to see if the resource host it runs on is already participating in a peer to see if it is available.

Communications and Workflow

Based on its role the agent will take different actions.

If unbound the agent announces availability
If bound the agent does nothing
If a peer the agent responds to availability announcements

Availability Announcement

The availability announcement MUST contain the unique cryptographic identity (PGP fingerprint) of the resource host. This way peers know if they've previously rejected the join request from the resource host before, and can ignore its announcements for availability.

Once a resource host becomes unbound it SHOULD announce availability on the LAN using a lightweight UDP broadcast. Unless answered directly by a peer, it MUST repeat the broadcast periodically. To avoid unnecessary traffic the time between announcement broadcasts SHOULD increase exponentially until some maximum limit is reached.

I.e. Cheap bit shifts, with powers of 2, can be used to raise the period from an initial 8 seconds until the it reaches an upper limit of 4096 seconds.

Unbound resource hosts MUST continue to broadcast availability until their role changes. When their role changes they no longer broadcast their availability.

Corner Cases

There MAY be more than one peer on the LAN. In this case multiple responses MAY arrive. With multiple peer responses, the agent MUST send join requests to each responding peer. If no join request is accepted by any peer, the unbound resource host MUST keep announcing its availability. Once bound to a peer, a resource host MUST not allow a second join operation to proceed.

Peer Response

Peers should respond with a lightweight unicast UDP message to the announcing resource host. The unique cryptographic identity (PGP fingerprint) of the peer (not its resource host) should be sent with the response. Upon receiving the response, the unbound resource host should send a REST request to join the peer which a peer MAY reject.

Peers that previously rejected or abandoned a resource host MAY choose not to respond to subsequent availability announcements by unbound resource hosts. This is why the availability announcement should carry the unique identifier of the unbound resource host: the PGP fingerprint.

Peers MAY in the future have some automated rules for resource hosts joining. They may check the fingerprint for trust relationships, or even compute whether or not to accept the request. These things are beyond the scope of discovery. (https://en.wikipedia.org/wiki/Simple_Service_Discovery_Protocol)

Existing Discovery Protocols

Zero configuration discovery protocols already exist that could help us out here. We SHOULD try an existing protocol before cooking up a home grown protocol ourselves. The only part that we MUST confirm is the ability to include a PGP fingerprint in the protocol's messages.

Simple Service Discovery Protocol

Simple Service Discovery Protocol seems to be just what we need in a tight little package. It uses UDP broadcasts on port 1600 for search requests. Search responses are unicast UDP messages back.

SSDP Golang Libraries

There's also a few open source Golang libraries out there to chose from on Github.

Most of these libraries also implement the UDP based HTTP (HTTPU) service for both clients and servers.

Considerations

SSDP looks like the ideal candidate for our needs. Using basically HTTP it allows us to pack PGP fingerprints into messages: maybe we can reuse an existing standard field.

SSDP is used with UPnP which is more intended for home LAN's. Seems also there might be some issues with DDoS shit storms caused by bot networks. See the Wikipedia page. SSDP may be just as applicable for the enterprise as it is for home use with UPnP if we're careful with it. SSDP gets a bad image because of complaints against UPnP's lack of authentication which makes it unfit for enterprise environments. SSDP has nothing to do with these UPnP shortfalls and is secure and simple.

There might be some scaleability issues that may make one opt for SLP. However I don't think it's a problem for the peer-per-rack configuration we're pitching in the enterprise. Plus most networks are not flat topologies anymore.

Service Location Protocol

SLP is a bit more complex, and probably overkill for our needs. Still it is worth mentioning. It has a lot more enterprise support and scaleability. The scale factor of SLP comes from the use of directory agents. We don't have the need for these. Peers in the enterprise or in hosting providers should not be bigger than a rack with a TOR switch for an isolated network segment.