-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes for handling recursive NHG entries #1636
Changes from 2 commits
cb1ba9f
3a690fc
e869889
481cffc
1066333
a1230bb
5b11b67
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||
---|---|---|---|---|
|
@@ -41,14 +41,14 @@ | |||
| 0.3 | Sep 18, 2023 | Kentaro Ebisawa (NTT) | Update based on discussion at Routing WG on Sep 14th (Scope, Warmboot/Fastboot, CONFIG_DB) | | ||||
| 0.4 | Sep 24, 2023 | Kentaro Ebisawa (NTT) | Add feature enable/disable design and CLI. Update test plan. | | ||||
| 0.5 | Nov 10, 2023 | Kanji Nakano (NTT) | Update feature enable/disable design and CLI. Update test plan. | | ||||
| 0.6 | Feb 09, 2024 | Nikhil Kelapure (BRCM) | Update multipath NHG handling. | | ||||
|
||||
### Scope | ||||
|
||||
This document details the design and implementation of the "fpmsyncd extension" related to NextHop Group behavior in SONiC. | ||||
The goal of this "fpmsyncd extension" is to integrate NextHop Group (NHG) functionality into SONiC by writing NextHop Group entry from `fpmsyncd` to `APPL_DB` for NextHop Group operation in SONiC. | ||||
|
||||
- Scope of this change is to extend `fpmsyncd` to handle `RTM_NEWNEXTHOP` and `RTM_DELNEXTHOP` messages from FPM. | ||||
- There will be no change to SWSS/Orchagent. | ||||
- This change is backward compatible. Upgrade from a SONiC version that does not support this feature does not change the user's expected behavior as this feature is disabled by default. | ||||
|
||||
### Overview | ||||
|
@@ -77,6 +77,9 @@ See [09072023 Routing WG Meeting minutes](https://lists.sonicfoundation.dev/g/so | |||
- `fpmsyncd` to SET/DEL routes to `APPL_DB: ROUTE_TABLE` using `nexthop_group` | ||||
- `fpmsyncd` to SET/DEL NextHop Group entry to `APPL_DB: NEXTHOP_GROUP_TABLE` | ||||
|
||||
`Orchagent extension` requires: | ||||
- `orchagent` to handle member updates from `APPL_DB:NEXTHOP_GROUP_TABLE` for multipath NextHop Group | ||||
|
||||
This feature must be disabled by default. | ||||
- When this feature is disabled, behavior will be the same as before introducing this feature. | ||||
- i.e. `NEXTHOP_GROUP_TABLE` entry will not be created and `nexthop_group` will not be used in `ROUTE_TABLE` entry in `APPL_DB`. | ||||
|
@@ -99,6 +102,8 @@ For multipath route, the `RTM_NEWROUTE` is sent with a list of gateway and inter | |||
|
||||
This `Fpmsyncd extension` will modify `fpmsyncd` to handle `RTM_NEWNEXTHOP` and `RTM_DELNEXTHOP` as below. | ||||
|
||||
This featured introduces new orchagent `NhgOrch` to handle `RTM_NEWNEXTHOP` events and update `NEXTHOP_GROUP_TABLE` entries accordingly | ||||
|
||||
<!-- omit in toc --> | ||||
##### Figure: Fpmsyncd NHG High Level Architecture | ||||
![fig: fpmsyncd nhg architecture](images_fpmsyncd/fpmsyncd-nhg-architecture.png) | ||||
|
@@ -230,14 +235,23 @@ admin@sonic:~$ sonic-db-cli APPL_DB HGETALL ROUTE_TABLE:10.1.1.4 | |||
<tr><td>NHA_GROUP</td><td>[{125,1},{126,1}]</td></tr> | ||||
<tr><td rowspan="2">4</td><td rowspan="2">RTM_NEWROUTE</td><td>RTA_DST</td><td>10.1.1.4</td></tr> | ||||
<tr><td>RTA_NH_ID</td><td>ID127</td></tr> | ||||
<tr><td rowspan="2">5</td><td rowspan="2">RTM_NEWROUTE</td><td>RTA_DST</td><td>10.1.1.5</td></tr> | ||||
<tr><td>RTA_NH_ID</td><td>ID127</td></tr> | ||||
<tr><td rowspan="3">6</td><td rowspan="3">RTM_NEWNEXTHOP</td><td>NHA_ID</td><td>ID128</td></tr> | ||||
<tr><td>NHA_GATEWAY</td><td>10.0.0.4</td></tr> | ||||
<tr><td>NHA_OIF</td><td>22</td></tr> | ||||
<tr><td rowspan="2">7</td><td rowspan="2">RTM_NEWNEXTHOP</td><td>NHA_ID</td><td>ID127</td></tr> | ||||
<tr><td>NHA_GROUP</td><td>[{125,1},{128,1}]</td></tr> | ||||
</table> | ||||
|
||||
A short description of `fpmsyncd` logic flow: | ||||
|
||||
- When receiving `RTM_NEWNEXTHOP` events on sequence 1, 2 and 3, `fpmsyncd` will save the information in an internal list to be used when necessary. | ||||
- When `fpmsyncd` receive the `RTM_NEWROUTE` on sequence 4, the process will write the NextHop Group with ID 118 to the `NEXTHOP_GROUP_TABLE` using the information of gateway and interface from the NextHop Group events with IDs 116 and 117. | ||||
- Then `fpmsyncd` will create a new route entry to `ROUTE_TABLE` with a `nexthop_group` field with value `ID118`. | ||||
- When `fpmsyncd` receives the last `RTM_NEWROUTE` on sequence 5, the process will create a new route entry (but no NextHop Group entry) in `ROUTE_TABLE` with `nexthop_group` field with value `ID118`. (Note: This NextHop Group entry was created when the `fpmsyncd` received the event sequence 4.) | ||||
- When receiving `RTM_NEWNEXTHOP` events on sequence 1 and 2, `fpmsyncd` will write the information in `NEXTHOP_GROUP_TABLE` with NextHop Group `ID125` and `ID126` and with information of gateway and interface respectively. For sequence 3 multi path NextHop Group `ID127` the information in `NEXTHOP_GROUP_TABLE` will contain the list of IDs 125 and 126 | ||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What changes are expected towards kernel programming? please capture those details if not captured in a separate document. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. No new changes are required from frr/kernel side for this feature. When NHG is enabled in FRR, it already programs routes to kernel using reference to the NHID. This implementation in fact make the FPM programming similar to how the routes are being programmed in kernel already |
||||
- When `fpmsyncd` receive the `RTM_NEWROUTE` on sequence 4, the process will create a new route entry to `ROUTE_TABLE` with a `nexthop_group` field with value `ID127`. | ||||
- When `fpmsyncd` receives the next `RTM_NEWROUTE` on sequence 5, the process will create a new route entry (but no NextHop Group entry) in `ROUTE_TABLE` with `nexthop_group` field with value `ID127`. (Note: This NextHop Group entry was created when the `fpmsyncd` received the event sequence 4.) | ||||
- When receiving `RTM_NEWNEXTHOP` events on sequence 6 `fpmsyncd` will write the information in `NEXTHOP_GROUP_TABLE` with NextHop Group `ID128` | ||||
- When receiving `RTM_NEWNEXTHOP` events on sequence 7 `fpmsyncd` will update the information in `NEXTHOP_GROUP_TABLE` with NextHop Group `ID127` and the information will now contain list of IDs 125 and 128 | ||||
|
||||
|
||||
#### Example of entries in ASIC_DB | ||||
|
||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Please capture ASIC_DB update for NHG group with multipath There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Added diagram to show the updated ASIC_DB entry |
||||
|
@@ -248,6 +262,32 @@ Therefore, even after this enhancement, table entries will be created for `ROUTE | |||
##### Figure: Example of ASIC_DB entry | ||||
![fig3](images_fpmsyncd/fig3.svg) | ||||
|
||||
#### Orchestration Agent Changes to handle NEXTHOP_GROUP | ||||
|
||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Please capture the case NH as Interface? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. "NH as Interface" will be same as any other NHG. The application ( zebra) will create a NHID for the NHG with NH as interface and the route will use this NHID as reference to the NH |
||||
A new orchestration agent `NhgOrch` will be written to handle the new NEXTHOP_GROUP_TABLE in APP_DB. For adding or updating an entry in the NEXTHOP_GROUP_TABLE, programming will depend on whether the group is configured with one or multiple next hops or is a recursive nexthop group i.e. it contains one or more nexthop group. | ||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Same as previous comment. NhgOrch is extended to handle recursive NEXTHOP_GROUP_TABLE entries in app-db. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done |
||||
|
||||
- NhgOrch treats an NEXTHOP_GROUP as "recursive" if it has "nexthop_group" as the only field. This field contains "member" NEXTHOP_GROUP. The parent NEXTHOP_GROUP key is constructed by combining keys of the member NEXTHOP_GROUPs. Such an NEXTHOP_GROUP always uses a "NEXT_HOP_GOUP" object-id to represent the group | ||||
- Singleton (non-recursive) NEXTHOP_GROUP contain various fields e.g. Inexthop, alias etc. Such NEXTHOP_GROUP form a key by combining these fields and query the NeighOrch with the "key" to get the corresponding SAI ID. So singleton (non-recursive) NEXTHOP_GROUP do not own the SAI ID rather they just reuse the SAI IDs owned by NeighOrch module. | ||||
- If one or more members of a "recursive" NEXTHOP_GROUP are not available, then the "NEXTHOP_GROUP" object is created with just the available members as long as there is one. The msg is retained in NhgOrch's mToSync queue so that it keeps checking for the availability of all the member NEXTHOP_GROUP. As and when the member NEXTHOP_GROUP are available, the "NEXTHOP_GROUP" object is updated with the new members. The "recursive" NEXTHOP_GROUP allow for members to be updated without having to update the referring routes. Since the SAI ID remains unchanged, update is transparent to the RouteOrch modules. | ||||
- When a port-down event is received from Syncd, the corresponding nexthops are removed from all the nexthop groups they're part of. | ||||
|
||||
Dummy code for this logic: | ||||
``` | ||||
if (nexthop_group NOT present) | ||||
sai_id = sai ID of the next hop | ||||
else | ||||
create next hop group | ||||
add next hop group member for each next hop | ||||
sai_id = sai ID of the next hop group | ||||
``` | ||||
|
||||
The handling of whether a next hop group member can be programmed or not, interfaces going up and down, etc. will match the same handling done by next hop groups managed by the route orchagent. | ||||
|
||||
When the next hop group is deleted, the next hop group orchagent will remove the next hop group from the ASIC_DB if required, and remove the association between the group identifier from APP_DB and the SAI identifier from ASIC_DB. This will only happen once the next hop group is no longer referenced by any routes - and so the next hop group will maintain a reference count within orchagent to ensure this. | ||||
|
||||
The route orchagent will parse the new nexthop_group field, and call into the next hop group orchagent to find the corresponding SAI object ID to program as the next hop attribute. If the next hop group does not exist, the route will remain on the list of routes to sync, and will be retried on future runs of the route orchagent. | ||||
|
||||
If a next hop group cannot be programmed because the data plane limit has been reached, one next hop will be picked to be temporarily used for that group. When a route is then programmed using that next hop group, it will be informed that a temporary form of the next hop group is in use, and so the route will remain on the pending list and will retry until the correct next hop group is programmed. This mirrors the current behaviour in the route orchagent when the next hop group limit is reached. The APP_DB entries remain unchanged. | ||||
|
||||
### SAI API | ||||
|
||||
|
@@ -475,6 +515,46 @@ Sample of CONFIG DB snippet given below: | |||
} | ||||
}, | ||||
``` | ||||
#### APPL DB Enhancements | ||||
|
||||
This feature adds a new NEXT_HOP_GROUP_TABLE, to store next hop group information to be used by one or more routes. Recursive/ecmp routes are represented by referencing a list of nexthop group ids in the nexthop_group field. When this field is used, other fields will not be present. | ||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. "adds a new .." => "extends the existing .." There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done |
||||
|
||||
|
||||
``` | ||||
### NEXT_HOP_GROUP_TABLE | ||||
Stores a list of groups of one or more next hops | ||||
Status: Mandatory | ||||
key = NEXT_HOP_GROUP_TABLE:string ; | ||||
arbitrary string identifying the next hop | ||||
group, as determined by the programming | ||||
application. | ||||
nexthop = *prefix, ; IP addresses separated | ||||
“,” (empty indicates no gateway) | ||||
ifname = *PORT_TABLE.key, ; zero or more separated | ||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why just PORT_TABLE? how about portchannel, vlan and sub-interfaces..etc There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There is no change in this DB Table. This is as is from https://github.com/sonic-net/SONiC/blob/master/doc/ip/next_hop_group_hld.md There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do we allow NH interfaces other than PORT_TABLE? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, as I understand all possible existing NH interface would be allowed There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Just mentioning PORT_TABLE alone the schema is weird, please fix it. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Changed to "INTF_TABLE" so that all types on interfaces are covered |
||||
by “,” (zero indicates no interface) | ||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The above fields already exist as part of the NEXT_HOP_GROUP_TABLE: SONiC/doc/ip/next_hop_group_hld.md Line 85 in f799bf2
Please mark "nexthop_group" field as newly added by this HLD. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This table is new in this HLD. Added "existing" and "new" fields labels for the "ROUTE_TABLE" There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done |
||||
nexthop_group = NEXT_HOP_GROUP_TABLE:key, ; index within the | ||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. can you add an example for the recursive routes? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The DB entry for recursive NHG is same as ECMP NHGs. In that for recursive NHG may contain only one NH and hence the NHA_GROUP will contain only one NHID |
||||
NEXT_HOP_GROUP_TABLE seperated by "," used | ||||
for recursice/ecmp routes. ( When this field | ||||
is present, other fields will not be present) | ||||
|
||||
``` | ||||
The ROUTE_TABLE is then extended to allow a reference to the next hop group to be specified, instead of the current nexthop and intf fields. | ||||
``` | ||||
### ROUTE_TABLE | ||||
;Stores a list of routes | ||||
;Status: Mandatory | ||||
key = ROUTE_TABLE:prefix | ||||
nexthop = *prefix, ;IP addresses separated “,” (empty | ||||
indicates no gateway) | ||||
ifname = *PORT_TABLE.key, ; zero or more separated | ||||
by “,” (zero indicates no interface) | ||||
blackhole = BIT ; Set to 1 if this route is a blackhole | ||||
(or null0) | ||||
nexthop_group = NEXT_HOP_GROUP_TABLE:key ; index within the | ||||
NEXT_HOP_GROUP_TABLE, optionally used instead | ||||
of nexthop and intf fields | ||||
``` | ||||
|
||||
|
||||
### Warmboot and Fastboot Design Impact | ||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Nexthop group ID reconciliation as part of the warm-boot is not currently supported. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done |
||||
|
||||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The "NhgOrch" already exists. The HLD/PR enhances the NhgOrch to handle recursive nexthop groups.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done