Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[T2-Chassis][Route-Convergence]: route-convergence takes upto 10secs in Process crash(swss/syncd) scenarios #21586

Open
deepak-singhal0408 opened this issue Jan 31, 2025 · 1 comment
Assignees
Labels
P0 Priority of the issue

Comments

@deepak-singhal0408
Copy link
Contributor

On T2 Chassis, running 202405 image, process crash takes upto 10 seconds for traffic to converge. It needs to be optimized. Ideally, we should achieve subsecond convergence in this scenario..

Testplan:
https://github.com/sonic-net/sonic-mgmt/blob/master/docs/testplan/Convergence%20measurement%20in%20data%20center%20networks.md#test-case--26

Number of prefixes:
60k(30k V4+30k v6) from each Upstream Neighbors

Number of Upstream Neighbors: 16

Testcase:
https://github.com/sonic-net/sonic-mgmt/blob/master/tests/snappi_tests/multidut/bgp/test_bgp_outbound_uplink_process_crash.py

@deepak-singhal0408 deepak-singhal0408 self-assigned this Jan 31, 2025
@deepak-singhal0408 deepak-singhal0408 added the P0 Priority of the issue label Jan 31, 2025
@deepak-singhal0408
Copy link
Contributor Author

The observation is that traffic loss is observed when these processes are coming up.. They start learning routes from their neighbors(upstream) and immediately start advertising to their other neighbors(downstream) and start attracting traffic even before programming them in their asics.

@deepak-singhal0408 deepak-singhal0408 changed the title [T2-Chassis][Route-Convergence]: route-convergence takes upto 10secs in Process crash swss/syncd scenarios [T2-Chassis][Route-Convergence]: route-convergence takes upto 10secs in Process crash(swss/syncd) scenarios Jan 31, 2025
rlhui pushed a commit that referenced this issue Feb 11, 2025
mssonicbld added a commit to mssonicbld/sonic-buildimage-msft that referenced this issue Feb 11, 2025
<!--
     Please make sure you've read and understood our contributing guidelines:
     https://github.com/Azure/SONiC/blob/gh-pages/CONTRIBUTING.md

     ** Make sure all your commits include a signature generated with `git commit -s` **

     If this is a bug fix, make sure your description includes "fixes #xxxx", or
     "closes #xxxx" or "resolves #xxxx"

     Please provide the following information:
-->

#### Why I did it
Fixes issue: sonic-net/sonic-buildimage#21586

##### Work item tracking
- Microsoft ADO **31196012**:

#### How I did it
Run TSA-TSB service upon swss/swss0/swss1/.. startup. If the service is already running, reset the TSA-TSB timer.

#### How to verify it
Ran the T2 process crash sonic-mgmt snappi test to verify the convergence.
Before fix: ~10second
After Fix: <10ms

<!--
If PR needs to be backported, then the PR must be tested against the base branch and the earliest backport release branch and provide tested image version on these two branches. For example, if the PR is requested for master, 202211 and 202012, then the requester needs to provide test results on master and 202012.
-->

#### Which release branch to backport (provide reason below if selected)

<!--
- Note we only backport fixes to a release branch, *not* features!
- Please also provide a reason for the backporting below.
- e.g.
- [x] 202006
-->

- [ ] 201811
- [ ] 201911
- [ ] 202006
- [ ] 202012
- [ ] 202106
- [ ] 202111
- [ ] 202205
- [ ] 202211
- [ ] 202305

#### Tested branch (Please provide the tested image version)
SONiC.20240532.04
<!--
- Please provide tested image version
- e.g.
- [x] 20201231.100
-->

- [ ] <!-- image version 1 -->
- [ ] <!-- image version 2 -->

#### Description for the changelog
<!--
Write a short (one line) summary that describes the changes in this
pull request for inclusion in the changelog:
-->

<!--
 Ensure to add label/tag for the feature raised. example - PR#2174 under sonic-utilities repo. where, Generic Config and Update feature has been labelled as GCU.
-->

#### Link to config_db schema for YANG module changes
<!--
Provide a link to config_db schema for the table for which YANG model
is defined
Link should point to correct section on https://github.com/Azure/sonic-buildimage/blob/master/src/sonic-yang-models/doc/Configuration.md
-->

#### A picture of a cute animal (not mandatory but encouraged)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P0 Priority of the issue
Projects
Status: No status
Development

No branches or pull requests

1 participant