[VOQ] Fabric orchagent exit in Supervisor #15321

judyjoseph · 2023-06-03T01:54:14Z

Description

Orchagent controlling the fabric asic exit seen on Nokia chassis supervisor due to TIMEOUT error. This is seen on a chassis with all the fabric cards inserted.

The CPU is high and continuous logs are seen in syslog "get:SAI_OBJECT_TYPE_PORT"

Steps to reproduce the issue:

Boot the chassis, observe it in the supervisor.

Describe the results you received:

May 30 07:54:10.454677 svcstr--sup-1 ERR syncd2#syncd: :- threadFunction: time span WD exceeded 30283 ms for SET:FABRIC_PORT_STAT_COUNTER:oid:0x1000000000181
May 30 07:54:10.454702 svcstr--sup-1 ERR syncd2#syncd: :- logEventData: op: SET, key: FABRIC_PORT_STAT_COUNTER:oid:0x1000000000181
May 30 07:54:10.454702 svcstr--sup-1 ERR syncd2#syncd: :- logEventData: fv: PORT_COUNTER_ID_LIST: SAI_PORT_STAT_IF_OUT_FABRIC_DATA_UNITS,SAI_PORT_STAT_IF_IN_FEC_SYMBOL_ERRORS,SAI_PORT_STAT_IF_IN_FEC_NOT_CORRECTABLE_FRAMES,SAI_PORT_STAT_IF_IN_FEC_CORRECTABLE_FRAMES,SAI_PORT_STAT_IF_OUT_OCTETS,SAI_PORT_STAT_IF_IN_FABRIC_DATA_UNITS,SAI_PORT_STAT_IF_IN_ERRORS,SAI_PORT_STAT_IF_IN_OCTETS
May 30 07:54:24.705958 svcstr--sup-1 ERR syncd9#syncd: :- threadFunction: time span WD exceeded 30273 ms for SET:FABRIC_PORT_STAT_COUNTER:oid:0x1000000000184
May 30 07:54:24.705958 svcstr--sup-1 ERR syncd9#syncd: :- logEventData: op: SET, key: FABRIC_PORT_STAT_COUNTER:oid:0x1000000000184
May 30 07:54:24.705958 svcstr--sup-1 ERR syncd9#syncd: :- logEventData: fv: PORT_COUNTER_ID_LIST: SAI_PORT_STAT_IF_OUT_FABRIC_DATA_UNITS,SAI_PORT_STAT_IF_IN_FEC_SYMBOL_ERRORS,SAI_PORT_STAT_IF_IN_FEC_NOT_CORRECTABLE_FRAMES,SAI_PORT_STAT_IF_IN_FEC_CORRECTABLE_FRAMES,SAI_PORT_STAT_IF_OUT_OCTETS,SAI_PORT_STAT_IF_IN_FABRIC_DATA_UNITS,SAI_PORT_STAT_IF_IN_ERRORS,SAI_PORT_STAT_IF_IN_OCTETS
May 30 07:54:28.366243 svcstr--sup-1 ERR syncd9#syncd: [-bdb:5:1] SAI_API_PORT:_brcm_sai_port_wred_stats_get:15065 port gport get failed with error Feature unavailable (0xfffffff0).
May 30 07:54:28.366449 svcstr--sup-1 ERR syncd9#syncd: [-bdb:5:1] SAI_API_PORT:brcm_sai_get_port_stats:5187 port wred stats get failed with error -2. 
May 30 07:54:28.366508 svcstr--sup-1 ERR syncd9#syncd: [-bdb:5:1] SAI_API_PORT:_brcm_sai_port_wred_stats_get:15065 port gport get failed with error Feature unavailable (0xfffffff0).
May 30 07:54:28.366557 svcstr--sup-1 ERR syncd9#syncd: [-bdb:5:1] SAI_API_PORT:brcm_sai_get_port_stats:5187 port wred stats get failed with error -2. 
May 30 07:54:28.685003 svcstr--sup-1 ERR syncd9#syncd: :- setEndTime: event 'SET:FABRIC_PORT_STAT_COUNTER:oid:0x1000000000184' took 34253 ms to execute
May 30 07:54:28.685003 svcstr--sup-1 ERR syncd9#syncd: :- logEventData: op: SET, key: FABRIC_PORT_STAT_COUNTER:oid:0x1000000000184
May 30 07:54:28.685003 svcstr--sup-1 ERR syncd9#syncd: :- logEventData: fv: PORT_COUNTER_ID_LIST: SAI_PORT_STAT_IF_OUT_FABRIC_DATA_UNITS,SAI_PORT_STAT_IF_IN_FEC_SYMBOL_ERRORS,SAI_PORT_STAT_IF_IN_FEC_NOT_CORRECTABLE_FRAMES,SAI_PORT_STAT_IF_IN_FEC_CORRECTABLE_FRAMES,SAI_PORT_STAT_IF_OUT_OCTETS,SAI_PORT_STAT_IF_IN_FABRIC_DATA_UNITS,SAI_PORT_STAT_IF_IN_ERRORS,SAI_PORT_STAT_IF_IN_OCTETS
May 30 07:54:28.709385 svcstr--sup-1 ERR syncd9#syncd: [-bdb:5:1] SAI_API_PORT:_brcm_sai_port_wred_stats_get:15065 port gport get failed with error Feature unavailable (0xfffffff0).
May 30 07:54:28.709436 svcstr--sup-1 ERR syncd9#syncd: [-bdb:5:1] SAI_API_PORT:brcm_sai_get_port_stats:5187 port wred stats get failed with error -2. 
May 30 07:54:28.709487 svcstr--sup-1 ERR syncd9#syncd: [-bdb:5:1] SAI_API_PORT:_brcm_sai_port_wred_stats_get:15065 port gport get failed with error Feature unavailable (0xfffffff0).
May 30 07:54:28.709539 svcstr--sup-1 ERR syncd9#syncd: [-bdb:5:1] SAI_API_PORT:brcm_sai_get_port_stats:5187 port wred stats get failed with error -2. 
May 30 07:54:35.872260 svcstr--sup-1 ERR syncd0#syncd: :- threadFunction: time span WD exceeded 30969 ms for SET:FABRIC_PORT_STAT_COUNTER:oid:0x1000000000182
May 30 07:54:35.872316 svcstr--sup-1 ERR syncd0#syncd: :- logEventData: op: SET, key: FABRIC_PORT_STAT_COUNTER:oid:0x1000000000182
May 30 07:54:35.872368 svcstr--sup-1 ERR syncd0#syncd: :- logEventData: fv: PORT_COUNTER_ID_LIST: SAI_PORT_STAT_IF_OUT_FABRIC_DATA_UNITS,SAI_PORT_STAT_IF_IN_FEC_SYMBOL_ERRORS,SAI_PORT_STAT_IF_IN_FEC_NOT_CORRECTABLE_FRAMES,SAI_PORT_STAT_IF_IN_FEC_CORRECTABLE_FRAMES,SAI_PORT_STAT_IF_OUT_OCTETS,SAI_PORT_STAT_IF_IN_FABRIC_DATA_UNITS,SAI_PORT_STAT_IF_IN_ERRORS,SAI_PORT_STAT_IF_IN_OCTETS
May 30 07:54:38.298868 svcstr--sup-1 ERR syncd0#syncd: :- setEndTime: event 'SET:FABRIC_PORT_STAT_COUNTER:oid:0x1000000000182' took 33398 ms to execute
May 30 07:54:38.300775 svcstr--sup-1 ERR syncd0#syncd: :- logEventData: op: SET, key: FABRIC_PORT_STAT_COUNTER:oid:0x1000000000182
May 30 07:54:38.863110 svcstr--sup-1 ERR syncd0#syncd: [-bdb:1:0] SAI_API_PORT:_brcm_sai_port_wred_stats_get:15065 port gport get failed with error Feature unavailable (0xfffffff0).
May 30 07:54:38.863207 svcstr--sup-1 ERR syncd0#syncd: [-bdb:1:0] SAI_API_PORT:brcm_sai_get_port_stats:5187 port wred stats get failed with error -2. 
May 30 07:54:38.863306 svcstr--sup-1 ERR syncd0#syncd: [-bdb:1:0] SAI_API_PORT:_brcm_sai_port_wred_stats_get:15065 port gport get failed with error Feature unavailable (0xfffffff0).
May 30 07:54:38.863397 svcstr--sup-1 ERR syncd0#syncd: [-bdb:1:0] SAI_API_PORT:brcm_sai_get_port_stats:5187 port wred stats get failed with error -2. 
May 30 07:54:40.229355 svcstr--sup-1 ERR swss2#orchagent: :- wait: SELECT operation result: TIMEOUT on getresponse
May 30 07:54:40.229481 svcstr--sup-1 ERR swss2#orchagent: :- wait: failed to get response for getresponse
May 30 07:54:43.353386 svcstr--sup-1 ERR syncd2#syncd: :- setEndTime: event 'SET:FABRIC_PORT_STAT_COUNTER:oid:0x1000000000181' took 63181 ms to execute
May 30 07:54:43.353507 svcstr--sup-1 ERR syncd2#syncd: :- logEventData: op: SET, key: FABRIC_PORT_STAT_COUNTER:oid:0x1000000000181
May 30 07:54:43.353566 svcstr--sup-1 ERR syncd2#syncd: :- logEventData: fv: PORT_COUNTER_ID_LIST: SAI_PORT_STAT_IF_OUT_FABRIC_DATA_UNITS,SAI_PORT_STAT_IF_IN_FEC_SYMBOL_ERRORS,SAI_PORT_STAT_IF_IN_FEC_NOT_CORRECTABLE_FRAMES,SAI_PORT_STAT_IF_IN_FEC_CORRECTABLE_FRAMES,SAI_PORT_STAT_IF_OUT_OCTETS,SAI_PORT_STAT_IF_IN_FABRIC_DATA_UNITS,SAI_PORT_STAT_IF_IN_ERRORS,SAI_PORT_STAT_IF_IN_OCTETS

Jun  3 01:49:29.701348 svcstr--sup-1 NOTICE syncd0#syncd: :- threadFunction: time span 50 ms for 'get:SAI_OBJECT_TYPE_PORT:oid:0x1000000000137'
Jun  3 01:49:29.713273 svcstr--sup-1 NOTICE syncd13#syncd: :- threadFunction: time span 345 ms for 'get:SAI_OBJECT_TYPE_PORT:oid:0x100000000011b'
Jun  3 01:49:29.789858 svcstr--sup-1 NOTICE syncd15#syncd: :- threadFunction: time span 0 ms for 'get:SAI_OBJECT_TYPE_PORT:oid:0x1000000000142'
Jun  3 01:49:29.817874 svcstr--sup-1 NOTICE syncd1#syncd: :- threadFunction: time span 264 ms for 'get:SAI_OBJECT_TYPE_PORT:oid:0x1000000000125'
Jun  3 01:49:29.866492 svcstr--sup-1 NOTICE syncd10#syncd: :- threadFunction: time span 137 ms for 'get:SAI_OBJECT_TYPE_PORT:oid:0x100000000012c'
Jun  3 01:49:29.954517 svcstr--sup-1 NOTICE syncd8#syncd: :- threadFunction: time span 114 ms for 'get:SAI_OBJECT_TYPE_PORT:oid:0x100000000012f'
Jun  3 01:49:30.060723 svcstr--sup-1 NOTICE syncd7#syncd: :- threadFunction: time span 365 ms for 'get:SAI_OBJECT_TYPE_PORT:oid:0x1000000000117'
Jun  3 01:49:30.152688 svcstr--sup-1 NOTICE syncd2#syncd: :- threadFunction: time span 15 ms for 'get:SAI_OBJECT_TYPE_PORT:oid:0x1000000000131'
Jun  3 01:49:30.296935 svcstr--sup-1 NOTICE syncd3#syncd: :- threadFunction: time span 81 ms for 'get:SAI_OBJECT_TYPE_PORT:oid:0x100000000013b'
Jun  3 01:49:30.353488 svcstr--sup-1 NOTICE syncd11#syncd: :- threadFunction: time span 25 ms for 'get:SAI_OBJECT_TYPE_PORT:oid:0x100000000012e'
Jun  3 01:49:30.404254 svcstr--sup-1 NOTICE syncd14#syncd: :- threadFunction: time span 166 ms for 'get:SAI_OBJECT_TYPE_PORT:oid:0x1000000000100'
Jun  3 01:49:30.409476 svcstr--sup-1 NOTICE syncd6#syncd: :- threadFunction: time span 191 ms for 'get:SAI_OBJECT_TYPE_PORT:oid:0x10000000000f6'
Jun  3 01:49:30.467450 svcstr--sup-1 NOTICE syncd12#syncd: :- threadFunction: time span 43 ms for 'get:SAI_OBJECT_TYPE_PORT:oid:0x1000000000104'
Jun  3 01:49:30.504442 svcstr--sup-1 NOTICE syncd5#syncd: :- threadFunction: time span 57 ms for 'get:SAI_OBJECT_TYPE_PORT:oid:0x100000000013d'
Jun  3 01:49:30.574298 svcstr--sup-1 NOTICE syncd4#syncd: :- threadFunction: time span 34 ms for 'get:SAI_OBJECT_TYPE_PORT:oid:0x10000000000e4'

Describe the results you expected:

Output of `show version`:

show version 

SONiC Software Version: SONiC.C.20220531.27.05
SONiC OS Version: 11
Distribution: Debian 11.7
Kernel: 5.10.0-18-2-amd64
Build commit: 9e776925c2
Build date: Wed May 24 21:18:30 UTC 2023
Built by: cloudtest@8c5b0374c000000

Output of `show techsupport`:

(paste your output here or download and attach the file here )

Additional information you deem important (e.g. issue happens only occasionally):

The text was updated successfully, but these errors were encountered:

judyjoseph · 2023-06-05T16:50:23Z

Adding more logs. With FABRIC poll, TIMEOUTs seen on random swss/.fabric asic on SUP on a fully populated chassis

2023-06-05T02:59:22.0269718Z E               Jun  5 02:52:20.310548 svcstr--sup-1 ERR syncd10#syncd: :- setEndTime: event 'SET:FABRIC_PORT_STAT_COUNTER:oid:0x1000000000169' took 30378 ms to execute
2023-06-05T02:59:22.0270480Z E               
2023-06-05T02:59:22.0271485Z E               Jun  5 02:52:20.310653 svcstr--sup-1 ERR syncd10#syncd: :- logEventData: op: SET, key: FABRIC_PORT_STAT_COUNTER:oid:0x1000000000169
2023-06-05T02:59:22.0272268Z E               
2023-06-05T02:59:22.0273565Z E               Jun  5 02:52:20.310708 svcstr--sup-1 ERR syncd10#syncd: :- logEventData: fv: PORT_COUNTER_ID_LIST: SAI_PORT_STAT_IF_OUT_FABRIC_DATA_UNITS,SAI_PORT_STAT_IF_IN_FEC_SYMBOL_ERRORS,SAI_PORT_STAT_IF_IN_FEC_NOT_CORRECTABLE_FRAMES,SAI_PORT_STAT_IF_IN_FEC_CORRECTABLE_FRAMES,SAI_PORT_STAT_IF_OUT_OCTETS,SAI_PORT_STAT_IF_IN_FABRIC_DATA_UNITS,SAI_PORT_STAT_IF_IN_ERRORS,SAI_PORT_STAT_IF_IN_OCTETS

judyjoseph · 2023-06-05T16:51:06Z

@saksarav-nokia @mlok-nokia to check.

judyjoseph · 2023-06-05T18:35:53Z

#define FABRIC_PORT_STAT_COUNTER_FLEX_COUNTER_GROUP "FABRIC_PORT_STAT_COUNTER"
#define FABRIC_PORT_STAT_FLEX_COUNTER_POLLING_INTERVAL_MS 10000
#define FABRIC_QUEUE_STAT_COUNTER_FLEX_COUNTER_GROUP "FABRIC_QUEUE_STAT_COUNTER"
#define FABRIC_QUEUE_STAT_FLEX_COUNTER_POLLING_INTERVAL_MS 100000

The fabric polling interval is very aggressive considering every 10sec we poll for all fabric ports

judyjoseph · 2023-06-09T16:24:27Z

I find two areas which will need optimization and fix

In this J2C+ linecards, longer delay is seen in get/set operations on Fabric ports. We can follow up with Broadcom on this @saksarav-nokia
In the orchagent/fabric orch
(i) The counter poll is very aggressive, https://github.com/sonic-net/sonic-swss/blob/7702466076f2998eceb86476595966a9cfea9a4d/orchagent/fabricportsorch.cpp#L19, queue stats for all interfaces every 10sec

(ii) The call to updateFabricPortState() is a heavy call, and here it is redundant https://github.com/sonic-net/sonic-swss/blob/master/orchagent/fabricportsorch.cpp#L360. The is because we already call updateFabricPortState() towards end of API getFabricPortList() which is called in doTask().
@ngoc-do Could you check please

judyjoseph · 2023-06-09T16:24:45Z

@arlakshm f.y.i

saksarav-nokia · 2023-06-13T19:36:03Z

@judyjoseph @arlakshm,
we analyzed the issue with fabric port stats polling and following is our findings.

We have 16 Ramons and 192 fabric ports in each Ramon.
When the supervisor is rebooted or config reload is done, the swss and syncd dockers are started and switch_create is called for each Ramon.
As soon as the switch_create is completed for a given swss/syncd, the fabric port stats polling is started from first fabric port and polled for every port one by one. But since the cpu is very busy with creating switch for all 16 swss/syncds and also the polling interval is 10000, the polling cycle is never completed for all 192 ports and we see the sai api call get_port_stats is invoked only for first set of fabric ports. When it is in the middle of polling, the next polling interval starts and the previous polling is interrupted, the polling starts from the very first port again. This process continues till config reload process is complete and all the swss/syncd dockers are up and running which takes ~5 minutes. After this i see the polling is done every 30secs (is this FABRIC_POLLING_INTERVAL_DEFAULT) and all 192 ports are polled.
We see sai api call get_port_stats to read all 8 fabric port stats for a given fabric ports takes only few ms in normal state and also during boot and config reload.
"threadFunction: time span" with higher values are printed for ports which are missed in polling during in completed polling cycles mentioned in bullet 3. Also we noticed that this time span messaged is printed with time value 0 and this needs to be addressed as well in swss/syncd.

So the only way to address this issue to optimize aggressive polling during bootup or config reload.

Thanks,
Sakthi

saksarav-nokia · 2023-06-13T20:45:52Z

cpm_syslog.log
Attached the syslog taken during config reload used for above analysis. Same behavior is seen during boot up.

arlakshm · 2023-06-14T17:09:09Z

@kenneth-arista, please take a look at this issue

saksarav-nokia · 2023-06-14T18:52:22Z

@kenneth-arista ,
we see that it takes ~30secs to poll all 192 counters in each polling interval. We increased FABRIC_PORT_STAT_FLEX_COUNTER_POLLING_INTERVAL_MS to 60secs and this seems to be helping a lot. Except the very first polling right after the config reload, all other polling cycles are completed for all 192 ports and each polling is completed in 30secs.

But looks like there is another issue with the fabric port counter. Even though all 192 ports are polled in every polling cycle and the duration to poll all 8 counters for each port is ~0.1 secs or less, we still see "syncd0#syncd: :- threadFunction: time span" logs for random few ports keeps in every polling cycle. When would we see this?.

kenneth-arista · 2023-06-22T22:41:32Z

@saksarav-nokia can you paste the output of show fabric reachability for your system?

I believe setting FABRIC_PORT_STAT_FLEX_COUNTER_POLLING_INTERVAL_MS to 60 secs to too slow and will negatively affect fabric link monitoring functionality.

@judyjoseph is correct in that there is a redundant call to updateFabricPortState() at the end of FabricPortsOrch::getFabricPortList(). Let's remove this.

We're gathering some data on our end. As a quick datapoint, we don't see orchagent restarts with config-reload nor during initial boot. However, it is not a fair comparison as we have fewer Ramons and fewer ports per Ramon.

Tagging @jfeng-arista for awareness

saksarav-nokia · 2023-06-23T13:56:21Z

fabric_reach.txt
@kenneth-arista , Please find the attached output of show fabric reachability from our cpm card which has 14 Ramons and 192 fabric ports in each.

abdosi · 2023-06-28T17:56:40Z

@kenneth-arista is working on PR to create to remove extra loop on updateFabricPortState

abdosi · 2023-06-28T17:58:24Z

also should we check on enhancing [enable_counter.py](https://github.com/sonic-net/sonic-buildimage/blob/master/dockers/docker-orchagent/enable_counters.py) to delay Fabric Port polling start

Call to updateFabricPortState in FabricPortsOrch::getFabricPortList() is redundant as FabricPortsOrch::doTask() already calls it. This change helps mitigate the MHz spikes during boot up of the supe as described in sonic-net/sonic-buildimage#15321.

kenneth-arista · 2023-07-09T06:23:44Z

@saksarav-nokia looking at your fabric_reach.txt output, not all 192 links are connected. On ASICs with connections, there are only 120 active links. Although Ramon supports up to 192 links, not all of them are used. There must something else amiss in your setup that is causing these timeouts. We also use Ramon, but have 144 active links and haven't yet seen these timeouts.

To help mitigate orchagent restarts, I posted sonic-net/sonic-swss#2850 to remove the redundant code. However, let's gather more info on what stats are being polled and how long the operations take before changing the polling interval.

saksarav-nokia · 2023-07-10T14:13:56Z

@kenneth-arista , We have 16 Ramons with 192 SFM links in each Ramon. Since we have only 5 (out of 8) IMM cards inserted in this chassis, only 120 SFM links are up. But i see SONiC fabric polling code polls the status for all 192 links even if only 120 links are up.

kenneth-arista · 2023-07-12T00:46:42Z

@saksarav-nokia can you propose a PR for changing the polling code because it's not productive for me to do it if I can't test it nor reproduce the problem.

Call to updateFabricPortState in FabricPortsOrch::getFabricPortList() is redundant as FabricPortsOrch::doTask() already calls it. This change helps mitigate the MHz spikes during boot up of the supe as described in sonic-net/sonic-buildimage#15321.

judyjoseph · 2023-07-13T15:26:30Z

Another interesting observation ( I have taken port:0x1000000000122 here in the below example ).

The SAI calls happens vey close like twice in subsequent seconds resulting in READ taking longer 1337 ms. Will need to check if there is some overlaps, or is it because the last polling of fabric ports did not complete and we have started the next loop etc

admin@svcstr-7250-sup-1:/var/log$ sudo zgrep 0x1000000000122 syslog | grep syncd7
Jul 13 14:29:15.896266 svcstr--sup-1 NOTICE syncd7#syncd: :- threadFunction: time span 272 ms for 'get:SAI_OBJECT_TYPE_PORT:oid:0x1000000000122'
Jul 13 14:33:18.096046 svcstr--sup-1 NOTICE syncd7#syncd: :- threadFunction: time span 168 ms for 'get:SAI_OBJECT_TYPE_PORT:oid:0x1000000000122'
Jul 13 14:39:19.286386 svcstr--sup-1 NOTICE syncd7#syncd: :- threadFunction: time span 121 ms for 'get:SAI_OBJECT_TYPE_PORT:oid:0x1000000000122'
Jul 13 14:39:45.309216 svcstr--sup-1 NOTICE syncd7#syncd: :- threadFunction: time span 149 ms for 'get:SAI_OBJECT_TYPE_PORT:oid:0x1000000000122'
Jul 13 14:45:20.538341 svcstr--sup-1 NOTICE syncd7#syncd: :- threadFunction: time span 272 ms for 'get:SAI_OBJECT_TYPE_PORT:oid:0x1000000000122'
Jul 13 14:45:47.559030 svcstr--sup-1 NOTICE syncd7#syncd: :- threadFunction: time span 337 ms for 'get:SAI_OBJECT_TYPE_PORT:oid:0x1000000000122'
Jul 13 14:45:48.559118 svcstr--sup-1 NOTICE syncd7#syncd: :- threadFunction: time span 1337 ms for 'get:SAI_OBJECT_TYPE_PORT:oid:0x1000000000122'
Jul 13 14:55:45.981219 svcstr--sup-1 NOTICE syncd7#syncd: :- threadFunction: time span 7 ms for 'get:SAI_OBJECT_TYPE_PORT:oid:0x1000000000122'
Jul 13 15:07:45.440779 svcstr--sup-1 NOTICE syncd7#syncd: :- threadFunction: time span 87 ms for 'get:SAI_OBJECT_TYPE_PORT:oid:0x1000000000122'
Jul 13 15:11:15.558521 svcstr--sup-1 NOTICE syncd7#syncd: :- threadFunction: time span 18 ms for 'get:SAI_OBJECT_TYPE_PORT:oid:0x1000000000122'

Call to updateFabricPortState in FabricPortsOrch::getFabricPortList() is redundant as FabricPortsOrch::doTask() already calls it. This change helps mitigate the MHz spikes during boot up of the supe as described in sonic-net/sonic-buildimage#15321.

judyjoseph · 2023-08-07T18:44:05Z

Closing this issue as we don't see the orchagent exits with this PR #2850. Still fine tuning of counters are still needed for fabric ports -- to open a new issue,

judyjoseph added chassis-voq Voq chassis changes Chassis 🤖 Modular chassis support labels Jun 3, 2023

judyjoseph added the Issue for 202205 label Jun 5, 2023

judyjoseph self-assigned this Jun 7, 2023

judyjoseph added the Triaged this issue has been triaged label Jun 7, 2023

rlhui added the P0 Priority of the issue label Jun 13, 2023

arlakshm added this to SONiC Chassis Jun 14, 2023

kenneth-arista mentioned this issue Jul 7, 2023

[Chassis]Remove redundant updateFabricPortState sonic-net/sonic-swss#2850

Merged

judyjoseph closed this as completed Aug 7, 2023

github-project-automation bot moved this to Done in SONiC Chassis Aug 7, 2023

judyjoseph mentioned this issue Sep 27, 2023

[Chassis][202205] CPU utilization on SUP card high with get:SAI_OBJECT_TYPE_PORT logs in syslog #16731

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[VOQ] Fabric orchagent exit in Supervisor #15321

[VOQ] Fabric orchagent exit in Supervisor #15321

judyjoseph commented Jun 3, 2023 •

edited

Loading

judyjoseph commented Jun 5, 2023

judyjoseph commented Jun 5, 2023 •

edited

Loading

judyjoseph commented Jun 5, 2023

judyjoseph commented Jun 9, 2023

judyjoseph commented Jun 9, 2023

saksarav-nokia commented Jun 13, 2023

saksarav-nokia commented Jun 13, 2023

arlakshm commented Jun 14, 2023

saksarav-nokia commented Jun 14, 2023

kenneth-arista commented Jun 22, 2023

saksarav-nokia commented Jun 23, 2023

abdosi commented Jun 28, 2023 •

edited

Loading

abdosi commented Jun 28, 2023 •

edited

Loading

kenneth-arista commented Jul 9, 2023

saksarav-nokia commented Jul 10, 2023

kenneth-arista commented Jul 12, 2023

judyjoseph commented Jul 13, 2023

judyjoseph commented Aug 7, 2023

[VOQ] Fabric orchagent exit in Supervisor #15321

[VOQ] Fabric orchagent exit in Supervisor #15321

Comments

judyjoseph commented Jun 3, 2023 • edited Loading

Description

Steps to reproduce the issue:

Describe the results you received:

Describe the results you expected:

Output of show version:

Output of show techsupport:

Additional information you deem important (e.g. issue happens only occasionally):

judyjoseph commented Jun 5, 2023

judyjoseph commented Jun 5, 2023 • edited Loading

judyjoseph commented Jun 5, 2023

judyjoseph commented Jun 9, 2023

judyjoseph commented Jun 9, 2023

saksarav-nokia commented Jun 13, 2023

saksarav-nokia commented Jun 13, 2023

arlakshm commented Jun 14, 2023

saksarav-nokia commented Jun 14, 2023

kenneth-arista commented Jun 22, 2023

saksarav-nokia commented Jun 23, 2023

abdosi commented Jun 28, 2023 • edited Loading

abdosi commented Jun 28, 2023 • edited Loading

kenneth-arista commented Jul 9, 2023

saksarav-nokia commented Jul 10, 2023

kenneth-arista commented Jul 12, 2023

judyjoseph commented Jul 13, 2023

judyjoseph commented Aug 7, 2023

judyjoseph commented Jun 3, 2023 •

edited

Loading

Output of `show version`:

Output of `show techsupport`:

judyjoseph commented Jun 5, 2023 •

edited

Loading

abdosi commented Jun 28, 2023 •

edited

Loading

abdosi commented Jun 28, 2023 •

edited

Loading