-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
during config reload, teamd is restarted twice #3822
Comments
Still happens with #3823. |
Note: this issue is reproduced on both master and 201811 |
do we know the reason why teamd restart twice in config reload? |
@lguohan: The |
Joe, I never noticed teamd starting twice with just reboot. According to our finding, this would leave system in broken state. So I am leaning towards to say it didn't happen or happened seldomly. Regards, |
@stepanblyschak is correct, at least I can see the teamd docker has been stopped and restarted during the reboot process.
|
I did two reboots, both show such behavior. The image is HEAD.168-5b18aa5f , which contains #3823 below is the log, after reboot. you can see teamd docker has been stopped and restarted. I do not know why after syncd started, systemd tried to stop teamd and restart it again.
|
i think it is caused by this line. https://github.com/Azure/sonic-buildimage/blob/master/files/scripts/swss.sh#L90 after syncd get started, it will sequentially restart teamd radv dhcp_relay. the above log match. |
The restart was added to address the radv not started properly issue. And dhcp_relay doesn't care being restarted again. But I think teamd is currently started after swss. Therefore, in the boot sequence, at the time when we 'restart' teamd, it should not have been started the first time yet. |
We should look into the change further regardless. |
"But I think teamd is currently started after swss. ", it is not the case based on the log above. as you can see swss is considered as started in the following log. Then immediately, teamd started.
|
Description
Teamd service is restarted twice during config reload/minigraph.
This cause intfmgrd to configure IP address when team devices got created, but then teamd restarts again and recreates teams so IP address is lost on interface.
I believe that there is another issue that intfmgrd cannot reconfigure interfaces after they were recreated, which makes teamd restart whithout bringing down swss useless.
If SONiC has to support teamd individual restart, other daemons in swss should be able to restore configuration properly.
It looks that during restart of swss, dhcp_relay starts and brings teamd up, than in wait() swss restarts teamd.
Steps to reproduce the issue:
Describe the results you received:
No IP addresses on Po
Describe the results you expected:
IP address on Po
Additional information you deem important (e.g. issue happens only occasionally):
Distribution: Debian 9.11
Kernel: 4.9.0-9-2-amd64
Build commit: 794d459
Build date: Tue Nov 26 08:18:45 UTC 2019
Built by: johnar@jenkins-worker-4
Platform: x86_64-mlnx_msn2700-r0
HwSKU: ACS-MSN2700
ASIC: mellanox
Serial Number: MT1822K07815
Uptime: 14:23:26 up 2:09, 3 users, load average: 3.53, 4.92, 5.77
Docker images:
REPOSITORY TAG IMAGE ID SIZE
docker-syncd-mlnx HEAD.134-794d4594 4256768182aa 373MB
docker-syncd-mlnx latest 4256768182aa 373MB
docker-fpm-frr HEAD.134-794d4594 2f9c5c2f1898 321MB
docker-fpm-frr latest 2f9c5c2f1898 321MB
docker-sflow HEAD.134-794d4594 2153ecfa328f 305MB
docker-sflow latest 2153ecfa328f 305MB
docker-lldp-sv2 HEAD.134-794d4594 caf9d4c2102e 299MB
docker-lldp-sv2 latest caf9d4c2102e 299MB
docker-dhcp-relay HEAD.134-794d4594 3a0e1f49a73f 289MB
docker-dhcp-relay latest 3a0e1f49a73f 289MB
docker-database HEAD.134-794d4594 a27870d25c87 281MB
docker-database latest a27870d25c87 281MB
docker-snmp-sv2 HEAD.134-794d4594 e0a37b191f29 335MB
docker-snmp-sv2 latest e0a37b191f29 335MB
docker-orchagent HEAD.134-794d4594 53d0f4f69dfb 322MB
docker-orchagent latest 53d0f4f69dfb 322MB
docker-teamd HEAD.134-794d4594 1978c6a4d093 304MB
docker-teamd latest 1978c6a4d093 304MB
docker-sonic-telemetry HEAD.134-794d4594 7fa1c0f6200d 304MB
docker-sonic-telemetry latest 7fa1c0f6200d 304MB
docker-router-advertiser HEAD.134-794d4594 ed1090c14998 281MB
docker-router-advertiser latest ed1090c14998 281MB
docker-platform-monitor HEAD.134-794d4594 11c35c684d4d 565MB
docker-platform-monitor latest 11c35c684d4d 565MB
The text was updated successfully, but these errors were encountered: