-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Linkmgrd subscribing State DB route event #13
Conversation
Define handlers
…net#8) Description of PR Summary: Fixes # (issue) Fix unstable unit tests. People observed that linkmgrd unit tests appeared to be flaky, and kept failing PR checks. After checking the build logs, I found that in most of the failures, the state change handler didn't seem to be invoked: Linkmgrd's composite state remained as it was before the state change was post. This PR updates code to invoke io_service::reset(), sets the io_service to no longer be in a stopped state, allowing subsequent calls to run_one() to invoke handlers. (When io_service is stopped, any call to run_one() will return immediately without invoking any handler. ) Signed-off-by: Jing Zhang [email protected] Type of change Bug fix Approach What is the motivation for this PR? Stabilize linkmgrd unit tests. How did you do it? Reset stopped io_service before calling to run_one(), to guarantee state change handlers are invoked as expected. How did you verify/test it? Tested building locally. This issue is hard to reproduce, the current build failure data is 2.98% (56/1877 in last 30 days). We should expect to see improvement after this change.
Add pull request template for linkmgrd repo. sign-off: Jing Zhang [email protected]
This reverts commit 9d38f17.
src/MuxManager.cpp
Outdated
} | ||
|
||
std::string nextState = "na"; | ||
if (mIpv4DefaultRouteState == "ok" && mIpv6DefaultRouteState == "ok") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lguohan do we have concern if either IPv4 or IPv6 BGP is not configured? Can we assume we will have them both eventually?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated per discussion in Gemini sync-up meeting.
{ | ||
MUXLOGWARNING(boost::format("%s: state db default route state: %s") % mMuxPortConfig.getPortName() % routeState); | ||
|
||
if(mComponentInitState.test(MuxStateComponent)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add space after if
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated accordingly.
|
||
if(mComponentInitState.test(MuxStateComponent)) { | ||
if (ms(mCompositeState) == mux_state::MuxState::Label::Active && | ||
routeState == "na") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest putting '{' on a new line, especially when if condition expands multiple lines. new line helps to identify the beginning of the if body.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated accordingly.
src/MuxManager.cpp
Outdated
} | ||
|
||
PortMapIterator portMapIterator = mPortMap.begin(); | ||
while(portMapIterator != mPortMap.end()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
space after while please.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated accordingly.
MUXLOGWARNING(boost::format("%s: state db default route state: %s") % mMuxPortConfig.getPortName() % routeState); | ||
|
||
if (mComponentInitState.test(MuxStateComponent)) { | ||
if (ms(mCompositeState) == mux_state::MuxState::Label::Active && routeState == "na") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm thinking we should check for != Standby
. What if mux is in a transient state and we got this event? What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I think this is a good point. I will evaluate the behavior and update accordingly.
Updated, thanks!
if (ms(mCompositeState) == mux_state::MuxState::Label::Active && routeState == "na") { | ||
CompositeState nextState = mCompositeState; | ||
enterLinkProberState(nextState, link_prober::LinkProberState::Wait); | ||
switchMuxState(nextState, mux_state::MuxState::Label::Standby); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does it behave if both ToRs have default route as "na" ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both ToRs will switch to standby, the one switches first will be "active" at the end.
Do we expect both ToRs to be "standby" in this scenario? This is not supported currently by linkmgrd as I know, we might need find other workaround if so.
Signed-off-by: Longxiang Lyu <[email protected]>
Signed-off-by: Ying Xie <[email protected]>
Signed-off-by: Longxiang Lyu <[email protected]>
Signed-off-by: Longxiang Lyu <[email protected]>
This reverts commit 12b9951.
…c-linkmgrd into subscribeRouteEvent
boost::asio::ip::make_address is unable to parse "::/0" and "0.0.0.0/0", and considering the key will always be these two strings, I updated the code and removed make_address(). Changes have been tested on virtual dual testbeds and worked as expected. @prsunny could you review again? Thanks! |
Summary: Fixes # (issue) This PR is to make linkmgrd subscribe events from ROUTE_TABLE in State DB, and react accordingly: - If any of the two default route state appears to be 'na', linkmgrd should switch to standby. - If both are 'ok', there will be a mux state probing and what happens next depends on linkmgrd state machine and the probing response. Type of change: New feature Motivation for this PR: To make linkmgrd subscribe state DB route event, and handle the switchovers. Documentation: Related PR: sonic-net/sonic-swss#2009
Summary: Fixes # (issue) This PR is to make linkmgrd subscribe events from ROUTE_TABLE in State DB, and react accordingly: - If any of the two default route state appears to be 'na', linkmgrd should switch to standby. - If both are 'ok', there will be a mux state probing and what happens next depends on linkmgrd state machine and the probing response. Type of change: New feature Motivation for this PR: To make linkmgrd subscribe state DB route event, and handle the switchovers. Documentation: Related PR: sonic-net/sonic-swss#2009
Description of PR
Summary:
Fixes # (issue)
This PR is to make linkmgrd subscribe events from ROUTE_TABLE in State DB, and react accordingly:
Type of change
Approach
What is the motivation for this PR?
To make linkmgrd subscribe state DB route event, and handle the switchovers.
How did you do it?
How did you verify/test it?
Any platform specific information?
Supported testbed topology if it's a new test case?
Documentation
Related PR: sonic-net/sonic-swss#2009