Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[routeorch] Add support for blackhole routes #1723

Merged
merged 4 commits into from
Apr 27, 2021
Merged

Conversation

shi-su
Copy link
Contributor

@shi-su shi-su commented Apr 26, 2021

What I did
Allow route orch to add/remove routes blackhole routes and add dvs tests.

Why I did it

How I verified it
Verify that the blackhole route gets applied into ASIC_DB with SAI_PACKET_ACTION_DROP

admin@vlab-01:~$ vtysh -c "configure terminal" -c "ip route 3.3.3.0/24 blackhole"
admin@vlab-01:~$ sonic-db-cli APPL_DB hgetall "ROUTE_TABLE:3.3.3.0/24"
{'blackhole': 'true'}
admin@vlab-01:~$ sonic-db-cli ASIC_DB keys "*3.3.3.0/24*"
ASIC_STATE:SAI_OBJECT_TYPE_ROUTE_ENTRY:{"dest":"3.3.3.0/24","switch_id":"oid:0x21000000000000","vr":"oid:0x3000000000022"}
admin@vlab-01:~$ sonic-db-cli ASIC_DB hgetall "ASIC_STATE:SAI_OBJECT_TYPE_ROUTE_ENTRY:{\"dest\":\"3.3.3.0/24\",\"switch_id\":\"oid:0x21000000000000\",\"vr\":\"oid:0x3000000000022\"}"
{'SAI_ROUTE_ENTRY_ATTR_PACKET_ACTION': 'SAI_PACKET_ACTION_DROP'}

Details if related

@shi-su shi-su marked this pull request as ready for review April 26, 2021 16:03
@shi-su shi-su marked this pull request as draft April 26, 2021 16:04
@shi-su shi-su marked this pull request as ready for review April 26, 2021 16:42
@shi-su shi-su requested a review from prsunny April 26, 2021 16:42
if (overlay_nh == false)
if (blackhole)
{
nhg = nhg = NextHopGroupKey();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove extra nhg =

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

self.clear_srv_config(dvs)

# add route entry
dvs.runcmd("vtysh -c \"configure terminal\" -c \"ip route 2.2.2.0/24 blackhole\"")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why don't we add the route to config_db and test the same?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there are two reasons not to do it in this way: 1. It seems that bgpcfgd is not running in docker-sonic-vs. Without bgpcfgd the static route table in config_db will not be applied. 2. I feel these vs tests mainly aim to test the orchagent functionalities. If we choose to add it in config_db, it would include the bgpcfgd functions which are not currently in the scope of dvs tests.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree to point 1. But for 2, I would say then we could directly add to APP_DB, why even vtysh. IMO, if the infra is available, lets use that path to configure.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay. Not sure if we are planning to update the infra to include the support configuring from config_db in the near future. If not, maybe we should keep it as is and change the configuration after the infra is ready.

@shi-su shi-su merged commit b962a60 into sonic-net:master Apr 27, 2021
lguohan pushed a commit to sonic-net/sonic-buildimage that referenced this pull request Apr 29, 2021
[flex-counters] Delay flex counters stats init for faster boot time (sonic-net/sonic-swss#1646)
[routeorch] Add support for blackhole routes (sonic-net/sonic-swss#1723)
Update pool sizes during initialization from timer only (sonic-net/sonic-swss#1708)

Signed-off-by: Shlomi Bitton <[email protected]>
@shi-su shi-su deleted the blackhole branch May 12, 2021 20:41
qiluo-msft pushed a commit that referenced this pull request May 18, 2021
Allow route orch to add/remove routes blackhole routes and add dvs tests.
@shlomibitton
Copy link
Contributor

@shi-su seems like this PR broke this branch can you please check?

@shi-su
Copy link
Contributor Author

shi-su commented May 19, 2021

@shi-su seems like this PR broke this branch can you please check?

Ack. I will fix this in 202012.

raphaelt-nvidia pushed a commit to raphaelt-nvidia/sonic-buildimage that referenced this pull request May 23, 2021
[flex-counters] Delay flex counters stats init for faster boot time (sonic-net/sonic-swss#1646)
[routeorch] Add support for blackhole routes (sonic-net/sonic-swss#1723)
Update pool sizes during initialization from timer only (sonic-net/sonic-swss#1708)

Signed-off-by: Shlomi Bitton <[email protected]>
carl-nokia pushed a commit to carl-nokia/sonic-buildimage that referenced this pull request Aug 7, 2021
[flex-counters] Delay flex counters stats init for faster boot time (sonic-net/sonic-swss#1646)
[routeorch] Add support for blackhole routes (sonic-net/sonic-swss#1723)
Update pool sizes during initialization from timer only (sonic-net/sonic-swss#1708)

Signed-off-by: Shlomi Bitton <[email protected]>
raphaelt-nvidia pushed a commit to raphaelt-nvidia/sonic-swss that referenced this pull request Oct 5, 2021
Allow route orch to add/remove routes blackhole routes and add dvs tests.
EdenGri pushed a commit to EdenGri/sonic-swss that referenced this pull request Feb 28, 2022
…oved Error Reporting (sonic-net#1723)

#### Why I did

Recently, a bug was seen which was related to saisdkdump and particularly showed up when `show techsupport` was invoked. Although, it was fixed, the sonic-mgmt test failed to capture it beforehand. 

This highlighted a few shortcomings of the `generate_dump` script and this PR addresses those and also a few additional issues seen

This PR fixes a few things, I'll explain each of them in the next section.

#### What I did

**1) Remove the "Interactive option (-i) for the docker invocation commands"**
  This was the reason why the bug which was  was not captured previously. When the techsupport was invoked remotely (Eg: using sshpass), the `docker exec -it  <docker> <cmd>` command would fail saying ` ‘the input device is not a TTY'`. Hence the (-i) option was removed.

**2) Change the Return Code**
Currently, the script doesn't return any non-zero error codes for most of the intermediate steps (even though they fail), which makes validation hard.

To handle this, a helper function and trap cmd are used. 
```
handle_error() {
  if [ "$1" != "0" ]; then
    echo "ERR: RC:-$1 observed on line $2" >&2
    RETURN_CODE=1
  fi
}
trap 'handle_error $? $LINENO' ERR # This would trap any executions with non-zero return codes
```
The global variable RETURN_CODE is set when this is called and the same RETURN_CODE is returned when `generate_dump` invocation process exits

You may see this is used in multiple functions instead of placing it once on the top of the script. This is because, every function can itself be considered as a subshell and each of them requires a explicit trap command.

When a command is failed with error, this logic would get append a corresponding log to stderr. 
`ERR: RC:-1 observed on line 729`

**3) Improve Error Reporting for save_cmd function**

Currently any error written to the stderr by the intermediate calls are redirected to the same location as stdout, which is usually the file we see under dump/ dir.  This is perfectly fine, but the sonic-mgmt test only parses the text seen in stdout. 

So, a new option (-r) is added to `generate_dump` script and subsequently to `show techsupport` to redirect any intermediate errors seen to the generate_dump's stderr.

With this option enabled, these sort of errors can be captured on the stderr.

```
root@sonic:/home/admin# show techsupport -r
..........
timeout --foreground 5m show queue counters > /var/dump/sonic_dump_r-tigon-04_20210714_062239/dump/queue.counters_1
Traceback (most recent call last):
  File "/usr/local/bin/queuestat", line 373, in <module>
    main()
  File "/usr/local/bin/queuestat", line 368, in main
    queuestat.get_print_all_stat(json_opt)
  File "/usr/local/bin/queuestat", line 239, in get_print_all_stat
    cnstat_dict = self.get_cnstat(self.port_queues_map[port])
  File "/usr/local/bin/queuestat", line 168, in get_cnstat
    cnstat_dict[queue] = get_counters(queue_map[queue])
  File "/usr/local/bin/queuestat", line 158, in get_counters
    fields[pos] = str(int(counter_data))
ValueError: invalid literal for int() with base 10: ''
handle_error $? $LINENO
ERR: RC:-1 observed on line 199
Command: show queue counters timedout after 5 minutes.
.............

Without that option, this'll be the output seen

root@sonic:/home/admin# show techsupport 
..........
timeout --foreground 5m show queue counters &> /var/dump/sonic_dump_r-tigon-04_20210714_062239/dump/queue.counters_1
handle_error $? $LINENO
ERR: RC:-1 observed on line 199
Command: show queue counters timedout after 5 minutes.
.............
```

**4) Minor Error in sdk-dump collection logic handled**
save_file is only called for the files seen in sdk_dump_path and not for directories
```
cp: -r not specified; omitting directory '/tmp/sdk-dumps'
handle_error $? $LINENO
ERR: RC:-1 observed on line 729
tar: sonic_dump_r-tigon-04_20210714_062239/sai_sdk_dump/sdk-dumps: Cannot stat: No such file or directory
tar: Exiting with failure status due to previous errors
tar append operation failed. Aborting to prevent data loss.
```
The reason being, `find /tmp/sdk-dumps` returns ["/tmp/sdk-dumps"] even if the dir is empty. In the next step, save_file cmd is applied on the folder and thus the error. This can be handled by the change specified above

**5) Minor Error in custom plugins logic handled**
Added a condition to check if the dir exists before proceeding forward.
```
if [[ -d ${PLUGINS_DIR} ]]; then
        local -r dump_plugins="$(find ${PLUGINS_DIR} -type f -executable)"
        for plugin in $dump_plugins; do
            # save stdout output of plugin and gzip it
            save_cmd "$plugin" "$(basename $plugin)" true false
        done
    fi
```
Otherwise,  find command might fail saying
```
root@sonic:/home/admin# find /usr/local/bin/debug-dump -type f -executable
find: ‘/usr/local/bin/debug-dump’: No such file or directory
```
EdenGri pushed a commit to EdenGri/sonic-swss that referenced this pull request Feb 28, 2022
…#1844)

What I did
Fix: sonic-net/sonic-buildimage#8850

Issue was introduced by sonic-net#1723, sonic-net#1833, and sonic-net#1843 (pending merge)

The error_handler is a great idea but the bash script needs to be error free first.

How I did it
Fix bash script errors.

How to verify it
run show techsupport test..

Signed-off-by: Ying Xie <[email protected]>
EdenGri pushed a commit to EdenGri/sonic-swss that referenced this pull request Feb 28, 2022
- What I did
This PR include some fixes which were missed for sonic-net#1723
i.e. removing -t option from the docker exec commands. to understand why the -it option was removed, refer sonic-net#1723.
Also, the show techsupport exits with $RETURN_CODE only when --redirect-stderr option is used.

Signed-off-by: Vivek Reddy Karri <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants