Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test zmq with faulty network #9734

Open
1 of 2 tasks
xjules opened this issue Jan 14, 2025 · 1 comment
Open
1 of 2 tasks

Test zmq with faulty network #9734

xjules opened this issue Jan 14, 2025 · 1 comment
Assignees

Comments

@xjules
Copy link
Contributor

xjules commented Jan 14, 2025

We should stress test zmq with "faulty" network before zmq is provide to stable version. The testing should include the following scenarios:

  • dispatchers lose connection, which gets reestablished after some time
  • the messages from dispatchers get a delay

Or combinations of the above ideally.

@EJahren suggested a nice library that could be exploited https://vaurien.readthedocs.io/en/1.8/ This library only supports python 2.7
There are other alternatives:

@xjules xjules added this to SCOUT Jan 14, 2025
@xjules xjules converted this from a draft issue Jan 14, 2025
@jonathan-eq jonathan-eq moved this from Todo to In Progress in SCOUT Jan 15, 2025
@jonathan-eq jonathan-eq self-assigned this Jan 15, 2025
@xjules xjules assigned xjules and unassigned jonathan-eq Jan 16, 2025
@xjules
Copy link
Contributor Author

xjules commented Jan 16, 2025

Tested with socat script

#!/bin/bash

# Default ports
DEFAULT_INBOUND_PORT="51823"
DEFAULT_OUTBOUND_PORT="51822"

# Parse input arguments
INBOUND_PORT=${1:-$DEFAULT_INBOUND_PORT}  # First argument: Inbound port, defaults to 51823
OUTBOUND_PORT=${2:-$DEFAULT_OUTBOUND_PORT}  # Second argument: Outbound port, defaults to 51822

# Start socat forwarding
echo "Starting socat to forward inbound port $INBOUND_PORT to outbound port $OUTBOUND_PORT"
socat TCP-LISTEN:$INBOUND_PORT,reuseaddr,fork TCP:127.0.0.1:$OUTBOUND_PORT

inbound port is the one that dealers (client and monitor) connect to and outbound is the one that is bind by router.

The following was tested:
[x] Lose connection

All was able to recover. Nevertheless this impacted monitor too and thus the events to the GUI were not sent, while all the job dispatchers events were resent when re-enabling connection again.

Image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: In Progress
Development

No branches or pull requests

2 participants