Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ConnectRetry and AutoReconnect do not work when connecting to docker host on MacOS #597

Closed
sirockin opened this issue Apr 29, 2022 · 3 comments

Comments

@sirockin
Copy link

sirockin commented Apr 29, 2022

Summary

When attempting to connect from a docker container to a broker running on the host, the client successfully connects as long as the broker is running. However:

  • With ConnectRetry set, the client will not connect if the broker starts after the client
  • With AutoReconnect set, the client will not reconnect if the broker stops then starts

Steps to Reproduce/Minimum Working Example

In this fork I have adapted the /cmd/docker example to allow connection to a broker on the host machine, and documented changes and steps to reproduce in the example readme.

I repeat these in the next comment (below).

System Info

OS: macOS Big Sur Version 11.6.5
Docker version 20.10.13, build a224086

I have separately tested on WSL/Ubuntu and the bug does not appear:
WSL Ubuntu 20.04
Docker version 20.10.14, build a224086

@sirockin
Copy link
Author

This example demonstrates a bug in operation when attempting to connect/reconnect to an mqtt broker operating from the host machine. The changes I have made to the original project are as follows:

  • Inpub/main.go and sub/main.go:
    • Allow SERVERADDRESS to be set by env var
    • Uncomment logging lines
  • docker-compose.yml:
    • Set SERVERADDRESS on pub and sub from external SERVERADDRESS env var, with default pointing at docker mosquitto.
  • Add new docker-compose.mosquitto.yml to allow starting mosquitto in separate network and exposing to host

Standard operation (all working):

Normal operation

docker-compose up

Result:
pub and sub connect to the broker and behave normally

To demonstrate succesful reconnection when mqtt broker goes down then up

In terminal 1:

docker-compose up

In terminal 2:

docker-compose stop mosquitto
docker-compose start mosquitto

Result:
pub and sub lose connections then successfully reconnect

Succesful connection when services start without broker, then broker starts

docker-compose up

In terminal 2:

docker-compose stop mosquitto
docker-compose restart sub pub
docker-compose start mosquitto

Result:
After restart, pub and sub services initially can't connect, then succeed when mosquitto is started

Using External Host:

Normal Operation

In terminal 2:

# Start the external broker
docker-compose -p mosquitto -f docker-compose.mosquitto.yml up -d

In terminal 1:

# Start pub and sub pointing at external broker
SERVERADDRESS=host.docker.internal:1883 docker-compose up

Result:
pub and sub connect to the broker and behave normally

Unsuccesful reconnection when mqtt broker goes down then up

In terminal 2:

# Start the external broker
docker-compose -p mosquitto -f docker-compose.mosquitto.yml up -d

In terminal 1:

# Start pub and sub pointing at external broker
SERVERADDRESS=host.docker.internal:1883 docker-compose up

In terminal 2:

# Stop the external broker
docker-compose -p mosquitto -f docker-compose.mosquitto.yml down

# Restart the external broker
docker-compose -p mosquitto -f docker-compose.mosquitto.yml up -d

Result:

Both services display single reconnection message, but never recover (pub keeps publishing)

sub_1        | [ERROR] [client]   Connecting to tcp://host.docker.internal:1883 CONNACK was not CONN_ACCEPTED, but rather Connection Error
sub_1        | [DEBUG] [client]   Reconnect failed, sleeping for 1 seconds: network Error : EOF
pub_1        | [ERROR] [net]      connect got error EOF
pub_1        | [ERROR] [client]   Connecting to tcp://host.docker.internal:1883 CONNACK was not CONN_ACCEPTED, but rather Connection Error
pub_1        | [DEBUG] [client]   Reconnect failed, sleeping for 1 seconds: network Error : EOF
...

Failure to connect when services start without broker, then broker starts

In terminal 2:

# Stop the external broker
docker-compose -p mosquitto -f docker-compose.mosquitto.yml down

In terminal 1:

# Start pub and sub pointing at external broker
SERVERADDRESS=host.docker.internal:1883 docker-compose up

In terminal 2:

# Start the external broker
docker-compose -p mosquitto -f docker-compose.mosquitto.yml up -d

Result:

Both services start and hang at connect started, never recover

sub_1        | SERVERADDRESS: host.docker.internal:1883
sub_1        | [DEBUG] [client]   Connect()
sub_1        | [DEBUG] [store]    memorystore initialized
pub_1        | SERVERADDRESS: host.docker.internal:1883
pub_1        | [DEBUG] [client]   Connect()
pub_1        | [DEBUG] [store]    memorystore initialized
pub_1        | [DEBUG] [client]   about to write new connect msg
pub_1        | [DEBUG] [client]   socket connected to broker
pub_1        | [DEBUG] [client]   Using MQTT 3.1.1 protocol
pub_1        | [DEBUG] [net]      connect started
sub_1        | [DEBUG] [client]   about to write new connect msg
sub_1        | [DEBUG] [client]   socket connected to broker
sub_1        | [DEBUG] [client]   Using MQTT 3.1.1 protocol
sub_1        | [DEBUG] [net]      connect started

@sirockin sirockin changed the title ConnectRetry and AutoReconnect do not work when connecting to docker host ConnectRetry and AutoReconnect do not work when connecting to docker host on MacOS Apr 29, 2022
@MattBrittan
Copy link
Contributor

Unfortunately I don't have access to a Mac so am going to struggle to debug this (as you mention it works OK under Windows/Linux). I suspect that the connection to the broker is being opened to a black hole (so the connection stays open but nothing is received).

Can you please try https://github.com/ChIoT-Tech/paho.mqtt.golang/tree/Issue597 and see if that resolves the issue? If not some additional logging will be needed to identify what is happening.

@sirockin
Copy link
Author

sirockin commented May 1, 2022

Hi @MattBrittan . Thank you for the swift response. Yes this does resolve the issue. I'll link to that for the time being and await the merge.

algitbot pushed a commit to alpinelinux/build-server-status that referenced this issue May 5, 2024
This MR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
| [github.com/eclipse/paho.mqtt.golang](https://github.com/eclipse/paho.mqtt.golang) | require | patch | `v1.4.1` -> `v1.4.3` |

---

### Release Notes

<details>
<summary>eclipse/paho.mqtt.golang (github.com/eclipse/paho.mqtt.golang)</summary>

### [`v1.4.3`](https://github.com/eclipse/paho.mqtt.golang/releases/tag/v1.4.3)

[Compare Source](eclipse-paho/paho.mqtt.golang@v1.4.2...v1.4.3)

Release 1.4.3 is a relatively small release to bring in changes made in the eight months since 1.4.2.

Thanks to everyone who submitted issues and contributed code (list of the main merged pull requests below):

#### What's Changed

-   Avoid Panic when keepalive is 1 by [@&#8203;tomatod](https://github.com/tomatod) in [#&#8203;622](eclipse-paho/paho.mqtt.golang#622)
-   Allow MQTT username/password in websocket URI [@&#8203;MattBrittan](https://github.com/MattBrittan) in [#&#8203;624](eclipse-paho/paho.mqtt.golang#624)
-   Add backoff when reconnecting following immediate connection loss [@&#8203;tomatod](https://github.com/tomatod) in [#&#8203;625](eclipse-paho/paho.mqtt.golang#625)
-   Update dependencies (github.com/gorilla/[email protected], golang.org/x/net, golang.org/x/sync) and specify `go 1.18` in `go.mod`.

**Full Changelog**: eclipse-paho/paho.mqtt.golang@v1.4.2...v1.4.3

### [`v1.4.2`](https://github.com/eclipse/paho.mqtt.golang/releases/tag/v1.4.2)

[Compare Source](eclipse-paho/paho.mqtt.golang@v1.4.1...v1.4.2)

Release 1.4.2 is relatively small and is mostly focused on tidying up the way the library manages the connection status. Previously `sync/
atomic` was used to read/update the status but this led to a range of potential deadlocks, and workarounds to avoid these, which made the code difficult to follow. The new [connectionStatus](https://github.com/eclipse/paho.mqtt.golang/blob/master/status.go#L113) separates status handling from `client` and should simplify further development whilst resolving potential race conditions. It is my hope that users will not notice any change ([@&#8203;master](https://github.com/master) was updated on 10th August and the updated code has been running in production at a few sites since then without issue).

A further change is that it is now possible to disable auto acknowledgment so that received messages can be manually acknowledged (or, more to the point, not acknowledged!).

Thanks to everyone who submitted issues and contributed code (list of the main merged pull requests below):

#### What's Changed

-   Tidy up use of mutex in `messageIds` by [@&#8203;MattBrittan](https://github.com/MattBrittan) in [#&#8203;602](eclipse-paho/paho.mqtt.golang#602)
-   Resolve situation where broker accepted connection but did not respond to CONNECT packet in a timely manner (should be very unusual but was reported in [#&#8203;597](eclipse-paho/paho.mqtt.golang#597)).  [@&#8203;MattBrittan](https://github.com/MattBrittan) in [#&#8203;603](eclipse-paho/paho.mqtt.golang#603)
-   Resolve race condition in test  by [@&#8203;MattBrittan](https://github.com/MattBrittan) in [#&#8203;606](eclipse-paho/paho.mqtt.golang#606)
-   Re-architect status handling by [@&#8203;MattBrittan](https://github.com/MattBrittan) in [#&#8203;607](eclipse-paho/paho.mqtt.golang#607)
-   Enable manual ACK by [@&#8203;shivamkm07](https://github.com/shivamkm07) in [#&#8203;578](eclipse-paho/paho.mqtt.golang#578)

#### New Contributors

-   [@&#8203;shivamkm07](https://github.com/shivamkm07) made their first contribution in [#&#8203;578](eclipse-paho/paho.mqtt.golang#578)

**Full Changelog**: eclipse-paho/paho.mqtt.golang@v1.4.1...v1.4.2

</details>

---

 - [ ] <!-- rebase-check -->If you want to rebase/retry this MR, check this box

---

&nbsp;
<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNy4yODYuMSIsInVwZGF0ZWRJblZlciI6IjM3LjI4Ni4xIiwidGFyZ2V0QnJhbmNoIjoibWFzdGVyIiwibGFiZWxzIjpbXX0=-->

See merge request alpine/infra/build-server-status!8
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants