Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[yugabyted] Error binding socket after docker host restart #18572

Closed
1 task done
pablopla opened this issue Aug 6, 2023 · 3 comments
Closed
1 task done

[yugabyted] Error binding socket after docker host restart #18572

pablopla opened this issue Aug 6, 2023 · 3 comments
Assignees
Labels
area/ecosystem Label for all ecosystem related projects kind/enhancement This is an enhancement of an existing feature priority/medium Medium priority issue

Comments

@pablopla
Copy link

pablopla commented Aug 6, 2023

Jira Link: DB-7509

Description

Yugabyte container fail to bind socket after restarting the host.
Docker assign IP addresses to containers based on the start order. If you have several containers running on your machine and restart the host, the Yugabyte container might get a different address.

To reproduce:

  1. Start a Yugabyte container:sudo docker run -it -d --restart=always -p 5433:5433 --hostname yugabyte --name yugabyte yugabytedb/yugabyte:2.18.1.0-b84 bin/yugabyted start --daemon=false
  2. Check the Yugabyte IP with: sudo docker inspect -f '{{range.NetworkSettings.Networks}}{{.IPAddress}}{{end}}' yugabyte
  3. Start several other docker containers.
  4. Restart the host machine.
  5. Check that the Yugabyte IP changed after restart with: sudo docker inspect -f '{{range.NetworkSettings.Networks}}{{.IPAddress}}{{end}}' yugabyte
  6. If the IP changed you'll get an error in the logs. cat /root/var/logs/tserver.err
    20 tablet_server_main_impl.cc:208] Network error (yb/util/net/socket.cc:325): Error binding socket to 172.19.0.7:9100: Cannot assign requested address (system error 99)

Warning: Please confirm that this issue does not contain any sensitive information

  • I confirm this issue does not contain any sensitive information.
@pablopla pablopla added area/ecosystem Label for all ecosystem related projects area/ybd yugabyted project related Github tickets. status/awaiting-triage Issue awaiting triage labels Aug 6, 2023
@yugabyte-ci yugabyte-ci added kind/enhancement This is an enhancement of an existing feature priority/medium Medium priority issue labels Aug 6, 2023
@FranckPachot
Copy link
Contributor

Thanks a lot, @pablopla for filling the issue and identifying the root cause. Yes, YugabyteDB needs a static network identity for its nodes, with a static IP. That's why in Kubernetes it is deployed with StatefulSets. For labs, can do something similar with Docker network.

Reproduce without restarting the host

By starting another container when the YugabyteDB one is stopped, it takes the IP address and then the next start will get a new one assigned:

docker run -it -d --restart=always -p 5433:5433 --hostname yugabyte --name yugabyte yugabytedb/yugabyte:2.18.1.0-b84 bin/yugabyted start --daemon=false
docker exec yugabyte bash -c 'until postgres/bin/pg_isready -h yugabyte ; do sleep 1 ; done | uniq'
docker exec -i yugabyte bash -c 'ysqlsh -h $(hostname)' <<<'create table franck as select 42;'
docker exec yugabyte hostname -i

# stopping the container releases the IP
docker stop yugabyte
# start another container that will take the IP
docker run -d --name zzz alpine sleep infinity
docker exec zzz hostname -i

# If the IP is the same, starting yugabyte will get a new one but tries to bind to the old one
docker start yugabyte
docker logs yugabyte

# cleanup
docker rm -f yugabyte zzz

The error is:

Failed to bind to address: 172.17.0.2:7100:
For more information, check the logs in /root/var/logs

because this address is already used by another container.

Solution: Static IP

What is different here is that I add --network yb42 --ip 172.42.10.1 on docker run to attach with a static IP address to a network that I've created with docker network create --subnet=172.42.0.0/16 yb42.

docker network create --subnet=172.42.0.0/16 yb42
docker run -it -d --restart=always -p 5433:5433 --hostname yugabyte --network yb42 --name yugabyte yugabytedb/yugabyte:2.18.1.0-b84 bin/yugabyted start --daemon=false
docker exec yugabyte bash -c 'until postgres/bin/pg_isready -h yugabyte ; do sleep 1 ; done | uniq'
docker exec -i yugabyte bash -c 'ysqlsh -h $(hostname)' <<<'create table franck as select 42;'
docker exec yugabyte hostname -i

# stopping the container releases the IP
docker stop yugabyte
# start another container that will take the IP
docker run -d --name zzz alpine sleep infinity
docker exec zzz hostname -i

# restart will use the same parameters with the static IP address
docker start yugabyte
docker logs yugabyte
# check that my table is still there
docker exec -i yugabyte bash -c 'ysqlsh -h yugabyte.yb42' <<<'\d'

# cleanup
docker rm -f yugabyte zzz

Of course, if another container is started on the same network without setting another IP address, we will encounter the same problem. But now we have full control of the IP assigned.

@pablopla
Copy link
Author

pablopla commented Aug 7, 2023

Thank you for the workaround.
This is the first container I've used that require a static IP address and it feels limiting while in dev. I have been using PostgreSQL, Redis, Nginx, Pulsar and many more. It will be convenient if the container will clean the network state (not data) before each start and work with dynamic IP address.

This should be documented here, here and here. The need to use "--name yugabyte" to be able to connect to the database with ysqlsh also need to be documented.

The docs on the docker hub page should have a short description of Yugabyte and all the specific instructions of using the container.

@FranckPachot
Copy link
Contributor

Yes, I'll ping the Doc team.
A Distributed SQL database is not the same beast as monolithic DBs and stateless apps. The nodes must have an identity (name, IP). Re-configuring after IP changes should have to broadcast to all nodes. This may not be scalable (we have the elasticity of starting new nodes quickly) and how to handle that during a network partition (we are resilient to node or network failure)?

@yugabyte-ci yugabyte-ci removed status/awaiting-triage Issue awaiting triage area/ybd yugabyted project related Github tickets. labels Oct 10, 2023
nchandrappa added a commit that referenced this issue Nov 8, 2024
…ased deployments.

Summary:
In docker based deployments, yugabyted by default binds to the ip_address of the container. When the hostname machine restarts, yugabyted fails to restart as the ip-address of the container would have changed. Updating the default behavior of yugabyted to bind with container `hostname` in docker deployments.

### Tests

1. Start yugabyted node docker

```sh
docker run -d --name yugabyte --hostname yugabyte -p7000:7000 -p9000:9000 -p15433:15433 -p5433:5433 -p9042:9042 nchandrappa/yugabyte:2.25.0.0-hostname bin/yugabyted start --background=false
```

2. Stop YB container

```sh
docker stop <container-id>
```

3. Start YB container

```sh
docker start <container-id>
```
Jira: DB-7509

Test Plan: manual test

Reviewers: sgarg-yb

Reviewed By: sgarg-yb

Subscribers: yugabyted-dev

Differential Revision: https://phorge.dev.yugabyte.com/D39786
nchandrappa added a commit that referenced this issue Jan 2, 2025
…name for docker based deployments.

Summary:
Original commit: 59e080a / D39786
In docker based deployments, yugabyted by default binds to the ip_address of the container. When the hostname machine restarts, yugabyted fails to restart as the ip-address of the container would have changed. Updating the default behavior of yugabyted to bind with container `hostname` in docker deployments.

### Tests

1. Start yugabyted node docker

```sh
docker run -d --name yugabyte --hostname yugabyte -p7000:7000 -p9000:9000 -p15433:15433 -p5433:5433 -p9042:9042 nchandrappa/yugabyte:2.25.0.0-hostname bin/yugabyted start --background=false
```

2. Stop YB container

```sh
docker stop <container-id>
```

3. Start YB container

```sh
docker start <container-id>
```
Jira: DB-7509

Test Plan: manual test

Reviewers: sgarg-yb

Reviewed By: sgarg-yb

Subscribers: yugabyted-dev

Differential Revision: https://phorge.dev.yugabyte.com/D40978
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/ecosystem Label for all ecosystem related projects kind/enhancement This is an enhancement of an existing feature priority/medium Medium priority issue
Projects
None yet
Development

No branches or pull requests

4 participants