Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Creating a mongo image set with --replSet #246

Closed
gioragutt opened this issue Feb 7, 2018 · 36 comments
Closed

Creating a mongo image set with --replSet #246

gioragutt opened this issue Feb 7, 2018 · 36 comments
Labels
question Usability question, not directly related to an error with the image

Comments

@gioragutt
Copy link

Hi, I've opened a SO thread, but I guess I can ask it here as well.

I want to create an Image that would use mongo:latest, but would initialize mongod with --replSet, so that I can later (in the custom Dockerfile) call rs.initiate() in the mongo shell.

Is there any way I can do that?

@myashchenko
Copy link

Hi @gioragutt. Have you resolved this ticket?
I have a similar requirement (run mongo with --replSet and initiate replica set), made several attempt to get it working, but it's still not working. See #249 for details

@crapthings
Copy link

@gioragutt @YashchenkoN

same here

@MaBeuLux88
Copy link

I really don't see the problem guys. I did it like this.
Problem solved?

#!/usr/bin/env bash

mkdir -p data/{db1,db2,db3}

#Create docker network
docker network create mongonet

# Here we start our main MongoDB instances in 3.6.3
docker run -d -p 27017:27017 -v $(pwd)/data/db1:/data/db1 \
	-u 1000:1000 -h mongo1 --network mongonet \
	--network-alias mongo1 --name mongo1 \
	mongo:3.6.3 --dbpath /data/db1 --replSet replicaTest --bind_ip_all --logpath /data/db1/mongod.log

docker run -d -p 27018:27017 -v $(pwd)/data/db2:/data/db2 \
	-u 1000:1000 -h mongo2 --network mongonet \
	--network-alias mongo2 --name mongo2 \
	mongo:3.6.3 --dbpath /data/db2 --replSet replicaTest --bind_ip_all --logpath /data/db2/mongod.log

docker run -d -p 27019:27017 -v $(pwd)/data/db3:/data/db3 \
	-u 1000:1000 -h mongo3 --network mongonet \
	--network-alias mongo3 --name mongo3 \
	mongo:3.6.3 --dbpath /data/db3 --replSet replicaTest --bind_ip_all --logpath /data/db3/mongod.log

sleep 1

# Here we initialize the replica
echo 'rs.initiate({
      _id: "replicaTest",
      members: [
         { _id: 0, host: "mongo1:27017" },
         { _id: 1, host: "mongo2:27017" },
         { _id: 2, host: "mongo3:27017", arbiterOnly:true }]});' | mongo

Most of the options on my docker run lines are not mandatory.
Works like a charm for me at least on Debian.

You could also create your own image (with a Dockerfile) and insert a new YML conf file and use it to start your MongoDB.

@gioragutt
Copy link
Author

@MaBeuLux88 this is a script. Our question is how do we create a Dockerfile for an image that does this out of the box.

@MaBeuLux88
Copy link

Ok now I understand and here is my answer 2.0.
I think you can't do that because this configuration is written inside the database itself which would require to start MongoDB before actually starting MongoDB...
In theory you could replace the CMD by a bash script, detect the first run and start this command once then start the mongod. But it really violates the Docker philosophy and you don't really want to go through this.
On top of that, the command line has to change (because it contains the host name) and the 3 nodes would not be identical because this has to be sent to only one of the 3 servers.

Moreover, if I understood correctly your real problem : you want to run MongoDB on AWS and MongoDB already has a DBaaS : MongoDB Atlas.

It's a fully managed service and as tons of advantages versus running it yourself on AWS.

Example of features you would miss :

  • Automated operations (version upgrade...)
  • Security management (users, SSL, encryption, ...)
  • Continuous backups
  • Data explorer
  • Real time metrics
  • Queryable snapshots
  • Auto scale disk size
  • etc

@yosifkit
Copy link
Member

Since the data is normally in a volume (/data/db), you'll need to define a different folder in order to be able to docker commit it. Warning that this could be extremely bad for performance since the DB files will be part of the copy-on-write filesystem employed by docker images.

FROM mongo:3.6
CMD [ "--dbpath", "/some/path/outside/data/db/", "--replSet", "myrepl" ]

As far as embedding the data using a RUN line, it would be a fairly complex set of running the mongod server in the background (probably using the entrypoint script) and then waiting for it to finish starting, calling rs.initiate, and then shutting down then backgrounded mongod. You might be able to use the /docker-entrypoint-initdb.d/ folder to have the entrypoint automatically run your js file for rs.initiate, but then you'll have to guess when the entrypoint is done and mongo is ready to be stopped.

@gioragutt
Copy link
Author

@yosifkit the only thing that is hard here is the part of running mongod with --replSet. I want it to be done in the Dockerfile, so that I don't have to do anything after the container is running.

AFAIK, running commands on the mongod instance from Dockerfile isn't really an issue.

@yosifkit
Copy link
Member

If you don't want to commit the data to the image then this will work for your first cluster member (since the initdb.d scripts are only run when the data directory is empty).

FROM mongo:3.6
RUN echo "rs.initiate();" > /docker-entrypoint-initdb.d/replica-init.js
CMD [ "--replSet", "myrepl" ]

@MaBeuLux88
Copy link

This will work only for a standalone node started as a ReplicaSet.
You can't start the 3 nodes that way. They won't connect to each other. That would just create 3 one node replica sets

@gioragutt
Copy link
Author

@MaBeuLux88 that's basically all I needed for now (or to be correct, then, since I don't even work in the company anymore).

@yosifkit can you elaborate on what's going on in the Dockerfile? I get lines 1 and 2, not sure why 3 is working. I'm not really proficient in Dockerfiles, so I don't understand what exactly happens in the CMD command.

@yosifkit
Copy link
Member

this will work for your first cluster member

So yes, you will need something more complex for the other nodes. But no, my hurried Dockerfile will not actually work since the host field in the replica config will be set to 127.0.0.1:27017.

@gioragutt, There is an entrypoint script that checks the args for a flag (like --replSet) and prepends mongod to the args.

When you do this (or the CMD from my Dockerfile above):

$ docker run -d --name mongo mongo:3.6 --replSet myrepl

What really is being run is the entrypoint script with the CMD as the arguments:

$ docker-entrypoint.sh --replSet myrepl
$ # which, after initialization, eventually ends with exec so that the bash process executing the shell script is replaced by mongod
$ # ie pid 1 of the contianer becomes mongod
$ exec mongod --replSet myrepl

@wglambert wglambert added the question Usability question, not directly related to an error with the image label Apr 25, 2018
@yosifkit
Copy link
Member

yosifkit commented May 8, 2018

Docs on using a config file or flags added in docker-library/docs#1127.

Adding automation for setting up a replica set is not something we want to add to the image since it requires an external service like consul in order to reliably coordinate which is the first and where to join.

This blog seems useful on setting up a replset (with mongos config servers too): https://dzone.com/articles/composing-a-sharded-mongodb-on-docker.

Edit: this one seems pretty simple: https://medium.com/lucjuggery/mongodb-replica-set-on-swarm-mode-45d66bc9245

@yosifkit yosifkit closed this as completed May 8, 2018
@Lewik
Copy link

Lewik commented Mar 4, 2019

If you don't want to commit the data to the image then this will work for your first cluster member (since the initdb.d scripts are only run when the data directory is empty).

FROM mongo:3.6
RUN echo "rs.initiate();" > /docker-entrypoint-initdb.d/replica-init.js
CMD [ "--replSet", "myrepl" ]

This should be in documentation. A lot of devs will start mongo in docker and change streams (that requires replica set) is feature by it self, without multiple nodes.

@yosifkit
Copy link
Member

yosifkit commented Mar 5, 2019

@Lewik, as I mention later:

But no, my hurried Dockerfile will not actually work since the host field in the replica config will be set to 127.0.0.1:27017

If I remember correctly, this would make the node inaccessible from other containers and only work from the Docker host with an exposed port on 27017 (docker run --p 27017:27017), so this would not be useful for the docs.

@Lewik
Copy link

Lewik commented Mar 5, 2019

@Lewik, as I mention later:

But no, my hurried Dockerfile will not actually work since the host field in the replica config will be set to 127.0.0.1:27017

If I remember correctly, this would make the node inaccessible from other containers and only work from the Docker host with an exposed port on 27017 (docker run --p 27017:27017), so this would not be useful for the docs.

As I said:

change streams ... is feature by it self, without multiple nodes

Ability to listen database changes is feature. Even for one node. Developers will need only one node with change streams. In production we should setup multiple nodes, of course, But when we develop - one node is enough. And for small projects one db is enough even in prod.

@dozzes
Copy link

dozzes commented May 11, 2019

I've tried to set setup replica set to enable change stream for mogodb:3.6 but still not succeeded:
Do you know, Guys, how to initialize DB and enable change stream?

@zhangyoufu
Copy link

FYI, the /docker-entrypoint-initdb.d only works for replica set listening on localhost.

When scripts under /docker-entrypoint-initdb.d are executed, mongod is forced to listen on localhost. If you wrote rs.initialize(); in your script, your application may not happy with a replica set containing a member at localhost:27017. If you wrote a hostname/FQDN in your script, your initialization will fail:

replSet initiate got NodeNotFound: No host described in new configuration 1 for replica set rs0 maps to this node while validating { _id: "rs0", version: 1.0, members: [ { _id: 0.0, host: "mongo" } ] }

See #339

@dozzes
Copy link

dozzes commented Jul 30, 2019

Replica set can be initialized on container startup.
This script is set as ENTRYPOINT for docker file which installs mongo on linux (base image is ubuntu:16.04)

#!/bin/sh
# start mongod, initialize replica set
mongod --fork --config /opt/mongod.conf
mongo --quiet < /opt/replica.js
# restart mongod    
mongod --shutdown --dbpath /var/lib/mongodb
mongod --config /opt/mongod.conf

replica.js:

rs.slaveOk()
rs.initiate()
rs.initiate({_id:"rs0", members: [{"_id":1, "host":"127.0.0.1:27017"}]})

https://stackoverflow.com/a/56194606/4560419

@tianon
Copy link
Member

tianon commented Jul 30, 2019 via email

@dozzes
Copy link

dozzes commented Aug 15, 2019

External applications can check another indication about mongod readiness.

@crapthings
Copy link

i end up with this one

https://gist.github.com/crapthings/71fb6156a8e9b31a2fa7946ebd7c4edc

@ziocleto
Copy link

i end up with this one

https://gist.github.com/crapthings/71fb6156a8e9b31a2fa7946ebd7c4edc

Wow, that's the real and proper solution right there. Great job!

@bergkvist
Copy link

bergkvist commented Mar 22, 2020

The only reason I'm here is that I want to use change streams in MongoDB locally (which requires using a replicaset). I don't want to run multiple replicas.

Is there no way of doing this with a single docker image/container?

@MaBeuLux88
Copy link

@bergkvist Yes of course. Use a single RS. This should do the job no problemo.

#!/usr/bin/env bash
docker run --rm -d -p 27017:27017 --name mongo mongo:4.2.3 --replSet=test
sleep 5
docker exec -it mongo mongo --quiet --eval "rs.initiate()"
docker exec -it mongo mongo --quiet

@bergkvist
Copy link

@MaBeuLux88 How could I make this work with docker-compose up -d?

@MaBeuLux88
Copy link

@bergkvist I just made this for you. Let me know if this works: https://github.com/MaBeuLux88/docker/tree/master/mongo-single-node-rs

@WangShuXian6
Copy link

i end up with this one

https://gist.github.com/crapthings/71fb6156a8e9b31a2fa7946ebd7c4edc

It's perfect, thank you very much

@eloparco
Copy link

If you don't want to commit the data to the image then this will work for your first cluster member (since the initdb.d scripts are only run when the data directory is empty).

FROM mongo:3.6
RUN echo "rs.initiate();" > /docker-entrypoint-initdb.d/replica-init.js
CMD [ "--replSet", "myrepl" ]

Do you happen to know why if I run it as a Dockerfile (and then try to reconnect to the container) everything is fine and the prompt says (in my case) rs0:PRIMARY> while if I put it in a docker-compose.yml like in the following a get a rs0:OTHER>

version: "3"
services:
  db:
    build:
      dockerfile: Dockerfile
      context: .
    ports:
      - "27017:27017"

Is there a workaround to fix this and set the node as primary also when using docker-compose?

@tianon
Copy link
Member

tianon commented Apr 27, 2020

I'd recommend running rs.status() in your interactive shell to try and get more information about your replica set and what's going wrong.

@eloparco
Copy link

I did it and I get Our replica set config is invalid or we are not a member of it.
I think the problem is related to what is written in this issue #249, but the solution provided is not working for me.
Do you have any suggestions?

@jdclarke5
Copy link

Here is a little wait-for-it pattern written if python is already installed on your test container and you don't want to install mongo straight in that container. May be useful for some.

# Docker compose file for testing
version: "3.4"
services:
  mongo:
    image: mongo:4.2
    container_name: mongo
    network_mode: host
    ports:
      - 27017:27017
    restart: always
    command: --replSet replicaset
  my_container:
    image: my_image
    container_name: my_container
    network_mode: host
    depends_on: 
    - mongo
    command: python wait-for-mongo.py pytest -s
#!/usr/local/bin/python
"""
Ping MongoDB endpoint until ready, then initialise replicaset, and run any
extra args as a subprocess.
"""

import subprocess
from pymongo import MongoClient
from pymongo.errors import OperationFailure, ServerSelectionTimeoutError
import sys
import time

MONGO_URI = "localhost:27017"
PING_RATE = 1
ATTEMPTS = 30

for _ in range(ATTEMPTS):
    print("Attempting connection to {}...".format(MONGO_URI))
    client = MongoClient(MONGO_URI, serverSelectionTimeoutMS=PING_RATE*1000)
    try: 
        client.server_info()
    except ServerSelectionTimeoutError:
        continue
    print("Connected!")
    break
else:
    print("Failed to connect after {} attempts".format(ATTEMPTS))
    sys.exit(1)

print("Initialising replicaset...")
try:
    client.admin.command("replSetInitiate")
except OperationFailure as e:
    if "already initialized" in str(e):
        pass
    else:
        print("OperationFailure: {}".format(e))
        sys.exit(1)

print("Initialised!")

if "wait-for-mongo.py" in sys.argv[0]:
    command = sys.argv[1:]
else:
    command = sys.argv

if command:
    print("Running command: {}".format(" ".join(command)))
    process = subprocess.Popen(
        command, 
        stdout=subprocess.PIPE,
        universal_newlines=True,
    )
    while True:
        output = process.stdout.readline().strip()
        if output:
            print(output)
        return_code = process.poll()
        if return_code is not None:
            sys.exit(return_code)

@jdclarke5
Copy link

Here's another (probably cleaner than above) solution using a wait-for-it pattern. Idea is to mount the volume with mongo executable from the mongo container into the waiting container, so that there is no need to have mongo installed on that container image.

# Docker compose file for testing
version: "3.4"
services:
  mongo:
    image: mongo:4.2
    container_name: mongo
    network_mode: host
    ports:
      - 27017:27017
    restart: always
    volumes:
    - mongo-bin:/usr/bin
    command: --replSet replicaset
  my_container:
    image: my_image
    container_name: my_container
    network_mode: host
    depends_on: 
    - mongo
    volumes:
    - mongo-bin:/mongo-bin:ro
    command: ./wait-for-mongo.sh pytest -s
volumes:
  mongo-bin:
#!/bin/sh

>&2 echo "Waiting for Mongo..."

until [ "$(/mongo-bin/mongo --host localhost --port 27017 --eval "db.stats().ok" | grep -E 'session')" ]; do
  >&2 echo "Attempting connection..."
  sleep 1
done
>&2 echo "Connected!"

>&2 echo "Initialising replicaset..."
>&2 echo $(/mongo-bin/mongo --host localhost --port 27017 --eval "rs.initiate({ _id: 'replicaset', members: [{ _id: 0, host: 'localhost:27017' }] })")

cmd="$@"
>&2 echo "Running command \"$cmd\"..."
exec $cmd

@shanmukha511
Copy link

Hi @yosifkit

Is the below code can work with mongo:3.4.20

FROM mongo:3.4.20
RUN echo "rs.initiate();" > /docker-entrypoint-initdb.d/replica-init.js
CMD [ "--replSet", "myrepl" ]

it is not initialised by default as compared to 3.6?

@tianon
Copy link
Member

tianon commented Dec 15, 2020

@Lewik, as I mention later:

But no, my hurried Dockerfile will not actually work since the host field in the replica config will be set to 127.0.0.1:27017

If I remember correctly, this would make the node inaccessible from other containers and only work from the Docker host with an exposed port on 27017 (docker run --p 27017:27017), so this would not be useful for the docs.

@jshbrntt
Copy link

jshbrntt commented Dec 18, 2020

This gives me what I want.

docker run -it --rm mongo:4.0.20@sha256:dd4cbe24eb8233db92b871cc556b77efcc7f9e67bc9516579796d4d08818273e bash -c "set -m ; mongod --bind_ip_all --replSet rs0 & while ! 2> /dev/null > '/dev/tcp/0.0.0.0/27017'; do sleep 1; done ; mongo --eval 'rs.initiate()' ; fg 1"

@thready
Copy link

thready commented Apr 6, 2021

I've tried this every which way to no avail. I'm looking for a docker-compose that can setup my app with replica set with the following:

1 - initdb - create some database that my app will use with its initial user
2- initialize a replica set
3- ensure one node (the only node needed for dev) can authenticate with itself (had errors that pointed to the need for -auth option)

I had 1 and 2 working, but I noticed when calling createIndex, I had a huge backlog of long running commands with db.currentOp(true).

Thanks for any guidance!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Usability question, not directly related to an error with the image
Projects
None yet
Development

No branches or pull requests