This repository contains references to 3 docker images for basic Data Science applications
- General (Basic analysis, ML applications, Jupyter)
- Deep (Supports deep learning frameworks)
- PySpark
Docker Official Images are a curated set of Docker open source and drop-in solution repositories. These images have clear documentation, promote best practices, and are designed for the most common use cases.
-
Create a dedicated folder for installation
-
Install Docker Desktop for Windows/Linux
-
Open Powershell; run as Administrator/ Open Terminal
-
Change the directory to the dedicated folder
-
Verify if the command
docker
is working on Powershell/Terminal window -
Build the Docker Container using the following command:
docker build -t image_name path/to/Dockerfile/folder
7.1 Build the container using the image built(Windows users):
docker run -it `
-v ${PWD}:/workspace/ --gpus all `
-p 8888:8888 -p 8050:8050 `
-p {portOfChoice}:{portOfChoice} `
--name container_name image_name `
/bin/bash
7.2 Build the container using the image built(Linux users):
docker run -it \
-v $(pwd):/workspace/ --gpus all \
--net=host \
--name <container_name> \
<image_name> \
/bin/bash
-
Start the container: docker start container_name
-
Attach to the container: docker attach container_name
-
Use the alias
work
to start the tmux sessions -
If there are any errors after putting the command
work
, usesudo apt-get install dos2unix
-> apply the conversions to tmux and workspace setup config files ->dos2unix {filename}
-
If you building on top popular images such as tensorflow/pytorch + jupyter notebook, these additional tags during
docker run
might be helpful:1. -e GRANT_SUDO=yes --user=root 2. --gpus all
- List downloaded images:
docker image ls
- Check currently running or previously run containers:
docker ps -a
- Start a container:
docker start container_name
- Attach to a container:
docker attach container_name
- Detach from tmux workspace:
ctrl+a
+d
ortmux detach
orexit
- Delete a container:
docker rm container_name
- Delete all containers:
docker rm -f $(docker ps -a -q)
- Reattach to the tmux workspace:
tmux a
ortmux attach
- Exit a container:
exit
- To check docker disk usage:
docker system df
- To remove all docker build cache, containers and images:
docker system prune -all
- To remove docker build cache:
docker builder prune
- https://pythonhosted.org/an_example_pypi_project/setuptools.html
- https://docs.docker.com/
- https://tmuxguide.readthedocs.io/en/latest/tmux/tmux.html
- https://linuxize.com/post/getting-started-with-tmux/
- https://www.hostinger.in/tutorials/tmux-beginners-guide-and-cheat-sheet/
- https://docs.docker.com/desktop/install/ubuntu/
- https://docs.docker.com/engine/install/ubuntu/#set-up-the-repository
- https://unix.stackexchange.com/questions/332419/tmux-mouse-mode-on-does-not-allow-to-select-text-with-mouse
- https://stackoverflow.com/questions/61890687/dash-app-refusing-to-start-127-0-0-1-refused-to-connect
-
For installing PySpark: refer this article
-
If you are facing issues running this on MacOS; add another tag to docker build:
--platform linux/x86_64
-
To copy contents on tmux panes, refer to issue
-
Add PySpark kernel to jupyter link