Skip to content

sush4nt/docker-containers

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

25 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

docker-containers

This repository contains references to 3 docker images for basic Data Science applications

  1. General (Basic analysis, ML applications, Jupyter)
  2. Deep (Supports deep learning frameworks)
  3. PySpark

Why the need Docker Images?

Docker Official Images are a curated set of Docker open source and drop-in solution repositories. These images have clear documentation, promote best practices, and are designed for the most common use cases.

Guide

Steps:

  1. Create a dedicated folder for installation

  2. Install Docker Desktop for Windows/Linux

  3. Open Powershell; run as Administrator/ Open Terminal

  4. Change the directory to the dedicated folder

  5. Verify if the command docker is working on Powershell/Terminal window

  6. Build the Docker Container using the following command:

        docker build -t image_name path/to/Dockerfile/folder
    

7.1 Build the container using the image built(Windows users):

        docker run -it `
        -v ${PWD}:/workspace/ --gpus all `
        -p 8888:8888 -p 8050:8050 `
        -p {portOfChoice}:{portOfChoice} `
        --name container_name image_name `
        /bin/bash

7.2 Build the container using the image built(Linux users):

         docker run -it \
         -v $(pwd):/workspace/ --gpus all \
         --net=host \
         --name <container_name> \
         <image_name> \
         /bin/bash
  1. Start the container: docker start container_name

  2. Attach to the container: docker attach container_name

  3. Use the alias work to start the tmux sessions

  4. If there are any errors after putting the command work, use sudo apt-get install dos2unix -> apply the conversions to tmux and workspace setup config files -> dos2unix {filename}

  5. If you building on top popular images such as tensorflow/pytorch + jupyter notebook, these additional tags during docker run might be helpful:

        1. -e GRANT_SUDO=yes --user=root
        2. --gpus all
    

Additional docker commands:

  1. List downloaded images: docker image ls
  2. Check currently running or previously run containers: docker ps -a
  3. Start a container: docker start container_name
  4. Attach to a container: docker attach container_name
  5. Detach from tmux workspace: ctrl+a+d or tmux detach or exit
  6. Delete a container: docker rm container_name
  7. Delete all containers: docker rm -f $(docker ps -a -q)
  8. Reattach to the tmux workspace: tmux a or tmux attach
  9. Exit a container: exit
  10. To check docker disk usage: docker system df
  11. To remove all docker build cache, containers and images: docker system prune -all
  12. To remove docker build cache: docker builder prune

References:

  1. https://pythonhosted.org/an_example_pypi_project/setuptools.html
  2. https://docs.docker.com/
  3. https://tmuxguide.readthedocs.io/en/latest/tmux/tmux.html
  4. https://linuxize.com/post/getting-started-with-tmux/
  5. https://www.hostinger.in/tutorials/tmux-beginners-guide-and-cheat-sheet/
  6. https://docs.docker.com/desktop/install/ubuntu/
  7. https://docs.docker.com/engine/install/ubuntu/#set-up-the-repository
  8. https://unix.stackexchange.com/questions/332419/tmux-mouse-mode-on-does-not-allow-to-select-text-with-mouse
  9. https://stackoverflow.com/questions/61890687/dash-app-refusing-to-start-127-0-0-1-refused-to-connect

Additional References

  1. For installing PySpark: refer this article

  2. If you are facing issues running this on MacOS; add another tag to docker build:

         --platform linux/x86_64
    
  3. To copy contents on tmux panes, refer to issue

  4. Add PySpark kernel to jupyter link