A Docker image to play around with Apache Sqoop with Hadoop configured in a Pseudo Distributed Mode (single cluster mode).
Below are the steps to use this image on Play with Docker.
- First of all, create an account on Docker Hub.
- Login to Play with Docker using the Docker Hub account you just created.
- You should see a green "Start" button, click on it to start a session.
- Create an instance by clicking on "+ Add new instance" in the left pane, to create a VM.
- A new terminal should show up in the right pane. Here, we need to pull the Docker image from Github Container Registry (GHCR). To do so, execute:
docker pull ghcr.io/kasipavankumar/sqoop-docker:latest
- After the image has been pulled into the VM, we need to start a new container & switch into it's terminal (mostly bash). To do so, execute:
docker run -it ghcr.io/kasipavankumar/sqoop-docker:latest
At this stage, the image will be booting up by executing all the required for running Sqoop.
From now on, you will be inside container's bash (terminal). 🚀
To verify the working, try the following command:
sqoop import \
--connect jdbc:mysql://localhost/employees \
--table employees \
--username bda \
--password 123456
This should import all the employees data into Hadoop file system which can be verfied by:
hadoop fs -ls /user/root/employees
which should list around 5 files & using cat
on any one of them should show few employees records. 🎉
D. Kasi Pavan Kumar (c) 2021