Learning Docker by building an R learning environment
Docker is a technology that makes it very easy to try a piece of software or technology without running into installation problems that you would otherwise run into if you were to install software directly or natively on your system. Docker gives you an environment that you can keep separate from the rest of your computer and use this environment as a playground to try different technologies.
In the post below I describe creating an R environment using RStudio Server with Docker.
If you don't have Docker installed, then get Docker Desktop here. Once installed the instructions below describe setting up RStudio Server.
GET RSTUDIO DOCKER IMAGE
In a terminal invoke the following commands;
docker pull rocker/rstudio
docker images
will list the images you have locally. With the rstudio
image downloaded, you are ready to start RStudio Server.
docker run -d -p 8787:8787 -v $(pwd):/home/rstudio -e PASSWORD=5tr0nG&_passW0rD rocker/rstudio
Flags passed to the docker run
command
docker run
command creates a container from an image. The flags you pass to the command are described below;
-d
is to start a container in detached mode. Setting this will return the terminal prompt back to you after invoking command.
-p
is to map ports between the container and the host machine.
RStudio Server is a web application and makes RStudio available as a web service. A web service is run on a specific port. When running a web service inside a Docker container, it is necessary to map the port from inside the container to a port on the host system. This is done to make the web service that is running inside the container available on the host system. As a result of mapping ports, you will be able to access RStudio via a web browser on your system.
-v
is used to mount a folder on your host to a folder inside the container. With -v $(pwd):/home/rstudio
you are mapping the current folder $(pwd)
to the /home/rstudio
folder inside the container.
Mounting folders allows you to store critical files, code, data on your host system while using a container only for processing.
A word of caution here. You can easily delete files inside the host systems' folder as a result of your actions inside the container. Please exercise caution when deleting files inside the container.
-e
allows you to set environment variables in the container. With -e PASSWORD=5tr0nG&_passW0rD#
we are creating a global variable PASSWORD
inside the container and giving it 5tr0nG&_passW0rD#
as the value. The PASSWORD
environment variable is necessary because RStudio Server expects it and will error out if it is not set.
Finally, in the docker run
command, after the flags have been set ypu specify the image name in this case rocker/rstudio
.
You now have RStudio Server running inside a container and available to you on the host. Open up a browser and type localhost:8787
in the address bar. Login with rstudio
as the username and the password you set in the environment variable.
CUSTOME DOCKER IMAGE
What we did earlier was to download a pre-built image. These images are contributed by people who have written a Dockerfile (recipe for building up the image), developed the actual image from the Dockerfile and then submitted it to Docker Hub for distribution.
As an example, consier the tidyverse
Docker image. The Dockerfile can be found here and is also shown below.
FROM rocker/rstudio:3.6.2
RUN apt-get update -qq && apt-get -y --no-install-recommends install \
libxml2-dev \
libcairo2-dev \
libsqlite-dev \
libmariadbd-dev \
libmariadbclient-dev \
libpq-dev \
libssh2-1-dev \
unixodbc-dev \
libsasl2-dev \
&& install2.r --error \
--deps TRUE \
tidyverse \
dplyr \
devtools \
formatR \
remotes \
selectr \
caTools \
BiocManager
Copy the contents of the tidyverse Dockerfile into a file locally and name it Dockerfile
. Don't add a .txt
or other extensions to the file. Edit your local Dockerfile
and add any additional R libraries you want installed in the image. You can append R library names to the existing list of R libraries already being installed in the Dockerfile.
docker build -t tidyverse-custom:first .
will build your custom image from the Dockerfile you just created.
In the docker build
command -t
sets the name of the image in name:tag format. For e.g., tidyverse-custom
is the name and first
is the tag. The docker build
command will now build the image. This can take a while so be prepared to wait.
Once your image is ready, you can invoke docker run
on your image with the same flags described above to stand up a container running RStudio Server with R libraries of your choice.
docker run -d -p 8799:8787 -v $(pwd):/home/rstudio -e PASSWORD=5tr0nG&_passW0rD tidyverse-custom:first
Remember to choose another host port if 8787
on your host is still in use. I've used 8799
in the command above.
docker ps
lists your running containers. Choose the container id you wish to stop and provide it to docker stop
to shut down a running container.