Learning Docker by building an R learning environment

Learning Docker by building an R learning environment

Docker is a technology that makes it very easy to try a piece of software or technology without running into installation problems that you would otherwise run into if you were to install software directly or natively on your system. Docker gives you an environment that you can keep separate from the rest of your computer and use this environment as a playground to try different technologies.

In the post below I describe creating an R environment using RStudio Server with Docker.

If you don't have Docker installed, then get Docker Desktop here. Once installed the instructions below describe setting up RStudio Server.

GET RSTUDIO DOCKER IMAGE
In a terminal invoke the following commands;

docker pull rocker/rstudio

docker images will list the images you have locally. With the rstudio image downloaded, you are ready to start RStudio Server.

docker run -d -p 8787:8787 -v $(pwd):/home/rstudio -e PASSWORD=5tr0nG&_passW0rD rocker/rstudio

Flags passed to the docker run command

docker run command creates a container from an image. The flags you pass to the command are described below;

-d is to start a container in detached mode. Setting this will return the terminal prompt back to you after invoking command.

-p is to map ports between the container and the host machine.
RStudio Server is a web application and makes RStudio available as a web service. A web service is run on a specific port. When running a web service inside a Docker container, it is necessary to map the port from inside the container to a port on the host system. This is done to make the web service that is running inside the container available on the host system. As a result of mapping ports, you will be able to access RStudio via a web browser on your system.

-v is used to mount a folder on your host to a folder inside the container. With -v $(pwd):/home/rstudio you are mapping the current folder $(pwd) to the /home/rstudio folder inside the container.

Mounting folders allows you to store critical files, code, data on your host system while using a container only for processing.

A word of caution here. You can easily delete files inside the host systems' folder as a result of your actions inside the container. Please exercise caution when deleting files inside the container.

-e allows you to set environment variables in the container. With -e PASSWORD=5tr0nG&_passW0rD# we are creating a global variable PASSWORD inside the container and giving it 5tr0nG&_passW0rD# as the value. The PASSWORD environment variable is necessary because RStudio Server expects it and will error out if it is not set.

Finally, in the docker run command, after the flags have been set ypu specify the image name in this case rocker/rstudio.

You now have RStudio Server running inside a container and available to you on the host. Open up a browser and type localhost:8787 in the address bar. Login with rstudio as the username and the password you set in the environment variable.

CUSTOME DOCKER IMAGE

What we did earlier was to download a pre-built image. These images are contributed by people who have written a Dockerfile (recipe for building up the image), developed the actual image from the Dockerfile and then submitted it to Docker Hub for distribution.

As an example, consier the tidyverse Docker image. The Dockerfile can be found here and is also shown below.

FROM rocker/rstudio:3.6.2

RUN apt-get update -qq && apt-get -y --no-install-recommends install \
  libxml2-dev \
  libcairo2-dev \
  libsqlite-dev \
  libmariadbd-dev \
  libmariadbclient-dev \
  libpq-dev \
  libssh2-1-dev \
  unixodbc-dev \
  libsasl2-dev \
  && install2.r --error \
    --deps TRUE \
    tidyverse \
    dplyr \
    devtools \
    formatR \
    remotes \
    selectr \
    caTools \
    BiocManager

Copy the contents of the tidyverse Dockerfile into a file locally and name it Dockerfile. Don't add a .txt or other extensions to the file. Edit your local Dockerfile and add any additional R libraries you want installed in the image. You can append R library names to the existing list of R libraries already being installed in the Dockerfile.

docker build -t tidyverse-custom:first . will build your custom image from the Dockerfile you just created.

In the docker build command -t sets the name of the image in name:tag format. For e.g., tidyverse-custom is the name and first is the tag. The docker build command will now build the image. This can take a while so be prepared to wait.

Once your image is ready, you can invoke docker run on your image with the same flags described above to stand up a container running RStudio Server with R libraries of your choice.

docker run -d -p 8799:8787 -v $(pwd):/home/rstudio -e PASSWORD=5tr0nG&_passW0rD tidyverse-custom:first

Remember to choose another host port if 8787 on your host is still in use. I've used 8799 in the command above.

docker ps lists your running containers. Choose the container id you wish to stop and provide it to docker stop to shut down a running container.