Home projects readings blog about

Deploy Jupyter Data Stack using Docker!

Published on Wednesday, 23 January, 2019 Infrastructure and Tools

docker

What is Docker?

Before we begin we need to review what a virtual machine is. A virtual machine (VM) is an emulation of a computer system which can be run on a local machine. A Mac-user may be familiar with Parallels which allows one to run Windows on their machine is a nice example of a virtual machine. The drawbacks from using virtual machines were the difficulty in setting one up, used extensive amount of system resources, and long boot times.

Now, that you have an idea about virtualization another approach is full virtual machine isolation called containerization. Docker came on to the scene back in 2013 and has changed a lot of how testing and deployment are done at companies now. Docker is based on Linux libcontainer and is a management system allows for the creation, management, and monitoring of containers. A container abstracts away the OS compared to a VM which provides an abstract machine.

vm_vs_container

Installing Docker

Mac OS Follow the instruction to install on your mac.

The link provides an easy GUI installer for Mac OS.

Running a Docker Container

After installing Docker, you should now be able to pull and run images. An image is a light OS and libraries allowing for you to run an application.

docker search jupyter

docker

Docker Pull and Run

docker pull datascience-notebook
docker run jupyter/datascience-notebook

docker

Now copy and paste one of those urls and it should then take you to your browser. You should end up to a page like the one below.

jupyter

You have successfully launched a Jupyter Notebook. Continue with the tutorial to explain Dockerfile and docker-compose so you can launch a database as well as a notebook service (web app).

Using docker-compose to run a Jupyter Notebook and Postgresq

Dockerfile

touch Dockerfile
nano Dockerfile

Paste within the Dockerfile the following:

Dockerfile and docker-compose.yaml

Dockerfiles are to build an image for example from a bare bone Ubuntu, you can add mysql called mySQL on one image and mywordpress on a second image called mywordpress.

Compose YAML files are to take these images and run them cohesively. For example, if you have in your docker-compose.yml file a service called db:

You have deployed a Jupyter Notebook

HOOK

Docker-compose is a simple tool for allowing for the defining and running of multiple Docker containers. Using a YAML file you can configure an application's services. Using a single command you can then deploy a single or multiple services.

References:

Paruchuri, Vik. “ Tutorial: Running a Dockerized Jupyter Server for Data Science.”Dataquest, Dataquest.io, 22 Nov. 2015, www.dataquest.io/blog/docker-data-science/.

“ 8 Surprising Facts About Real Docker Adoption.” Datadog, June 2018, https://www.datadoghq.com/docker-adoption/.

“ What’s the Difference Between Containers and Virtual Machines?.” ElectronicDesign, 15 July 2016, https://www.electronicdesign.com/technologies/dev-tools/article/21801722/whats-the-difference-between-containers-and-virtual-machines