<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="3.7.4">Jekyll</generator><link href="https://derekpowell.github.io/feed.xml" rel="self" type="application/atom+xml" /><link href="https://derekpowell.github.io/" rel="alternate" type="text/html" /><updated>2019-04-21T15:58:14-07:00</updated><id>https://derekpowell.github.io/feed.xml</id><title type="html">Derek Powell</title><subtitle>Postdoctoral Scholar at Stanford University</subtitle><author><name>Derek Powell</name><email>derekpowell@stanford.edu</email></author><entry><title type="html">Conducting reproducible research with Docker (Part 3 of 3)</title><link href="https://derekpowell.github.io/posts/2018/02/docker-tutorial-3/" rel="alternate" type="text/html" title="Conducting reproducible research with Docker (Part 3 of 3)" /><published>2018-02-26T00:00:00-08:00</published><updated>2018-02-26T00:00:00-08:00</updated><id>https://derekpowell.github.io/posts/2018/02/docker-tutorial-3</id><content type="html" xml:base="https://derekpowell.github.io/posts/2018/02/docker-tutorial-3/">&lt;p&gt;In the last two entries in this tutorial series I showed you how to use Docker to maintain a reproducible environment for conducting statistical analyses. Conducting reproducible reseach is primarily about scientific honesty, transparency, and the maintenance of high scientific standards. However, the choice to use Docker for reproducible research also has an awesome side-benefit: the ability to run docker containers remotely on cloud servers. In this tutorial I’ll show you how to run your docker containers on a virtual cloud “workstation” using &lt;a href=&quot;https://m.do.co/c/b5d7c56f84df&quot;&gt;DigitalOcean&lt;/a&gt;.&lt;/p&gt;

&lt;h2 id=&quot;whats-digitalocean&quot;&gt;What’s DigitalOcean?&lt;/h2&gt;

&lt;p&gt;&lt;img src=&quot;https://tctechcrunch2011.files.wordpress.com/2016/07/unnamed1.png&quot; alt=&quot;DigitalOcean logo&quot; /&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://m.do.co/c/b5d7c56f84df&quot;&gt;DigitalOcean&lt;/a&gt; is a web-hosting company that lets you rent &lt;a href=&quot;https://en.wikipedia.org/wiki/Virtual_private_server&quot;&gt;virtual servers&lt;/a&gt; called “droplets.” Using droplets or other virtual private servers as remote workstations is a very economical way to access extra computing power. For me, for at least 80% of my workday my laptop is totally sufficient, but sometimes I really need some extra cores or extra ram (mostly for MCMC samping with the excellent but computationally demanding  &lt;a href=&quot;https://github.com/paul-buerkner/brms&quot;&gt;BRMS&lt;/a&gt; R package). For those times I can spin up an an 8-core workstation with 16gb of ram whenever I like at a rate of $0.24 per hour, billed to the minute. Because I only pay for the time I actually use the droplet, the extra computing power usually costs me less than my daily coffee. Perhaps that’s not enough for you? As of this writing, droplets scale all the way up to 32 cores and 192gb of ram for $1.43 an hour.&lt;/p&gt;

&lt;p&gt;In addition to DigitalOcean, there are a number of other VPS providers that offer similar services. The heaviest hitters are &lt;a href=&quot;https://aws.amazon.com/&quot;&gt;Amazon Web Services&lt;/a&gt; and &lt;a href=&quot;https://cloud.google.com/compute/&quot;&gt;Google Compute Engine&lt;/a&gt;. I like DigitalOcean because it offers “dedicated” cpu instances, has a nice web interface, has nice CLI tools, offers great documentation, and is about as cheap as you’ll find anywhere for similar quality.&lt;/p&gt;

&lt;h2 id=&quot;running-docker-containers-on-digitalocean-droplets&quot;&gt;Running Docker containers on DigitalOcean droplets&lt;/h2&gt;

&lt;p&gt;In the first part of the tutorial we’ll …&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;Sign up for DigialOcean&lt;/li&gt;
  &lt;li&gt;Create SSH keys to make it easy to access our remote instances&lt;/li&gt;
  &lt;li&gt;Create a DigitalOcean droplet with Docker&lt;/li&gt;
  &lt;li&gt;Run RStudio on our remote droplet&lt;/li&gt;
  &lt;li&gt;“Destroy” the droplet so we stop getting billed for it&lt;/li&gt;
&lt;/ol&gt;

&lt;h2 id=&quot;step-1-sign-up-with-digitalocean&quot;&gt;Step 1: Sign up with DigitalOcean&lt;/h2&gt;

&lt;p&gt;I suggest you sign up using my &lt;a href=&quot;https://m.do.co/c/b5d7c56f84df&quot;&gt;DigitalOcean referral link&lt;/a&gt; to get a $10 credit for DigitalOcean. That way you can finish this tutorial and try out DigitalOcean for free!&lt;/p&gt;

&lt;h2 id=&quot;step-2-create-ssh-keys&quot;&gt;Step 2: Create SSH keys&lt;/h2&gt;

&lt;p&gt;Per &lt;a href=&quot;https://en.wikipedia.org/wiki/Secure_Shell&quot;&gt;Wikipedia&lt;/a&gt;:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;Secure Shell (SSH) is a cryptographic network protocol for operating network services securely over an unsecured network. The best known example application is for remote login to computer systems by users.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;We’ll use SSH to communicate with the remote droplet on DigitalOcean and to remotely run commands, such as initiating the docker container. In addition, we can use it to create a “tunnel” between a port on our local machine and the remote machine, and can also use it to transfer files back and forth. Using an SSH key will let us do all of this securely without having to enter passwords at every step of the way.&lt;/p&gt;

&lt;p&gt;DigitalOcean actually offers a great &lt;a href=&quot;https://www.digitalocean.com/community/tutorials/how-to-use-ssh-keys-with-digitalocean-droplets&quot;&gt;tutorial&lt;/a&gt; for creating and using SSH keys on your account, so I’ll leave the heavy lifting to them. You’ll need to follow at least steps 1 through 4 of the linked tutorial.&lt;/p&gt;

&lt;h2 id=&quot;step-3-create-a-digitalocean-droplet-with-docker&quot;&gt;Step 3: Create a DigitalOcean droplet with Docker&lt;/h2&gt;

&lt;p&gt;Now the main event. We’ll create a new DigitalOcean droplet running docker. Sign in to DigitalOcean and choose “create droplet”.&lt;/p&gt;

&lt;html&gt;&lt;center&gt;&lt;img src=&quot;/images/create_droplet.png&quot; alt=&quot;creating a droplet&quot; style=&quot;width: 66%;&quot; /&gt;&lt;/center&gt;&lt;/html&gt;

&lt;p&gt;From the “choose an image” menu select “One-click apps”. Then choose “Docker 17.12.0 on 16.04”. This will create a docker droplet running Ubuntu 16.04 with Docker pre-installed.&lt;sup id=&quot;fnref:1&quot;&gt;&lt;a href=&quot;#fn:1&quot; class=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/images/droplet_images.png&quot; alt=&quot;Choosing a Docker image&quot; /&gt;&lt;/p&gt;

&lt;p&gt;Next, you’ll choose a droplet size. For our purposes let’s choose the 2 vcpu dedicated instance. This will have some oomph to play around with but without costing us too much for the purposes of the tutorial&lt;/p&gt;

&lt;html&gt;&lt;center&gt;&lt;img src=&quot;/images/droplet_sizes.png&quot; alt=&quot;droplet size options&quot; style=&quot;width: 66%;&quot; /&gt;&lt;/center&gt;&lt;/html&gt;

&lt;p&gt;Then, choose your datacenter region. You can choose whichever you like, though some options are only available in certain regions.&lt;/p&gt;

&lt;p&gt;Finally, make sure to &lt;strong&gt;“include the SSH key”&lt;/strong&gt; you created in step 2. Name your droplet however you like and click &lt;strong&gt;“Create”&lt;/strong&gt;.&lt;/p&gt;

&lt;html&gt;&lt;center&gt;&lt;img src=&quot;/images/add_ssh_key.png&quot; alt=&quot;adding keys&quot; style=&quot;width: 250px;&quot; /&gt;&lt;/center&gt;&lt;/html&gt;

&lt;h2 id=&quot;step-4-running-rstudio-remotely&quot;&gt;Step 4: Running RStudio remotely&lt;/h2&gt;

&lt;p&gt;&lt;img src=&quot;/images/droplet_progress.png&quot; alt=&quot;droplet&quot; /&gt;&lt;/p&gt;

&lt;p&gt;Once your droplet is created, copy its address to your clipboard by clicking on it. Now switch back over to terminal and run (being sure to use your droplet’s ip address):&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;ssh root@138.68.6.84
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Then type &lt;code class=&quot;highlighter-rouge&quot;&gt;yes&lt;/code&gt; at the prompt. This will give you a shell prompt on your remote DigitalOcean server as the root user. Now, you can start your docker container exactly as you would on your local computer.&lt;sup id=&quot;fnref:2&quot;&gt;&lt;a href=&quot;#fn:2&quot; class=&quot;footnote&quot;&gt;2&lt;/a&gt;&lt;/sup&gt; Run:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;docker run &lt;span class=&quot;nt&quot;&gt;-d&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-p&lt;/span&gt; 8787:8787 &lt;span class=&quot;nt&quot;&gt;-e&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;USER&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;yourName &lt;span class=&quot;nt&quot;&gt;-e&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;PASSWORD&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;secretPassword &lt;span class=&quot;nt&quot;&gt;-e&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;ROOT&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;TRUE rocker/tidyverse:3.4.3
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Hop on your browser and point it to your droplet’s ip address and port 8787. As I made this tutorial mine was  &lt;code class=&quot;highlighter-rouge&quot;&gt;138.68.1.215:8787&lt;/code&gt;. You should be greeted with the RStudio sign-in page.&lt;/p&gt;

&lt;p&gt;Do note that using an original password (and possibly username) is much more important now that you’re working on a remote server. Anyone in the world can type in that ip address and port and potentially access your droplet, so you want to ensure there’s real protection.&lt;/p&gt;

&lt;p&gt;Now that you’ve got RStudio running remotely, there are a few different ways to get your files onto it. The most direct is to upload them from the files window in the web interface. You can also securely copy them using ssh and the &lt;code class=&quot;highlighter-rouge&quot;&gt;scp&lt;/code&gt; command.&lt;/p&gt;

&lt;html&gt;&lt;center&gt;&lt;img src=&quot;/images/RStudio_upload.png&quot; alt=&quot;uploading files to rstudio&quot; style=&quot;width: 66%;&quot; /&gt;&lt;/center&gt;&lt;/html&gt;

&lt;p&gt;My personal preference is to interface with github. I save all my R projects as github repositories, and clone whatever I’m working on to the remote machine. You can do this through command-line, or directly in the rstudio interface: Go to &lt;code class=&quot;highlighter-rouge&quot;&gt;File -&amp;gt; New Project -&amp;gt; Version Control -&amp;gt; Git&lt;/code&gt; and enter the repository name. After you enter your username and password, the files will be cloned to the remote machine and you can commit-push when you are done working.&lt;/p&gt;

&lt;h2 id=&quot;step-5-destroying-the-droplet&quot;&gt;Step 5: Destroying the droplet&lt;/h2&gt;

&lt;p&gt;Once you’re done working you’ll want to “destroy” the droplet so that you are no longer billed for it. This sounds dramatic but I think it’s so-named to ensure you won’t forget to save your work from the droplet to your local machine or to a repository like github. To destroy the droplet, navigate to its page on the DigitalOcean website and choose &lt;strong&gt;“Destroy”&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/images/droplet_destroy.png&quot; alt=&quot;droplet&quot; /&gt;&lt;/p&gt;

&lt;h2 id=&quot;creature-comforts&quot;&gt;Creature comforts&lt;/h2&gt;

&lt;p&gt;Working from within a Docker container offers some great advantages, but it can also have some drawbacks. Because reproducibility demands the container be available to anyone, there’s a limit to the amount of customization that we should build into the container itself. For instance, we should &lt;em&gt;never&lt;/em&gt; put any passwords, keys, authentication info, etc. into a Docker container. Here I’ll show how we can add some creature comforts to our RStudio environment within our docker container, without compromising security or preventing others from using it easily.&lt;/p&gt;

&lt;h2 id=&quot;setting-up-git-username-and-password&quot;&gt;Setting up git username and password&lt;/h2&gt;

&lt;p&gt;Using the git and github integration in RStudio server requires telling git how to sign commits. As is, this means running the following commands at the shell &lt;em&gt;every&lt;/em&gt; time we create a new docker container:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;git config &lt;span class=&quot;nt&quot;&gt;--global&lt;/span&gt; user.name &lt;span class=&quot;s2&quot;&gt;&quot;Your Name&quot;&lt;/span&gt;
git config &lt;span class=&quot;nt&quot;&gt;--global&lt;/span&gt; user.email &lt;span class=&quot;s2&quot;&gt;&quot;yourEmail@gmail.com&quot;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;That’s a pain.&lt;/p&gt;

&lt;p&gt;We’ll fix this by adding a script to the &lt;code class=&quot;highlighter-rouge&quot;&gt;/init&lt;/code&gt; startup directory of our Rocker-based RStudio container that will perform this step for us. Rather than hard-coding our name and email–which could make this difficult for others to use, we’ll pass that info in as an an environment variable.&lt;/p&gt;

&lt;p&gt;Here’s the script we’ll create in our docker project folder (the same folder with the Dockerfile) as &lt;code class=&quot;highlighter-rouge&quot;&gt;git_config.sh&lt;/code&gt;:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c&quot;&gt;#!/usr/bin/with-contenv bash&lt;/span&gt;

&lt;span class=&quot;nv&quot;&gt;GIT_USER&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;${&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;GIT_USER&lt;/span&gt;:&lt;span class=&quot;p&quot;&gt;=none&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;nv&quot;&gt;GIT_EMAIL&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;${&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;GIT_EMAIL&lt;/span&gt;:&lt;span class=&quot;p&quot;&gt;=none&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;}&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;[&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$GIT_USER&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;!=&lt;/span&gt; none &lt;span class=&quot;o&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;then
	&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;echo&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-e&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;[user]&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\n\t&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;name=&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$GIT_USER&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\n\t&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;email=&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$GIT_EMAIL&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; /home/rstudio/.gitconfig
&lt;span class=&quot;k&quot;&gt;fi&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Then, we’ll modify our Dockerfile to add this file to the appropriate startup directory. Here’s how we’d modify the Dockerfile I created in part 2 of this tutorial:&lt;/p&gt;

&lt;div class=&quot;language-dockerfile highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c&quot;&gt;####### Dockerfile #######&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;FROM&lt;/span&gt;&lt;span class=&quot;s&quot;&gt; rocker/tidyverse:3.4.3&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;ENV&lt;/span&gt;&lt;span class=&quot;s&quot;&gt; DEBIAN_FRONTEND noninteractive&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;COPY&lt;/span&gt;&lt;span class=&quot;s&quot;&gt; git_config.sh /etc/cont-init.d/gitconfig&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;RUN &lt;/span&gt;apt-get update &lt;span class=&quot;nt&quot;&gt;-qq&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&amp;amp;&lt;/span&gt; apt-get &lt;span class=&quot;nt&quot;&gt;-y&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;--no-install-recommends&lt;/span&gt; install &lt;span class=&quot;se&quot;&gt;\
&lt;/span&gt;	libglu1-mesa-dev &lt;span class=&quot;se&quot;&gt;\
&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;amp;&amp;amp;&lt;/span&gt; install2.r &lt;span class=&quot;nt&quot;&gt;--error&lt;/span&gt; &lt;span class=&quot;se&quot;&gt;\
&lt;/span&gt;    &lt;span class=&quot;nt&quot;&gt;--deps&lt;/span&gt; TRUE &lt;span class=&quot;se&quot;&gt;\
&lt;/span&gt;    lme4 &lt;span class=&quot;se&quot;&gt;\
&lt;/span&gt;    car
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Copying this script into &lt;code class=&quot;highlighter-rouge&quot;&gt;/etc/cont-init.d&lt;/code&gt; sets it to run at startup. The script looks for environment variables &lt;code class=&quot;highlighter-rouge&quot;&gt;GIT_USER&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;GIT_EMAIL&lt;/code&gt; and if they exist it runs the commands for us. When we start the docker container we can pass in that info with &lt;code class=&quot;highlighter-rouge&quot;&gt;-e&lt;/code&gt; flags and it will set things up for us.&lt;/p&gt;

&lt;h2 id=&quot;changing-themes&quot;&gt;Changing themes&lt;/h2&gt;

&lt;p&gt;Personally, I like using the “Solarized Dark” theme in RStudio. Rather than manually changing the themes each time we run the container, we can also make these changes using environment variables.&lt;/p&gt;

&lt;p&gt;To do so, create a  &lt;code class=&quot;highlighter-rouge&quot;&gt;set_theme.sh&lt;/code&gt; script in the docker project directory, with the following content:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c&quot;&gt;#!/usr/bin/with-contenv bash&lt;/span&gt;

&lt;span class=&quot;nv&quot;&gt;THEME&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;${&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;THEME&lt;/span&gt;:&lt;span class=&quot;p&quot;&gt;=none&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;}&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;[&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$THEME&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;!=&lt;/span&gt; none &lt;span class=&quot;o&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;then
	&lt;/span&gt;mkdir &lt;span class=&quot;nt&quot;&gt;-p&lt;/span&gt; /home/rstudio/.rstudio/monitored/user-settings
	&lt;span class=&quot;nb&quot;&gt;echo&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;uiPrefs={&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\&quot;&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;theme&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\&quot;&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt; : &lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\&quot;&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;$THEME&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\&quot;&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;}&quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;se&quot;&gt;\&lt;/span&gt;
	/home/rstudio/.rstudio/monitored/user-settings/user-settings
	chown &lt;span class=&quot;nt&quot;&gt;-R&lt;/span&gt; rstudio /home/rstudio
&lt;span class=&quot;k&quot;&gt;fi&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Then, just like before we add another line to the dockerfile:&lt;/p&gt;

&lt;div class=&quot;language-dockerfile highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c&quot;&gt;####### Dockerfile #######&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;FROM&lt;/span&gt;&lt;span class=&quot;s&quot;&gt; rocker/tidyverse:3.4.3&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;ENV&lt;/span&gt;&lt;span class=&quot;s&quot;&gt; DEBIAN_FRONTEND noninteractive&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;COPY&lt;/span&gt;&lt;span class=&quot;s&quot;&gt; git_config.sh /etc/cont-init.d/gitconfig&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;COPY&lt;/span&gt;&lt;span class=&quot;s&quot;&gt; set_theme.sh /etc/cont-init.d/theme&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;RUN &lt;/span&gt;apt-get update &lt;span class=&quot;nt&quot;&gt;-qq&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&amp;amp;&lt;/span&gt; apt-get &lt;span class=&quot;nt&quot;&gt;-y&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;--no-install-recommends&lt;/span&gt; install &lt;span class=&quot;se&quot;&gt;\
&lt;/span&gt;	libglu1-mesa-dev &lt;span class=&quot;se&quot;&gt;\
&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;amp;&amp;amp;&lt;/span&gt; install2.r &lt;span class=&quot;nt&quot;&gt;--error&lt;/span&gt; &lt;span class=&quot;se&quot;&gt;\
&lt;/span&gt;    &lt;span class=&quot;nt&quot;&gt;--deps&lt;/span&gt; TRUE &lt;span class=&quot;se&quot;&gt;\
&lt;/span&gt;    lme4 &lt;span class=&quot;se&quot;&gt;\
&lt;/span&gt;    car
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h2 id=&quot;putting-it-all-together&quot;&gt;Putting it all together&lt;/h2&gt;

&lt;p&gt;When you’ve got your scripts and dockerfile written correctly, add those scripts to the git repo, commit, and push to trigger the automated build. Once the image is ready, we can pass in our preferred defaults as environment variables to the &lt;code class=&quot;highlighter-rouge&quot;&gt;docker run&lt;/code&gt; command.&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;docker run &lt;span class=&quot;nt&quot;&gt;-d&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-p&lt;/span&gt; 8787:8787 &lt;span class=&quot;nt&quot;&gt;-e&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;USER&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;yourName &lt;span class=&quot;nt&quot;&gt;-e&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;PASSWORD&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;secretPassword &lt;span class=&quot;nt&quot;&gt;-e&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;ROOT&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;TRUE &lt;span class=&quot;nt&quot;&gt;-e&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;GIT_USER&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;gitUsername&quot;&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-e&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;GIT_EMAIL&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;yourEmail@gmail.com&quot;&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-e&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;THEME&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;Solarized Dark&quot;&lt;/span&gt;  rocker/tidyverse:3.4.3
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Voilà!&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/images/RStudio_sd.png&quot; alt=&quot;solarized dark theme&quot; /&gt;&lt;/p&gt;

&lt;p&gt;You can extend this general approach to run whatever commands or set whatever settings you like. For more advanced users, here’s more information on the &lt;a href=&quot;https://github.com/just-containers/s6-overlay&quot;&gt;init setup&lt;/a&gt; being used by the Rocker images.&lt;/p&gt;

&lt;h2 id=&quot;conclusions&quot;&gt;Conclusions&lt;/h2&gt;

&lt;p&gt;One virtue of using Docker containers for reproducible research is that they are complete and yet fully portable. This allows others (including our future selves) to reproduce our work, but with the help of RStudio and RStudio server, it also means we can do that work wherever we want.&lt;/p&gt;

&lt;div class=&quot;footnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:1&quot;&gt;
      &lt;p&gt;Another option here is to choose &lt;strong&gt;Container Distributions&lt;/strong&gt; and &lt;strong&gt;coreOs&lt;/strong&gt;. This is a more minimal linux distribution that also has docker pre-installed. If you choose to go this route you’ll need to login as user “core”, using &lt;code class=&quot;highlighter-rouge&quot;&gt;ssh core@ip.address&lt;/code&gt; in the next step. &lt;a href=&quot;#fnref:1&quot; class=&quot;reversefootnote&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:2&quot;&gt;
      &lt;p&gt;You might note I’m not mapping a volume into the container. That’s because there isn’t any data or files on this remote server, and instead I plan to do pretty much everything from within the container. If we wanted to &lt;code class=&quot;highlighter-rouge&quot;&gt;scp&lt;/code&gt; some files or something, then we would want to do some mapping. &lt;a href=&quot;#fnref:2&quot; class=&quot;reversefootnote&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;</content><author><name>Derek Powell</name><email>derekpowell@stanford.edu</email></author><category term="Research" /><category term="R" /><category term="Reproducibility" /><summary type="html">In the last two entries in this tutorial series I showed you how to use Docker to maintain a reproducible environment for conducting statistical analyses. Conducting reproducible reseach is primarily about scientific honesty, transparency, and the maintenance of high scientific standards. However, the choice to use Docker for reproducible research also has an awesome side-benefit: the ability to run docker containers remotely on cloud servers. In this tutorial I’ll show you how to run your docker containers on a virtual cloud “workstation” using DigitalOcean.</summary></entry><entry><title type="html">Conducting reproducible research with Docker (Part 2 of 3)</title><link href="https://derekpowell.github.io/posts/2018/02/docker-tutorial-2/" rel="alternate" type="text/html" title="Conducting reproducible research with Docker (Part 2 of 3)" /><published>2018-02-14T00:00:00-08:00</published><updated>2018-02-14T00:00:00-08:00</updated><id>https://derekpowell.github.io/posts/2018/02/docker-tutorial-2</id><content type="html" xml:base="https://derekpowell.github.io/posts/2018/02/docker-tutorial-2/">&lt;p&gt;&lt;em&gt;This post picks up right where &lt;a href=&quot;/posts/2018/02/docker-tutorial-1/&quot;&gt;part 1&lt;/a&gt; of this tutorial left off. If you haven’t read that, I strongly recommend you start there.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;In part 1 of this series you saw how to get started using Docker for reproducible research. Here, we’ll build a Docker image with our own custom R environment. This will allow us to work in a reproducible environment with all the packages and libraries we need at hand.&lt;/p&gt;

&lt;h2 id=&quot;the-rocker-project&quot;&gt;The Rocker Project&lt;/h2&gt;

&lt;p&gt;Carl Boettiger and Dirk Eddelbuettel at &lt;a href=&quot;https://www.rocker-project.org/&quot;&gt;The Rocker Project&lt;/a&gt; have done the hard work of properly organizing R and RStudio server applications into a well-maintained and versioned docker image. They maintain docker containers that run R studio with sensible security options and some helpful base packages. Their website is also a good resource for help using these images. In part 1, we used their image to run RStudio server in Docker. Now, we’ll build off their work in this tutorial to make our own image with whatever packages we like.&lt;/p&gt;

&lt;h1 id=&quot;tutorial&quot;&gt;Tutorial&lt;/h1&gt;

&lt;p&gt;In this tutorial we’ll …&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;Create a repository on github for our docker image&lt;/li&gt;
  &lt;li&gt;Create a dockerfile&lt;/li&gt;
  &lt;li&gt;Setup github and dockerhub integration&lt;/li&gt;
  &lt;li&gt;Practice testing and installing packages&lt;/li&gt;
  &lt;li&gt;Build our dockerfile locally (for testing)&lt;/li&gt;
  &lt;li&gt;Initiate an automated build&lt;/li&gt;
  &lt;li&gt;Tag our dockerfile with a version so we can refer to it later&lt;/li&gt;
&lt;/ol&gt;

&lt;blockquote&gt;
  &lt;p&gt;Along the way, I’ll assume you have some basic working knowledge of git/github and the use of the terminal (though I’ve added some footnotes to help). I’ll also assume you’re on a mac, though things shouldn’t be that different on Linux. I can’t really speak to Windows, but the overall process should be similar.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2 id=&quot;step-1-create-a-git-repository-on-github&quot;&gt;Step 1. Create a git repository on github&lt;/h2&gt;

&lt;p&gt;I like to create my repos on github and then clone them to my local machine so I can get the license and .gitignore files from github. I’ve made a repo in my github called “docker-tut-example” for this tutorial. You should name your repo whatever you like, but be sure to use your name in all the code below. Whatever name you use will also appear on Docker hub.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/images/create_git.png&quot; alt=&quot;create git&quot; /&gt;&lt;/p&gt;

&lt;h2 id=&quot;step-2-create-a-dockerfile&quot;&gt;Step 2. Create a Dockerfile&lt;/h2&gt;

&lt;p&gt;In the root of your git repo directory, create a file called “Dockerfile” (no file extension). If you like, you can do this directly on github, or you can clone the repo to your local machine.&lt;sup id=&quot;fnref:1&quot;&gt;&lt;a href=&quot;#fn:1&quot; class=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt; In the editor of your choosing, add the following:&lt;/p&gt;

&lt;div class=&quot;language-dockerfile highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c&quot;&gt;####### Dockerfile #######&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;FROM&lt;/span&gt;&lt;span class=&quot;s&quot;&gt; rocker/tidyverse:3.4.3&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;This will create a docker file image that starts from the &lt;code class=&quot;highlighter-rouge&quot;&gt;rocker/tidyverse:3.4.3&lt;/code&gt; image. We’ll use this to build up our own custom image.&lt;/p&gt;

&lt;p&gt;Save the Dockerfile. If you’re working locally, add it to your git repository, commit, and push those changes up to github.&lt;sup id=&quot;fnref:2&quot;&gt;&lt;a href=&quot;#fn:2&quot; class=&quot;footnote&quot;&gt;2&lt;/a&gt;&lt;/sup&gt; Refresh your github page and you should see the Dockerfile has been added.&lt;/p&gt;

&lt;h2 id=&quot;step-3-setting-github-and-docker-hub-integration&quot;&gt;Step 3. Setting github and docker hub integration&lt;/h2&gt;

&lt;p&gt;Now we’ll set up integration between github and dockerhub for automated builds. This will allow other researchers to use your docker container and to allow you to use it across different machines.&lt;/p&gt;

&lt;p&gt;First, follow &lt;a href=&quot;https://docs.docker.com/docker-hub/github/#linking-docker-hub-to-a-github-account&quot;&gt;this guide&lt;/a&gt; to link your github and dockerhub accounts. Once your accounts are setup, head to the “settings” tab on your github repo page. Then, click on the  “integrations &amp;amp; services” tab and add the docker service.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/images/github_integration.png&quot; alt=&quot;integration&quot; /&gt;&lt;/p&gt;

&lt;p&gt;Finally, head over to dockerhub and click “Create” –&amp;gt; “Create Automated Build”, and choose to do so from your github. Choose the appropriate repo from the list.&lt;/p&gt;

&lt;p&gt;Now, whenever you push to this repo, an automated build will be triggered on dockerhub. You can watch the builds occur by checking “Build Details” on the dockerhub image page. If nothing is listed on the build details page, make a change in your README.md file so you can make a new commit and push it to github. This should trigger the build.&lt;/p&gt;

&lt;h2 id=&quot;step-4-testing-installing-packages&quot;&gt;Step 4. Testing installing packages&lt;/h2&gt;

&lt;p&gt;The automated build will take a few minutes. Once it’s ready, download and run it on your own machine using the &lt;code class=&quot;highlighter-rouge&quot;&gt;docker run&lt;/code&gt; command. In the terminal, enter (after replacing with your repo name):&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;docker run &lt;span class=&quot;nt&quot;&gt;-d&lt;/span&gt; derekpowell/docker-tut-example
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The container will download to your machine and start running in the background.&lt;/p&gt;

&lt;h3 id=&quot;entering-running-containers&quot;&gt;Entering running containers&lt;/h3&gt;

&lt;p&gt;To interact with running containers we need to find out what they’re named. To see a list of running containers, at terminal enter:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt; docker ps 
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Docker assigns each running image a container id and a weird randomized name. As I was putting together this tutorial it was &lt;code class=&quot;highlighter-rouge&quot;&gt;xenodochial_kilby&lt;/code&gt; .&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/images/docker_ps.png&quot; alt=&quot;docker-ps&quot; /&gt;&lt;/p&gt;

&lt;p&gt;To get a bash prompt inside your running docker image, copy the name or container id and run the following command:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;docker &lt;span class=&quot;nb&quot;&gt;exec&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-i&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-t&lt;/span&gt; xenodochial_kilby /bin/bash
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;From here we can experiment with installing R packages and any other changes we might want to make. This allows us to test the install process without having to trigger a full automated build everytime we want to add a package.&lt;/p&gt;

&lt;h3 id=&quot;installing-r-packages-from-cran&quot;&gt;Installing R packages from CRAN&lt;/h3&gt;

&lt;p&gt;Let’s try installing an R package using &lt;code class=&quot;highlighter-rouge&quot;&gt;install2.r&lt;/code&gt; from littler (which is already in the container). Suppose we wanted to install the lme4 package (a popular package for hierarchical linear models). At your bash prompt, enter:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;install2.r &lt;span class=&quot;nt&quot;&gt;--error&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;--deps&lt;/span&gt; TRUE lme4
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;This will install the lme4 R package and its dependencies, throwing an error if anything fails along the way. If it’s a success, we can safely add this line to our Dockerfile.&lt;/p&gt;

&lt;h3 id=&quot;installing-r-packages-from-github&quot;&gt;Installing R packages from github&lt;/h3&gt;

&lt;p&gt;Generally speaking, install2.r is the preferred way to install packages inside a docker container. But, suppose that instead of installing from CRAN, we wanted to install the latest development version from github. We can do that like so:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;R &lt;span class=&quot;nt&quot;&gt;--no-restore&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;--no-save&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-e&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;'devtools::install_github(&quot;lme4/lme4&quot;,dependencies=TRUE)'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h3 id=&quot;installing-r-packages-while-specifying-a-specific-version&quot;&gt;Installing R packages while specifying a specific version&lt;/h3&gt;

&lt;p&gt;Finally, let’s say instead of the latest version we actually wanted to install an older version or maybe we just want to be explicit about the version that’s installed. We can do that too:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;R &lt;span class=&quot;nt&quot;&gt;--no-restore&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;--no-save&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-e&lt;/span&gt; &lt;span class=&quot;s1&quot;&gt;'devtools::install_version(&quot;lme4&quot;, version=&quot;1.1-14&quot;)'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;To generalize, we can run any R command we want from the command line and we can do this in the creation of our docker container image.&lt;/p&gt;

&lt;h3 id=&quot;installing-system-packages&quot;&gt;Installing system packages&lt;/h3&gt;

&lt;p&gt;In some cases, installing an R package might not work as expected and you might end up with something like the following: &lt;code class=&quot;highlighter-rouge&quot;&gt;Error: installation of package ‘rgl’ had non-zero exit status&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;This is, in fact, the error you’ll get if you try to install the &lt;code class=&quot;highlighter-rouge&quot;&gt;car&lt;/code&gt; package at this stage. This error occurs because we are missing some linux headers for libraries that are required. Unfortunately, littler’s &lt;code class=&quot;highlighter-rouge&quot;&gt;install2.r &lt;/code&gt; script can’t take care of these dependencies. This is (a big part of) why we test!&lt;/p&gt;

&lt;p&gt;Google this error and you’ll find &lt;a href=&quot;https://stackoverflow.com/questions/31982425/error-installation-of-package-rgl-had-non-zero-exit-status&quot;&gt;this stackoverflow post&lt;/a&gt; is the first result. The solution is to install &lt;code class=&quot;highlighter-rouge&quot;&gt;libglu1-mesa-dev&lt;/code&gt; before installing the &lt;code class=&quot;highlighter-rouge&quot;&gt;car&lt;/code&gt; package. To do so, use the following commands:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;apt-get update &lt;span class=&quot;nt&quot;&gt;-qq&lt;/span&gt;
apt-get &lt;span class=&quot;nt&quot;&gt;-y&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;--no-install-recommends&lt;/span&gt; install libglu1-mesa-dev
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Run those commands, then you can install car using install2.r (I’ll leave this as an exercise for the reader, as they say).&lt;/p&gt;

&lt;h2 id=&quot;step-5-building-dockerfile-locally&quot;&gt;Step 5. Building Dockerfile locally&lt;/h2&gt;

&lt;p&gt;Now we’re ready to add the steps we just tested to our docker image. If you haven’t already, clone your github repo to your local machine.&lt;sup id=&quot;fnref:1:1&quot;&gt;&lt;a href=&quot;#fn:1&quot; class=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt; Then, open up the dockerfile and edit it so it looks like this:&lt;/p&gt;

&lt;div class=&quot;language-dockerfile highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c&quot;&gt;####### Dockerfile #######&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;FROM&lt;/span&gt;&lt;span class=&quot;s&quot;&gt; rocker/tidyverse:3.4.3&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;ENV&lt;/span&gt;&lt;span class=&quot;s&quot;&gt; DEBIAN_FRONTEND noninteractive&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;RUN &lt;/span&gt;apt-get update &lt;span class=&quot;nt&quot;&gt;-qq&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&amp;amp;&lt;/span&gt; apt-get &lt;span class=&quot;nt&quot;&gt;-y&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;--no-install-recommends&lt;/span&gt; install &lt;span class=&quot;se&quot;&gt;\
&lt;/span&gt;	libglu1-mesa-dev &lt;span class=&quot;se&quot;&gt;\
&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;amp;&amp;amp;&lt;/span&gt; install2.r &lt;span class=&quot;nt&quot;&gt;--error&lt;/span&gt; &lt;span class=&quot;se&quot;&gt;\
&lt;/span&gt;    &lt;span class=&quot;nt&quot;&gt;--deps&lt;/span&gt; TRUE &lt;span class=&quot;se&quot;&gt;\
&lt;/span&gt;    lme4 &lt;span class=&quot;se&quot;&gt;\
&lt;/span&gt;    car
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;blockquote&gt;
  &lt;p&gt;A few notes on what’s going on here: &lt;code class=&quot;highlighter-rouge&quot;&gt;RUN&lt;/code&gt; is a docker command that executes bash commands during the building of the image. In bash, you can chain commands together with &lt;code class=&quot;highlighter-rouge&quot;&gt;&amp;amp;&amp;amp;&lt;/code&gt; and split them onto multiple lines with &lt;code class=&quot;highlighter-rouge&quot;&gt;\&lt;/code&gt;. Everything must be done in noninteractive mode because the build is automated, you won’t be there to press “y” to continue.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;At this stage, we could commit and push this up to github to trigger an automated buid, but before we do that let’s just make sure everything works by building it locally. From terminal on your local machine, &lt;sup id=&quot;fnref:3&quot;&gt;&lt;a href=&quot;#fn:3&quot; class=&quot;footnote&quot;&gt;3&lt;/a&gt;&lt;/sup&gt; navigate up to the parent directory that contains your github repo with &lt;code class=&quot;highlighter-rouge&quot;&gt;cd ..&lt;/code&gt;, and enter the following command (swapping in your repo name):&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;docker build docker-tut-example 
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;This will build the docker image from the dockerfile locally. Building locally can let you test more quickly, and without cluttering up your github repo with tons of commits. &lt;sup id=&quot;fnref:4&quot;&gt;&lt;a href=&quot;#fn:4&quot; class=&quot;footnote&quot;&gt;4&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

&lt;h2 id=&quot;step-6-push-to-github-for-automated-build&quot;&gt;Step 6. Push to github for automated build&lt;/h2&gt;

&lt;p&gt;Once the build succeeds, you can commit and push your repo to github and docker hub will begin automatically building your docker image. You can check on its progress in the “Build Details” tab of your docker hub repo.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/images/docker_build_details.png&quot; alt=&quot;building progress&quot; /&gt;&lt;/p&gt;

&lt;p&gt;Using the automated build feature of dockerhub might seem like a bit of extra work right now, but it is important for security and trust when you share the images. When you share a docker container that was produced with an automated build, your recipients can check the dockerfile and be sure of its contents.&lt;/p&gt;

&lt;h2 id=&quot;step-7-add-version-tags&quot;&gt;Step 7. Add version tags&lt;/h2&gt;

&lt;p&gt;There are lots of different ways you might organize Docker containers to achieve a reproducible workflow. As far as I can see, the simplest would be to maintain a single “personal” image with the libraries you use most. To maintain reproducibility between different projects, you can version this image using tags. Tags let you have multiple version of the same image, as we saw when we used the tidyverse:3.4.3 image in Part 1.&lt;/p&gt;

&lt;p&gt;Under this approach, each time you publish a paper or release some work, you would make sure that the docker container was tagged with a version and you would include that with your publication. Something like, “Analyses were conducted using derekpowell/docker-tut-example:0.0.1 docker image.”&lt;/p&gt;

&lt;p&gt;To set up your dockerhub repo for tagging, head to the “Build Settings” tab on its page.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/images/docker_build_settings.png&quot; alt=&quot;build settings&quot; /&gt;&lt;/p&gt;

&lt;p&gt;This is the tag configuration area. The first row shows that the “master” branch of the github repo is assigned the “latest” tag. This is a special, default tag. If you run &lt;code class=&quot;highlighter-rouge&quot;&gt;docker build rocker/tidyverse&lt;/code&gt; with no specific tag, it will assume that the “latest” version should be used. On the next row, change “Branch” to “Tag” (as shown). Now, when you tag your github repo, that tag will also be reflected on docker. Be sure to click “save changes” when you’re done.&lt;/p&gt;

&lt;p&gt;Now, head back to terminal and tag the current version of your repo:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;git tag &lt;span class=&quot;nt&quot;&gt;-a&lt;/span&gt; 0.0.1 &lt;span class=&quot;nt&quot;&gt;-m&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;very first version&quot;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;This will tag the repo with “0.0.1”. If you don’t like that number scheme, you can use any other you like, or even more descriptive tags, e.g., “dissertation.” Do note, the normal git push command will not push tags. To push all tags, enter:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;git push origin &lt;span class=&quot;nt&quot;&gt;--tags&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Then head over to dockerhub and check the “Build Details” tab–you should see the version being built. For more on git tags, check out &lt;a href=&quot;https://git-scm.com/book/en/v2/Git-Basics-Tagging&quot;&gt;this resource&lt;/a&gt;.&lt;/p&gt;

&lt;h1 id=&quot;conclusions&quot;&gt;Conclusions&lt;/h1&gt;

&lt;p&gt;In this tutorial we created a simple docker container with a custom R environment. Using the steps covered here, you should be able to make create a docker container with R, Rstudio, and the packages you use most. This will give you a reproducible environment to conduct analyses and a way to share that environment with other researchers.&lt;/p&gt;

&lt;p&gt;In Part 3 of this tutorial series, I plan to cover one of the side benefits of using Docker for reproducible research: the ability to run Docker containers remotely on cloud services like &lt;a href=&quot;https://aws.amazon.com/&quot;&gt;AWS&lt;/a&gt; and &lt;a href=&quot;https://www.digitalocean.com/&quot;&gt;Digital Ocean&lt;/a&gt;. I’ll also cover some ways to reduce the (minor) pain points associated with running RStudio from within a container in day-to-day use.&lt;/p&gt;
&lt;div class=&quot;footnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:1&quot;&gt;
      &lt;p&gt;Open terminal, navigate to the directory you’d like it to appear in, and type  &lt;code class=&quot;highlighter-rouge&quot;&gt;git clone YOUR_REPO_NAME&lt;/code&gt; . &lt;a href=&quot;#fnref:1&quot; class=&quot;reversefootnote&quot;&gt;&amp;#8617;&lt;/a&gt; &lt;a href=&quot;#fnref:1:1&quot; class=&quot;reversefootnote&quot;&gt;&amp;#8617;&lt;sup&gt;2&lt;/sup&gt;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:2&quot;&gt;
      &lt;p&gt;Add it with ` git add Dockerfile&lt;code class=&quot;highlighter-rouge&quot;&gt; and make a new commit with &lt;/code&gt;git commit -m “added dockerfile” .&lt;code class=&quot;highlighter-rouge&quot;&gt; . Finally, push this to github with &lt;/code&gt;git push`. &lt;a href=&quot;#fnref:2&quot; class=&quot;reversefootnote&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:3&quot;&gt;
      &lt;p&gt;If your terminal window is still at the prompt in your docker container, you can type &lt;code class=&quot;highlighter-rouge&quot;&gt;exit&lt;/code&gt; to exit out (should be familiar if you’ve ever used ssh). &lt;a href=&quot;#fnref:3&quot; class=&quot;reversefootnote&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:4&quot;&gt;
      &lt;p&gt;On a mac, you may need to make sure Docker has been granted sufficient ram. If not, you may get compiler errors. Access the docker app preferences via the menu bar and bump up the ram if this happens. &lt;a href=&quot;#fnref:4&quot; class=&quot;reversefootnote&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;</content><author><name>Derek Powell</name><email>derekpowell@stanford.edu</email></author><category term="Research" /><category term="R" /><category term="Reproducibility" /><summary type="html">This post picks up right where part 1 of this tutorial left off. If you haven’t read that, I strongly recommend you start there.</summary></entry><entry><title type="html">Conducting reproducible research with Docker (Part 1 of 3)</title><link href="https://derekpowell.github.io/posts/2018/02/docker-tutorial-1/" rel="alternate" type="text/html" title="Conducting reproducible research with Docker (Part 1 of 3)" /><published>2018-02-09T00:00:00-08:00</published><updated>2018-02-09T00:00:00-08:00</updated><id>https://derekpowell.github.io/posts/2018/02/docker-tutorial-1</id><content type="html" xml:base="https://derekpowell.github.io/posts/2018/02/docker-tutorial-1/">&lt;p&gt;In scientific research, reproducibility is a necessary (though not sufficient) condition for validity. But conducting reproducible research is hard! Sadly, many psychological studies &lt;a href=&quot;https://www.nature.com/news/over-half-of-psychology-studies-fail-reproducibility-test-1.18248&quot;&gt;fail&lt;/a&gt; tests of empirical reproducibility.  Unfortunately, there’s no software package that can solve the set of structural and statistical issues likely at the root of those non-replications.&lt;/p&gt;

&lt;p&gt;Still, there are some tools that can help us achieve statistical or computational reproducibility. This kind of reproducibility means that another researcher can take &lt;em&gt;our data&lt;/em&gt; and reproduce the analyses we conducted in a published paper. Sadly again, many studies in psychology &lt;a href=&quot;https://www.nature.com/news/stat-checking-software-stirs-up-psychology-1.21049&quot;&gt;fail here too&lt;/a&gt;. However, here the problem really might be solved with better tools–tools like &lt;a href=&quot;https://rmarkdown.rstudio.com/&quot;&gt;R Markdown&lt;/a&gt; that can help ensure that our results sections are reflective of our actual analyses.&lt;/p&gt;

&lt;p&gt;Here I’m going to describe another tool for producing statistically and computationally reproducible research, Docker. Reproducibility demands we make available the data and analyses scripts used in our research projects, but sometimes the line between our personal computer systems and our projects can start to blur. Our projects have “dependencies” that are required for them to run properly. So, to ensure other researchers can reproduce our projects, we need to clue them in to these dependencies in some way. The simplest way would be to dump our &lt;code class=&quot;highlighter-rouge&quot;&gt;sessionInfo()&lt;/code&gt; at the bottom of the page. That’s easy in the moment, but not easy down the road for those who want to reproduce our work. The easier we can make reproducible research the whole way through, the better. Keep in mind, the researcher most likely to attempt to reproduce your work is &lt;em&gt;future you&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;Here, I’ll show you how to use Docker to create reproducible workflows for scientific research.&lt;/p&gt;

&lt;h1 id=&quot;what-is-docker&quot;&gt;What is Docker?&lt;/h1&gt;

&lt;p&gt;&lt;img src=&quot;https://www.docker.com/sites/default/files/horizontal.png&quot; alt=&quot;docker logo&quot; /&gt;&lt;/p&gt;

&lt;p&gt;Docker is a tool for making containerized applications. The docker engine is like a very lightweight virtual machine engine. A virtual machine is (to oversimplify) a computer program that simulates another computer system, typically another operating system. This allows you to run a windows app on your mac, or a linux progam on windows, and so forth.&lt;/p&gt;

&lt;p&gt;Docker creates a “containerized” version of an application that includes everything needed to run the app: OS, headers, libraries, packages, etc. This container is saved as an “image”, that can then be shared with others. This allows people working on different computers, with different OS versions, package versions, etc to share and execute code or apps. So long as you have Docker installed on your computer and the right Docker image, you can spin up a container that will exactly reproduce the environment needed for the app, no matter what your own personal computing environment looks like.&lt;/p&gt;

&lt;p&gt;Maybe you’re seeing how this can help us do reproducible research: if we create a containerized version of R, we can ensure we have R, R packages, system libraries, etc all in the right versions to reproduce the analyses. And because everything is held together in the container, if we share the image with another researcher, or with our future selves, it won’t matter that they might have a different computer with different OS, packages, etc.&lt;/p&gt;

&lt;h2 id=&quot;the-rocker-project&quot;&gt;The Rocker Project&lt;/h2&gt;

&lt;p&gt;Carl Boettiger and Dirk Eddelbuettel at &lt;a href=&quot;https://www.rocker-project.org/&quot;&gt;The Rocker Project&lt;/a&gt; have done the hard work of properly organizing R and RStudio server applications into well-maintained and versioned Docker images. These Docker containers run R and RStudio server with sensible security options and some helpful base packages. Their website is also a good resource for help using these images. In this tutorial, we’ll use their image to run RStudio server in a Docker container. In Part 2, we’ll build off their work to make our own image with whatever packages we like.&lt;/p&gt;

&lt;h2 id=&quot;docker-vs-packrat&quot;&gt;Docker vs. Packrat&lt;/h2&gt;

&lt;p&gt;There’s another solution to the problem of statistical and computational reproducibility in R, called &lt;a href=&quot;https://rstudio.github.io/packrat/&quot;&gt;packrat&lt;/a&gt;. I will admit I don’t have a great deal of familiarity with packrat, but I can discuss some differences. First, packrat is focused on R and R alone. This means if you incorporate other languages (e.g., python) in your projects, you will need multiple reproducibility solutions. In contrast, Docker handles everything. Second, Docker gives us other nice and powerful features, such as an easy way to run code remotely on cloud servers. Finally, there’s nothing stopping you from using both approaches (even using packrat inside your Docker container).&lt;/p&gt;

&lt;h1 id=&quot;tutorial&quot;&gt;Tutorial&lt;/h1&gt;

&lt;p&gt;In this tutorial we’ll …&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;Install docker&lt;/li&gt;
  &lt;li&gt;Make a docker cloud account&lt;/li&gt;
  &lt;li&gt;Run a docker image from docker cloud&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;em&gt;Along the way, I’ll assume you are comfortable using the terminal. I’ll also assume you’re on a mac, though things shouldn’t be that different on Linux. I can’t really speak to Windows, but the overall process should be similar.&lt;/em&gt;&lt;/p&gt;

&lt;h2 id=&quot;1-installing-docker-on-mac&quot;&gt;1. Installing Docker (on Mac)&lt;/h2&gt;

&lt;p&gt;I’d advocate for installing Docker on Mac using &lt;a href=&quot;https://brew.sh/&quot;&gt;homebrew&lt;/a&gt;. If you don’t have homebrew, Docker has an &lt;a href=&quot;https://docs.docker.com/docker-for-mac/install/&quot;&gt;installation guide for mac&lt;/a&gt; that covers all the steps to install the traditional way.&lt;/p&gt;

&lt;p&gt;To install using homebrew, open up terminal and run:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;brew update &lt;span class=&quot;o&quot;&gt;&amp;amp;&amp;amp;&lt;/span&gt; brew cask install docker
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Then launch docker from your applications (or with spotlight, cmd-space and type “docker”). You’ll need to enter your administrator password.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Optional&lt;/em&gt;: set up bash completion for docker by running the below commands in terminal:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;brew install bash-completion
brew install docker-completion
brew install docker-compose-completion
brew install docker-machine-completion
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h2 id=&quot;2-make-your-docker-cloud-account&quot;&gt;2. Make your Docker cloud account&lt;/h2&gt;

&lt;p&gt;When you first launch Docker it should prompt you to sign in or create a Docker cloud account. Alternately, you can go to &lt;a href=&quot;https://hub.docker.com/&quot;&gt;hub.docker.com&lt;/a&gt; and create an account there. Dockerhub is a centralized store for docker images (saved containers). In the next step, we’ll grab an image from dockerhub to run a container our machine. Eventually, this will host your own personalized docker images (part 2 of this series).&lt;/p&gt;

&lt;p&gt;In the next step, we’ll load the &lt;a href=&quot;https://hub.docker.com/r/rocker/tidyverse/&quot;&gt;tidyverse container&lt;/a&gt; from the rocker project’s page on dockerhub.&lt;/p&gt;

&lt;h2 id=&quot;3-run-a-docker-image-from-docker-hub&quot;&gt;3. Run a docker image from docker hub&lt;/h2&gt;

&lt;p&gt;Ok, now let’s actually get a docker container image running on our machine. First, make sure Docker is running on your machine (check the menubar for the icon). Then, head back over to terminal and enter the following command:&lt;/p&gt;

&lt;div class=&quot;language-bash highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;docker run &lt;span class=&quot;nt&quot;&gt;-d&lt;/span&gt; &lt;span class=&quot;nt&quot;&gt;-p&lt;/span&gt; 8787:8787 &lt;span class=&quot;nt&quot;&gt;-v&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;sb&quot;&gt;`&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;pwd&lt;/span&gt;&lt;span class=&quot;sb&quot;&gt;`&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;:/home/rstudio/working &lt;span class=&quot;nt&quot;&gt;-e&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;PASSWORD&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;rstudio &lt;span class=&quot;nt&quot;&gt;-e&lt;/span&gt; &lt;span class=&quot;nv&quot;&gt;ROOT&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;TRUE rocker/tidyverse:3.4.3
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;There’s a lot going on in here so let’s break down this command.&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;The first part, &lt;code class=&quot;highlighter-rouge&quot;&gt;docker run&lt;/code&gt; says we want to start running a docker container.&lt;/li&gt;
  &lt;li&gt;The &lt;code class=&quot;highlighter-rouge&quot;&gt;-d&lt;/code&gt; flag tells the container to run in the background (detached)&lt;/li&gt;
  &lt;li&gt;The &lt;code class=&quot;highlighter-rouge&quot;&gt;-p 8787:8787&lt;/code&gt; flag maps port from inside the docker container to the main computer. This container will end up running an instance of RStudio server, which will be available at &lt;code class=&quot;highlighter-rouge&quot;&gt;localhost:8787&lt;/code&gt;. Port 8787 happens to be the default, but it can be nice to be explicit.&lt;/li&gt;
  &lt;li&gt;The &lt;code class=&quot;highlighter-rouge&quot;&gt;-v `pwd`:/home/rstudio/working&lt;/code&gt; flag uses the –volume tag to connect the filesystem on our machine to our docker container. It maps our present working directory to a folder in the docker container called “working” that’s in a location we can access through the RStudio interface. This lets you access whatever data or project files you need from your computer in the docker container.&lt;/li&gt;
  &lt;li&gt;The &lt;code class=&quot;highlighter-rouge&quot;&gt;-e PASSWORD=rstudio&lt;/code&gt; flag sets an environment variable “PASSWORD” to “rstudio”. This sets the password to access the rstudio server instance. Here we’re just explicitly setting the password to the default, “rstudio”. If you run this remotely (part 3 teaser!), this should obviously be changed.&lt;/li&gt;
  &lt;li&gt;The &lt;code class=&quot;highlighter-rouge&quot;&gt;-e ROOT=TRUE&lt;/code&gt; flag gives us root access from inside RStudio. This can be helpful for installing linux dependencies when installing R packages.&lt;/li&gt;
  &lt;li&gt;Finally &lt;code class=&quot;highlighter-rouge&quot;&gt;rocker/tidyverse:3.4.3&lt;/code&gt; specifies the docker image to run. That is, version 3.4.3 of the rocker/tidyverse image. If we didn’t specify a tag, docker would default to the “latest” tag.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;When you run a container without its image present locally, Docker will automatically download it.&lt;/p&gt;

&lt;h2 id=&quot;using-rstudio&quot;&gt;Using RStudio&lt;/h2&gt;

&lt;p&gt;Now open up your browser and navigate to &lt;code class=&quot;highlighter-rouge&quot;&gt;localhost:8787&lt;/code&gt;. Enter “rstudio” as your username and whatever password you set as the password (defaults to “rstudio”).&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/images/rstudio-sign-in.png&quot; alt=&quot;sign in&quot; /&gt;&lt;/p&gt;

&lt;p&gt;You will then be met with a fully-functioning RStudio interface. In the lower right you should see the file browser with the “working” directory we mapped when we ran the container.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/images/rstudio_interface.png&quot; alt=&quot;file pane&quot; /&gt;&lt;/p&gt;

&lt;p&gt;If you make changes to files in “working” inside this Docker container, they will also be reflected on your computer’s file system.&lt;/p&gt;

&lt;p&gt;Feel free to play around with this, you can see the already-installed R packages by typing &lt;code class=&quot;highlighter-rouge&quot;&gt;sessionInfo()&lt;/code&gt;.&lt;/p&gt;

&lt;h2 id=&quot;configuring-your-container&quot;&gt;Configuring your container&lt;/h2&gt;

&lt;p&gt;Finally, you may need to adjust how much of your machine you allow Docker to use. On mac, Docker is very “polite” so it doesn’t give itself very much of your machine’s resources. But, because you plan to be working in this container, you will probably want to give it some more juice.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/images/docker_prefs.png&quot; alt=&quot;docker preferences&quot; /&gt;&lt;/p&gt;

&lt;p&gt;To fix this, access the docker preferences via the menu button and select the “advanced” tab. Then, adjust to your liking. There doesn’t seem to be any harm to letting docker have full access to your system resources, at least not when used in this fashion.&lt;/p&gt;

&lt;h2 id=&quot;coming-up-next-&quot;&gt;Coming up next …&lt;/h2&gt;

&lt;p&gt;That’s it for Part 1 of this series. Next, in Part 2 we’ll discuss customizing a docker image with your own personal R environment. Till then you might want to poke around a bit and see what’s available on dockerhub. I won’t cover it’s use in this series, but if you do any work in python, the &lt;a href=&quot;https://hub.docker.com/r/jupyter/datascience-notebook/&quot;&gt;jupyter notebook datascience container&lt;/a&gt; is worth checking out.&lt;/p&gt;</content><author><name>Derek Powell</name><email>derekpowell@stanford.edu</email></author><category term="Research" /><category term="R" /><category term="Reproducibility" /><summary type="html">In scientific research, reproducibility is a necessary (though not sufficient) condition for validity. But conducting reproducible research is hard! Sadly, many psychological studies fail tests of empirical reproducibility. Unfortunately, there’s no software package that can solve the set of structural and statistical issues likely at the root of those non-replications.</summary></entry></feed>