Packrat by example

View the Project on GitHub rstudio/packrat

This tutorial will walk you through some of the most common tasks you’ll want to do with packrat, and explain the fundamental concepts behind the package on the way.

First things first

You’re getting ready to start a new project, so you create a new directory that will eventually contain all the .R scripts, CSV data, and other files that are needed for this particular project.

You know you’re going to need to make use of several R packages over the course of this project. So before you write your first line of code, set up the project directory to use Packrat with packrat::init:

> packrat::init("~/projects/babynames")
Adding these packages to packrat:
            _         
    packrat   0.2.0.128

Fetching sources for packrat (0.2.0.128) ... OK (GitHub)
Snapshot written to '/Users/kevin/projects/babynames/packrat/packrat.lock'
Installing packrat (0.2.0.128) ... OK (built source)
init complete!
Packrat mode on. Using library in directory:
- "~/projects/babynames/packrat/lib" 

(Tip: If the current working directory is the project directory, you can omit the path.)

After initializing the project, you will be placed into packrat mode in the project directory. You’re ready to go!

You’re no longer in an ordinary R project; you’re in a Packrat project. The main difference is that a packrat project has its own private package library. Any packages you install from inside a packrat project are only available to that project; and packages you install outside of the project are not available to the project.

This is what we mean by “isolation” and it’s a Very Good Thing, as it means that upgrading a package for one project won’t break a totally different project that just happens to reside on the same machine, even if that package contained incompatible changes.

A packrat project contains a few extra files and directories. The init() function creates these files for you, if they don’t already exist.

Adding, removing, and updating packages

Adding a package in a Packrat project is easy. The first step is to start R inside your Packrat project, and install the package however you normally do; usually that means either the install.packages() function or the “Install Packages” button in your favorite R IDE. Let’s do this now, with the reshape2 package.

> install.packages("reshape2")

If you completed the previous steps correctly, you just installed the reshape2 package from CRAN into your project’s private package library. Let’s take a snapshot to save the changes in Packrat:

> packrat::snapshot()

Adding these packages to packrat:
             _       
    plyr       1.8.1 
    Rcpp       0.11.2
    reshape2   1.4   
    stringr    0.6.2 

Fetching sources for plyr (1.8.1) ... OK (CRAN current)
Fetching sources for Rcpp (0.11.2) ... OK (CRAN current)
Fetching sources for reshape2 (1.4) ... OK (CRAN current)
Fetching sources for stringr (0.6.2) ... OK (CRAN current)
Snapshot written to '/Users/kevin/projects/babynames/packrat/packrat.lock'

If you have automatic snapshots turned on, Packrat will record package upgrades and additions in the background, so you don’t even need to remember to call ::snapshot() manually unless you’re performing a less common action.

When packrat takes a snapshot, it looks in the project’s private package library for packages that have been added, modified, or removed since the last time snapshot was called. For packages that were added or modified, packrat attempts to go find the uncompiled source package from CRAN, BioConductor, or GitHub (caveat: only for packages that were installed using devtools version 1.4 or later), and save them in the packrat/src project subdirectory. It also records metadata about each package in the packrat.lock file.

Because we save source packages for all of your dependencies, packrat makes your project more reproducible. When someone else wants to run your project–even if that someone else is you, years in the future, dusting off some old backups–they won’t need to try to figure out what versions of what packages you were running, and where you got them.

Installing local source packages

You may be working on a project with an R package that is not available on any external repository. Don’t fret; packrat can still handle this! With source packages, we expect these packages live in a local repository. A local repository is just a directory containing package sources. This can be set within a packrat project with:

packrat::set_opts(local.repos = "<path_to_repo>")

Restoring snapshots

Once your project has a snapshot, you can easily install the packages from that snapshot into your private library at any time.

You’ll need to do this, for example, when copying the project to a new computer, especially to one with a different operating system. Let’s simulate this by exiting R and then deleting the library subdirectory in your project. Then launch R from your project directory again.

Packrat automates the whole process for you – upon restarting R in this directory, you should see the following output:

Packrat is not installed in the local library -- attempting to bootstrap an installation...
> Installing packrat into project private library:
- '/Users/kevin/projects/babynames/packrat/lib/x86_64-apple-darwin13.1.0/3.2.0'
* installing *source* package ‘packrat’ ...
** R
** inst
** preparing package for lazy loading
** help
*** installing help indices
** building package indices
** testing if installed package can be loaded
* DONE (packrat)
> Attaching packrat
> Restoring library
Installing plyr (1.8.1) ... OK (built source)
Installing Rcpp (0.11.2) ... OK (built source)
Installing reshape2 (1.4) ... OK (built source)
Installing stringr (0.6.2) ... OK (built source)
> Packrat bootstrap successfully completed. Entering packrat mode...
Packrat mode on. Using library in directory:
- "~/projects/babynames/packrat/lib"

All of the packages in the snapshot have now been installed in your project’s newly created private package library.

> packrat::status()
Up to date.

Another reason to restore from the packrat snapshot is if you remove a package that you later realize you still needed, or if one of your collaborators makes their own changes to the snapshot. In these cases, you can call packrat::restore().

Let’s remove the plyr package, and use packrat::restore() to bring it back.

> remove.packages("plyr")
Removing package from ‘/Users/kevin/projects/babynames/packrat/lib/x86_64-apple-darwin13.1.0/3.2.0’
(as ‘lib’ is unspecified)

> packrat::status()

The following packages are used in your code, tracked by packrat, but no longer present in your library:
            from   to
    plyr   1.8.1   NA

Use packrat::restore() to restore these libraries.

> packrat::restore()
Installing plyr (1.8.1) ... OK (built source)

Cleaning up

Package libraries can grow over time to include many packages that were needed at one time but are no longer used. Packrat can analyze your code and try to determine which packages you’re using so you can keep your library tidy. Let’s take a look at ::status() again:

> packrat::status()
The following packages are installed but not needed:
             _       
    plyr       1.8.1 
    Rcpp       0.11.2
    reshape2   1.4   
    stringr    0.6.2 

Use packrat::clean() to remove them. Or, if they are actually needed
by your project, add `library(packagename)` calls to a .R file
somewhere in your project.

Notice these packages are “installed but not needed”. Packrat attempts to ascertain which packages your project needs by analyzing the *.R script files in your project and looking for calls like library() and require().

Our sample project doesn’t have any R code yet, so Packrat isn’t aware that we need any packages. Let’s add some code that depends on the packages. Create a .R file in your project directory with the following contents and save it:

library(reshape2)

Now let’s take another look at ::status():

> packrat::status()
Up to date.

Of course, if we were actually finished with the packages, we’d have used packrat::clean() to remove them.