Crosstalk makes it easy to link multiple (Crosstalk-compatible) HTML widgets within an R Markdown page
or Shiny app. To begin, you’ll need to install the
crosstalk
package:
devtools::install_github("rstudio/crosstalk")
Note that at the time of this writing, only a few HTML Widgets are
Crosstalk-compatible. This article will use a simple d3 scatter plot
widget package called d3scatter
,
and the development version of the Leaflet package.
devtools::install_github("jcheng5/d3scatter")
devtools::install_github("rstudio/leaflet")
Crosstalk is designed to work with widgets that take data frames (or sufficiently data-frame-like objects) as input. d3scatter, for example, takes a data frame:
library(d3scatter)
d3scatter(iris, ~Petal.Length, ~Petal.Width, ~Species)
Crosstalk’s main R API is a SharedData
R6 class. You use
this class to wrap your data frame, and pass it to a
Crosstalk-compatible widget where a data frame would normally be
expected.
library(crosstalk)
shared_iris <- SharedData$new(iris)
d3scatter(shared_iris, ~Petal.Length, ~Petal.Width, ~Species)
(It’s not even worth showing the results–it looks just the same as the plot above.)
Things become more interesting when we pass the same
SharedData
instance to two separate widgets: their
selection state becomes linked. (bscols
is a simple helper
function for creating Bootstrap
column layouts, used here to put two plots side by side.)
library(crosstalk)
shared_iris <- SharedData$new(iris)
bscols(
d3scatter(shared_iris, ~Petal.Length, ~Petal.Width, ~Species, width="100%", height=300),
d3scatter(shared_iris, ~Sepal.Length, ~Sepal.Width, ~Species, width="100%", height=300)
)
Click and drag to brush data points in the above plots; notice that
their brushing states are linked. This is because they share the same
SharedData
object. If you created a separated
SharedData
object for each plot, even with the same
underlying data frame, the plots would not be linked.
Note that we’re not limited to linking only two plots, nor do we need to limit ourselves to the same type of widget. Any Crosstalk-compatible widget can be linked with any other.
library(leaflet)
shared_quakes <- SharedData$new(quakes[sample(nrow(quakes), 100),])
bscols(
leaflet(shared_quakes, width = "100%", height = 300) %>%
addTiles() %>%
addMarkers(),
d3scatter(shared_quakes, ~depth, ~mag, width = "100%", height = 300)
)
The examples so far have used linked brushing. Crosstalk also supports using filter inputs to narrow down data sets. If you’re familiar with input controls in Shiny, Crosstalk filter inputs feel similar, but they don’t require Shiny so they work in static HTML documents.
In the following example, we’ll use three filter inputs to control two plots. (Note that linked brushing still works on the plots.)
shared_mtcars <- SharedData$new(mtcars)
bscols(widths = c(3,NA,NA),
list(
filter_checkbox("cyl", "Cylinders", shared_mtcars, ~cyl, inline = TRUE),
filter_slider("hp", "Horsepower", shared_mtcars, ~hp, width = "100%"),
filter_select("auto", "Automatic", shared_mtcars, ~ifelse(am == 0, "Yes", "No"))
),
d3scatter(shared_mtcars, ~wt, ~mpg, ~factor(cyl), width="100%", height=250),
d3scatter(shared_mtcars, ~hp, ~qsec, ~factor(cyl), width="100%", height=250)
)
These three filter_
functions are part of the Crosstalk
package, but third-party filter inputs could certainly be written and
shipped in other R packages.
While linked brushing only lets you have an active selection on one widget at a time, you can have multiple active filters and Crosstalk will combine the filters by intersection. In other words, only data points that pass all active filters will be displayed in any of the visualizations.
The SharedData
constructor has two optional parameters.
The first is key
, which is one of the central concepts of
Crosstalk. This concept is important because Crosstalk widgets
communicate with each other using arrays of keys; this is how both
selection and filter state are represented.
A key is a unique ID string by which a row can be identified. If you’ve used SQL databases, you can think of these as primary keys, except that their type must be character vector (whereas databases more often use integer values as keys). And indeed, the same criteria that make for good primary keys in SQL databases also make for good Crosstalk keys:
NA
or NULL
or
""
Keys should also be data that’s safe to share publicly, as they may end up being embedded in web page HTML (e.g., not social security numbers).
If an explicit key
argument isn’t passed to
SharedData
, then row.names()
are used if
available; if not, then row numbers are used. While row numbers are not
ideal because reordering or filtering the data will cause them to
change, they are sufficient for simple cases where you are not using
Shiny and are not doing anything special with groups (see the next
section).
The key
argument can take several forms:
~ColumnName
. Will be evaluated
in the context of the data.nrow(data)
.The following code snippet demonstrates all three styles, with identical results:
state_info <- data.frame(stringsAsFactors = FALSE,
state.name,
state.region,
state.area
)
sd1 <- SharedData$new(state_info, ~state.name)
sd2 <- SharedData$new(state_info, state_info$state.name)
sd3 <- SharedData$new(state_info, function(data) data$state.name)
# Do all three forms give the same results?
all(sd1$key() == sd2$key() & sd2$key() == sd3$key())
[1] TRUE
So far we’ve generated three sets of Crosstalk plots (based on
iris
, quakes
, and mtcars
). Each
of these sets forms a “group”, or a set of Crosstalk plots/widgets that
only communicate with each other. Selecting points on one of the
iris
plots doesn’t affect the plots in the
quakes
group, and vice versa.
Every SharedData
instance belongs to a group. If you
don’t specify a group when creating a SharedData
instance,
a randomly generated name is used:
shared_iris$groupName()
[1] "SharedDatac9f6656f"
In other words, every SharedData
forms its own group, by
default.
You can provide a group
argument to the
SharedData
constructor to assign it to a specific group.
It’s critical that all SharedData
instances in a
group refer conceptually to the same data points, and share the same
keys. This doesn’t mean that the data and keys need to be
identical across SharedData
instances in the group, but
rather, that any overlapping key values must refer to the same data
point or observation; and conversely, that related data
points/observations in different SharedData
instances must
use identical keys.
This might be useful in cases, for example, where data is subsetted.
The following code plots mtcars
in its entirety, and also
displays two smaller plots that subset to automatic and manual
transmissions. Even though there are three distinct
SharedData
objects, the plots are linked because the group
names are identical.
row.names(mtcars) <- NULL
sd_mtcars_all <- SharedData$new(mtcars, group = "mtcars_subset")
sd_mtcars_auto <- SharedData$new(mtcars[mtcars$am == 0,], group = "mtcars_subset")
sd_mtcars_manual <- SharedData$new(mtcars[mtcars$am == 1,], group = "mtcars_subset")
bscols(widths = c(8, 4),
d3scatter(sd_mtcars_all, ~hp, ~mpg, ~factor(cyl),
x_lim = ~range(hp), y_lim = ~range(mpg),
width = "100%", height = 400),
list(
d3scatter(sd_mtcars_auto, ~hp, ~mpg, ~factor(cyl),
x_lim = range(mtcars$hp), y_lim = range(mtcars$mpg),
width = "100%", height = 200),
d3scatter(sd_mtcars_manual, ~hp, ~mpg, ~factor(cyl),
x_lim = range(mtcars$hp), y_lim = range(mtcars$mpg),
width = "100%", height = 200)
)
)
Note that this example only works because the mtcars
data frame has row names.