Construct a reactive expression that reads Tableau data — reactive_tableau

This function is used to read data from Tableau. Because of the many levels of indirection involved in actually physically reading data from Tableau, using this function is significantly more involved than, say, a simple read.csv(). See the Details section for a more detailed introduction.

reactive_tableau_data(
  spec,
  options = list(),
  session = shiny::getDefaultReactiveDomain()
)

Arguments

spec

spec	An argument that specifies what specific data should be accessed. This can be specified in a number of ways: The name of a setting, that was set using a value returned from `choose_data()`. This is the most common scenario for `server`. The object returned from `choose_data()` can be passed in directly. This is likely the approach you should take if you want to access data in `config_server` based on unsaved config changes (e.g. to give the user a live preview of what their `choose_data` choices would yield). You can directly create a spec object using one of the helper functions `spec_summary()`, `spec_underlying()`, or `spec_datasource()`. For cases where the data is not selected based on `choose_data()` at all, but programmatically determined or hardcoded. (This should not be common.)
options	A named list of options: `ignoreAliases` - Do not use aliases specified in the data source in Tableau. Default is `FALSE`. `ignoreSelection` - If `FALSE` (the default), only return data for the currently selected marks. Does not apply for datasource tables, only summary and underlying. If `"never"`, then if no marks are selected, `NULL` is returned. If `TRUE`, all data is returned, regardless of selection. `includeAllColumns` - Return all the columns for the table. Default is `FALSE`. Does not apply for datasource and summary tables, only underlying. `maxRows` - The maximum number of rows to return. Tableau will not, under any circumstances, return more than 10,000 rows for datasource and underlying tables. This option is ignored for summary tables. `columnsToInclude` - Character vector of columns that should be included; leaving this option unspecified means all columns should be returned. Does not apply for summary and underlying, only datasource. `truncation` - For underlying and datasource reads, Tableau will never, under any circumstances, return more than 10,000 rows of data. If `warn` (the default), when this condition occurs a warning will be displayed to the user and emitted as a warning in the R process, then the available data will be returned. If `ignore`, then no warning will be issued. If `error`, then an error will be raised.
session	The Shiny `session` object. (You should probably just use the default.)

An argument that specifies what specific data should be accessed. This can be specified in a number of ways:

The name of a setting, that was set using a value returned from choose_data(). This is the most common scenario for server.
The object returned from choose_data() can be passed in directly. This is likely the approach you should take if you want to access data in config_server based on unsaved config changes (e.g. to give the user a live preview of what their choose_data choices would yield).
You can directly create a spec object using one of the helper functions spec_summary(), spec_underlying(), or spec_datasource(). For cases where the data is not selected based on choose_data() at all, but programmatically determined or hardcoded. (This should not be common.)

options

A named list of options:

ignoreAliases - Do not use aliases specified in the data source in Tableau. Default is FALSE.
ignoreSelection - If FALSE (the default), only return data for the currently selected marks. Does not apply for datasource tables, only summary and underlying. If "never", then if no marks are selected, NULL is returned. If TRUE, all data is returned, regardless of selection.
includeAllColumns - Return all the columns for the table. Default is FALSE. Does not apply for datasource and summary tables, only underlying.
maxRows - The maximum number of rows to return. Tableau will not, under any circumstances, return more than 10,000 rows for datasource and underlying tables. This option is ignored for summary tables.
columnsToInclude - Character vector of columns that should be included; leaving this option unspecified means all columns should be returned. Does not apply for summary and underlying, only datasource.
truncation - For underlying and datasource reads, Tableau will never, under any circumstances, return more than 10,000 rows of data. If warn (the default), when this condition occurs a warning will be displayed to the user and emitted as a warning in the R process, then the available data will be returned. If ignore, then no warning will be issued. If error, then an error will be raised.

session

The Shiny session object. (You should probably just use the default.)

Details

There are two complicating factors when reading data from Tableau; the first is how to tell shinytableau what specific data table you want to access, and the second is actually accessing the data from R.

Specifying a data table

If we want to access data from Tableau, the Tableau Extension API only allows us to do so via one of the worksheets that are part of the same dashboard.

Each worksheet makes three categories of data available to us:

Summary data: The data in its final form before visualization. If the visualization aggregates measures, then the summary data contains the data after aggregation has been performed. If the worksheet has an active selection, then by default, only the selected data is returned (set the ignoreSelection option to TRUE to retrieve all data).
Underlying data: The underlying data that is used in the visualization, before aggregation operations are performed but after tables are joined.

By default, only the columns that are used in the worksheet are included (set includeAllColumns to TRUE if you need them all). If the worksheet has an active selection, then by default, only the selected data is returned (set the ignoreSelection option to TRUE to retrieve all data).
Data source: You can also access the raw data from the data source(s) used by the worksheet. This data is unaffected by the worksheet settings. Tableau data sources are broken into one or more logical tables, like how a relational database has multiple tables.

As an R user, you may find this analogy based on the examples from dplyr::mutate-joins to be helpful in explaining the relationship between data source, underlying, and summary data:

# Data source
logical1 <- band_members
logical2 <- band_instruments

# Underlying is joined/selected, but not aggregated
underlying <- band_members %>%
  full_join(band_instruments, by = "name") %>%
  select(band, name)

# Summary is underlying plus aggregation
summary <- underlying %>%
  group_by(band) %>%
  tally(name = "COUNT(name)")

The existence of these three levels of data granularity, plus the fact that the underlying and data source levels need additional specification to narrow down which of the multiple data tables at each level are desired, means that providing clear instructions to reactive_tableau_data is surprisingly complicated.

Now that you have some context, see the description for the spec parameter, above, for specific instructions on the different ways to specify data tables, based on current user input, previously saved configuration, or programmatically.

Accessing a data table

We turn our attention now to consuming data from reactive_tableau_data(). Given the following code snippet, one that might appear in config_server:

data_spec <- choose_data("mydata")
data <- reactive_tableau_data(data_spec)

The data variable created here has two complications.

First, it's reactive; like all reactive expressions, you must call data as a function to get at its value. It must be reactive because Tableau data can change (based on selection and filtering, if nothing else), and also, the user's choices can change as well (in the example, the data_spec object is also reactive).

Second, and more seriously, reading Tableau data is asynchronous, so when you invoke data() what you get back is not a data frame, but the promise of a data frame. Working with promises has its own learning curve, so it's regrettable that they play such a prominent role in reading Tableau data. If this is a new topic for you, start with this talk and then read through the various articles on the promises website.

The bottom line with promises is that you can use any of the normal functions you usually use for manipulating, analyzing, and visualizing data frames, but the manner in which you invoke those functions will be a bit different. Instead of calling print(data()), for example, you'll need to first change to the more pipe-oriented data() %>% print() and then replace the magrittr pipe with the promise-pipe like data() %...>% print(). There's much more to the story, though; for all but the simplest scenarios, you'll need to check out the resources linked in the previous paragraph.

Examples

if (FALSE) {
data_spec_x <- choose_data("x", iv = iv)
data_x <- reactive_tableau_data(data_spec_x)
}