The TensorFlow API is composed of a set of Python modules that enable constructing and executing TensorFlow graphs. The tensorflow package provides access to the complete TensorFlow API from within R. This article describes the basic syntax and mechanics of using TensorFlow from R.

The TensorFlow API is divided into modules. The top-level entry point to the API is `tf`

, which provides access to the main TensorFlow module (`tf`

is exported from the tensorflow package). For example, here is a simple “Hello, World” script:

```
library(tensorflow)
sess = tf$Session()
hello <- tf$constant('Hello, TensorFlow!')
sess$run(hello)
a <- tf$constant(10L)
b <- tf$constant(32L)
sess$run(a + b)
```

Functions that begin with a lowercase letter (e.g. `tf$constant`

) are normal R functions. Functions that begin within an uppercase letter (e.g. `tf$Session`

) are functions that create new instances of TensorFlow classes.

The main TensorFlow module (`tf`

) includes a wide variety of functions and classes which you can read more more about in the TensorFlow API documentation (see also the Getting Help section below which describes accessing help for the TensorFlow API directly within the RStudio IDE editor and console).

The TensorFlow API also includes many sub-modules, including the `tf$nn`

module which provides specialized functions for neural networks and the `tf$train`

module which provides a set of classes and functions that helps train models. You can access these modules the same way that functions are accessed (via the `$`

operator). For example:

```
# Call the conv2d function within the nn sub-module
tf$nn$conv2d(x, W, strides=c(1L, 1L, 1L, 1L), padding='SAME')
# Create an optimizer from the train sub-module
optimizer <- tf$train$GradientDescentOptimizer(0.5)
```

Here’s a simple example of using R to define a TensorFlow model:

```
library(tensorflow)
# Create 100 phony x, y data points, y = x * 0.1 + 0.3
x_data <- runif(100, min=0, max=1)
y_data <- x_data * 0.1 + 0.3
# Try to find values for W and b that compute y_data = W * x_data + b
# (We know that W should be 0.1 and b 0.3, but TensorFlow will
# figure that out for us.)
W <- tf$Variable(tf$random_uniform(shape(1L), -1.0, 1.0))
b <- tf$Variable(tf$zeros(shape(1L)))
y <- W * x_data + b
# Minimize the mean squared errors.
loss <- tf$reduce_mean((y - y_data) ^ 2)
optimizer <- tf$train$GradientDescentOptimizer(0.5)
train <- optimizer$minimize(loss)
# Launch the graph and initialize the variables.
sess = tf$Session()
sess$run(tf$global_variables_initializer())
# Fit the line (Learns best fit is W: 0.1, b: 0.3)
for (step in 1:201) {
sess$run(train)
if (step %% 20 == 0)
cat(step, "-", sess$run(W), sess$run(b), "\n")
}
```

If you’ve seen TensorFlow code written in Python you’ll recognize this code as nearly identical save for some minor syntactic differences (e.g. the use of `$`

rather than `.`

as an object delimiter). You’ll also note that R numeric vectors and random distribution functions are used, whereas Python code would typically use their NumPy equivalents.

The remainder of this article describes the core patterns used for interacting with the TensorFlow API from R,

The TensorFlow API is more strict about numeric types than is customary in R (which often automatically casts from integer to float and vice-versa as necessary). Many TensorFlow function parameters require integers (e.g. for tensor dimensions) and in those cases it’s important to use an R integer literal (e.g. `1L`

). Here’s an example of specifying the `strides`

parameter for a 4-dimensional tensor using integer literals:

`tf$nn$conv2d(x, W, strides=c(1L, 1L, 1L, 1L), padding='SAME')`

Here’s another example of using integer literals when defining a set of integer flags:

```
flags$DEFINE_integer('max_steps', 2000L, 'Number of steps to run trainer.')
flags$DEFINE_integer('hidden1', 128L, 'Number of units in hidden layer 1.')
flags$DEFINE_integer('hidden2', 32L, 'Number of units in hidden layer 2.')
```

Some TensorFlow APIs call for lists of a numeric type. Typically you can use the `c`

function (as illustrated above) to create lists of numeric types. However, there are a couple of special cases (mostly involving specifying the shapes of tensors) where you may need to create a numeric list with an embedded `NULL`

or a numeric list with only a single item. In those cases you’ll want to use the `list`

function rather than `c`

in order to force the argument to be treated as a list rather than a scalar, and to ensure that `NULL`

elements are preserved. For example:

```
x <- tf$placeholder(tf$float32, list(NULL, 784L))
W <- tf$Variable(tf$zeros(list(784L, 10L)))
b <- tf$Variable(tf$zeros(list(10L)))
```

This need to use `list`

rather than `c`

is very common for shape arguments (since they are often of one dimension and in the case of placeholders often have a `NULL`

dimension). For these cases there is a `shape`

function which you can use to make the calling syntax a bit more more clear. For example, the above code could be re-written as:

```
x <- tf$placeholder(tf$float32, shape(NULL, 784L))
W <- tf$Variable(tf$zeros(shape(784L, 10L)))
b <- tf$Variable(tf$zeros(shape(10L)))
```

A tensor is a typed multi-dimensional array. Tensors can take the form of a single value, a vector, a matrix, or an array in many dimensions. When you initialize the value of a tensor you can use the following R data types for various tensor shapes:

Dimensions | R Type | Example |
---|---|---|

1 | vector | `c(1.0, 2.0, 3.0, 4.0)` |

2 | matrix | `matrix(c(1.0,2.0,3.0,4.0), nrow = 2, ncol = 2)` |

3+ | array | `array(rep(1, 365*5*4), dim=c(365, 5, 4))` |

Correspondingly, when a TensorFlow computation yields a value back to R the appropriate data type (vector, matrix, or array) will be returned. You may see references to NumPy arrays in TensorFlow documentation or examples written in Python. The TensorFlow R API doesn’t make use of NumPy arrays but rather their R analogs as described above.

Tensor indexes within the TensorFlow API are 0-based (rather than 1-based as R vectors are). This typically comes up when specifying the dimension of a tensor to operate on (e.g with a function like `tf$reduce_mean`

or `tf$argmax`

). The first dimension of a tensor is specified as `0L`

, the second `1L`

, and so on. For example:

```
# call tf$reduce_mean on the second dimension of the specified tensor
cross_entropy <- tf$reduce_mean(
-tf$reduce_sum(y_ * tf$log(y_conv), reduction_indices=1L)
)
# call tf$argmax on the second dimension of the specified tensor
correct_prediction <- tf$equal(tf$argmax(y_conv, 1L), tf$argmax(y_, 1L))
```

Tensors can created by extracting elements from other tensors, either using functions like `tf$gather`

and `tf$slice`

, or by using `[`

syntax. Extraction using `[`

also expects 0-based indexes, but otherwise follows R conventions: the index range is inclusive (whereas Python omits the last element of an indexing vector) and blank indices indicate that all elements in that dimension should be retained. For example:

```
# take the elements in the first 100 rows and first 5 columns of W
W_small <- W[0:99, 0:4]
# take the last 3 columns, but all rows, of W
W_smallish <- W[, 7:9]
```

The tensorflow package provides Tensor implementations for a number of R generic operators and functions. You can use these in place of the equivalent `tf`

function to make your code more readable and compact. The following generics are implemented:

Generic | TensorFlow |
---|---|

- | tf$neg, tf$sub |

+ | tf$add |

* | tf$mul |

/ | tf$truediv |

%/% | tf$floordiv |

%% | tf$mod |

^ | tf$pow |

%% | tf$mod |

& | tf$logical_and |

| | tf$logical_or |

! | tf$logical_not |

== | tf$equal |

!= | tf$not_equal |

< | tf$less |

<= | tf$less_equal |

> | tf$greater |

>= | tf$greater_equal |

[ | tf$strided_slice |

abs | tf$abs |

sign | tf$sign |

sqrt | tf$sqrt |

floor | tf$floor |

ceiling | tf$ceiling |

round | tf$round |

exp | tf$exp |

log | tf$log |

cos | tf$cos |

sin | tf$sin |

tan | tf$tan |

acos | tf$acos |

asin | tf$asin |

atan | tf$atan |

lgamma | tf$lgamma |

digamma | tf$digamma |

Some TensorFlow APIs accept dictionaries as arguments, the most common of which is the `feed_dict`

argument which feeds training and test data to various functions. If a dictionary is keyed by character string you can simply pass an R named list. However, if a dictionary is keyed by another object type like a tensor (as `feed_dict`

is) then you should use the `dict`

function rather than a named list. For example:

`sess$run(train_step, feed_dict = dict(x = batch_xs, y_ = batch_ys))`

The `x`

and `y_`

variables in the above example are tensor placeholders which are substituted for by the specified training data.

Some TensorFlow APIs accept tuples as arguments. In many cases if you pass a list the API will automatically convert it to a tuple, however there may be instances where you need to explicitly pass a tuple. For these cases you can use the `tuple`

function, for example:

`tuple(1L, 2L, 3L)`

The TensorFlow API includes several functions that yield scoped execution contexts (i.e. blocks of code which execute code on enter and exit, for example to set a default or to open and close a resource).

The R `with`

generic function can be used with TensorFlow objects that define a scoped execution context. For example:

```
with(tf$name_scope('input'), {
x <- tf$placeholder(tf$float32, shape(NULL, 784L), name='x-input')
y_ <- tf$placeholder(tf$float32, shape(NULL, 10L), name='y-input')
})
```

In this case the `tf$name_scope`

execution context will automatically pre-pend `"input/"`

to the specified names, so they will become `"input/x-input"`

and `"input/y-input"`

respectively.

It is sometimes convenient to gain access to the execution context via an R object. For this purpose there is also a custom `%as%`

operator defined, for example:

```
with(tf$Session() %as% sess, {
sess$run(hello)
})
```

The `sess`

variable will be assigned from the execution context and be available only within the expression passed to `with`

.

The TensorFlow API includes several functions that return Python iterators or generators (these are typically used for streaming back results from training or evaluation). You can use the `iterate`

function to apply an R function to each item returned from an interator or generator. For example:

```
iterate(tf$python_io$tf_record_iterator(filename), function(record) {
# Process the record
})
```

If you just want to collect all of the items into an R list or vector, you can call the `iterate`

function with just the iterator and no function to apply:

`records <- iterate(tf$python_io$tf_record_iterator(filename))`

The result will be automatically simplified to an R vector if all values returned from the iteration are length 1 vectors of single primitive type “character”, “complex”, “double”, “integer”, or “logical” (otherwise the result will be an R list).

Note that Python iterators and generators are stateful objects. This means that after making a single pass through an iterator it will be empty (i.e. you can’t execute another iteration against the same underlying iterator).

When you use a TensorFlow API that expects NumPy arrays you can simply use the equivalent R matrix or multi-dimensional array and it will be automatically converted to the required NumPy array type.

There are however some parts of the TensorFlow API which require direct specification of NumPy data types (e.g. the CSV loading functions in tflearn). For these contexts you can import the NumPy module explicitly and refer to it’s types as needed.

For example, here’s a call to tflearn’s `load_csv_with_header`

function that specifies data types using the NumPy module:

```
np <- import("numpy")
training_set <- tf$contrib$learn$datasets$base$load_csv_with_header(
filename = "iris_training.csv",
target_dtype = np$int,
features_dtype = np$float32
)
```

Note that we explicitly import NumPy via `np <- import("numpy")`

prior to accessing the `np$int`

and `np$float32`

types.

TensorFlow includes a facility for driving training hyperparmaeters using command-line flags. You can use this facility by:

Obtaining a reference to the global

`tf$app$flags`

object.Defining whatever flags are appropriate using the various

`DEFINE`

functions provided by the object.Populating a

`FLAGS`

object which will carry the specified values by calling the`parse_flags`

function.

Here’s a complete example:

```
# Basic model parameters as external flags.
flags <- tf$app$flags
flags$DEFINE_float('learning_rate', 0.01, 'Initial learning rate.')
flags$DEFINE_integer('max_steps', 5000L, 'Number of steps to run trainer.')
flags$DEFINE_integer('hidden1', 128L, 'Number of units in hidden layer 1.')
flags$DEFINE_integer('hidden2', 32L, 'Number of units in hidden layer 2.')
flags$DEFINE_integer('batch_size', 100L, 'Batch size.')
flags$DEFINE_string('train_dir', 'MNIST-data', 'Directory to put the training data.')
flags$DEFINE_boolean('fake_data', FALSE, 'If true, uses fake data for unit testing.')
FLAGS <- parse_flags()
```

As you use TensorFlow from R you’ll want to get help on the various functions and classes available within the API. If you are running the vary latest Preview Release (v1.0.18 or later) of RStudio IDE you can get code completion and inline help for the TensorFlow API within RStudio. For example:

Inline help is also available for function parameters:

You can press the F1 key while viewing inline help (or whenever your cursor is over a TensorFlow API symbol) and you will be navigated to the location of that symbol’s help within the TensorFlow API documentation.

The main TensorFlow API reference documents all of the modules, classes, and functions within TensorFlow. This documentation is for the Python API, however since the R API is based on the Python API the documentation is also easily adapted for use with R.

Python data types in the TensorFlow API map to R as follows:

Python | R | Examples |
---|---|---|

Scalar | Single-element vector | `1` , `1L` , `TRUE` , `"foo"` |

List | Multi-element vector | `c(1.0, 2.0, 3.0)` , `c(1L, 2L, 3L)` |

Tuple | List of multiple types | `list(1L, TRUE, "foo")` |

Dict | Named list or `dict` |
`list(a = 1L, b = 2.0)` , `dict(x = x_data)` |

NumPy ndarray | Matrix/Array | `matrix(c(1,2,3,4), nrow = 2, ncol = 2)` |

None, True, False | NULL, TRUE, FALSE | `NULL` , `TRUE` , `FALSE` |