Keras is a high-level neural networks API developed with a focus on enabling fast experimentation. Being able to go from idea to result with the least possible delay is key to doing good research. Keras has the following key features:

  • Allows the same code to run on CPU or on GPU, seamlessly.

  • User-friendly API which makes it easy to quickly prototype deep learning models.

  • Built-in support for convolutional networks (for computer vision), recurrent networks (for sequence processing), and any combination of both.

  • Supports arbitrary network architectures: multi-input or multi-output models, layer sharing, model sharing, etc. This means that Keras is appropriate for building essentially any deep learning model, from a memory network to a neural Turing machine.

This website provides documentation for the R interface to Keras. See the main Keras website at https://keras.io for additional information on the project.

Installation

The R interface to Keras uses TensorFlow™ as it’s default tensor backend engine. To get started you should therefore install both the keras R package and the TensorFlow engine.

First, install the keras R package from GitHub as follows:

devtools::install_github("rstudio/keras")

Then, use the install_tensorflow() function to install TensorFlow:

library(keras)
install_tensorflow()

This will provide you with a default installation of TensorFlow suitable for use with the keras R package. See the article on TensorFlow installation to learn about more advanced options, including installing a version of TensorFlow that takes advantage of Nvidia GPUs if you have the correct CUDA libraries installed.

Getting started: 30 seconds to Keras

The core data structure of Keras is a model, a way to organize layers. The simplest type of model is the Sequential model, a linear stack of layers. For more complex architectures, you should use the Keras functional API, which allows to build arbitrary graphs of layers.

Define a sequential model:

library(keras)
model <- keras_model_sequential() 
model %>% 
  layer_dense(units = 512, activation = 'relu', input_shape = c(784)) %>% 
  layer_dropout(rate = 0.2) %>% 
  layer_dense(units = 512, activation = 'relu') %>%
  layer_dropout(rate = 0.2) %>%
  layer_dense(units = 10, activation = 'softmax')

Print a summary of the model’s structure using the summary() function:

summary(model)
Model
_____________________________________________________________________________________
Layer (type)                          Output Shape                      Param #      
=====================================================================================
dense_4 (Dense)                       (None, 512)                       401920       
_____________________________________________________________________________________
dropout_3 (Dropout)                   (None, 512)                       0            
_____________________________________________________________________________________
dense_5 (Dense)                       (None, 512)                       262656       
_____________________________________________________________________________________
dropout_4 (Dropout)                   (None, 512)                       0            
_____________________________________________________________________________________
dense_6 (Dense)                       (None, 10)                        5130         
=====================================================================================
Total params: 669,706
Trainable params: 669,706
Non-trainable params: 0
_____________________________________________________________________________________

Compile the model with appropriate loss function, optimizer, and metrics:

model %>% compile(
  loss = 'categorical_crossentropy',
  optimizer = optimizer_sgd(lr = 0.02),
  metrics = c('accuracy')
)

You can now iterate on your training data in batches (x_train and y_train are R matrices):

history <- model %>% fit(
  x_train, y_train, 
  epochs = 20, batch_size = 128, 
  validation_split = 0.2
)

Plot loss and accuracy metrics from training:

plot(history)

Evaluate your performance on test data:

loss_and_metrics <- model %>% evaluate(x_test, y_test, batch_size = 128)

Generate predictions on new data:

classes <- model %>% predict(x_test, batch_size = 128)

Building a question answering system, an image classification model, a Neural Turing Machine, or any other model is just as fast. The ideas behind deep learning are simple, so why should their implementation be painful?

Learning More

To learn more about Keras, you can check out these articles:

The examples demonstrate more advanced models including transfer learning, variational auto-encoding, question-answering with memory networks, text generation with stacked LSTMs, etc.

The function reference includes detailed information on all of the functions available in the package.

Why this name, Keras?

Keras (κέρας) means horn in Greek. It is a reference to a literary image from ancient Greek and Latin literature, first found in the Odyssey, where dream spirits (Oneiroi, singular Oneiros) are divided between those who deceive men with false visions, who arrive to Earth through a gate of ivory, and those who announce a future that will come to pass, who arrive through a gate of horn. It’s a play on the words κέρας (horn) / κραίνω (fulfill), and ἐλέφας (ivory) / ἐλεφαίρομαι (deceive).

Keras was initially developed as part of the research effort of project ONEIROS (Open-ended Neuro-Electronic Intelligent Robot Operating System).

“Oneiroi are beyond our unravelling –who can be sure what tale they tell? Not all that men look for comes to pass. Two gates there are that give passage to fleeting Oneiroi; one is made of horn, one of ivory. The Oneiroi that pass through sawn ivory are deceitful, bearing a message that will not be fulfilled; those that come out through polished horn have truth behind them, to be accomplished for men who see them.” Homer, Odyssey 19. 562 ff (Shewring translation).