`R/layers.R`

`layer_conv_1d_flipout.Rd`

This layer creates a convolution kernel that is convolved
(actually cross-correlated) with the layer input to produce a tensor of
outputs. It may also include a bias addition and activation function
on the outputs. It assumes the `kernel`

and/or `bias`

are drawn from distributions.

layer_conv_1d_flipout( object, filters, kernel_size, strides = 1, padding = "valid", data_format = "channels_last", dilation_rate = 1, activation = NULL, activity_regularizer = NULL, trainable = TRUE, kernel_posterior_fn = tfp$layers$util$default_mean_field_normal_fn(), kernel_posterior_tensor_fn = function(d) d %>% tfd_sample(), kernel_prior_fn = tfp$layers$util$default_multivariate_normal_fn, kernel_divergence_fn = function(q, p, ignore) tfd_kl_divergence(q, p), bias_posterior_fn = tfp$layers$util$default_mean_field_normal_fn(is_singular = TRUE), bias_posterior_tensor_fn = function(d) d %>% tfd_sample(), bias_prior_fn = NULL, bias_divergence_fn = function(q, p, ignore) tfd_kl_divergence(q, p), ... )

object | Model or layer object |
---|---|

filters | Integer, the dimensionality of the output space (i.e. the number of filters in the convolution). |

kernel_size | An integer or list of a single integer, specifying the length of the 1D convolution window. |

strides | An integer or list of a single integer,
specifying the stride length of the convolution.
Specifying any stride value != 1 is incompatible with specifying
any |

padding | One of |

data_format | A string, one of |

dilation_rate | An integer or tuple/list of a single integer, specifying
the dilation rate to use for dilated convolution.
Currently, specifying any |

activation | Activation function. Set it to None to maintain a linear activation. |

activity_regularizer | Regularizer function for the output. |

trainable | Whether the layer weights will be updated during training. |

kernel_posterior_fn | Function which creates |

kernel_posterior_tensor_fn | Function which takes a |

kernel_prior_fn | Function which creates |

kernel_divergence_fn | Function which takes the surrogate posterior distribution, prior distribution and random variate
sample(s) from the surrogate posterior and computes or approximates the KL divergence. The
distributions are |

bias_posterior_fn | Function which creates a |

bias_posterior_tensor_fn | Function which takes a |

bias_prior_fn | Function which creates |

bias_divergence_fn | Function which takes the surrogate posterior distribution, prior distribution and random variate sample(s)
from the surrogate posterior and computes or approximates the KL divergence. The
distributions are |

... | Additional keyword arguments passed to the |

a Keras layer

This layer implements the Bayesian variational inference analogue to
a dense layer by assuming the `kernel`

and/or the `bias`

are drawn
from distributions.

By default, the layer implements a stochastic forward pass via sampling from the kernel and bias posteriors,

outputs = f(inputs; kernel, bias), kernel, bias ~ posterior

where f denotes the layer's calculation. It uses the Flipout
estimator (Wen et al., 2018), which performs a Monte Carlo approximation
of the distribution integrating over the `kernel`

and `bias`

. Flipout uses
roughly twice as many floating point operations as the reparameterization
estimator but has the advantage of significantly lower variance.

The arguments permit separate specification of the surrogate posterior
(`q(W|x)`

), prior (`p(W)`

), and divergence for both the `kernel`

and `bias`

distributions.

Upon being built, this layer adds losses (accessible via the `losses`

property) representing the divergences of `kernel`

and/or `bias`

surrogate
posteriors and their respective priors. When doing minibatch stochastic
optimization, make sure to scale this loss such that it is applied just once
per epoch (e.g. if `kl`

is the sum of `losses`

for each element of the batch,
you should pass `kl / num_examples_per_epoch`

to your optimizer).
You can access the `kernel`

and/or `bias`

posterior and prior distributions
after the layer is built via the `kernel_posterior`

, `kernel_prior`

,
`bias_posterior`

and `bias_prior`

properties.

Other layers:
`layer_autoregressive()`

,
`layer_conv_1d_reparameterization()`

,
`layer_conv_2d_flipout()`

,
`layer_conv_2d_reparameterization()`

,
`layer_conv_3d_flipout()`

,
`layer_conv_3d_reparameterization()`

,
`layer_dense_flipout()`

,
`layer_dense_local_reparameterization()`

,
`layer_dense_reparameterization()`

,
`layer_dense_variational()`

,
`layer_variable()`