The RelaxedBernoulli is a distribution over the unit interval (0,1), which continuously approximates a Bernoulli. The degree of approximation is controlled by a temperature: as the temperature goes to 0 the RelaxedBernoulli becomes discrete with a distribution described by the logits or probs parameters, as the temperature goes to infinity the RelaxedBernoulli becomes the constant distribution that is identically 0.5.
tfd_relaxed_bernoulli( temperature, logits = NULL, probs = NULL, validate_args = FALSE, allow_nan_stats = TRUE, name = "RelaxedBernoulli" )
An 0-D Tensor, representing the temperature of a set of RelaxedBernoulli distributions. The temperature should be positive.
An N-D Tensor representing the log-odds of a positive event. Each entry in the Tensor parametrizes an independent RelaxedBernoulli distribution where the probability of an event is sigmoid(logits). Only one of logits or probs should be passed in.
AAn N-D Tensor representing the probability of a positive event. Each entry in the Tensor parameterizes an independent Bernoulli distribution. Only one of logits or probs should be passed in.
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE.
Logical, default TRUE. When TRUE, statistics (e.g., mean, mode, variance) use the value NaN to indicate the result is undefined. When FALSE, an exception is raised if one or more of the statistic's batch members are undefined.
name prefixed to Ops created by this class.
a distribution instance.
The RelaxedBernoulli distribution is a reparameterized continuous distribution that is the binary special case of the RelaxedOneHotCategorical distribution (Maddison et al., 2016; Jang et al., 2016). For details on the binary special case see the appendix of Maddison et al. (2016) where it is referred to as BinConcrete. If you use this distribution, please cite both papers.
Some care needs to be taken for loss functions that depend on the
log-probability of RelaxedBernoullis, because computing log-probabilities of
the RelaxedBernoulli can suffer from underflow issues. In many case loss
functions such as these are invariant under invertible transformations of
the random variables. The KL divergence, found in the variational autoencoder
loss, is an example. Because RelaxedBernoullis are sampled by a Logistic
random variable followed by a
tf$sigmoid op, one solution is to treat
the Logistic as the random variable and
tf$sigmoid as downstream. The
KL divergences of two Logistics, which are always followed by a
op, is equivalent to evaluating KL divergences of RelaxedBernoulli samples.
See Maddison et al., 2016 for more details where this distribution is called
An alternative approach is to evaluate Bernoulli log probability or KL
directly on relaxed samples, as done in Jang et al., 2016. In this case,
guarantees on the loss are usually violated. For instance, using a Bernoulli
KL in a relaxed ELBO is no longer a lower bound on the log marginal
probability of the observation. Thus care and early stopping are important.