This distribution models a random vector Y = (Y1,...,Yk), making use of a SinhArcsinh transformation (which has adjustable tailweight and skew), a rescaling, and a shift. The SinhArcsinh transformation of the Normal is described in great depth in Sinh-arcsinh distributions. Here we use a slightly different parameterization, in terms of tailweight and skewness. Additionally we allow for distributions other than Normal, and control over scale as well as a "shift" parameter loc.

  loc = NULL,
  scale_diag = NULL,
  scale_identity_multiplier = NULL,
  skewness = NULL,
  tailweight = NULL,
  distribution = NULL,
  validate_args = FALSE,
  allow_nan_stats = TRUE,
  name = "VectorSinhArcsinhDiag"



Floating-point Tensor. If this is set to NULL, loc is implicitly 0. When specified, may have shape [B1, ..., Bb, k] where b >= 0 and k is the event size.


Non-zero, floating-point Tensor representing a diagonal matrix added to scale. May have shape [B1, ..., Bb, k], b >= 0, and characterizes b-batches of k x k diagonal matrices added to scale. When both scale_identity_multiplier and scale_diag are NULL then scale is the Identity.


Non-zero, floating-point Tensor representing a scale-identity-matrix added to scale. May have shape [B1, ..., Bb], b >= 0, and characterizes b-batches of scale k x k identity matrices added to scale. When both scale_identity_multiplier and scale_diag are NULL then scale is the Identity.


Skewness parameter. floating-point Tensor with shape broadcastable with event_shape.


Tailweight parameter. floating-point Tensor with shape broadcastable with event_shape.


tf$distributions$Distribution-like instance. Distribution from which k iid samples are used as input to transformation F. Default is tfd_normal(loc = 0, scale = 1). Must be a scalar-batch, scalar-event distribution. Typically distribution$reparameterization_type = FULLY_REPARAMETERIZED or it is a function of non-trainable parameters. WARNING: If you backprop through a VectorSinhArcsinhDiag sample and distribution is not FULLY_REPARAMETERIZED yet is a function of trainable variables, then the gradient will be incorrect!


Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE.


Logical, default TRUE. When TRUE, statistics (e.g., mean, mode, variance) use the value NaN to indicate the result is undefined. When FALSE, an exception is raised if one or more of the statistic's batch members are undefined.


name prefixed to Ops created by this class.


a distribution instance.


Mathematical Details

Given iid random vector Z = (Z1,...,Zk), we define the VectorSinhArcsinhDiag transformation of Z, Y, parameterized by (loc, scale, skewness, tailweight), via the relation (with @ denoting matrix multiplication):

Y := loc + scale @ F(Z) * (2 / F_0(2))
F(Z) := Sinh( (Arcsinh(Z) + skewness) * tailweight )
F_0(Z) := Sinh( Arcsinh(Z) * tailweight )

This distribution is similar to the location-scale transformation L(Z) := loc + scale @ Z in the following ways:

  • If skewness = 0 and tailweight = 1 (the defaults), F(Z) = Z, and then Y = L(Z) exactly.

  • loc is used in both to shift the result by a constant factor.

  • The multiplication of scale by 2 / F_0(2) ensures that if skewness = 0 P[Y - loc <= 2 * scale] = P[L(Z) - loc <= 2 * scale]. Thus it can be said that the weights in the tails of Y and L(Z) beyond loc + 2 * scale are the same. This distribution is different than loc + scale @ Z due to the reshaping done by F:

    • Positive (negative) skewness leads to positive (negative) skew.

    • positive skew means, the mode of F(Z) is "tilted" to the right.

    • positive skew means positive values of F(Z) become more likely, and negative values become less likely.

    • Larger (smaller) tailweight leads to fatter (thinner) tails.

    • Fatter tails mean larger values of |F(Z)| become more likely.

    • tailweight < 1 leads to a distribution that is "flat" around Y = loc, and a very steep drop-off in the tails.

    • tailweight > 1 leads to a distribution more peaked at the mode with heavier tails. To see the argument about the tails, note that for |Z| >> 1 and |Z| >> (|skewness| * tailweight)**tailweight, we have Y approx 0.5 Z**tailweight e**(sign(Z) skewness * tailweight). To see the argument regarding multiplying scale by 2 / F_0(2),

    P[(Y - loc) / scale <= 2] = P[F(Z) * (2 / F_0(2)) <= 2]
    = P[F(Z) <= F_0(2)]
    = P[Z <= 2]  (if F = F_0).

See also

For usage examples see e.g. tfd_sample(), tfd_log_prob(), tfd_mean().

Other distributions: tfd_autoregressive(), tfd_batch_reshape(), tfd_bates(), tfd_bernoulli(), tfd_beta_binomial(), tfd_beta(), tfd_binomial(), tfd_categorical(), tfd_cauchy(), tfd_chi2(), tfd_chi(), tfd_cholesky_lkj(), tfd_continuous_bernoulli(), tfd_deterministic(), tfd_dirichlet_multinomial(), tfd_dirichlet(), tfd_empirical(), tfd_exp_gamma(), tfd_exp_inverse_gamma(), tfd_exponential(), tfd_gamma_gamma(), tfd_gamma(), tfd_gaussian_process_regression_model(), tfd_gaussian_process(), tfd_generalized_normal(), tfd_geometric(), tfd_gumbel(), tfd_half_cauchy(), tfd_half_normal(), tfd_hidden_markov_model(), tfd_horseshoe(), tfd_independent(), tfd_inverse_gamma(), tfd_inverse_gaussian(), tfd_johnson_s_u(), tfd_joint_distribution_named_auto_batched(), tfd_joint_distribution_named(), tfd_joint_distribution_sequential_auto_batched(), tfd_joint_distribution_sequential(), tfd_kumaraswamy(), tfd_laplace(), tfd_linear_gaussian_state_space_model(), tfd_lkj(), tfd_log_logistic(), tfd_log_normal(), tfd_logistic(), tfd_mixture_same_family(), tfd_mixture(), tfd_multinomial(), tfd_multivariate_normal_diag_plus_low_rank(), tfd_multivariate_normal_diag(), tfd_multivariate_normal_full_covariance(), tfd_multivariate_normal_linear_operator(), tfd_multivariate_normal_tri_l(), tfd_multivariate_student_t_linear_operator(), tfd_negative_binomial(), tfd_normal(), tfd_one_hot_categorical(), tfd_pareto(), tfd_pixel_cnn(), tfd_poisson_log_normal_quadrature_compound(), tfd_poisson(), tfd_power_spherical(), tfd_probit_bernoulli(), tfd_quantized(), tfd_relaxed_bernoulli(), tfd_relaxed_one_hot_categorical(), tfd_sample_distribution(), tfd_sinh_arcsinh(), tfd_skellam(), tfd_spherical_uniform(), tfd_student_t_process(), tfd_student_t(), tfd_transformed_distribution(), tfd_triangular(), tfd_truncated_cauchy(), tfd_truncated_normal(), tfd_uniform(), tfd_variational_gaussian_process(), tfd_vector_diffeomixture(), tfd_vector_exponential_diag(), tfd_vector_exponential_linear_operator(), tfd_vector_laplace_diag(), tfd_vector_laplace_linear_operator(), tfd_von_mises_fisher(), tfd_von_mises(), tfd_weibull(), tfd_wishart_linear_operator(), tfd_wishart_tri_l(), tfd_wishart(), tfd_zipf()