Skip to contents

The Gaussian error linear unit (GELU) is defined as:

gelu(x) = x * P(X <= x) where P(X) ~ N(0, 1), i.e. gelu(x) = 0.5 * x * (1 + erf(x / sqrt(2))).

GELU weights inputs by their value, rather than gating inputs by their sign as in ReLU.

Usage

activation_gelu(x, approximate = FALSE)

Arguments

x

Input tensor.

approximate

A bool, whether to enable approximation.

Value

A tensor, the result from applying the activation to the input tensor x.