Gaussion error linear unit (GELU) activation function.
This is computed element-wise. There are two variants depending on whether approximate is set (default true):
approximate
gelu(x) ~= x * 0.5 * (1 + tanh(sqrt(2/pi) * (x + 0.044715 * x^3)))
gelu(x) = x * 0.5 * erfc(-x / sqrt(2))
Reference: https://ml-explore.github.io/mlx/build/html/python/nn/_autosummary_functions/mlx.nn.gelu_approx.html
Gaussion error linear unit (GELU) activation function.
This is computed element-wise. There are two variants depending on whether
approximateis set (default true):gelu(x) ~= x * 0.5 * (1 + tanh(sqrt(2/pi) * (x + 0.044715 * x^3)))gelu(x) = x * 0.5 * erfc(-x / sqrt(2))Reference: https://ml-explore.github.io/mlx/build/html/python/nn/_autosummary_functions/mlx.nn.gelu_approx.html