Gated linear unit (GLU) activation function.
Splits the axis dimension of the input into two halves, a and b, then computes a * sigmoid(b).
axis
a * sigmoid(b)
Gated linear unit (GLU) activation function.
Splits the
axisdimension of the input into two halves, a and b, then computesa * sigmoid(b).