What are Gated Neural Networks?
A gate in a neural network acts as a threshold for helping the network to distinguish when to use normal stacked layers versus an identity connection. An identity connection uses the the output of lower layers as an addition to the output of consecutive layers. In short, it allows for the layers of the network to learn in increments, rather than creating transformations from scratch. The gate in the neural network is used to decide whether the network can use the shortened identity connections, or if it will need to use the stacked layers.
How does a Gated Neural Network function?
The gate in the neural network works by being assigned a coefficient used to define how much the network uses the identity connections over the stacked layers. For example, instead of a traditional recurrent neural network architecture, with several sequential nodes, the gated recurrent unit uses several cells consecutively, each containing three models (example cell pictured below). A gated neural network uses processes known called update gate and reset gate. This allows the neural network to carry information forward across multiple units by storing values in memory. When a critical point is reached, the stored values are used to update the current state.
Update Gate and Reset Gate
A gated neural network contains four main components; the update gate, the reset gate, the current memory unit, and the final memory unit. The update gate is responsible for updating the weights and eliminating the vanishing gradient problem. As the model can learn on its own, it will continue to update information to be passed to the future. The reset gate acts in an opposing way by deciding how much of the past information should be forgotten, given the current state.