What is a Rectified Linear Unit?
A Rectified Linear Unit is a form of activation function used commonly in deep learning models. In essence, the function returns 0 if it receives a negative input, and if it receives a positive value,
The rectified linear unit, or ReLU, allows for the deep learning model to account for non-linearities and specific interaction effects.
How does a Rectified Linear Unit work?
The benefits of using the ReLU function is that its simplicity leads it to be a relatively cheap function to compute. As there is no complicated math, the model can be trained and run in a relatively short time. Similarly, it converges faster, meaning the slope doesn't plateau as the value for X gets larger. This vanishing gradient problem is avoided in ReLU, unlike alternative functions such as sigmoid or tanh. Lastly, ReLU is sparsely activated because for all negative inputs, the output is zero. Sparsity is the principle that specific functions only are activated in concise situations. This is a desirable feature for modern neural networks, as in a sparse network it is more likely that neurons are appropriately processing valuable parts of a problem. For example, a model that is processing images of fish may contain a neuron that is specialized to identity fish eyes. That specific neuron would not be activated if the model was processing images of airplanes instead. This specified use of neuron functions accounts for network sparsity.