12 Main Dropout Methods: Mathematical and Visual Explanation for DNNs, CNNs, and RNNs
Motivations
- major Challenge -> co-adaptation : the neurons are very dependent on each other (prevent overfitting)
Standard Dropout
for each layer (not output layer) we give a probability
of dropout.p
At each iteration, each neuron has a probability
of being omitted ( 0-1 distribution )p
DropConnect
To regularize the forward pass of a Dense network, you can apply a dropout on the neurons. The DropConnect [2] introduced by L. Wan et al. does not apply a dropout directly on the neurons but on the weights and bias linking these neurons.
Standout
The difference is that the probability p of omission of the neuron is not constant on the layer. It is adaptative according to the value of the weights.
……