12 Main Dropout Methods: Mathematical and Visual Explanation for DNNs, CNNs, and RNNs

https://towardsdatascience.com/12-main-dropout-methods-mathematical-and-visual-explanation-58cdc2112293

Motivations

major Challenge -> co-adaptation : the neurons are very dependent on each other (prevent overfitting)

Standard Dropout

for each layer (not output layer) we give a probability p of dropout.

At each iteration, each neuron has a probability p of being omitted ( 0-1 distribution )

DropConnect

To regularize the forward pass of a Dense network, you can apply a dropout on the neurons. The DropConnect [2] introduced by L. Wan et al. does not apply a dropout directly on the neurons but on the weights and bias linking these neurons.

Standout

The difference is that the probability p of omission of the neuron is not constant on the layer. It is adaptative according to the value of the weights.

……