Regularization

Regularization

What is regularization

Alternative intuition for deep neural networks:
Regularization reduces overfitting by letting the weight of units decay and get closer to 0 (given that λ are usually large). If the weights almost zero, than the networks becomes almost linear and will avoid overfitting.

Cost function with regularization

When you choose regularization, a regularization term will be added to the cost function.
See Cost Functions#Cost function with regularization.

Types of Techniques

ElasticNet Penalty=λ1j=1pβj+λ2j=1pβj2

, where λ1 and λ2​ are tuning parameters that control the strength of the L1 and L2 penalties, respectively.

Interesting

But long-term training may lead to flip in large models, see here

Regularization in Bayesian framework