L1-norm vs l2-norm as cost function when standardizing

时间:2017-04-08 22:34:47

标签: machine-learning statistics gradient-descent

I have some data where both the input and the output values are standardized, so the difference between Y and Y_pred is always gonna very small.

I feel that the l2-norm will penalize less the model than the l1-norm since squaring a number that is between 0 and 1 will always result in a lower number.

So my question is, is it ok to use the l2-norm when both the input and the output are standardized?

1 个答案:

答案 0 :(得分:1)

没关系。

基本思想/动机是如何惩罚偏差。 L1-norm并不关心异常值,而L2-norm则严重惩罚这些异常值。这是基本的区别,你会发现很多优点和缺点,即使在维基百科上也是如此。

所以关于你的问题,如果预期的偏差很小是有意义的:当然,它的行为是一样的。

我们举个例子:

y_real 1.0      ||| y_pred 0.8     ||| y_pred 0.6 
l1:                |0.2| = 0.2         |0.4| = 0.4  => 2x times more error!
l2:                0.2^2 = 0.04        0.4^2 = 0.16 => 4x times more error!

你知道,基本思想仍然适用!