9.24(2) - 爱码网

Model selection

$d = degree .of .polynomid$
$d=1,h_{\theta}(x)=\theta_0 +\theta_1x$
$d=2,h_{\theta}(x)=\theta_0 +\theta_1x+\theta_2x$
$d=3,h_{\theta}(x)=\theta_0 +\theta_1x+\theta_2x+\theta_3x^{3}$
$d=10 ,h_{\theta}(x)=\theta_0 +\theta_1x+\theta_2x+\theta_3x^{3}......\theta_10x^{10}$

Then calculate everyone $\Theta^{(d)}$ –> $J_{test}(\Theta^{(d)})$ ,to choose the most reasonable one
But the problem still live in ,when new training set appear.

In order to get around this problem ,we’re going to do is split it into 3 pieces
(Testing set60%, Cross validation set20% , Test set20%)

$J_{train}(\theta)=1/2m \displaystyle \sum^{m}_{i=1}(h_{\theta}(x^{(i)})-y^{(i)})^2$
$J_{cv}(\theta)=1/2m_{cv} \displaystyle \sum^{m_{cv}}_{i=1}(h_{\theta}(x^{(i)})-y^{(i)})^2$
$J_{test}(\theta)=1/2m_{test} \displaystyle \sum^{m_{test}}_{i=1}(h_{\theta}(x^{(i)})-y^{(i)})^2$

Diagnosing bias vs. variance

9.24(2)

Regularization and bias/variance(正则化和偏差、方差)

taking about how it interacts with and is effected by the regularization of your learning algorithm

learning curves（学习曲线）

If a learning algorithm is suffering from high bias, getting more training data will not (by itself) help much
9.24(2)
9.24(2)

Get more training examples(fixes high variance)
Try smaller sets of features(fixes high variance)
Try getting additional features(fixes high bias)
Try adding polynomial features( $x_1^2,x_2^2,x_1,x_2,etc$ )(fixes high bias)
Try decreasing $\lambda$ ( fix high bias)
Try increasing $\lambda$ ( fix high variance）