ML学习笔记（1）

Supervised Learning：

we gave the algorithm a data set , in which the "right answers"were given.and the task of the algorithm was to jut produce more of these right answers.This is also called a regression problem.

example: 连续值预测、离散值预测。

Unsupervised Learning：

we're just told,here is a data set, can you find ome structure in the data? This is called a clustering algorithm.

example: 新闻分类、社交分析、市场分割。

Supervised Learning

linear regression

m = Number of training examples.

x's = "input" variable / features

y's = "output" variable / "target" variable

(x , y) = one training example

ML学习笔记（1） = i.th training example

hypothesi example: ML学习笔记（1） =

linear regression 最常用均方误差（Mean squared error）

costfunction: ML学习笔记（1）

goal : minimize ML学习笔记（1）

3-D surface plot:

ML学习笔记（1）

contour figures:(right)

each of these ovals shows is a set of points , that takes in the sae value for ML学习笔记（1）

ML学习笔记（1）

point --> ML学习笔记（1） --> line

Gradient descent

a more generral algorithm

ML学习笔记（1）

repeat until convergence {

ML学习笔记（1） for ( j=0 and j=1)

}

α ： learning rate

!! update at the same time !!

Why be closer to the minimun:

ML学习笔记（1）

too small or large :

if α is too small , gradient descent can be slow.

if α is too large , gradient descent can overshoot the minimun.it may fail to converge ,or even diverge.

ML学习笔记（1）

As we approach a local minimun , gradient descent will automatically take smaller steps.So,no need to decrease α over time.

ML学习笔记（1）

Gradient descent algorithm:

ML学习笔记（1）

Installing octave:

ML学习笔记（1）