Supervised Learning:

we gave the algorithm a data set , in which the "right answers"were given.and the task of the algorithm was to jut produce more of these right answers.This is also called a regression problem.

example: 连续值预测、离散值预测。

Unsupervised Learning:

we're just told,here is a data set, can you find ome structure in the data? This is called a clustering algorithm.

example:  新闻分类、社交分析、市场分割。

 

Supervised Learning

linear regression

m = Number of training examples.

x's = "input" variable / features

y's = "output" variable / "target" variable

(x , y) = one training example

ML学习笔记 (1)  = i.th training example

hypothesi  example:  ML学习笔记 (1) = ML学习笔记 (1)

linear regression 最常用均方误差(Mean squared error)

costfunction:    ML学习笔记 (1)

goal : minimize ML学习笔记 (1)

3-D surface plot:

ML学习笔记 (1)

contour figures:(right)

each of these ovals shows is a set of points , that takes in the sae value for ML学习笔记 (1)

ML学习笔记 (1)

point --> ML学习笔记 (1) --> line

Gradient descent 

a more generral algorithm

ML学习笔记 (1)

 

ML学习笔记 (1)ML学习笔记 (1)

repeat until convergence {

         ML学习笔记 (1)     for ( j=0 and j=1)

}

α : learning rate 

!!  update at the same time  !!

Why be closer to the minimun:

ML学习笔记 (1)

too small or large :

if α is too small , gradient descent can be slow.

if α is too large , gradient descent can overshoot the minimun.it may fail to converge ,or even diverge.

 

ML学习笔记 (1)

As we approach a local minimun , gradient descent will automatically take smaller steps.So,no need to decrease α over time.

ML学习笔记 (1)

Gradient descent algorithm:

ML学习笔记 (1)

ML学习笔记 (1)

Installing octave:

ML学习笔记 (1)

 

 

相关文章: