机器学习基石上 (Machine Learning Foundations)—Mathematical Foundations
Hsuan-Tien Lin, 林轩田,副教授 (Associate Professor),资讯工程学系 (Computer Science and Information Engineering)
Learning to Answer Yes/No
takes and to get
Perceptron(感知器) Hypothesis Set
综合各个参数来得出一个分数
—— features of customer
- 各个参数(维度)乘上相应的权重再相加
- approve credit if
- deny credit if
,(0 - ignored)
-
linear formula is
- 简化一些:令
- (向量内积)
- each represents a hypothesis ,不同的参数对应不同的函数
- Perceptron in
- 二维感受器
- 不同的分类(参数)有不同的效果
- 令 ,得到的几何图形是一条线,线性分类器
- 二维感受器
Perceptron Learning Algorithm (PLA)
includes all possible perceptrons (infinite), how to select ?
-
want、necessary、difficult、idea
- what we want: (hard when is unknown)
- 可行的是在已知的数据里,理想情况下使得
- 先有一条线 ,再慢慢改进修正参数
-
步骤
- 向量内积的正负可以通过夹角判断
- 修正向量,改变夹角
- A fault confessed is half redressed. (知错能改善莫大焉)
-
Cyclic PLA
- a full cycle of not encountering mistakes
- ‘correct’ mistakes on until no mistakes
- find the next mistake: follow naive cycle or precomputed random cycle
-
存在的问题
- 循环一定会中止吗
- 得到的 和所设想的 究竟接近吗
- 数据之外的表现如何
思考题
注意第二个选项
Guarantee of PLA
-
if PLA halts (no more mistakes)
- (necessary condition) allows some to make no mistake
- call such linear separable (线性可分)
-
linear separable ⇔ exists perfect such that
- 证明1
- 向量内积的操作是通过矩阵乘法实现的
- gets more aligned with (因为内积变大)
- 证明1
-
已知式
- 证明2
- does not grow too fast (长度增量有上界)
- 和 的夹角会越来越小,存在下界 0 度
- 证明2
- 思考题
Non-Separable Data
linear separable: inner product of and grows fast (二者越来越接近)
correct by mistake: length of grows slowly (缓慢增长)
PLA ‘lines’ are more and more aligned with ⇒ halts
Pros: simple to implement, fast, works in any dimension
- Cons
- ‘assumes’ linear separable to halt (只是假设线性可分)
- not fully sure how long halting takes (何时停止不知道)
Learning with Noisy Data
- 找一条犯错误最少的线
- 公式
- 括号代表boolean运算
-
argmin f(x)是指使得函数 f(x) 取得其最小值的所有自变量 x 的集合 - NP-hard to solve
- 公式
- Pocket Algorithm
- modify PLA algorithm (black lines) by keeping best weights in pocket (总是取当前情况下最好的)
- 算法
- a simple modification of PLA to find(somewhat) ‘best’ weights
- 在线性可分的数据集上使用 Pocket 也能找到最优解,但会比 PLA 慢