Lecture 6: Theory of Generalization
Restriction of Break Point


≤≤mH(N) maximum possible mH(N) given kpoly(N)
Fun Time
When minimum break point k = 1, what is the maximum possible mH(N) when N=3?
1. 1 ✓ 2. 2 3. 3 4. 4
Explanation
因为k=1,所以没有任何一个点可以和它共存,所以mH(N)=1
Bounding Function: Basic Cases
Bounding Function
bounding function B(N,k):
maximum possible mH(N) when break point = k
B(N,k)≤poly(N)
换言之,B(N,k)是mH(N)的上界。
Table of Bounding Function

Fun Time
For the 2D perceptrons, which of the following claim is true?
1 minimum break point k = 2
2 mH(4)= 15
3 mH(N)<B(N,k) when $N = k = $ minimum break point ✓
4 mH(N)>B(N,k) when $N = k = $ minimum break point
Explanation
minimum break point k = 3
mH(4)= 14
B(N,k)是mH(N)的上界
不记得2D感知器的同学,可以回顾Lecture 5: Training versus Testing中的Effective Number of Hypotheses ????
Bounding Function: Inductive Cases
B(4,3)=11=2α+β

B(N,k)α+βα⇒B(N,k)=2α+β≤B(N−1,k)≤B(N−1,k−1)≤B(N−1,k)+B(N−1,k−1)B(N,k)≤i=0∑k−1(Ni)

≤ 实际上是=
即
B(N,k)=B(N−1,k)+B(N−1,k−1)B(N,k)=i=0∑k−1(Ni)=CN0+CN1+...+CNk−1

2D perceptrons break point at 4, mH(N)≤B(N,4)=61N3+65N+1=O(N3)
Fun Time
For 1D perceptrons (positive and negative rays), we know that mH(N) = 2N. Let k be the minimum break point. Which of the following is not true?
1 k = 3
2 for some integers N>0, mH(N)=∑i=0k−1(Ni)
3 for all integers N>0, mH(N)=∑i=0k−1(Ni) ✓
4 for all integers N>2, mH(N)<∑i=0k−1(Ni)
Explanation
minimum break point k = 3
B(N,k)=∑i=0k−1(Ni)
B(N,k)是mH(N)的上界,当N≥k时,mH(N)<B(N,k); 当N<k时,mH(N)=B(N,k).
拓展:回顾下Lecture 5: Training versus Testing中的Effective Number of Hypotheses Funtime
求2维感知器中5个点的有效分类数(k=3,N=5 mH(N)=?≤61N3+65N+1),N>k,=取不到。
正确答案22<(6125+625+1=25),验证成功,回顾题目也挺有趣味的。????
A Pictorial Proof

用Ein′(有限)替换Eout(无限),但是这个不等式及21的系数的出处,我没想明白。

将上界定义为以mH(2N)为基准的。

使用无放回的霍夫丁不等式,结果类似,只是ν=E in ,μ=2E in +E in ′。
Vapnik-Chervonenkis (VC) bound
P[∃h∈H s.t. ∣E in (h)−E out (h)∣>ϵ]≤4mH(2N)exp(−81ϵ2N)
mH(N) can replace M with a few changes
Fun Time
For positive rays, mH(N)=N+1. Plug it into the VC bound for ? = 0.1 and N = 10000. What is VC bound of BAD events?
P[∃h∈H s.t. ∣E in (h)−E out (h)∣>ϵ]≤4mH(2N)exp(−81ϵ2N)
1 2.77×10−87
2 5.54×10−83
3 2.98×10−1 ✓
4 2.29×10−2
Explanation
代入公式计算即可。
0.2981471603789822
Summary
本篇讲义主要讲了Bound FunctionB(N,k)以及VC Bound的含义及推导。
讲义总结
若mH(N)有break point,且N足够大,那么Eout≈Ein.
Restriction of Break Point
break point ‘breaks’ consequent points
Bounding Function: Basic Cases
B(N,k) bounds mH(N) with break point k
Bounding Function: Inductive Cases
B(N,k) is poly(N)
A Pictorial Proof
mH(N) can replace M with a few changes
参考文献
《Machine Learning Foundations》(机器学习基石)—— Hsuan-Tien Lin (林轩田)