The line of thinking is:
(1) modify the activation function of logistics regression, and then put out the optimization function.

(2) discuss the nature of this new optimiaztion function.

(3) kernel function.

1 optimization objective

Support vector machine(SVM)

where:

C equals to in Regularization.

cost0() and cost(1) are shown below.

Support vector machine(SVM)
Support vector machine(SVM)

2 nature of this new optimiaztion function

In this function, if C is infinite. This function will be:
Support vector machine(SVM)

We hope that:
For postive point y=1,cost(θTxi\theta ^{T}*x_{i})=0(z>=1).
For negative point y=0,cos2(θTxi\theta ^{T}*x_{i})=0(z<-1).

Support vector machine(SVM)
In this situation(C is infinite), machine will choose the black line as dividing line. Because we make the dividing line of θTxi\theta ^{T}*x_{i} is 1/-1 instead of 0.

This classifier with maximum interval will provide a more reliable dividing line to classify. We can imgine that if dividing line is green line or pink line, it may be false if there is a little noisy.

3 kernel

The most commonly used kernel function are Gaussian kernel function and linear kernel function.

Kernel function is used to deal with non-linear situation. For example:
Support vector machine(SVM)
Gaussian kernel function is:
Support vector machine(SVM)
We take x1=(3,5) , so its image is as follows:

Support vector machine(SVM)
We find that if a point is closed to x1(3,5), the function will be closed to 1. Considering this function, we difinite a new optimization function:
Support vector machine(SVM)
where:
fif_i is Guassian kernel used on different xix_i.(A thought is to choose all the sample as xix_i)
Minimize optimization function, we will find θT\theta ^{T}.

When σ\sigma is high, it may lead to under fitting; when When σ\sigma is low, it may lead to over fitting.

If fif_i = xTxx ^{T}x, we named it linear kernel.

There are several other kernel functions, but they are used less commonly.

Reference

链接: 支持向量机通俗导论(理解SVM的三层境界).

相关文章: