Support Vector Machine Learning for Interdependent and Structured Output Spaces

 

这篇文章主要描述了如何处理结构化的SVM。这篇文章是Learning Structural SVMs with Latent Variables的基础。

 

结构化SVMsStructural SVMs

什么称为结构化的SVM呢?输入(论文分析) Machine Learning -- Support Vector Machine Learning for Interdependent and Structured Output Spaces和输出(论文分析) Machine Learning -- Support Vector Machine Learning for Interdependent and Structured Output Spaces对构成了来自于某个固定但未知的概率分布。不像一般的多分类问题,其中(论文分析) Machine Learning -- Support Vector Machine Learning for Interdependent and Structured Output Spaces可互换(arbitrarily numbered labeles),而在这里所考虑的是结构化的输出空间(论文分析) Machine Learning -- Support Vector Machine Learning for Interdependent and Structured Output Spaces。例如,在这个空间中的元素可能是sequences, strings, labeled trees, lattices, 或者graphs

 

 

我们的目的是:

使用输入输出训练样本对来学习得到一个映射函数(论文分析) Machine Learning -- Support Vector Machine Learning for Interdependent and Structured Output Spaces。现在以自然语言解析的例子来说明。我们感兴趣的函数是(论文分析) Machine Learning -- Support Vector Machine Learning for Interdependent and Structured Output Spaces,其映射一个给定的句子(论文分析) Machine Learning -- Support Vector Machine Learning for Interdependent and Structured Output Spaces到一个解析树(论文分析) Machine Learning -- Support Vector Machine Learning for Interdependent and Structured Output Spaces上。这个解析树如下图所示:

(论文分析) Machine Learning -- Support Vector Machine Learning for Interdependent and Structured Output Spaces

 

现在我们想要学习一个建立在输入输出对上的discriminant function (论文分析) Machine Learning -- Support Vector Machine Learning for Interdependent and Structured Output Spaces,从而我们可以对一个给定的输入x来寻找某个y从而最大化这个discriminant function,从而实现预测。我们可以看到由于y的结构性,我们不可以独立进行考虑。我们建立假设(论文分析) Machine Learning -- Support Vector Machine Learning for Interdependent and Structured Output Spaces

(论文分析) Machine Learning -- Support Vector Machine Learning for Interdependent and Structured Output Spaces

其中(论文分析) Machine Learning -- Support Vector Machine Learning for Interdependent and Structured Output Spaces代表参数向量。

在这里我们假设,(论文分析) Machine Learning -- Support Vector Machine Learning for Interdependent and Structured Output Spaces是输入输出对的联合特征表示(论文分析) Machine Learning -- Support Vector Machine Learning for Interdependent and Structured Output Spaces(的具体形式依赖于具体问题)

(论文分析) Machine Learning -- Support Vector Machine Learning for Interdependent and Structured Output Spaces

 

我们依然可以使用自然语言解析的例子来形象地说明。对于一个句子x的解析树(parse tree)(论文分析) Machine Learning -- Support Vector Machine Learning for Interdependent and Structured Output Spaces,每个结点都对应着一个语法规则(论文分析) Machine Learning -- Support Vector Machine Learning for Interdependent and Structured Output Spaces,并且其拥有一个分数(论文分析) Machine Learning -- Support Vector Machine Learning for Interdependent and Structured Output Spaces。对于一个解析树,可以使用其上所有结点的的和对解析树进行打分。这个分数可以写为:

(论文分析) Machine Learning -- Support Vector Machine Learning for Interdependent and Structured Output Spaces,其中(论文分析) Machine Learning -- Support Vector Machine Learning for Interdependent and Structured Output Spaces是一个统计每个语法规则(论文分析) Machine Learning -- Support Vector Machine Learning for Interdependent and Structured Output Spaces在解析树中出现的次数的直方图。使用CKY算法,(论文分析) Machine Learning -- Support Vector Machine Learning for Interdependent and Structured Output Spaces能够依靠寻找结构(论文分析) Machine Learning -- Support Vector Machine Learning for Interdependent and Structured Output Spaces来最大化(论文分析) Machine Learning -- Support Vector Machine Learning for Interdependent and Structured Output Spaces从而有效地计算。

 

 

当然为了学习结构化的y,我们必须定义loss functions,其是定量衡量真实的y和预测的(论文分析) Machine Learning -- Support Vector Machine Learning for Interdependent and Structured Output Spaces之间的差异的标准:

loss functions:  (论文分析) Machine Learning -- Support Vector Machine Learning for Interdependent and Structured Output Spaces

 

 

Margins and Margin Maximization

 

零训练误差的条件可以写成如下的非线性约束的集合:

(论文分析) Machine Learning -- Support Vector Machine Learning for Interdependent and Structured Output Spaces

对于上面式子中的每个非线性不等式都可以使用(论文分析) Machine Learning -- Support Vector Machine Learning for Interdependent and Structured Output Spaces个线性不等式来替换,其导致了(论文分析) Machine Learning -- Support Vector Machine Learning for Interdependent and Structured Output Spaces个线性约束:

(论文分析) Machine Learning -- Support Vector Machine Learning for Interdependent and Structured Output Spaces

其中(论文分析) Machine Learning -- Support Vector Machine Learning for Interdependent and Structured Output Spaces(论文分析) Machine Learning -- Support Vector Machine Learning for Interdependent and Structured Output Spaces

 

 

 

hard-margin optimization

(论文分析) Machine Learning -- Support Vector Machine Learning for Interdependent and Structured Output Spaces

 

 

为了允许在训练集中的误差,我们引入slack variables,从而对a soft-margin criterion进行优化。

(论文分析) Machine Learning -- Support Vector Machine Learning for Interdependent and Structured Output Spaces

 

在这种定义下,其隐藏地考虑了zero-one classification loss,也就是说分错了代价就是1分对了代价就是0。但是对于像natural language parsing问题来说((论文分析) Machine Learning -- Support Vector Machine Learning for Interdependent and Structured Output Spaces很大),是不恰当的。作者提出两个解决途径,(1)根据代价函数来re-scale the slack variables,也就是说与真实y偏离越大的预测,应该给与更大的惩罚(我们可以看(9)式的右侧),

(论文分析) Machine Learning -- Support Vector Machine Learning for Interdependent and Structured Output Spaces

2re-scale the margin。这个方法由Taskar 针对于Hamming loss提出,

(论文分析) Machine Learning -- Support Vector Machine Learning for Interdependent and Structured Output Spaces

 

 

至此,我们已经建立好了SVM模型。

 

接下来作者便看是进行Support Vector Machine learning。这块好难啊!

相关文章:

  • 2022-12-23
  • 2021-11-01
  • 2021-12-28
  • 2022-01-01
  • 2022-12-23
  • 2021-09-08
  • 2021-11-04
  • 2022-12-23
猜你喜欢
  • 2021-07-21
  • 2022-01-17
  • 2021-05-15
  • 2021-08-05
  • 2021-06-08
  • 2021-10-30
  • 2022-01-30
相关资源
相似解决方案