Learning to Ask Good Questions笔记

Motivation

A principle goal of asking questions is to fill information gaps, typically through clarification questions. We take the perspective that a good question is the one whose likely answer will be useful.
In this work, we design a model to rank a candidate set of clarification questions by their usefulness to the given post.

这篇文章是基于Expected Value of Perfect Information (EVPI) 理论，来选择其答案能对帖子提供最多信息的问题来提问。
Learning to Ask Good Questions笔记

Model

EVPI是一个衡量指标：如果我获得了信息X，那么X的有用程度是多少？
作者将本文的任务建模为在一个候选集上的排序问题。越有用的问题的排序应该越高。

We formulate this task as a ranking problem on a set of potential clarification questions.

作者对在post $p$ 下某个问题 $q_i$ 的EVPI定义如下：
$EVPI(q_i|p)=\sum_{a_j \in A} \Bbb P[a_j|p,q_i] \Bbb U (p+a_j)$

The value of this question qi is the expected utility, over all possible answers.

所以重点就是怎样计算上式中的两部分。在这篇文章中，两部分都是通过神经网络来计算得到的。
整个模型在测试阶段的表现如下：
Learning to Ask Good Questions笔记下面就分别阐述这几个部分：

Question & answer candidate generator

使用开源软件Lucene来找到10个与当前post最相似的post。向这个10个post提问的问题作为候选问题集。编辑所选择的答案作为候选答案集。

Answer modeling

Given a post $p$ and a question candidate $q_i$ , our second step is to calculate how likely is this question to be answered using one of our answer candidates $a_j$ .

这里的 $q_i$ 和 $a_j$ 都是来自前一部分所述的候选集中。
作者使用post和问题的neural representation的结合来表示 $a_i$ 的representation，所以这个answer representation和其中一个answer candidate $a_j$ 的距离就可以用下式来表示:
$dist(F_{ans}(\overline p,\overline {q_i}), \hat {a_j})=1-cos\_sim(F_{ans}(\overline p,\overline {q_i}),\hat {a_j})$
Learning to Ask Good Questions笔记
也就是说， $\Bbb P[a_j|p,q_i]$ 与 $q_i$ 与 $q_j$ 之间的相似度成正比，与 $e$ 的 $a_i$ 的answer representation和 $a_j$ 的距离次方成反比。