Vowpal Wabbit LDA：模型选择答案

【问题标题】：Vowpal Wabbit LDA : Model SelectionVowpal Wabbit LDA：模型选择
【发布时间】：2015-03-29 12:02:23
【问题描述】：

大众汽车内部有什么方法可以比较 LDA 的模型拟合度吗？软件输出的渐进式损失对这个目的有意义吗？

【问题讨论】：

【解决方案1】：

运行vw -h --lda 1 时，帮助提供以下参数。 metrics 参数默认关闭。它用于计算实现here 的主题连贯性。尝试通过传递 --metrics 1 来启用此功能

Latent Dirichlet Allocation:
  --lda arg                             Run lda with <int> topics

  --lda_alpha arg (=0.100000001)        Prior on sparsity of per-document topic
                                        weights
  --lda_rho arg (=0.100000001)          Prior on sparsity of topic 
                                        distributions
  --lda_D arg (=10000)                  Number of documents
  --lda_epsilon arg (=0.00100000005)    Loop convergence threshold
  --minibatch arg (=1)                  Minibatch size, for LDA
  --math-mode arg (=0)                  Math mode: simd, accuracy, fast-approx
  --metrics arg (=0)                    Compute metrics

或者直接跳转到source code of vw utility。

可以在here 找到一个展示大多数参数的有用演示文稿。

Python：如果您使用的是 gensim

（你用python标记了这个问题）

如果您使用的是 gensim (vwmodel2ldamodel 后会由 Gensim 自己训练模型一样，或者直接使用 @987654333 @或其他coherence measures。

可以在here 找到一个关于如何比较多个 LDA 模型的好教程。

【讨论】：

【解决方案2】：

在 R 统计包中，您可以诊断模型与此类程序的匹配度

How to compute the log-likelihood of the LDA model in vowpal wabbit

我也在那里询问大众的机会

【讨论】：