(Yao, et al 2019) reclaimed a clear description of Point-wise Mutual Information as below:

\[PMI(i, j) = \log \frac{p(i,j)}{p(i)p(j)} \\ p(i, j) = \frac{\#(i,j)}{\#W} \\ p(i) = \frac{\#(i)}{\#W} \]

where \(\#(i)\) is the number of sliding windows in a corpus hat contain word \(i\)

where \(\#(i,j)\) is the number of sliding windows that contain both word \(i\) and \(j\)

where \(\#W\) is the total number of sliding windows in the corpus.

(Levy, et al 2014) simplified PMI formula as below:

\[PMI(i,j) = \log\frac{\#(i,j)\#W}{\#(i)\#(j)} \]

Obviously, \(\#W\) is a constant if we fixed slide window size and corpus, hence we can further simplify the formula as below:

\[PMI(i, j) = \log\frac{\#(i,j)}{\#(i)\#(j)} \]

References

Liang Yao, et al, 2019. Graph Convolutional Networks for Text Classification. AAAI

Omer Levy, et al, 2014. NeuralWord Embedding as Implicit Matrix Factorization. NIPS

相关文章:

  • 2021-08-23
  • 2022-12-23
  • 2021-05-30
  • 2021-10-02
  • 2022-12-23
  • 2021-08-19
  • 2022-12-23
猜你喜欢
  • 2022-12-23
  • 2022-01-28
  • 2021-04-07
  • 2022-02-20
  • 2021-09-02
  • 2021-11-04
  • 2021-04-23
相关资源
相似解决方案