BLEU is designed to approximate human judgement at a corpus level, and performs badly if used to evaluate the quality of individual sentences.

 

 

https://en.wikipedia.org/wiki/BLEU

To produce a score for the whole corpus the modified precision scores for the segments are combined using the geometric meanmultiplied by a brevity penalty to prevent very short candidates from receiving too high a score.

相关文章:

  • 2021-10-10
  • 2021-07-12
  • 2022-12-23
  • 2021-06-12
  • 2022-12-23
  • 2022-12-23
  • 2022-12-23
  • 2021-07-09
猜你喜欢
  • 2021-10-29
  • 2022-12-23
  • 2022-03-07
  • 2021-07-20
  • 2021-11-12
  • 2021-11-27
相关资源
相似解决方案