自然语言处理(NLP)是人工智能研究中极具挑战的一个分支。随着深度学习等技术的引入,NLP领域正在以前所未有的速度向前发展。但对于初学者来说,这一领域目前有哪些研究和资源是必读的?最近,Kyubyong Park 为我们整理了一份完整列表。

 

GitHub 项目链接:_tasks

 

本人从事自然语言处理任务(NLP)的研究已经有很长时间了,有一天我想到,我需要为庞大的 NLP领域做一个概览,我知道自己肯定不是想要一睹 NLP 任务的全貌的第一个人。

 

我曾竭尽所能的研究过尽可能多种类型的 NLP 任务,但由于个人知识的局限,我承认还远远没有穷尽整个领域。目前,该项目选取的参考文献都偏重最新的深度学习研究成果。我希望这些能为想要深入钻研一个 NLP 任务的人们提供一个开端。这个项目将持续更新,不过,我更希望与更多人合作。如果你有意愿的话,欢迎对这个项目作出贡献。

 

回指解析

 

 

自动作文评分

 

 

自动语音识别

 

 

自动摘要

 

 

指代消解

 

  • INFO Coreference Resolution(ts/coref.shtml
  • 论文:DeepReinforcement Learning for Mention-Ranking Coreference Models(7
  • 论文:ImprovingCoreference Resolution by Learning Entity-Level Distributed Representations(3
  • 竞赛:CoNLL2012 Shared Task: Modeling Multilingual Unrestricted Coreference in OntoNotes(/task-description.html
  • 竞赛:CoNLL2011 Shared Task: Modeling Unrestricted Coreference in OntoNotes(/task-description.html

 

实体链接

 

  • 见「命名实体消歧」部分

 

语法错误纠正

 

 

字素音素转换

 

 

语种猜测

 

  • 见「语种辨别」部分

 

语种辨别

 

 

语言建模

 

 

语种识别

 

  • 见「语种辨别」部分

 

同一词类

 

 

观唇辨意

 

  • WIKI Lip reading (ip_reading
  • 论文:LipReading Sentences in the Wild (8
  • 论文:3DConvolutional Neural Networks for Cross Audio-Visual Matching Recognition(9
  • 项目: LipReading - Cross Audio-Visual Recognition using 3D Convolutional Neural Networks(reading-deeplearning
  • 资源: TheGRID audiovisual sentence corpus (ridcorpus/

 

机器翻译

 

  • 论文:NeuralMachine Translation by Jointly Learning to Align and Translate(arxiv.org/abs/1409.0473
  • 论文:NeuralMachine Translation in Linear Time (9
  • 论文:2
  • 竞赛: ACL2014 NINTH WORKSHOP ON STATISTICAL MACHINE TRANSLATION(ation-task.html#download
  • 竞赛: EMNLP2017 SECOND CONFERENCE ON MACHINE TRANSLATION (WMT17)(ation-task.html
  • 资源:OpenSubtitles2016 (Subtitles2016.php
  • 资源: WIT3:Web Inventory of Transcribed and Translated Talks (wit3.fbk.eu/
  • 资源: TheQCRI Educational Domain (QED) Corpus(qedcorpus/

 

生成词法变化

 

  • WIKI Inflection (nflection
  • 论文:MorphologicalInflection Generation Using Character Sequence to Sequence Learning(0
  • 竞赛:SIGMORPHON 2016 Shared Task: Morphological Reinflection(/sigmorphon2016/
  • 资源:sigmorphon2016 (l/sigmorphon2016

 

命名实体消歧

 

 

命名实体识别

 

 

释义检测

 

 

语法分析

 

  • WIKI Parsing (arsing
  • 工具包: TheStanford Parser: A statistical parser (re/lex-parser.shtml
  • 工具包: spaCyparser (endency-parse
  • 论文:A fastand accurate dependency parser using neural networks(4-1082
  • 竞赛: CoNLL2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies (/conll17/
  • 竞赛: CoNLL2016 Shared Task: Multilingual Shallow Discourse Parsing(nll16st/
  • 竞赛: CoNLL2015 Shared Task: Shallow Discourse Parsing(nll15st/
  • 竞赛:SemEval-2016 Task 8: The meaning representations may be abstract, but this taskis concrete! (6/task8/

 

词性标记

 

  • WIKI Part-of-speech tagging (art-of-speech_tagging
  • 论文:MultilingualPart-of-Speech Tagging with Bidirectional Long Short-Term Memory Models andAuxiliary Loss (9.pdf
  • 论文:UnsupervisedPart-Of-Speech Tagging with Anchor Hidden Markov Models(php/tacl/article/viewFile/837/192
  • 资源:Treebank-3 (dc99t42
  • 工具包:nltk.tag package (tml

 

拼音-中文转换

 

  • 论文:NeuralNetwork Language Model for Chinese Pinyin Input Method Engine(5-1052
  • 项目: NeuralChinese Transliterator(ral_chinese_transliterator

 

问答系统

 

 

关系提取

 

 

语义角色标注

 

 

语句边界消歧

 

 

情绪分析

 

 

源分离

 

  • WIKI Source separation (ource_separation
  • 论文:FromBlind to Guided Audio Source Separation (/hal-00922378/document
  • 论文:JointOptimization of Masks and Deep Recurrent Neural Networks for Monaural SourceSeparation (9
  • 竞赛: SignalSeparation Evaluation Campaign (SiSEC)(sisec.inria.fr/
  • 竞赛: CHiMESpeech Separation and Recognition Challenge(hime_challenge/

 

说话人认证

 

  • 见「说话人识别」部分

 

语音身份分离

 

 

说话人识别

 

 

唇读

 

  • 见「观唇辨意」部分

 

语音识别

 

  • 见「自动语音识别」部分

 

语音分割

 

 

语音合成

 

 

语音增强

 

 

语音文本转换

 

  • 见「自动语音识别」部分

 

口语的术语检测

 

  • 见「语音分割」部分

 

词干提取

 

  • WIKI Stemming (temming
  • 论文: ABACKPROPAGATION NEURAL NETWORK TO IMPROVE ARABIC STEMMING(No3/7Vol82No3.pdf
  • 工具包: NLTKStemmers (l

 

术语提取

 

  • WIKI Terminology extraction (erminology_extraction
  • 论文: NeuralAttention Models for Sequence Classification: Analysis and Application to KeyTerm Extraction and Dialogue Act Detection (7.pdf

 

文本简化

 

 

文本语音转换

 

  • 见「语音合成」部分

 

文本蕴涵

 

  • WIKI Textual entailment (extual_entailment
  • 项目:Textual Entailment with TensorFlow (t/Entailment-with-Tensorflow
  • 论文:Textual Entailment with Structured Attentions and Composition(6.pdf
  • 竞赛:SemEval-2014 Task 1: Evaluation of compositional distributional semantic modelson full sentences through semantic relatedness and textual entailment(4/task1/
  • 竞赛:SemEval-2013 Task 7: The Joint Student Response Analysis and 8th RecognizingTextual Entailment Challenge (013/task7.html

 

声音转换

 

 

声音识别

 

  • 见「说话人识别」部分

 

词嵌入

 

 

词预测

 

  • INFO What is Word Prediction? (ry/wp/what_is.htm
  • 论文: Theprediction of character based on recurrent neural network language model(mp/stamp.jsp?arnumber=7960065
  • 论文: AnEmbedded Deep Learning based Word Prediction(2
  • 论文:Evaluating Word Prediction: Framing Keystroke Savings(8-2066
  • 资源: AnEmbedded Deep Learning based Word Prediction(dPrediction/master.zip
  • 项目: WordPrediction using Convolutional Neural Networks—can you do better than iPhone™Keyboard? (d_prediction

 

词分割

 

 

词义消歧

相关文章: