【发布时间】:2019-06-23 18:30:25
【问题描述】:
我使用 Python 和 spaCy 作为我的 NLP 库。我是 NLP 工作的新手,我希望得到一些指导,以便从文本中提取表格信息。我的目标是找出冻结或不冻结的费用类型。任何指导将不胜感激。
TYPE_OF_EXPENSE FROZEN? NOT_FROZEN?
purchase order frozen null
capital frozen null
consulting frozen null
business meetings frozen null
external hires frozen null
KM&L null not frozen
travel null not frozen
import spacy
nlp = spacy.load('en_core_web_sm')
doc = nlp(u'Non-revenue-generating purchase order expenditures will be frozen. All capital
related expenditures are frozen effectively for Q4. Following spending categories
are frozen: Consulting, (including existing engagements), Business meetings.
Please note that there is a hiring freeze for external hires, subcontractors
and consulting services. KM&L expenditure will
not be frozen. Travel cost will not be on ‘freeze’.)
我的最终目标是将所有这些表格提取到一个 excel 文件中。 即使您可以就上述几个类别提出建议,我也将不胜感激。非常感谢您。
【问题讨论】: