【发布时间】:2018-03-11 12:09:51
【问题描述】:
假设我有一个这样的字符串:
"IgotthistextfromapdfIscraped.HowdoIsplitthis?"
我想制作:
"I got this text from a pdf I scraped. How do I split this?"
我该怎么做?
【问题讨论】:
-
"wheeloffortune" -> "wheel" "off" "or" "tune"?
-
@RobertLozyniak
python-wordsegment的segment函数将其拆分为['wheel', 'of', 'fortune']。不错吧?
标签: string algorithm tokenize text-segmentation