【发布时间】:2015-03-06 21:39:27
【问题描述】:
import re
replacement_patterns = [
(r'won\'t', 'will not'),
(r'can\'t', 'cannot'),
(r'i\'m', 'i am'),
(r'ain\'t', 'is not'),
(r'(\w+)\'ll', '\g<1> will'),
(r'(\w+)n\'t', '\g<1> not'),
(r'(\w+)\'ve', '\g<1> have'),
(r'(\w+)\'s', '\g<1> is'),
(r'(\w+)\'re', '\g<1> are'),
(r'(\w+)\'d', '\g<1> would')
]
class RegexpReplacer(object):
def __init__(self, patterns=replacement_patterns):
self.patterns = [(re.compile(regex), repl) for (regex, repl)
in pattern]
def replace(self, text):
s = text
for (pattern, repl) in self.patterns:
(s, count) = re.subn(pattern, repl, s)
return s
rep=RegexpReplacer()
print rep.replace("can't is a contradicton")
我从 Jacob Perkins 的 Python Text Processing with NLTK 2.0 Cookbook 复制了这段代码
但是我的预期输出是: 不能是矛盾的
实际输出为: can't 是矛盾的
我无法确定 t 中的错误
【问题讨论】:
标签: python regex output nltk ontology