字符串是不可变的,因此没有插入或删除方法。但是,您可以将其更改为一个肯定是可变的列表。我可能会有一本以标点符号作为键的字典和每个索引的列表。您可能会遇到的问题是,如果您有多个标点符号,则无法保证它们会以正确的顺序插入。例如:
sentence = 'I am called bob. What is your name?'
punc = ('!', '"', '£', '$', '%', '^', '&', '*', '(', ')', '¬', '`', '{', '}', '~', '@', ':', '?', '>', '<', ',', '.', '/', ';', '#', ']', '[', '/', '*')
sentence = list(sentence)
Dictionary = {}
for i, p in enumerate(sentence): # enumerate() returns an iterable in (index, value) format
if p in punc:
if p in Dictionary:
Dictionary[p].append(i)
else:
Dictionary[p] = [i]
print(Dictionary) # => {'?': [34], '.': [15]}
例如,如果我有一个带有随机数量的各种标点符号的奇怪格式的字符串:
sentence = 'I? am. cal?led ,bob. Wh,at. is your .name?.'
... # above code
print(sentence) # => "I? am. call?ed bob,. What .i,s your .name?."
这显然是不正确的。唯一可靠的方法是遍历 dict 从最低元素到最高元素并以这种方式添加它们。
最终代码:
original = sentence = 'I? am. cal?led ,bob. Wh,at. is your .name?.'
punc = ('!', '"', '£', '$', '%', '^', '&', '*', '(', ')', '¬', '`', '{', '}', '~', '@', ':', '?', '>', '<', ',', '.', '/', ';', '#', ']', '[', '/', '*')
sentence = list(sentence)
Dictionary = {}
seq = [] # list of all indices with any punctuation
for i, p in enumerate(sentence):
if p in punc:
seq.append(i)
if p in Dictionary:
Dictionary[p].append(i)
else:
Dictionary[p] = [i]
sentence = list(filter(lambda x: x not in punc, sentence))
for i in seq:
for key, indices in Dictionary.items():
if i in indices:
sentence.insert(i, key)
indices.remove(i)
assert(''.join(sentence) == original)