如何替换数组中的条目答案

【问题标题】：How to replace entries in an array如何替换数组中的条目
【发布时间】：2018-04-03 09:23:32
【问题描述】：

我有一个数组，例如：

key = ['*', '(DATE*', '*', '*', '*)', '*', '*', '*', '*', '*', '*', '*', '*', '*', '*', '*', '*', '*', '*', '*', '*', '*', '*', '(GPE*', '*)', '*', '*', '*', '(DATE)', '*']

我有这样一个数组，我想为它执行类似的任务，

遍历数组
一旦我找到以 '(' 开头但不以 ')' 结尾的条目
替换下一个 '' 条目，直到我们找不到 ')' 并将 '*)' 替换为找到的以 '('
如果条目在 '()' 内，则应该被剥离。至于倒数第二个元素（DATE）仅替换为DATE

例如我们有第二个条目 '(DATE*' 后跟 '','','*)'，因此这些条目应仅替换为 DATE

输出应该是：

key = ['*', 'DATE', 'DATE', 'DATE', 'DATE', '*', '*', '*', '*', '*', '*', '*', '*', '*', '*', '*', '*', '*', '*', '*', '*', '*', '*', 'GPE', 'GPE', '*', '*', '*', 'DATE', '*']

【问题讨论】：

标签： python arrays list

【解决方案1】：

**Nothing but some regex and while loops**
import re
key = key = ['*', '(DATE*', '*', '*', '*)', '*', '*', '*', '*', '*', '*', '*', '*', '*',
             '*', '*', '*', '*', '*', '*', '*', '*', '*', '(GPE*', '*)', '*', '*', '*', '(DATE)', '*']
val = 0
while val < len(key):
    value = key[val]
    if re.findall(r'\(',value):
        value = re.findall(r'\w+', value)[0]
        while re.findall(r'\)', key[val]) == []:
            key[val] = value
            val += 1
        key[val] = value
    val += 1
print key

输出 - ['*', 'DATE', 'DATE', 'DATE', 'DATE', '*', '*', '*', '*', '*', '*', '*', '*', '*', '*', '*', '*', '*', '*', '*', '*', '*', '*', 'GPE', 'GPE', '*', '*', '*', 'DATE', '*']

【讨论】：

【解决方案2】：

我知道它没有太多的pythonic，无论如何你可以试试这个：

key = ['*', '(DATE*', '*', '*', '*)', '*', '*', '*', '*', '*', '*',
   '*', '*', '*', '*', '*', '*', '*', '*', '*', '*', '*', '*', '(GPE*', '*)',
   '*', '*', '*', '(DATE)', '*']

for i in key:
    if i.startswith('(') and not (i.endswith(')')):
        a = key[key.index(i)+1:]
        for j in a:
            if j.endswith(')'):
                a = a[:a.index(j)+1]
                break
        for l in range(key.index(i), key.index(i)+len(a)+1):
            key[l] = i.strip('(').strip('*')
    elif i.startswith('(') and i.endswith(')'):
        key[key.index(i)] = i.strip('(').strip(')')

print(key)

它会给 O/P 像：

['*', 'DATE', 'DATE', 'DATE', 'DATE', '*', '*', '*', '*', '*', '*', 
 '*', '*', '*', '*', '*', '*', '*', '*', '*', '*', '*', '*', 'GPE', 
'GPE', '*', '*', '*', 'DATE', '*']

【讨论】：

很高兴听到:)

【解决方案3】：

`key = ['*', '(DATE*', '*', '*', '*)', '*', '*', '*', '*', '*', '*', '*', '*', '*', '*', '*', '*', '*', '*', '*', '*', '*', '*', '(GPE*', '*)', '*', '*', '*', '(DATE)', '*']
outKeys = []
isFound = False
for k in key:
    if k.startswith("(") and k.endswith(")"):
        k = k[k.find("(")+1:k.find(")")]
    elif k.startswith("("):
        k = k[k.find("(")+1:k.find("*")]
        isFound = k
    elif k.endswith(")"):
        k = isFound
        isFound = False
    elif isFound:
        k = isFound
    outKeys.append(k)
print(outKeys)`

这会给你输出：

['*', 'DATE', 'DATE', 'DATE', 'DATE', '*', '*', '*', '*', '*', '*', '*', '*', '*', '*', '*', '*', '*', '*', '*', '*', '*', '*', 'GPE', 'GPE', '*', '*', '*', 'DATE', '*']

【讨论】：

【解决方案4】：

我建议您使用这个易于阅读的解决方案。我定义了另一个列表 newKey 以避免在迭代其 owm 元素时修改列表：

key = ['*', '(DATE*', '*', '*', '*)', '*', '*', '*', '*', '*', '*', '*', '*', '*', '*', '*', '*', '*', '*', '*', '*', '*', '*', '(GPE*', '*)', '*', '*', '*', '(DATE)', '*']


newKey = []
next_x = None

for x in key:
    if x.startswith('(') and x.endswith(')'):
        newKey.append(x.strip('()*'))
    elif x.startswith('('):
        newKey.append(x.strip('(*'))
        next_x = x.strip('(*')
    elif x.endswith(')'):
        newKey.append(next_x.strip('*)'))
        next_x = None
    elif next_x is not None:
        newKey.append(next_x)
    else:
        newKey.append(x)  

key = newKey[:]

print(key)

【讨论】：

【解决方案5】：

您可以使用以下代码：

current_entry = None
for i, k in enumerate(key):
    if k.startswith('(') and k.endswith(')'):
        key[i] = k.strip('(').strip(')')
        continue
    if k.startswith('(') and not k.endswith(')'):
        current_entry = k.strip('(').strip('*')
    if current_entry:
        key[i] = current_entry
    if k.endswith(')'):
        current_entry = None

【讨论】：

【解决方案6】：

可以使用简单的正则表达式来完成：

string = ' '.join(['*', '(DATE*', '*', '*', '*)', '*', '*', '*', '*', '*', '*', '*', '*', '*', '*', '*', '*', '*', '*', '*', '*', '*', '*', '(GPE*', '*)', '*', '*', '*', '(DATE)', '*'])
result = re.sub(r'\((.*?)\)', lambda m: ' '.join([m.group(1).replace('*', '').strip()
 for n in range(1 if m.group(0).count('*') == 0 else m.group(0).count('*'))]), string).split(' ')
print(result)

【讨论】：