如何从python列表的每个元素中删除前缀？答案

【问题标题】：How to remove the prefix from each element of python list?如何从python列表的每个元素中删除前缀？
【发布时间】：2020-01-01 12:20:56
【问题描述】：

我有一个包含以下项目的 python 列表：

[ 1.1 ] 1. a electronic bill presentment system.
[ 1.2 ] a network.
[ 1.3 ] a plurality of first stations, each associated with a respective one of a plurality of users and operable to transmit first requests for bills of its associated user via the network.
[ 1.4 ] a central network station configured to receive the transmitted first requests for bills and to transmit, responsive to each of the received first requests, bill availability information for the associated user via the network, wherein each of the plurality of first stations is configured to receive the transmitted bill availability information for its associated user and is operable to transmit second requests for bills of its associated user via the network.
[ 1.5 ] a plurality of second network stations, each associated with a respective one of a plurality of billers, configured to receive the transmitted second requests for bills and to transmit, responsive thereto, the requested bills of the associated user via the network.
[ 1.6 ] wherein the bill availability information for the associated user identifies those of the plurality of billers having a bill available for that user without identifying an amount of the bill of each of the identified billers for the associated user.
[ 2.1 ] 2. a method for presenting electronic bills.
[ 2.2 ] storing, at a plurality of different locations, electronic bills of a plurality of different billers for a plurality of different users.
[ 2.3 ] storing identifiers of the stored electronic bills at a location different than the plurality of different locations.
[ 2.4 ] transmitting a first request for the stored electronic bills for a first of the plurality of users.
[ 2.5 ] transmitting one or more of the stored identifiers of the stored electronic bills for the first user responsive to the transmitted first request, each of the transmitted one or more identifiers being associated with a respective one of the stored electronic bills of a different one of the plurality of billers.
[ 2.6 ] transmitting a second request for at least one of the stored electronic bills identified by the transmitted one or more identifiers.
[ 2.7 ] transmitting the at least one identified stored electronic bill responsive to the transmitted second request.
[ 2.8 ] wherein the transmitted one or more identifiers identifies the stored electronic bills without identifying an amount of the identified stored electronic bills.

我只需要从每个项目中删除前缀，但不知道如何删除它。

例如，

我需要删除[2.8](space) [2.7](space)。上面的每个新行打印代表列表的项目。也喜欢[ 1.1 ] 1. a electronic bill presentment system.，我需要删除[ 1.1 ] 1.

我要删除的代码功能如下，我正在使用一种逻辑，首先使用空格分割，然后删除非 alpha 值。

但它不能正常工作。

请帮忙。

TextDictionaryValuesList = list(TextDictionary.values()) 
# You can make a test list using above given items of mylist

def remove_non_alpha(splitlist):
    for j in range(0, len(splitlist)):
        if(splitlist[j].isalpha()):
            splitlist[j] = splitlist[j]
        else:
            splitlist[j] = ""       

    return splitlist


for i in range(0, len(TextDictionaryValuesList)):

    print(TextDictionaryValuesList[i])

    splitlist = TextDictionaryValuesList[i].split(" ")
    splitlist = remove_non_alpha(splitlist)
    TextDictionaryValuesList[i] = splitlist

print(TextDictionaryValuesList)

【问题讨论】：

这看起来像是一个课堂/作业问题。你熟悉正则表达式吗？
是的，我知道 reg 表达式，但是这个东西不能使用它

标签： python regex list

【解决方案1】：

这是使用re.sub的一种方法：

import re
l = ['[ 1.1 ] 1. a electronic bill presentment system.','[ 1.2 ] a network.']

[re.sub(r'\[\s*\d+\.*\d*\s*\]\s+(?:\d+\.\s*)?', '', s) for s in l]
# ['a electronic bill presentment system.', 'a network.']

见demo

使用更大的字符串列表进行测试：

l = ['[ 1.1 ] 1. a electronic bill presentment system.',\
'[ 1.2 ] a network.',\
'[ 1.3 ] a plurality of first stations, each associated with a respective one of a plurality of users and operable to transmit first requests for bills of its associated user via the network.',\
'[ 1.5 ] a plurality of second network stations, each associated with a respective one of a plurality of billers, configured to receive the transmitted second requests for bills and to transmit, responsive thereto, the requested bills of the associated user via the network.',\
'[ 1.6 ] wherein the bill availability information for the associated user identifies those of the plurality of billers having a bill available for that user without identifying an amount of the bill of each of the identified billers for the associated user.',\
'[ 2.1 ] 2. a method for presenting electronic bills.']

[re.sub(r'\[\s*\d+\.*\d*\s*\]\s+(?:\d+\.\s*)?', '', s) for s in l]

['a electronic bill presentment system.',
 'a network.',
 'a plurality of first stations, each associated with a respective one of a plurality of users and operable to transmit first requests for bills of its associated user via the network.',
 'a plurality of second network stations, each associated with a respective one of a plurality of billers, configured to receive the transmitted second requests for bills and to transmit, responsive thereto, the requested bills of the associated user via the network.',
 'wherein the bill availability information for the associated user identifies those of the plurality of billers having a bill available for that user without identifying an amount of the bill of each of the identified billers for the associated user.',
 'a method for presenting electronic bills.']

【讨论】：

这个链接是否会自动创建 Reg exp regex101.com/r/JVPUbx/1
不，这是用于在样本字符串@RinkuYadav 上测试正则表达式
我很好奇使用的量词。我会用+ 表示小数，用* 表示空格。我至少会改变\d?
公平点@Adirio * 在这里再考虑一下确实更有意义。只是因为似乎在所有情况下最多只有 1 个空格
@yatu 如果我想在[ 1.1 ] 1. 中删除] 之后的1. 怎么办，这可能是正则表达式，只保留[ 1.1 ]

【解决方案2】：

你应该用正则表达式用空字符串替换模式

>>> re.sub(r'\[\s?\d\.\d\s?]\s?(\d(\.\s)?)?', '', '[ 1.1 ] 1. a electronic bill presentment system.')
'a electronic bill presentment system.'

【讨论】：

如答案所示，这不会删除方括号后的前导1。

【解决方案3】：

import re

data = ["[ 1.1 ] 1. a electronic bill presentment system.","[ 1.2 ] a network."]

result = [re.search('[a-z,A-Z].*',i).group(0) for i in data]
print(result)

【讨论】：

【解决方案4】：

如果你不想避免正则表达式，你可以保持简单，只要你没有像[1.2.b] 这样的花哨前缀或任何带有字母的东西。

def chop_non_alpha(txt):
    for i in range(len(txt)):
        if txt[i].isalpha():
            return txt[i:]

for line in lines:
    print(chop_non_alpha(line))

【讨论】：