遍历系列中的列表以在python中的列表中查找相似的元素答案

【问题标题】：Iterating over lists in a series to find similiiar elements in the list in python遍历系列中的列表以在python中的列表中查找相似的元素
【发布时间】：2019-07-05 19:03:22
【问题描述】：

我有一个系列如：

ID
1 [a,b,c,d,e]
2 [b,c,d,e,f]
3 [z,t,c,d,w]

我想打印出列表中的常用项

output: [b,c,d,e]

另外，我想知道他们属于哪个 ID
输出：

b: 1,2
c: 1,2,3
d: 1,2,3
e: 1,2

【问题讨论】：

1) 以正确的方式格式化您的输入； 2）发布您的初始代码
你能提供更多细节或例子吗？ c 和 d 出现在每个列表中，但 b 和 e 仅出现在 3 个中的 2 个中，因此不清楚您必须对给定元素有多少不同的列表感兴趣。假设我们有 100 个列表，您是否会感兴趣出现在 2 个列表中的元素？如果不是，您需要出场多少次？

标签： python list duplicates series

【解决方案1】：

如果您创建一个将索引映射到字符列表的字典，您可以获得答案的两个部分：

from collections import defaultdict
d = defaultdict(list)
arr = [
    ['a','b','c','d','e'],
    ['b','c','d','e','f'],
    ['z','t','c','d','w']
    ]

for ind, l in enumerate(arr):
    for c in l:
        d[c].append(ind)
print(d)

d 将是一个像这样的字典：

defaultdict(list,
            {'a': [0],
             'b': [0, 1],
             'c': [0, 1, 2],
             'd': [0, 1, 2],
             'e': [0, 1],
             'f': [1],
             'z': [2],
             't': [2],
             'w': [2]})

出现在多个列表中的项目通过查看：

[k for k, v in d.items() if len(v) > 1]
# ['b', 'c', 'd', 'e']

您可以直接索引到 dict 以查找它们所属的索引：

d['e']
# [0, 1]

【讨论】：

【解决方案2】：

让我们尝试一种计数方法，因为我们想要一个不错的时间复杂度。

l1 = ['a','b','c','d','e']
l2 = ['b','c','d','e','f']
l3 = ['z','t','c','d','w']

# create an empty dictionary
count = dict()

# start your id counter
list_id = 1    

# iterate over the lists
for lst in [l1,l2,l3]:
    # iterate over each list, getting the char
    for char in lst:
        try:
            # try to append the list id to each corresponding char
            count[char].append(list_id)
        except:
            # if the char key doesn't exist in the dict, we add it as a list
            # containing our list id in which it was first found
            count[char] = [list_id]
    # increment our list id, as we finished looking on li
    list_id = list_id + 1

# print each char and list that contains more than one list_id
for key in count:
    if len(count[key])>1:
        print(key+': '+str(count[key]))

输出将是

b: [1, 2]
c: [1, 2, 3]
d: [1, 2, 3]
e: [1, 2]

【讨论】：

一些注意事项：1) 你几乎不需要像这样手动增加一个 id 字段。只需创建循环for list_id, lst in enumerate([l1,l2,l3], 1):，您就可以删除list_id 到0 的显式初始化，以及每个循环结束时的显式增量。 2）不要使用裸excepts；你期待一个KeyError，只抓住那个，所以你不要忽略TypeErrors、KeyboardInterrupts等。或者只使用collections.defaultdict(list)，这样你就不需要try/except完全可以，并且可以无条件地做count[char].append(list_id)。

【解决方案3】：

欢迎来到 StackOverflow。

如果我了解您的问题，您可以使用 defaultdict 解决此问题：

from collections import defaultdict

l1 = ['a', 'b', 'c', 'd', 'e']
l2 = ['b', 'c', 'd', 'e', 'f']
l3 = ['z', 't', 'c', 'd', 'w']

output = defaultdict(list)

for l in [l1, l2, l3]:
    for item in l:
        output[item].append(l)

output = [{k: v} for k, v in output.items() if len(v) == 3]

print(output)

输出：

[
  {'c': [['a', 'b', 'c', 'd', 'e'], ['b', 'c', 'd', 'e', 'f'], ['z', 't', 'c', 'd', 'w']]},
  {'d': [['a', 'b', 'c', 'd', 'e'], ['b', 'c', 'd', 'e', 'f'], ['z', 't', 'c', 'd', 'w']]}
]

这能回答你的问题吗？

【讨论】：

我认为您需要多一层循环来获取单个字符。
他说的是“普通物品”，我假设一个列表并比较他们的物品。如果比较字符串项中的单个字符，则需要对item 字符串中的每个字符进行第三次循环。