Python 字典列表搜索答案

【问题标题】：Python list of dictionaries searchPython 字典列表搜索
【发布时间】：2012-01-29 00:35:39
【问题描述】：

假设我有这个：

[
  {"name": "Tom", "age": 10},
  {"name": "Mark", "age": 5},
  {"name": "Pam", "age": 7}
]

通过搜索“Pam”作为名称，我想检索相关字典：{name: "Pam", age: 7}

如何做到这一点？

【问题讨论】：

标签： python search dictionary

【解决方案1】：

people = [
{'name': "Tom", 'age': 10},
{'name': "Mark", 'age': 5},
{'name': "Pam", 'age': 7}
]

def search(name):
    for p in people:
        if p['name'] == name:
            return p

search("Pam")

【讨论】：

它将返回列表中具有给定名称的第一个字典。
只是为了让这个非常有用的例程更通用一点：def search(list, key, value): for item in list: if item[key] == value: return item

【解决方案2】：

您可以使用generator expression：

>>> dicts = [
...     { "name": "Tom", "age": 10 },
...     { "name": "Mark", "age": 5 },
...     { "name": "Pam", "age": 7 },
...     { "name": "Dick", "age": 12 }
... ]

>>> next(item for item in dicts if item["name"] == "Pam")
{'age': 7, 'name': 'Pam'}

如果您需要处理不存在的项目，那么您可以执行用户 Matt suggested in his comment 的操作，并使用稍微不同的 API 提供默认值：

next((item for item in dicts if item["name"] == "Pam"), None)

并且要查找项目的索引，而不是项目本身，可以enumerate()列表：

next((i for i, item in enumerate(dicts) if item["name"] == "Pam"), None)

【讨论】：

只是为了节省其他人一点时间，如果您需要在“Pam”事件中的默认值只是不在列表中：next((item for item in dicts if item["name "] == "帕姆"), 无)
[item for item in dicts if item["name"] == "Pam"][0] 呢？
@Moberg，这仍然是一个列表推导，所以它会遍历整个输入序列，而不管匹配项的位置。
如果字典中不存在键，这将引发停止迭代错误
@Siemkowski：然后添加enumerate()生成运行索引：next(i for i, item in enumerate(dicts) if item["name"] == "Pam")。

【解决方案3】：

您可以使用list comprehension：

def search(name, people):
    return [element for element in people if element['name'] == name]

【讨论】：

这很好，因为如果有多个匹配项，它会返回所有匹配项。不完全是问题所要求的，但这是我需要的！谢谢！
请注意这会返回一个列表！
是否可以通过两个条件？比如 if element['name'] == name and element['age'] == age?我试过了，但似乎不起作用，说元素在第二个条件下未定义。
@Martynas 是的，这是可能的。不要忘记将参数age 添加到函数def search2(name, age, people): 并且不要忘记传递此参数，以及 =)。我刚刚尝试了两个条件，它的工作原理！

【解决方案4】：

我的第一个想法是，您可能要考虑创建一个包含这些词典的词典……例如，如果您要搜索它的次数超过少数。

但是，这可能是过早的优化。会有什么问题：

def get_records(key, store=dict()):
    '''Return a list of all records containing name==key from our store
    '''
    assert key is not None
    return [d for d in store if d['name']==key]

【讨论】：

实际上你可以有一个字典，其中有一个 name=None 项；但这不适用于此列表理解，并且在您的数据存储中允许它可能不明智。
如果调试模式关闭，断言可能会被跳过。

【解决方案5】：

names = [{'name':'Tom', 'age': 10}, {'name': 'Mark', 'age': 5}, {'name': 'Pam', 'age': 7}]
resultlist = [d    for d in names     if d.get('name', '') == 'Pam']
first_result = resultlist[0]

这是一种方式...

【讨论】：

我可能会建议 [d for x in names if d.get('name', '') == 'Pam'] ... 优雅地处理“names”中没有的任何条目有一个“名称”键。

【解决方案6】：

您必须遍历列表中的所有元素。没有捷径！

除非您在其他地方保留指向列表项的名称字典，但是您必须注意从列表中弹出元素的后果。

【讨论】：

在未排序的列表和缺少键的情况下，此语句是正确的，但通常不正确。如果已知列表已排序，则不需要遍历所有元素。此外，如果命中单个记录并且您知道键是唯一的或只需要一个元素，则迭代可能会停止并返回单个项目。
查看@user334856的回答
@MelihYıldız' 也许我的陈述并不清楚。通过在答案中使用列表理解 user334856 stackoverflow.com/a/8653572/512225 正在遍历整个列表。这证实了我的说法。你提到的答案是我写的另一种说法。

【解决方案7】：

dicts=[
{"name": "Tom", "age": 10},
{"name": "Mark", "age": 5},
{"name": "Pam", "age": 7}
]

from collections import defaultdict
dicts_by_name=defaultdict(list)
for d in dicts:
    dicts_by_name[d['name']]=d

print dicts_by_name['Tom']

#output
#>>>
#{'age': 10, 'name': 'Tom'}

【讨论】：

【解决方案8】：

这是在字典列表中搜索值的一般方法：

def search_dictionaries(key, value, list_of_dictionaries):
    return [element for element in list_of_dictionaries if element[key] == value]

【讨论】：

【解决方案9】：

这在我看来是最pythonic的方式：

people = [
{'name': "Tom", 'age': 10},
{'name': "Mark", 'age': 5},
{'name': "Pam", 'age': 7}
]

filter(lambda person: person['name'] == 'Pam', people)

结果（在 Python 2 中作为列表返回）：

[{'age': 7, 'name': 'Pam'}]

注意：在 Python 3 中，会返回一个过滤器对象。所以python3的解决方案是：

list(filter(lambda person: person['name'] == 'Pam', people))

【讨论】：

值得注意的是，这个答案返回了一个列表，其中包含人员中“Pam”的所有匹配项，或者我们可以通过将比较运算符更改为 ! =。 +1
另外值得一提的是，结果是一个过滤器对象，而不是一个列表——如果你想使用len()之类的东西，你需要先在结果上调用list()。或：stackoverflow.com/questions/19182188/…
@wasabigeek 这就是我的 Python 2.7 所说的：people = [ {'name': "Tom", 'age': 10}, {'name': "Mark", 'age': 5}, {'name': "Pam", 'age': 7} ] r = filter(lambda person: person['name'] == 'Pam', people) type(r) list 所以r 是list
列表推导被认为比 map/filter/reduce 更 Pythonic：stackoverflow.com/questions/5426754/google-python-style-guide
获取第一场比赛：next(filter(lambda x: x['name'] == 'Pam', dicts))

【解决方案10】：

@Frédéric Hamidi 的回答很棒。在 Python 3.x 中，.next() 的语法略有变化。因此稍作修改：

>>> dicts = [
     { "name": "Tom", "age": 10 },
     { "name": "Mark", "age": 5 },
     { "name": "Pam", "age": 7 },
     { "name": "Dick", "age": 12 }
 ]
>>> next(item for item in dicts if item["name"] == "Pam")
{'age': 7, 'name': 'Pam'}

正如@Matt 在 cmets 中提到的，您可以像这样添加默认值：

>>> next((item for item in dicts if item["name"] == "Pam"), False)
{'name': 'Pam', 'age': 7}
>>> next((item for item in dicts if item["name"] == "Sam"), False)
False
>>>

【讨论】：

这是 Python 3.x 的最佳答案。如果您需要字典中的特定元素，例如年龄，您可以编写：next((item.get('age') for item in dicts if item["name"] == "Pam"), False)跨度>

【解决方案11】：

向@FrédéricHamidi 添加一点点。

如果您不确定某个键是否在 dicts 列表中，这样的事情会有所帮助：

next((item for item in dicts if item.get("name") and item["name"] == "Pam"), None)

【讨论】：

或者干脆item.get("name") == "Pam"

【解决方案12】：

这是一个使用迭代遍历列表的比较，使用过滤器 + lambda 或重构（如果需要或对您的情况有效）您的代码到 dicts 而不是 dicts 列表

import time

# Build list of dicts
list_of_dicts = list()
for i in range(100000):
    list_of_dicts.append({'id': i, 'name': 'Tom'})

# Build dict of dicts
dict_of_dicts = dict()
for i in range(100000):
    dict_of_dicts[i] = {'name': 'Tom'}


# Find the one with ID of 99

# 1. iterate through the list
lod_ts = time.time()
for elem in list_of_dicts:
    if elem['id'] == 99999:
        break
lod_tf = time.time()
lod_td = lod_tf - lod_ts

# 2. Use filter
f_ts = time.time()
x = filter(lambda k: k['id'] == 99999, list_of_dicts)
f_tf = time.time()
f_td = f_tf- f_ts

# 3. find it in dict of dicts
dod_ts = time.time()
x = dict_of_dicts[99999]
dod_tf = time.time()
dod_td = dod_tf - dod_ts


print 'List of Dictionries took: %s' % lod_td
print 'Using filter took: %s' % f_td
print 'Dict of Dicts took: %s' % dod_td

输出是这样的：

List of Dictionries took: 0.0099310874939
Using filter took: 0.0121960639954
Dict of Dicts took: 4.05311584473e-06

结论： 显然，拥有字典字典是在这些情况下能够进行搜索的最有效方式，在这种情况下，您知道您将仅通过 id 进行搜索。有趣的是，使用过滤器是最慢的解决方案。

【讨论】：

【解决方案13】：

你试过 pandas 包吗？它非常适合此类搜索任务并进行了优化。

import pandas as pd

listOfDicts = [
{"name": "Tom", "age": 10},
{"name": "Mark", "age": 5},
{"name": "Pam", "age": 7}
]

# Create a data frame, keys are used as column headers.
# Dict items with the same key are entered into the same respective column.
df = pd.DataFrame(listOfDicts)

# The pandas dataframe allows you to pick out specific values like so:

df2 = df[ (df['name'] == 'Pam') & (df['age'] == 7) ]

# Alternate syntax, same thing

df2 = df[ (df.name == 'Pam') & (df.age == 7) ]

我在下面添加了一些基准测试，以说明 pandas 在更大范围（即 100k+ 条目）上的更快运行时间：

setup_large = 'dicts = [];\
[dicts.extend(({ "name": "Tom", "age": 10 },{ "name": "Mark", "age": 5 },\
{ "name": "Pam", "age": 7 },{ "name": "Dick", "age": 12 })) for _ in range(25000)];\
from operator import itemgetter;import pandas as pd;\
df = pd.DataFrame(dicts);'

setup_small = 'dicts = [];\
dicts.extend(({ "name": "Tom", "age": 10 },{ "name": "Mark", "age": 5 },\
{ "name": "Pam", "age": 7 },{ "name": "Dick", "age": 12 }));\
from operator import itemgetter;import pandas as pd;\
df = pd.DataFrame(dicts);'

method1 = '[item for item in dicts if item["name"] == "Pam"]'
method2 = 'df[df["name"] == "Pam"]'

import timeit
t = timeit.Timer(method1, setup_small)
print('Small Method LC: ' + str(t.timeit(100)))
t = timeit.Timer(method2, setup_small)
print('Small Method Pandas: ' + str(t.timeit(100)))

t = timeit.Timer(method1, setup_large)
print('Large Method LC: ' + str(t.timeit(100)))
t = timeit.Timer(method2, setup_large)
print('Large Method Pandas: ' + str(t.timeit(100)))

#Small Method LC: 0.000191926956177
#Small Method Pandas: 0.044392824173
#Large Method LC: 1.98827004433
#Large Method Pandas: 0.324505090714

【讨论】：

【解决方案14】：

我在寻找相同问题的答案时发现了这个帖子题。虽然我意识到这是一个迟到的答案，但我想我会贡献它以防它对其他人有用：

def find_dict_in_list(dicts, default=None, **kwargs):
    """Find first matching :obj:`dict` in :obj:`list`.

    :param list dicts: List of dictionaries.
    :param dict default: Optional. Default dictionary to return.
        Defaults to `None`.
    :param **kwargs: `key=value` pairs to match in :obj:`dict`.

    :returns: First matching :obj:`dict` from `dicts`.
    :rtype: dict

    """

    rval = default
    for d in dicts:
        is_found = False

        # Search for keys in dict.
        for k, v in kwargs.items():
            if d.get(k, None) == v:
                is_found = True

            else:
                is_found = False
                break

        if is_found:
            rval = d
            break

    return rval


if __name__ == '__main__':
    # Tests
    dicts = []
    keys = 'spam eggs shrubbery knight'.split()

    start = 0
    for _ in range(4):
        dct = {k: v for k, v in zip(keys, range(start, start+4))}
        dicts.append(dct)
        start += 4

    # Find each dict based on 'spam' key only.  
    for x in range(len(dicts)):
        spam = x*4
        assert find_dict_in_list(dicts, spam=spam) == dicts[x]

    # Find each dict based on 'spam' and 'shrubbery' keys.
    for x in range(len(dicts)):
        spam = x*4
        assert find_dict_in_list(dicts, spam=spam, shrubbery=spam+2) == dicts[x]

    # Search for one correct key, one incorrect key:
    for x in range(len(dicts)):
        spam = x*4
        assert find_dict_in_list(dicts, spam=spam, shrubbery=spam+1) is None

    # Search for non-existent dict.
    for x in range(len(dicts)):
        spam = x+100
        assert find_dict_in_list(dicts, spam=spam) is None

【讨论】：

【解决方案15】：

我测试了各种方法来遍历字典列表并返回 key x 具有特定值的字典。

结果：

速度：列表理解 > 生成器表达式 >> 正常列表迭代 >>> 过滤器。
所有比例都与列表中的字典数量成线性关系（10 倍列表大小 -> 10 倍时间）。
每个字典的键不会显着影响大量（数千）键的速度。请看我计算的这张图：https://imgur.com/a/quQzv（方法名称见下文）。

所有测试均使用 Python 3.6.4, W7x64。

from random import randint
from timeit import timeit


list_dicts = []
for _ in range(1000):     # number of dicts in the list
    dict_tmp = {}
    for i in range(10):   # number of keys for each dict
        dict_tmp[f"key{i}"] = randint(0,50)
    list_dicts.append( dict_tmp )



def a():
    # normal iteration over all elements
    for dict_ in list_dicts:
        if dict_["key3"] == 20:
            pass

def b():
    # use 'generator'
    for dict_ in (x for x in list_dicts if x["key3"] == 20):
        pass

def c():
    # use 'list'
    for dict_ in [x for x in list_dicts if x["key3"] == 20]:
        pass

def d():
    # use 'filter'
    for dict_ in filter(lambda x: x['key3'] == 20, list_dicts):
        pass

结果：

1.7303 # normal list iteration 
1.3849 # generator expression 
1.3158 # list comprehension 
7.7848 # filter

【讨论】：

我添加了实现 next 的函数 z()，正如上面 Frédéric Hamidi 所指出的那样。以下是 Py 配置文件的结果。

【解决方案16】：

只需使用列表推导：

[i for i in dct if i['name'] == 'Pam'][0]

示例代码：

dct = [
    {'name': 'Tom', 'age': 10},
    {'name': 'Mark', 'age': 5},
    {'name': 'Pam', 'age': 7}
]

print([i for i in dct if i['name'] == 'Pam'][0])

> {'age': 7, 'name': 'Pam'}

【讨论】：

【解决方案17】：

你可以试试这个：

''' lst: list of dictionaries '''
lst = [{"name": "Tom", "age": 10}, {"name": "Mark", "age": 5}, {"name": "Pam", "age": 7}]

search = raw_input("What name: ") #Input name that needs to be searched (say 'Pam')

print [ lst[i] for i in range(len(lst)) if(lst[i]["name"]==search) ][0] #Output
>>> {'age': 7, 'name': 'Pam'}

【讨论】：

【解决方案18】：

您可以通过在 Python 中使用 filter 和 next 方法来实现这一点。

filter 方法过滤给定的序列并返回一个迭代器。 next 方法接受一个迭代器并返回列表中的下一个元素。

所以你可以找到元素，

my_dict = [
    {"name": "Tom", "age": 10},
    {"name": "Mark", "age": 5},
    {"name": "Pam", "age": 7}
]

next(filter(lambda obj: obj.get('name') == 'Pam', my_dict), None)

输出是，

{'name': 'Pam', 'age': 7}

注意：如果没有找到我们正在搜索的名称，上面的代码将返回None。

【讨论】：

这比列表推导要慢很多。

【解决方案19】：

使用列表推导的一种简单方法是，如果 l 是列表

l = [
{"name": "Tom", "age": 10},
{"name": "Mark", "age": 5},
{"name": "Pam", "age": 7}
]

然后

[d['age'] for d in l if d['name']=='Tom']

【讨论】：

【解决方案20】：

这里提出的大多数（如果不是全部）实现有两个缺陷：

他们假设只传递一个键进行搜索，而为复杂的字典设置更多键可能会很有趣
他们假定为搜索传递的所有键都存在于字典中，因此它们不能正确处理发生的 KeyError，而实际上它不存在。

一个更新的命题：

def find_first_in_list(objects, **kwargs):
    return next((obj for obj in objects if
                 len(set(obj.keys()).intersection(kwargs.keys())) > 0 and
                 all([obj[k] == v for k, v in kwargs.items() if k in obj.keys()])),
                None)

也许不是最pythonic，但至少更安全一点。

用法：

>>> obj1 = find_first_in_list(list_of_dict, name='Pam', age=7)
>>> obj2 = find_first_in_list(list_of_dict, name='Pam', age=27)
>>> obj3 = find_first_in_list(list_of_dict, name='Pam', address='nowhere')
>>> 
>>> print(obj1, obj2, obj3)
{"name": "Pam", "age": 7}, None, {"name": "Pam", "age": 7}

gist。

【讨论】：

【解决方案21】：

def dsearch(lod, **kw):
    return filter(lambda i: all((i[k] == v for (k, v) in kw.items())), lod)

lod=[{'a':33, 'b':'test2', 'c':'a.ing333'},
     {'a':22, 'b':'ihaha', 'c':'fbgval'},
     {'a':33, 'b':'TEst1', 'c':'s.ing123'},
     {'a':22, 'b':'ihaha', 'c':'dfdvbfjkv'}]



list(dsearch(lod, a=22))

[{'a': 22, 'b': 'ihaha', 'c': 'fbgval'},
 {'a': 22, 'b': 'ihaha', 'c': 'dfdvbfjkv'}]



list(dsearch(lod, a=22, b='ihaha'))

[{'a': 22, 'b': 'ihaha', 'c': 'fbgval'},
 {'a': 22, 'b': 'ihaha', 'c': 'dfdvbfjkv'}]


list(dsearch(lod, a=22, c='fbgval'))

[{'a': 22, 'b': 'ihaha', 'c': 'fbgval'}]

【讨论】：

【解决方案22】：

我会像这样创建一个字典：

names = ["Tom", "Mark", "Pam"]
ages = [10, 5, 7]
my_d = {}

for i, j in zip(names, ages):
    my_d[i] = {"name": i, "age": j}

或者，使用与发布的问题完全相同的信息：

info_list = [{"name": "Tom", "age": 10}, {"name": "Mark", "age": 5}, {"name": "Pam", "age": 7}]
my_d = {}

for d in info_list:
    my_d[d["name"]] = d

然后你可以my_d["Pam"] 得到{"name": "Pam", "age": 7}

【讨论】：

【解决方案23】：

将接受的答案放入函数中以便于重复使用

def get_item(collection, key, target):
    return next((item for item in collection if item[key] == target), None)

或者也可以作为 lambda

   get_item_lambda = lambda collection, key, target : next((item for item in collection if item[key] == target), None)

结果

    key = "name"
    target = "Pam"
    print(get_item(target_list, key, target))
    print(get_item_lambda(target_list, key, target))

    #{'name': 'Pam', 'age': 7}
    #{'name': 'Pam', 'age': 7}

如果键可能不在目标字典中，请使用 dict.get 并避免使用`KeyError`

def get_item(collection, key, target):
    return next((item for item in collection if item.get(key, None) == target), None)

get_item_lambda = lambda collection, key, target : next((item for item in collection if item.get(key, None) == target), None)

【讨论】：

如果键可能不在目标字典中，请使用 dict.get 并避免使用KeyError

如果键可能不在目标字典中，请使用 dict.get 并避免使用`KeyError`