如何获得一组具有最大值的键？答案

【问题标题】：How to get a set of keys with largest values?如何获得一组具有最大值的键？
【发布时间】：2016-12-05 06:15:31
【问题描述】：

我正在处理一个函数

def common_words(dictionary, N):
     if len(dictionary) > N:
         max(dictionary, key=dictionary.get)

函数说明为：

第一个参数是字数字典，第二个参数是一个正整数 N。这个函数应该更新字典，所以它包括最常见的（最高频率的词）。最多 N 单词应该包含在字典中。如果包括所有单词有一些字数会导致字典超过 N 单词，则不应包含具有该字数的单词。（即，对于第 N+1 个最常用词的平局，省略所有领带中的单词。）

所以我知道我需要获得具有最高值的 N 个项目，但我不知道该怎么做。我也知道，一旦我得到 N 个项目，如果有任何重复的值，我需要将它们弹出。

例如，给定

k = {'a':5, 'b':4, 'c':4, 'd':1}

然后

common_words(k, 2)

应修改k，使其变为{'a':5}。

【问题讨论】：

请提供样本输入输出
>>> k= {'a':5,'b':4,'c':4,'d':1} >>> common_words(k,2) 应该返回 '一个'
你能用例子解释你的问题吗？我不明白你的问题背后的逻辑。

标签： python dictionary

【解决方案1】：

这是我解决这个问题的算法。

将字典中的数据提取到一个列表中，并按字典值的降序对其进行排序。
清除原始字典。
将排序后的数据分组到具有相同值的组中。
使用排序列表中每个组的所有（键、值）对重新填充字典，如果这样可以保持字典总大小 N，则返回。

使用标准的itertools.groupby函数可以轻松完成分组操作。

要执行排序和分组，我们需要一个适当的键函数，如groupby、list 和sorted 文档中所述。由于我们需要每个元组的第二项，我们可以使用

def keyfunc(t):
    return t[1]

或

keyfunc = lambda t: t[1]

但使用operator.itemgetter 效率更高。

from operator import itemgetter
from itertools import groupby

def common_words(d, n):
    keyfunc = itemgetter(1)
    lst = sorted(d.items(), key=keyfunc, reverse=True)
    d.clear()
    for _, g in groupby(lst, key=keyfunc):
        g = list(g)
        if len(d) + len(g) <= n:
            d.update(g)
        else:
            break

# test

data = {'a':5, 'b':4, 'c':4, 'd':1} 

common_words(data, 4)
print(data)
common_words(data, 2)
print(data)

输出

{'c': 4, 'd': 1, 'b': 4, 'a': 5}
{'a': 5}

【讨论】：

您好，感谢您的快速回复！我还没有看到像 itemgetter 和 groupby 这样的一些功能。你能解释一下它们以及我可以在那里使用哪些替代方法？
这只是练习作业:) 我正在尝试练习字典
你知道，你可以在文档中查找函数
@EnesDal 我添加了更多解释和一些官方文档的链接。
非常感谢您的帮助！

【解决方案2】：

我的算法如下

根据值排序的字典中的第一个构建元组列表从大到小
检查 item[N-1] 是否匹配 item[N] 的值，如果是，则删除 item[N-1] （索引从 0 开始，所以那里是 -1）
最后，将最多 N 个元素的元组列表切片转换回 dict，如果想保留物品顺序，可以在这里改用OrderedDict

如果字典长度小于 N，它将只返回字典原样

def common_words(dictionary, N):
    if len(dictionary) > N:
        tmp = [(k,dictionary[k]) for k in sorted(dictionary, key=dictionary.get, reverse=True)]
        if tmp[N-1][1] == tmp[N][1]:
            N -= 1
        return dict(tmp[:N])
        # return [i[0] for i in tmp[:N]] # comment line above and uncomment this line to get keys only as your title mention how to get keys
    else:
        return dictionary
        # return dictionary.keys() # comment line above and uncomment this line to get keys only as your title mention how to get keys

>>> common_words({'a':5, 'b':4, 'c':4, 'd':1}, 2)
{'a': 5}

OP想在函数内修改输入字典并返回None，可以修改如下

def common_words(dictionary, N):
    if len(dictionary) > N:
        tmp = [(k,dictionary[k]) for k in sorted(dictionary, key=dictionary.get, reverse=True)]
        if tmp[N-1][1] == tmp[N][1]:
            N -= 1
        # return dict(tmp[:N])
        for i in tmp[N:]:
            dictionary.pop(i[0])

>>> k = {'a':5, 'b':4, 'c':4, 'd':1}
>>> common_words(k, 2)
>>> k
{'a': 5}

【讨论】：

感谢您的回复！帮了我很多。但是有一个问题，我尝试更改代码以使其不返回任何内容，并且您必须在字典中调用它才能输出它，但我无法弄清楚。举个例子： >>> k={'lets':3,'go':2,'and':2,'by':1} >>> common_words(k,2) >>> k {'lets :3}
如果我猜对了，您希望输入字典成为common_words 函数的输出，更简单的方法是k=common_words(k,2)，另一种方法是将行return dict(tmp[:N]) 更改为@987654328 @ ，这样它会修改函数内的输入字典
我希望函数返回 None，这样当函数运行时，他们必须再次输入字典才能获取值，就像我发布的示例一样，他们需要再次请求 k 才能获取答案
根据我在评论中的建议参考更新的答案
谢谢！我把这个“for i in tmp[N:]:dictionary.pop(i[0]”当成一行读到了，我很困惑。再次感谢您的帮助