【发布时间】:2021-02-22 23:05:00
【问题描述】:
我正在尝试在 O(N log K) 时间内解决 Top K Frequent Words Leetcode problem 问题,但结果不理想。我的 Python3 代码和控制台输出如下:
from collections import Counter
import heapq
class Solution:
def topKFrequent(self, words: List[str], k: int) -> List[str]:
counts = Counter(words)
print('Word counts:', counts)
result = []
for word in counts:
print('Word being added:', word)
if len(result) < k:
heapq.heappush(result, (-counts[word], word))
print(result)
else:
heapq.heappushpop(result, (-counts[word], word))
result = [r[1] for r in result]
return result
----------- Console output -----------
Word counts: Counter({'the': 3, 'is': 3, 'sunny': 2, 'day': 1})
Word being added: the
[(-3, 'the')]
Word being added: day
[(-3, 'the'), (-1, 'day')]
Word being added: is
[(-3, 'is'), (-1, 'day'), (-3, 'the')]
Word being added: sunny
[(-3, 'is'), (-2, 'sunny'), (-3, 'the'), (-1, 'day')]
当我使用K = 4 运行测试用例["the", "day", "is", "sunny", "the", "the", "sunny", "is", "is"] 时,我发现一旦添加了is,单词the 就会移动到列表的末尾(在day 之后),即使它们都有计数为 3。这是有道理的,因为父级只需要 (-2, 'sunny') 和(-3, 'the') 都> (-3, 'is'),因此堆不变量实际上是保持不变的,即使(-3, 'the') (-2, 'sunny') 并且是(-3, 'is') 的右孩子。预期结果是["is","the","sunny","day"],而我的代码输出是["is","sunny","the","day"]。
我是否应该在 O(N log K) 时间内使用堆来解决这个问题,如果是,我该如何修改我的代码以达到预期的结果?
【问题讨论】: