找到最大间隔数与python重叠的点的最有效方法答案

【问题标题】：The most efficient way to find the point where maximum number of intervals overlap with python找到最大间隔数与python重叠的点的最有效方法
【发布时间】：2018-12-22 20:05:27
【问题描述】：

假设我有一个日志寄存器，用于记录用户从某个服务器进入和退出的时间。我需要找到最多会话的时间。如果有多个可能的答案，则应选择最小的一个。输入包含第一行中的会话数。

示例输入：

输出：

我试过这个脚本：

from collections import Counter, OrderedDict

load = Counter()
with open("input.txt", "r") as f:
    n = int(f.readline())
    for i in range(n):
        session = f.readline()
        session = session.split()
        load.update(range(int(session[0]), int(session[1])+1))

load = load.most_common()
i = 0
max = load[0][1]
candidates = []
while load[i][1] == max:
    candidates.append(load[i][0])
    i += 1
print(min(candidates))

首先，我使用Counter() 来计算所有点的出现次数。其次，我使用load = load.most_common() 按出现次数对结果字典进行排序。最后，我找到了所有键的最小值以及相应的最大值（= 出现次数）。

其实如果Counter()返回一个key排序的dict，就简单多了。

无论如何，这是我的家庭任务，它在其中一个测试输入上运行超过 1 秒（给定时间限制）。可以做些什么来加快它的速度？我读过关于区间树的文章，但我不确定它是否相关。

【问题讨论】：

“最高效的”...“使用 Python”是一个挑眉。如果每一纳秒都很重要，你为什么要使用 Python？你想用多少可读性来换取速度？
这看起来非常像 hackerrank.com "riddle" ...
@timgeb 我认为这没有任何问题。假设这个条件是任务的一部分。

标签： python algorithm intervals

【解决方案1】：

假设ins 和outs 是登录和注销时间：

ins = [4,0,1,7,2]
outs = [5,3,9,8,6]

将它们组合在一个排序列表中，并用数字符号表示是“到达”（肯定）还是“离开”（否定）：

times = sorted(ins + [-x for x in outs], key=abs)

现在，遍历列表并计算“到达”和“离开”发生的时间：

lmax = -1
logged = 0
for t in times:
    if t >= 0:
        logged += 1
        if logged > lmax:
            tmax = t
            lmax = logged
    else:
        logged -= 1

print(tmax, lmax)
#2 3

【讨论】：

【解决方案2】：

对此的快速解决方案是在进入/退出时间存储 +1、-1 - 然后对 dict-keys 进行排序并逐步对其求和，然后获得最大值：

data = """5
4 5
0 3
1 9
7 8
2 6"""
with open("input.txt", "w") as f:
    f.write(data) 

d = {}
with open("input.txt", "r") as f:
    next(f)
    for line in f:  
        if line.strip():
            start, stop = map(int,line.strip().split())
            d.setdefault(start,0)
            d[start] += 1
            d.setdefault(stop,0)
            d[stop] -= 1

maxx = 0
s = 0
max_idx = 0

# iteratively summ over sorted times from dict
for idx,key in enumerate(sorted(d)):
    s += d[key]
    if maxx < s:  # remembert new max_idx and max
        maxx = s
        max_idx = idx
print(max_idx)

如果它仍然太慢而无法解决您的挑战，您可以使用 defaultdict(int)。

【讨论】：