在时间范围列表中查找（数量）重叠答案

【问题标题】：Finding (number of) overlaps in a list of time ranges在时间范围列表中查找（数量）重叠
【发布时间】：2010-02-11 14:15:31
【问题描述】：

给定一个时间范围列表，我需要找到最大重叠数。

以下是显示 10 分钟通话间隔的数据集，来自我试图找到其中的最大活动行数间隔。 IE。从下面的示例中，同时活动的最大呼叫数是多少：

CallStart   CallEnd
2:22:22 PM  2:22:33 PM
2:22:35 PM  2:22:42 PM
2:22:36 PM  2:22:43 PM
2:22:46 PM  2:22:54 PM
2:22:49 PM  2:27:21 PM
2:22:57 PM  2:23:03 PM
2:23:29 PM  2:23:40 PM
2:24:08 PM  2:24:14 PM
2:27:37 PM  2:39:14 PM
2:27:47 PM  2:27:55 PM
2:29:04 PM  2:29:26 PM
2:29:31 PM  2:29:43 PM
2:29:45 PM  2:30:10 PM

如果有人知道算法或可以指出正确的方向，我将不胜感激。

TIA，

史蒂夫·F

【问题讨论】：

标签： algorithm

【解决方案1】：

以下必须有效：

对所有时间值进行排序并保存每个时间值的开始或结束状态。
将numberOfCalls 设置为0（计数变量）
遍历您的时间值并：
- 如果时间值标记为开始，则增加 numberOfCalls
- 如果时间值标记为结束，则减少 numberOfCalls
- 在过程中跟踪 numberOfCalls 的最大值（以及发生时的时间值）

复杂度：O(n log(n)) 排序，O(n) 遍历所有记录

【讨论】：

确实很简单，我发布了另一个不需要排序的解决方案，我想知道它在性能方面的表现如何......
如何跟踪 numberOfCalls 的最大值？
@ygnhzeus，将其保存在一个单独的变量中，并在当前 numberOfCalls 值大于之前的最大值时更新它
它错过了一个用例。假设在某一点，有多个开始和结束，即假设在 2:25:00 有 2 个开始和 3 个结束。因此，排序后的范围间隔将在 2:25:00 有 5 个值，其中 2 个开始和 3 个以随机顺序结束。但是为了让算法正常工作，结束应该在开始之前。
@vladimir 非常好的和清晰的解决方案，谢谢。但是如果我们想要返回所有的重叠次数而不是重叠的数量呢？再次感谢

【解决方案2】：

这是一个 Python 中的工作算法

def maximumOverlap(calls):
    times = []
    for call in calls:
        startTime, endTime = call
        times.append((startTime, 'start'))
        times.append((endTime, 'end'))
    times = sorted(times)

    count = 0
    maxCount = 0
    for time in times:
        if time[1] == 'start':
            count += 1    # increment on arrival/start
        else:
            count -= 1    # decrement on departure/end
        maxCount = max(count, maxCount)  # maintain maximum
    return maxCount

calls = [
('2:22:22 PM', '2:22:33 PM'),
('2:22:35 PM', '2:22:42 PM'),
('2:22:36 PM', '2:22:43 PM'),
('2:22:46 PM', '2:22:54 PM'),
('2:22:49 PM', '2:27:21 PM'),
('2:22:57 PM', '2:23:03 PM'),
('2:23:29 PM', '2:23:40 PM'),
('2:24:08 PM', '2:24:14 PM'),
('2:27:37 PM', '2:39:14 PM'),
('2:27:47 PM', '2:27:55 PM'),
('2:29:04 PM', '2:29:26 PM'),
('2:29:31 PM', '2:29:43 PM'),
('2:29:45 PM', '2:30:10 PM'),
]
print(maximumOverlap(calls))

【讨论】：

【解决方案3】：

一个幼稚的方法怎么样：

取最短的开始时间和最长的结束时间（这是您的范围 R）
取最短调用时长——d(排序，O(nlog n))
创建一个数组 C，由 ceil(R/d) 整数组成，零初始化
现在，对于每个呼叫，将 1 添加到定义呼叫持续时间 O(n * ceil(R/d)) 的单元格
遍历数组 C 并保存最大值 (O(n))

我想你也可以将其建模为图表并摆弄，但目前胜过我。

【讨论】：

【解决方案4】：

在我看来，贪婪算法会满足需要。问题类似于找出给定火车时刻表所需的站台数量。所以重叠的数量将是所需的平台数量。
callStart 时间已排序。开始将每个调用放入一个数组（一个平台）中。所以对于调用i and (i + 1)，如果callEnd[i] > callStart[i+1] 那么他们不能进入同一个数组（或平台），在第一个数组中放置尽可能多的调用。然后用其余的重复这个过程，直到所有的调用都用完。最后，数组的数量是重叠的最大数量。复杂度将是O(n)。

【讨论】：

【解决方案5】：

以下页面提供了多种语言解决此问题的示例：http://rosettacode.org/wiki/Max_Licenses_In_Use

【讨论】：

【解决方案6】：

您在CallStart 上缩短了列表。然后对于每个元素 (i)，您会看到所有 j < i if

CallEnd[j] > CallStart[i] // put it in a map with CallStart[i]  as the key and some count

休息应该很容易。

【讨论】：

【解决方案7】：

有些问题的解决方案有时会从一个人的脑海中蹦出来，这真是太神奇了……我想我可能是最简单的解决方案；）

您可以以秒为单位表示时间，从您的范围开始 (0) 到结束 (600)。一次调用是一对时间。

Python算法：

def maxSimultaneousCalls(calls):
  """Returns the maximum number of simultaneous calls
  calls   : list of calls
    (represented as pairs [begin,end] with begin and end in seconds)
  """
  # Shift the calls so that 0 correspond to the beginning of the first call
  min = min([call[0] for call in calls])

  tmpCalls = [(call[0] - min, call[1] - min) for call in calls]
  max = max([call[1] for call in tmpCalls])

  # Find how many calls were active at each second during the interval [0,max]
  seconds = [0 for i in range(0,max+1)]
  for call in tmpCalls:
    for i in range(call[0],call[1]):
      seconds[i] += 1

  return max(seconds)

请注意，我不知道此时哪些通话处于活动状态；）

但就复杂性而言，评估起来非常简单：就调用的总持续时间而言，它是线性的。

【讨论】：

【解决方案8】：

我认为对于这个问题的良好解决方案的一个重要元素是认识到每个结束时间 >= 呼叫的开始时间并且开始时间是有序的。因此，与其考虑读取整个列表和排序，我们只需要按开始时间的顺序读取并从结束时间的最小堆中合并。这也解决了 Sanjeev 关于如何在开始前处理结束的评论，当它们具有完全相同的时间值时，通过从结束时间 min-heap 轮询并在其值为

max_calls = 0
// A min-heap will typically do the least amount of sorting needed here.
// It's size tells us the # of currently active calls.
// Note that multiple keys with the same value must be supported.
end_times = new MinHeap()
for call in calls:
  end_times.add(call.end)
  while (end_times.min_key() <= call.start) {
    end_times.remove_min()
  }
  // Check size after calls have ended so that a start at the same time
  // doesn't count as an additional call.  
  // Also handles zero duration calls as not counting.
  if (end_times.size() > max_calls) max_calls = end_times.size()
}

【讨论】：

【解决方案9】：

这似乎是一个reduce 操作。打个比方，每次通话开始，当前通话次数加1。每次通话结束，当前通话次数减为零。

一旦您拥有了活跃的呼叫流，您只需对它们应用最大操作即可。这是一个有效的 python2 示例：

from itertools import chain
inp = ((123, 125),
       (123, 130),
       (123, 134),
       (130, 131),
       (130, 131),
       (130, 132),)

# technical: tag each point as start or end of a call
data = chain(*(((a, 'start'), (b, 'end')) for a, b in inp))

def r(state, d):
    last = state[-1]
    # if a call is started we add one to the number of calls,
    # if it ends we reduce one
    current = (1 if d[1] == 'start' else -1)
    state.append(last + current)
    return state

max_intersect = max(reduce(r, sorted(data), [0]))

print max_intersect

【讨论】：