将 python 列表划分为列表的子集（子集的数量越少越好），每个列表的总和小于 K答案

【问题标题】：Divide a python list into subsets of lists (the smaller the number of subsets the better), each with sum less then K将 python 列表划分为列表的子集（子集的数量越少越好），每个列表的总和小于 K
【发布时间】：2019-12-01 07:13:44
【问题描述】：

我对 python 还很陌生。我正在制作一个程序，但遇到一个问题，可以总结如下：

假设我们有一个数字列表（每个都小于 5）[1.5, 3, 4, 2.5 , 1, 4, 0.5 etc]。我想将此列表划分为列表的子集，条件是每个子集中的项目总和为<= 5。该列表最多可包含 200 个项目。

最佳解决方案是返回最小个子集的解决方案。但我不是在寻找最佳解决方案，只是足够好。

【问题讨论】：

保持列表及其子集有序很重要吗？（在你的例子中，解决方案应该是[[1.5, 3], [4], [2.5, 1], [4, 0.5]]）
还有其他没有提到的情况吗？因为你可以只返回一个具有单个数字的子集

标签： python list

【解决方案1】：

这称为bin packing problem。这是一个经过充分研究的 NP 完全问题，这意味着没有任何已知算法可以给出准确的答案（即具有真正的最小子列表数），同时还能有效地处理更大的输入。

但是，由于您只需要一个“足够好”的解决方案，因此您很幸运；有很多好的heuristics 在实践中给出了很好的答案。一个不错的简单算法是“首次拟合递减”算法：

按降序对项目进行排序（即从大到小）。
初始化一个列表以存储子列表。最初，没有。
对于每个项目：
- 如果有任何剩余容量足够的子列表，请将该项目插入第一个。
- 否则，创建一个新的空子列表，并将项目插入其中。

结果证明总是最多使用 (11/9)b + 1 个子列表给出解决方案，其中 b 是最优解决方案使用的子列表数量 (Yue, 1990)。

【讨论】：

您好，谢谢您的回答。发布问题后，我采用了类似的方法。很高兴确认它足够好。这篇论文看起来很有趣，谢谢你的链接。

【解决方案2】：

我会争辩说，这更像是一个算法问题，而不是特定于 python 的问题——但一种让我感觉足够简单的算法是对列表进行排序，并创建“桶”（子列表) 以 max 元素开始，从列表的最前面添加，直到无法添加为止。

在 Python 中可能看起来像列表

x = [1.5, 3, 4, 2.5 , 1, 4, 0.5]
x.sort()
buckets = []

while True:
    # if the list is empty, break
    if x == []:
        break

    last_elem = x.pop()  # pop removes the last element and returns it
    new_bucket = [last_elem]  # create a new bucket initially with just that
    new_bucket_sum = last_elem

    # for the remaining numbers
    num_added = 0
    for num in x:
        if num + new_bucket_sum > 5:
            break
        new_bucket.append(num) # add it to the sub-list
        new_bucket_sum += num  # account for the sum
        num_added += 1  # increase our count for this iteration

    buckets.append(new_bucket)  # add the bucket
    x = x[num_added:]  # take a sub-list of x (getting rid of the numbers added)


    # Note that we now recurse until all numbers have been placed in to buckets

# After this for loop breaks, you have all the buckets
print(buckets)

这是我的直觉。我会说有更多“pythonic”的方式来编写该算法，但由于您是 Python 新手，我认为将其分解和评论可能会有所帮助。那里可能还有更好的算法。干杯

【讨论】：

非常感谢您的回答。你是对的，这更像是一个算法类型的问题，但你的 python 插图很有帮助。我最终采用了类似的方法。

【解决方案3】：

只是想补充一点，如果生成的列表列表的元素必须保持其原始顺序（相对于输入列表），那么您可以这样做：

elts = [1.5, 3, 4, 2.5 , 1, 4, 0.5]
res = []

temp = []      # for accumulating the numbers
temp_sum = 0   # the sum of the accumulated numbers

for e in elts:
    temp_sum += e    # update the sum with current element
    if temp_sum > 5:
        # if updating the sum with the current element
        # makes the sum overshoot the limit
        # then don't accumulate the current element
        # instead ...
        res.append(temp)  # append the previously accumulated elements to the result
        temp = [e]        # start a new accumulator with the current element
        temp_sum = e      # start a new accumulated sum with the current element
    else:
        # if updating the sum with the current element
        # does not make the sum overshoot the limit ...
        temp.append(e)    # accumulate current element

# finally, append the last seen accumulator to the result
res.append(temp)

结果 res 将是 [[1.5, 3], [4], [2.5, 1], [4, 0.5]]

【讨论】：

【解决方案4】：

我喜欢这个挑战，所以我根据基本列表的随机抽样创建了一个heuristic algorythm。因此它会寻找最佳解决方案，直到预设给定的迭代次数：

import numpy as np


#base_randlist = np.random.random(200) * 5

base_randlist = np.array([1.5, 3, 4, 2.5 , 1, 4, 0.5])

print(base_randlist)

sets = []
for i in range(10000):

    set_ = []
    subset = []
    randlist = base_randlist

    while randlist.shape[0] != 0:
        while True:
            if randlist.shape[0] == 0:
                set_.append(subset)
                break
            ind = np.random.randint(0, randlist.shape[0])
            last_subset = subset.copy()
            subset.append(randlist[ind])

            if sum(subset) <= 5:
                randlist = np.delete(randlist, ind)
            else:
                set_.append(last_subset)
                subset = []
                break
    sets.append(set_)

min_setnum = np.inf
for i, s in enumerate(sets):
    if min_setnum > len(s):
        min_setnum = len(s)
        min_ind = i

print(sets[min_ind])
print(min_setnum)

输出：

[1.5 3.  4.  2.5 1.  4.  0.5]
[[3.0, 0.5], [1.5, 2.5], [4.0], [4.0, 1.0]]
4

【讨论】：