生成集合中所有大小相等的分区答案

【问题标题】：Generate all equal-sized partitions of a set生成集合中所有大小相等的分区
【发布时间】：2017-02-17 06:28:08
【问题描述】：

我需要一个生成器，它将一组“代理”和一组“项目”作为输入，并生成每个代理获得相同数量项目的所有分区。例如：

>>> for p in equalPartitions(["A","B"], [1,2,3,4]): print(p)
{'A': [1, 2], 'B': [3, 4]}
{'A': [1, 3], 'B': [2, 4]}
{'A': [1, 4], 'B': [2, 3]}
{'A': [2, 3], 'B': [1, 4]}
{'A': [2, 4], 'B': [1, 3]}
{'A': [3, 4], 'B': [1, 2]}

对于两个代理来说，这很容易（假设项目数是偶数）：

itemsPerAgent = len(items) // len(agents)
for bundle0 in itertools.combinations(items, itemsPerAgent):
        bundle1 =  [item for item in items if item not in bundle0]
        yield {
            agents[0]: list(bundle0),
            agents[1]: bundle1
            }

对于三个代理，这变得更加复杂：

itemsPerAgent = len(items) // len(agents)
for bundle0 in itertools.combinations(items, itemsPerAgent):
    bundle12 =  [item for item in items if item not in bundle0]
    for bundle1 in itertools.combinations(bundle12, itemsPerAgent):
        bundle2 =  [item for item in bundle12 if item not in bundle1]
        yield {
            agents[0]: list(bundle0),
            agents[1]: list(bundle1),
            agents[2]: bundle2
            }

是否有更通用的解决方案，适用于任意数量的代理？

【问题讨论】：

只是为了澄清。您是否总是有可以在代理之间平均分配的项目数量（len(items)/len(agents) == 0）？如果没有，如果物品不能均匀分配，你如何在代理之间分配物品？
@Highstaker 是的，我假设物品的数量总是代理人数量的整数倍。
是否有重复项？

标签： python algorithm python-3.x combinations partitioning

【解决方案1】：

这是一个递归解决方案，其工作原理是将适当数量的项目分配给第一个代理，然后将其余问题交给自身的进一步调用：

from itertools import combinations

def part(agents, items):
    if len(agents) == 1:
        yield {agents[0]: items}
    else:
        quota = len(items) // len(agents)
        for indexes in combinations(range(len(items)), quota):
            remainder = items[:]
            selection = [remainder.pop(i) for i in reversed(indexes)][::-1]
            for result in part(agents[1:], remainder):
                result[agents[0]] = selection
                yield result

在单个代理的简单情况下，我们生成单个字典并终止。

如果有多个代理，我们：

将所有索引组合生成到应分配给第一个代理的items。
将这些索引处的项目从items 的副本中以相反的顺序（以避免弄乱索引）弹出到selection，然后使用[::-1] 再次反转结果以保持预期的顺序.
对剩余的代理和项目递归调用part()。
将我们已经做出的选择添加到这些递归调用产生的每个结果中，并产生它。

它在行动：

>>> for p in part(["A", "B"], [1, 2, 3, 4]):
...     print(p)
... 
{'A': [1, 2], 'B': [3, 4]}
{'A': [1, 3], 'B': [2, 4]}
{'A': [1, 4], 'B': [2, 3]}
{'A': [2, 3], 'B': [1, 4]}
{'A': [2, 4], 'B': [1, 3]}
{'A': [3, 4], 'B': [1, 2]}

>>> for p in part(["A", "B", "C"], [1, 2, 3, 4, 5, 6, 7, 8, 9]):
...     print(p)
... 
{'A': [1, 2, 3], 'B': [4, 5, 6], 'C': [7, 8, 9]}
{'A': [1, 2, 3], 'B': [4, 5, 7], 'C': [6, 8, 9]}
{'A': [1, 2, 3], 'B': [4, 5, 8], 'C': [6, 7, 9]}
{'A': [1, 2, 3], 'B': [4, 5, 9], 'C': [6, 7, 8]}
{'A': [1, 2, 3], 'B': [4, 6, 7], 'C': [5, 8, 9]}
  # [...]    
{'A': [7, 8, 9], 'B': [3, 4, 5], 'C': [1, 2, 6]}
{'A': [7, 8, 9], 'B': [3, 4, 6], 'C': [1, 2, 5]}
{'A': [7, 8, 9], 'B': [3, 5, 6], 'C': [1, 2, 4]}
{'A': [7, 8, 9], 'B': [4, 5, 6], 'C': [1, 2, 3]}

>>> for p in part(["A", "B", "C"], [1, 2, 3, 4, 5, 6, 7]):
...     print(p)
... 
{'A': [1, 2], 'B': [3, 4], 'C': [5, 6, 7]}
{'A': [1, 2], 'B': [3, 5], 'C': [4, 6, 7]}
{'A': [1, 2], 'B': [3, 6], 'C': [4, 5, 7]}
{'A': [1, 2], 'B': [3, 7], 'C': [4, 5, 6]}
  # [...]
{'A': [6, 7], 'B': [2, 5], 'C': [1, 3, 4]}
{'A': [6, 7], 'B': [3, 4], 'C': [1, 2, 5]}
{'A': [6, 7], 'B': [3, 5], 'C': [1, 2, 4]}
{'A': [6, 7], 'B': [4, 5], 'C': [1, 2, 3]}

如您所见，它处理items 不能在agents 之间平均分配的情况。此外，与各种基于permutations() 的答案不同，它不会浪费计算重复结果的工作，因此运行速度比它们快很多。

【讨论】：

这对我来说效果最好。它也可以在没有第二次反转的情况下工作 [::-1]。

【解决方案2】：

如果您有一个 permutations 函数可以处理输入中的重复元素，而不会在输出中产生重复的排列，那么您可以非常有效地做到这一点。不幸的是，itertools.permutations 没有做我们想要的（len(list(itertools.permutations('aaa'))) 是 6，而不是我们想要的 1）。

这是我为之前的一些问题编写的置换函数，它恰好对重复的输入值做了正确的事情：

def permutations(seq):
    perm = sorted(seq) # the first permutation is the sequence in sorted order
    while True:
        yield perm

        # find largest index i such that perm[i] < perm[i+1]
        for i in range(len(perm)-2, -1, -1):
            if perm[i] < perm[i+1]:
                break
        else: # if none was found, we've already found the last permutation
            return

        # find the largest index j such that perm[i] < perm[j] (always exists)
        for j in range(len(perm)-1, -1, -1):
            if perm[i] < perm[j]:
                break

        # Swap values at indexes i and j, then reverse the values from i+1
        # to the end. I've packed that all into one operation, with slices.
        perm = perm[:i]+perm[j:j+1]+perm[-1:j:-1]+perm[i:i+1]+perm[j-1:i:-1]

现在，以下是如何使用它为您的代理分配物品：

def equal_partitions(agents, items):
    items_per_agent, extra_items = divmod(len(items), len(agents))
    item_assignments = agents * items_per_agent + agents[:extra_items]
    for assignment in permutations(item_assignments):
        result = {}
        for agent, item in zip(assignment, items):
            result.setdefault(agent, []).append(item)
        yield result

第一行构建了一个对我们的代理的引用列表，该列表与项目列表的长度相同。每个代理重复的次数与他们将收到的项目数一样多。如果items 列表不能完全平均划分，我会在前几个代理中添加一些额外的引用。如果您愿意，可以添加其他内容（例如 ['extra'] * extra_items，也许）。

主循环在分配列表的排列上运行。然后它运行一个内部循环，将代理从排列匹配到相应的项目，并将结果以您想要的格式打包到字典中。

对于任意数量的代理或项目，此代码在时间和空间上都应该是渐近最优的。也就是说，它可能仍然很慢，因为它依赖于我用纯 Python 编写的 permutation 函数，而不是用 C 更快的实现。可能有一种有效的方法可以使用 itertools 获得我们想要的排列，但是我不确定如何。

【讨论】：

【解决方案3】：

一个内存效率极低的解决方案，但相当短且更“pythonic”。此外，结果中的字典顺序非常随意，imo。

import itertools as it
from pprint import pprint
from time import time

agents = ('a', 'b', 'c')
items = (1,2,3,4,5,6,7,8,9)

items_per_agent = int(len(items)/len(agents))

def split_list(alist,max_size=1):
    """Yield successive n-sized chunks from alist."""
    for i in range(0, len(alist), max_size):
        yield alist[i:i+max_size]

def my_solution():
    # I have put this into one-liner below
    # combos = set()
    # i=0
    # for perm in it.permutations(items, len(items)):
    #   combo = tuple(tuple(sorted(chunk)) for chunk in split_list(perm, max_size=items_per_agent))
    #   combos.add(combo)
    #   print(combo, i)
    #   i+=1

    combos = {tuple(tuple(sorted(chunk)) for chunk in split_list(perm, max_size=items_per_agent)) for perm in it.permutations(items, len(items))}

    # I have put this into one-liner below
    # result = []
    # for combo in combos:
    #   result.append(dict(zip(agents,combo)))

    result = [dict(zip(agents,combo)) for combo in combos]

    pprint(result)

my_solution()

【讨论】：

【解决方案4】：

# -*- coding: utf-8 -*-

from itertools import combinations
from copy import copy


def main(agents, items):
    if len(items) % len(agents):
        return []

    result = [{'remain': items}]

    part_size = len(items) / len(agents)

    while True:
        for item in result[:]:
            remain_agent = set(agents) - set(item.keys())
            if not remain_agent:
                continue

            result.remove(item)

            agent = remain_agent.pop()

            for combination in combinations(item['remain'], part_size):
                current_item = copy(item)
                current_item.update({agent: combination, 'remain': list(set(item['remain']) - set(combination))})
                result.append(current_item)

            break
        else: 
            break

    for item in result:
        item.pop('remain', None)
    return result


if __name__ == '__main__':
    agents = ['A', 'B', 'C']
    items = [1, 2, 3, 4, 5, 6]

    t = main(agents, items)

它在行动：

In [3]: agents = ['A', 'B']

In [4]: items = [1, 2, 3, 4]

In [5]: result = main(agents, items)

In [6]: for item in result:
   ...:     print item
   ...:
{'A': (1, 2), 'B': (3, 4)}
{'A': (1, 3), 'B': (2, 4)}
{'A': (1, 4), 'B': (2, 3)}
{'A': (2, 3), 'B': (1, 4)}
{'A': (2, 4), 'B': (1, 3)}
{'A': (3, 4), 'B': (1, 2)}

【讨论】：

【解决方案5】：

from itertools import combinations,permutations
def get(items, no_of_agents):
    def chunks(l, n):
        """Yield successive n chunks from l."""
        rt = []
        ln = len(l) // n
        for i in range(0, len(l) - ln - 1, ln):
            rt.append(l[i:i + ln])
        rt.append(l[i + ln:])
        return rt

    for i in permutations(items, len(items)):
        yield chunks(i,no_of_agents)

def get_equal_partitions(items, agents):
    for i in get(items, len(agents)):
        yield dict(zip(agents, i))

items = [i for i in range(4)]
agents = ["A","B","C"]

for i in get_equal_partitions(items,agents):
    print(i)

【讨论】：

这会产生每个可能组的所有可能排列，这不是问题所要求的。