如何找到集合的所有分区答案

【问题标题】：How to find all partitions of a set如何找到集合的所有分区
【发布时间】：2013-12-30 01:47:41
【问题描述】：

我有一组不同的价值观。我正在寻找一种方法来生成该集合的所有分区，即将集合划分为子集的所有可能方式。

例如，集合{1, 2, 3} 具有以下分区：

{ {1}, {2}, {3} },
{ {1, 2}, {3} },
{ {1, 3}, {2} },
{ {1}, {2, 3} },
{ {1, 2, 3} }.

由于这些是数学意义上的集合，因此顺序无关紧要。例如，{1, 2}, {3} 与 {3}, {2, 1} 相同，不应是单独的结果。

集合分区的完整定义可以在Wikipedia找到。

【问题讨论】：

我不能说我已经遇到过这个问题，而且一些搜索也没有提供足够的答案，+1。乍一看，代码似乎也不错（肯定比我遇到的任何接近意图的代码都更简洁），+1 来自我。
python版本见stackoverflow.com/q/19368375/281545

标签： c# algorithm set partitioning

【解决方案1】：

我找到了一个简单的递归解决方案。

首先，让我们解决一个更简单的问题：如何找到恰好由两部分组成的所有分区。对于一个 n 元素集，我们可以从 0 到 (2^n)-1 计算一个 int。这将创建每个 n 位模式，每个位对应于一个输入元素。如果该位为 0，我们将元素放在第一部分；如果为 1，则元素放置在第二部分。这留下了一个问题：对于每个分区，我们将得到一个重复的结果，其中两个部分被交换。为了解决这个问题，我们总是将第一个元素放入第一部分。然后我们只通过从 0 到 (2^(n-1))-1 的计数来分配剩余的 n-1 个元素。

现在我们可以将一个集合分成两部分，我们可以编写一个递归函数来解决剩下的问题。该函数从原始集合开始并找到所有两部分分区。对于这些分区中的每一个，它递归地找到将第二部分分成两部分的所有方法，从而产生所有三部分分区。然后它划分每个分区的最后一部分以生成所有四部分分区，依此类推。

以下是 C# 中的实现。调用

Partitioning.GetAllPartitions(new[] { 1, 2, 3, 4 })

产量

{ {1, 2, 3, 4} },
{ {1, 3, 4}, {2} },
{ {1, 2, 4}, {3} },
{ {1, 4}, {2, 3} },
{ {1, 4}, {2}, {3} },
{ {1, 2, 3}, {4} },
{ {1, 3}, {2, 4} },
{ {1, 3}, {2}, {4} },
{ {1, 2}, {3, 4} },
{ {1, 2}, {3}, {4} },
{ {1}, {2, 3, 4} },
{ {1}, {2, 4}, {3} },
{ {1}, {2, 3}, {4} },
{ {1}, {2}, {3, 4} },
{ {1}, {2}, {3}, {4} }.

using System;
using System.Collections.Generic;
using System.Linq;

namespace PartitionTest {
    public static class Partitioning {
        public static IEnumerable<T[][]> GetAllPartitions<T>(T[] elements) {
            return GetAllPartitions(new T[][]{}, elements);
        }

        private static IEnumerable<T[][]> GetAllPartitions<T>(
            T[][] fixedParts, T[] suffixElements)
        {
            // A trivial partition consists of the fixed parts
            // followed by all suffix elements as one block
            yield return fixedParts.Concat(new[] { suffixElements }).ToArray();

            // Get all two-group-partitions of the suffix elements
            // and sub-divide them recursively
            var suffixPartitions = GetTuplePartitions(suffixElements);
            foreach (Tuple<T[], T[]> suffixPartition in suffixPartitions) {
                var subPartitions = GetAllPartitions(
                    fixedParts.Concat(new[] { suffixPartition.Item1 }).ToArray(),
                    suffixPartition.Item2);
                foreach (var subPartition in subPartitions) {
                    yield return subPartition;
                }
            }
        }

        private static IEnumerable<Tuple<T[], T[]>> GetTuplePartitions<T>(
            T[] elements)
        {
            // No result if less than 2 elements
            if (elements.Length < 2) yield break;

            // Generate all 2-part partitions
            for (int pattern = 1; pattern < 1 << (elements.Length - 1); pattern++) {
                // Create the two result sets and
                // assign the first element to the first set
                List<T>[] resultSets = {
                    new List<T> { elements[0] }, new List<T>() };
                // Distribute the remaining elements
                for (int index = 1; index < elements.Length; index++) {
                    resultSets[(pattern >> (index - 1)) & 1].Add(elements[index]);
                }

                yield return Tuple.Create(
                    resultSets[0].ToArray(), resultSets[1].ToArray());
            }
        }
    }
}

【讨论】：

哇，太酷了。你能回答你自己的问题吗？我从来没想过。
谢谢，这是一个绝妙的解决方案，你成就了我的一天！ Here's my JS ES6 implementation。我正在寻找这个确切的问题，你可以将一个集合拆分成的所有集合组合，直到我理解技术术语是：集合分区，我才能找到任何东西。

【解决方案2】：

请参考Bell number，这里对这个问题做一个简单的思考：
将 f(n,m) 视为将 n 个元素的集合划分为 m 个非空集合。

例如，一组 3 个元素的分区可以是：
1) 设置大小 1: {{1,2,3}, } 2) 设置大小 2: {{1,2},{3}}, {{1,3},{2}}, {{2,3},{1}} 3) 设置大小 3: {{1}, {2}, {3}}

现在让我们计算 f(4,2)：
f(4,2)有两种方法：

A.为 f(3,1) 添加一个集合，它将从 {{1,2,3}, } 转换为 {{1,2,3}, {4}}
B. 将 4 添加到 f(3,2) 的任何集合中，这将从
{{1,2},{3}}, {{1,3},{2}}, {{2,3},{1}}
到
{{1,2,4},{3}}, {{1,2},{3,4}}
{{1,3,4},{2}}, {{1,3},{2,4}}
{{2,3,4},{1}}, {{2,3},{1,4}}

所以f(4,2) = f(3,1) + f(3,2)*2
这导致f(n,m) = f(n-1,m-1) + f(n-1,m)*m

这里是获取集合所有分区的 Java 代码：

import java.util.ArrayList;
import java.util.List;

public class SetPartition {
    public static void main(String[] args) {
        List<Integer> list = new ArrayList<>();
        for(int i=1; i<=3; i++) {
            list.add(i);
        }

        int cnt = 0;
        for(int i=1; i<=list.size(); i++) {
            List<List<List<Integer>>> ret = helper(list, i);
            cnt += ret.size();
            System.out.println(ret);
        }
        System.out.println("Number of partitions: " + cnt);
    }

    // partition f(n, m)
    private static List<List<List<Integer>>> helper(List<Integer> ori, int m) {
        List<List<List<Integer>>> ret = new ArrayList<>();
        if(ori.size() < m || m < 1) return ret;

        if(m == 1) {
            List<List<Integer>> partition = new ArrayList<>();
            partition.add(new ArrayList<>(ori));
            ret.add(partition);
            return ret;
        }

        // f(n-1, m)
        List<List<List<Integer>>> prev1 = helper(ori.subList(0, ori.size() - 1), m);
        for(int i=0; i<prev1.size(); i++) {
            for(int j=0; j<prev1.get(i).size(); j++) {
                // Deep copy from prev1.get(i) to l
                List<List<Integer>> l = new ArrayList<>();
                for(List<Integer> inner : prev1.get(i)) {
                    l.add(new ArrayList<>(inner));
                }

                l.get(j).add(ori.get(ori.size()-1));
                ret.add(l);
            }
        }

        List<Integer> set = new ArrayList<>();
        set.add(ori.get(ori.size() - 1));
        // f(n-1, m-1)
        List<List<List<Integer>>> prev2 = helper(ori.subList(0, ori.size() - 1), m - 1);
        for(int i=0; i<prev2.size(); i++) {
            List<List<Integer>> l = new ArrayList<>(prev2.get(i));
            l.add(set);
            ret.add(l);
        }

        return ret;
    }

}

结果是：
[[[1, 2, 3]]] [[[1, 3], [2]], [[1], [2, 3]], [[1, 2], [3]]] [[[1], [2], [3]]] Number of partitions: 5

【讨论】：

【解决方案3】：

这是一个非递归的解决方案

class Program
{
    static void Main(string[] args)
    {
        var items = new List<Char>() { 'A', 'B', 'C', 'D', 'E' };
        int i = 0;
        foreach (var partition in items.Partitions())
        {
            Console.WriteLine(++i);
            foreach (var group in partition)
            {
                Console.WriteLine(string.Join(",", group));
            }
            Console.WriteLine();
        }
        Console.ReadLine();
    }
}  

public static class Partition
{
    public static IEnumerable<IList<IList<T>>> Partitions<T>(this IList<T> items)
    {
        if (items.Count() == 0)
            yield break;
        var currentPartition = new int[items.Count()];
        do
        {
            var groups = new List<T>[currentPartition.Max() + 1];
            for (int i = 0; i < currentPartition.Length; ++i)
            {
                int groupIndex = currentPartition[i];
                if (groups[groupIndex] == null)
                    groups[groupIndex] = new List<T>();
                groups[groupIndex].Add(items[i]);
            }
            yield return groups;
        } while (NextPartition(currentPartition));
    }

    private static bool NextPartition(int[] currentPartition)
    {
        int index = currentPartition.Length - 1;
        while (index >= 0)
        {
            ++currentPartition[index];
            if (Valid(currentPartition))
                return true;
            currentPartition[index--] = 0;
        }
        return false;
    }

    private static bool Valid(int[] currentPartition)
    {
        var uniqueSymbolsSeen = new HashSet<int>();
        foreach (var item in currentPartition)
        {
            uniqueSymbolsSeen.Add(item);
            if (uniqueSymbolsSeen.Count <= item)
                return false;
        }
        return true;
    }
}

【讨论】：

【解决方案4】：

这是一个大约 20 行长的 Ruby 解决方案：

def copy_2d_array(array)
  array.inject([]) {|array_copy, item| array_copy.push(item)}
end

#
# each_partition(n) { |partition| block}
#
# Call the given block for each partition of {1 ... n}
# Each partition is represented as an array of arrays.
# partition[i] is an array indicating the membership of that partition.
#
def each_partition(n)
  if n == 1
    # base case:  There is only one partition of {1}
    yield [[1]]
  else
    # recursively generate the partitions of {1 ... n-1}.
    each_partition(n-1) do |partition|
      # adding {n} to a subset of partition generates
      # a new partition of {1 ... n}
      partition.each_index do |i|
        partition_copy = copy_2d_array(partition)
        partition_copy[i].push(n)
        yield (partition_copy)    
      end # each_index

      # Also adding the set {n} to a partition of {1 ... n}
      # generates a new partition of {1 ... n}
      partition_copy = copy_2d_array(partition)
      partition_copy.push [n]
      yield(partition_copy)
    end # block for recursive call to each_partition
  end # else
end # each_partition

（我不是想为 Ruby 兜圈子，我只是认为这个解决方案可能更容易让一些读者理解。）

【讨论】：

【解决方案5】：

我对一组 N 个成员使用的技巧。 1.计算2^N 2. 用二进制写出 1 到 N 之间的每个数字。 3. 您将获得 2^N 个二进制数，每个长度为 N，每个数字告诉您如何将集合拆分为子集 A 和 B。如果第 k 位为 0，则将第 k 个元素放入集合 A 中，否则放入它在 B 组中。

【讨论】：

小心，这只会找到 2 个分区，即将集合划分为 2 个子集。在原始示例中，这忽略了{ {1}, {2}, {3} }。

【解决方案6】：

只是为了好玩，这里有一个更短的纯迭代版本：

public static IEnumerable<List<List<T>>> GetAllPartitions<T>(T[] elements) {
    var lists = new List<List<T>>();
    var indexes = new int[elements.Length];
    lists.Add(new List<T>());
    lists[0].AddRange(elements);
    for (;;) {
        yield return lists;
        int i,index;
        for (i=indexes.Length-1;; --i) {
            if (i<=0)
                yield break;
            index = indexes[i];
            lists[index].RemoveAt(lists[index].Count-1);
            if (lists[index].Count>0)
                break;
            lists.RemoveAt(index);
        }
        ++index;
        if (index >= lists.Count)
            lists.Add(new List<T>());
        for (;i<indexes.Length;++i) {
            indexes[i]=index;
            lists[index].Add(elements[i]);
            index=0;
        }
    }

在这里测试：https://ideone.com/EccB5n

还有一个更简单的递归版本：

public static IEnumerable<List<List<T>>> GetAllPartitions<T>(T[] elements, int maxlen) {
    if (maxlen<=0) {
        yield return new List<List<T>>();
    }
    else {
        T elem = elements[maxlen-1];
        var shorter=GetAllPartitions(elements,maxlen-1);
        foreach (var part in shorter) {
            foreach (var list in part.ToArray()) {
                list.Add(elem);
                yield return part;
                list.RemoveAt(list.Count-1);
            }
            var newlist=new List<T>();
            newlist.Add(elem);
            part.Add(newlist);
            yield return part;
            part.RemoveAt(part.Count-1);
        }
    }

https://ideone.com/Kdir4e

【讨论】：

我都试过了，递归解决方案的速度非常快。只是一点警告：每个返回分区的内容仅在枚举时有效。一旦枚举了另一个分区，前一个分区的内容就会被清除。
如何将它移植到 C++。 C++ 缺少 yield 关键字
好的，我把它移植到 C++ stackoverflow.com/questions/59958703/…

【解决方案7】：

我已经实现了 Donald Knuth 的非常好的算法 H，它列出了 Matlab 中的所有分区

https://uk.mathworks.com/matlabcentral/fileexchange/62775-allpartitions--s-- http://www-cs-faculty.stanford.edu/~knuth/fasc3b.ps.gz

function [ PI, RGS ] = AllPartitions( S )
    %% check that we have at least two elements
    n = numel(S);
    if n < 2
        error('Set must have two or more elements');
    end    
    %% Donald Knuth's Algorith H
    % restricted growth strings
    RGS = [];
    % H1
    a = zeros(1,n);
    b = ones(1,n-1);
    m = 1;
    while true
        % H2
        RGS(end+1,:) = a;
        while a(n) ~= m            
            % H3
            a(n) = a(n) + 1;
            RGS(end+1,:) = a;
        end
        % H4
        j = n - 1;
        while a(j) == b(j)
           j = j - 1; 
        end
        % H5
        if j == 1
            break;
        else
            a(j) = a(j) + 1;
        end
        % H6
        m = b(j) + (a(j) == b (j));
        j = j + 1;
        while j < n 
            a(j) = 0;
            b(j) = m;
            j = j + 1;
        end
        a(n) = 0;
    elementsd
    %% get partitions from the restricted growth stirngs
    PI = PartitionsFromRGS(S, RGS);
end

【讨论】：

链接坏了？

【解决方案8】：

def allPossiblePartitions(l): # l is the list whose possible partitions have to be found


    # to get all possible partitions, we consider the binary values from 0 to 2**len(l))//2-1
    """
    {123}       --> 000 (0)
    {12} {3}    --> 001 (1)
    {1} {2} {3} --> 010 (2)
    {1} {23}    --> 011 (3)  --> (0 to (2**3//2)-1)

    iterate over each possible partitions, 
    if there are partitions>=days and
    if that particular partition contains
    more than one element then take max of all elements under that partition
    ex: if the partition is {1} {23} then we take 1+3
    """
    for i in range(0,(2**len(l))//2):
            s = bin(i).replace('0b',"")
            s = '0'*(len(l)-len(s)) + s
            sublist = []
            prev = s[0]
            partitions = []
            k = 0
            for i in s:
                if (i == prev):
                    partitions.append(l[k])
                    k+=1
                else:
                    sublist.append(partitions)
                    partitions = [l[k]]
                    k+=1
                    prev = i
            sublist.append(partitions)
            print(sublist)

【讨论】：

虽然此代码可以解决问题，including an explanation 说明如何以及为什么解决问题将真正有助于提高您的帖子质量，并可能导致更多的赞成票。请记住，您正在为将来的读者回答问题，而不仅仅是现在提问的人。请edit您的回答添加解释并说明适用的限制和假设。
@user13052579，您的技术并不详尽。 allPossiblePartitions([ABC]) 未命中 [[AC], [B]] 参见上面的Spiralmoon's 解决方案...从Bell numbers 中，N = 4 应该有 15 个枚举分区，但是您的技术 allPossiblePartitions([ABCD ])，产生 8。缺少：[[AC]，[B]，[D]]； [[AD]，[B]，[C]]； [[A]，[BD]，[C]]； [[ABD],[C]]; [[ACD],[B]]。
对于 N = 4，有 4 个分区具有一个元素集和一组 3 个元素，但 allPossiblePartitions() 仅提供其中两个。如果您的算法有效，它应该返回给定列表大小的贝尔编号。