下面是一个 C# 算法,它应该比您发布的算法快得多(并且使用更少的内存)。它不使用您使用的整洁的二进制技巧,因此代码要长一些。它比你的有更多的for 循环,并且可能需要一两次使用调试器逐步完成它才能完全理解它。但它实际上是一种更简单的方法,只要您了解它在做什么。
作为奖励,返回的集合具有更“自然”的顺序。它将按照您在问题中列出的顺序返回集合 {1 2 3} 的子集。这不是重点,而是所用算法的副作用。
在我的测试中,我发现这个算法比你为一大组 22 个项目发布的算法快大约 4 倍(这是我可以在我的机器上运行的最大容量,而不会出现过多的磁盘抖动也不会影响结果很多)。你的一次运行大约需要 15.5 秒,而我的大约需要 3.6 秒。
对于较小的列表,差异不太明显。对于一组只有 10 个项目,您的运行 10,000 次大约需要 7.8 秒,而我的运行大约需要 3.2 秒。对于包含 5 个或更少项目的集合,它们几乎同时运行。通过多次迭代,您的运行速度会更快一些。
无论如何,这是代码。抱歉这么久;我试着确保我评论得很好。
/*
* Made it static, because it shouldn't really use or modify state data.
* Making it static also saves a tiny bit of call time, because it doesn't
* have to receive an extra "this" pointer. Also, accessing a local
* parameter is a tiny bit faster than accessing a class member, because
* dereferencing the "this" pointer is not free.
*
* Made it generic so that the same code can handle sets of any type.
*/
static IList<IList<T>> PowerSet<T>(IList<T> set){
if(set == null)
throw new ArgumentNullException("set");
/*
* Caveat:
* If set.Count > 30, this function pukes all over itself without so
* much as wiping up afterwards. Even for 30 elements, though, the
* result set is about 68 GB (if "set" is comprised of ints). 24 or
* 25 elements is a practical limit for current hardware.
*/
int setSize = set.Count;
int subsetCount = 1 << setSize; // MUCH faster than (int)Math.Pow(2, setSize)
T[][] rtn = new T[subsetCount][];
/*
* We don't really need dynamic list allocation. We can calculate
* in advance the number of subsets ("subsetCount" above), and
* the size of each subset (0 through setSize). The performance
* of List<> is pretty horrible when the initial size is not
* guessed well.
*/
int subsetIndex = 0;
for(int subsetSize = 0; subsetSize <= setSize; subsetSize++){
/*
* The "indices" array below is part of how we implement the
* "natural" ordering of the subsets. For a subset of size 3,
* for example, we initialize the indices array with {0, 1, 2};
* Later, we'll increment each index until we reach setSize,
* then carry over to the next index. So, assuming a set size
* of 5, the second iteration will have indices {0, 1, 3}, the
* third will have {0, 1, 4}, and the fifth will involve a carry,
* so we'll have {0, 2, 3}.
*/
int[] indices = new int[subsetSize];
for(int i = 1; i < subsetSize; i++)
indices[i] = i;
/*
* Now we'll iterate over all the subsets we need to make for the
* current subset size. The number of subsets of a given size
* is easily determined with combination (nCr). In other words,
* if I have 5 items in my set and I want all subsets of size 3,
* I need 5-pick-3, or 5C3 = 5! / 3!(5 - 3)! = 10.
*/
for(int i = Combination(setSize, subsetSize); i > 0; i--){
/*
* Copy the items from the input set according to the
* indices we've already set up. Alternatively, if you
* just wanted the indices in your output, you could
* just dup the index array here (but make sure you dup!
* Otherwise the setup step at the bottom of this for
* loop will mess up your output list! You'll also want
* to change the function's return type to
* IList<IList<int>> in that case.
*/
T[] subset = new T[subsetSize];
for(int j = 0; j < subsetSize; j++)
subset[j] = set[indices[j]];
/* Add the subset to the return */
rtn[subsetIndex++] = subset;
/*
* Set up indices for next subset. This looks a lot
* messier than it is. It simply increments the
* right-most index until it overflows, then carries
* over left as far as it needs to. I've made the
* logic as fast as I could, which is why it's hairy-
* looking. Note that the inner for loop won't
* actually run as long as a carry isn't required,
* and will run at most once in any case. The outer
* loop will go through as few iterations as required.
*
* You may notice that this logic doesn't check the
* end case (when the left-most digit overflows). It
* doesn't need to, since the loop up above won't
* execute again in that case, anyway. There's no
* reason to waste time checking that here.
*/
for(int j = subsetSize - 1; j >= 0; j--)
if(++indices[j] <= setSize - subsetSize + j){
for(int k = j + 1; k < subsetSize; k++)
indices[k] = indices[k - 1] + 1;
break;
}
}
}
return rtn;
}
static int Combination(int n, int r){
if(r == 0 || r == n)
return 1;
/*
* The formula for combination is:
*
* n!
* ----------
* r!(n - r)!
*
* We'll actually use a slightly modified version here. The above
* formula forces us to calculate (n - r)! twice. Instead, we only
* multiply for the numerator the factors of n! that aren't canceled
* out by (n - r)! in the denominator.
*/
/*
* nCr == nC(n - r)
* We can use this fact to reduce the number of multiplications we
* perform, as well as the incidence of overflow, where r > n / 2
*/
if(r > n / 2) /* We DO want integer truncation here (7 / 2 = 3) */
r = n - r;
/*
* I originally used all integer math below, with some complicated
* logic and another function to handle cases where the intermediate
* results overflowed a 32-bit int. It was pretty ugly. In later
* testing, I found that the more generalized double-precision
* floating-point approach was actually *faster*, so there was no
* need for the ugly code. But if you want to see a giant WTF, look
* at the edit history for this post!
*/
double denominator = Factorial(r);
double numerator = n;
while(--r > 0)
numerator *= --n;
return (int)(numerator / denominator + 0.1/* Deal with rounding errors. */);
}
/*
* The archetypical factorial implementation is recursive, and is perhaps
* the most often used demonstration of recursion in text books and other
* materials. It's unfortunate, however, that few texts point out that
* it's nearly as simple to write an iterative factorial function that
* will perform better (although tail-end recursion, if implemented by
* the compiler, will help to close the gap).
*/
static double Factorial(int x){
/*
* An all-purpose factorial function would handle negative numbers
* correctly - the result should be Sign(x) * Factorial(Abs(x)) -
* but since we don't need that functionality, we're better off
* saving the few extra clock cycles it would take.
*/
/*
* I originally used all integer math below, but found that the
* double-precision floating-point version is not only more
* general, but also *faster*!
*/
if(x < 2)
return 1;
double rtn = x;
while(--x > 1)
rtn *= x;
return rtn;
}