查找字符串的所有唯一排列而不生成重复项答案

【问题标题】：Finding all the unique permutations of a string without generating duplicates查找字符串的所有唯一排列而不生成重复项
【发布时间】：2012-02-09 20:00:34
【问题描述】：

通过众所周知的 Steinhaus-Johnson-Trotter 算法查找字符串的所有排列。但是如果字符串中包含重复的字符，例如
AABB,
那么可能的唯一组合将是 4!/(2!* 2!) = 6

实现这一点的一种方法是我们可以将它存储在一个数组中，然后删除重复项。

有没有更简单的方法来修改约翰逊算法，这样我们就不会生成重复的排列。（以最有效的方式）

【问题讨论】：

排列的定义是什么？ BA 是 AABB 的有效排列吗？
no BA 不是 AABB 的有效排列。
排列是对字符串中的字符进行洗牌的一个序列。对于长度为 n 和唯一字符的字符串，我们总共有 n 个！可能的独特排列
您可以修改 Jhonson 算法，将每个字母的每次出现放在一个步骤中。
如果您找不到避免生成重复项的方法，您可能会在生成重复项时通过将排列存储在自平衡 BST 或类似的排序结构中而受益于删除重复项。跨度>

标签： c algorithm combinatorics

【解决方案1】：

首先将字符串转换为一组唯一字符和出现次数，例如香蕉 -> (3, A),(1,B),(2,N)。（这可以通过对字符串进行排序和对字母进行分组来完成）。然后，对于集合中的每个字母，将该字母添加到该集合的所有排列之前，并少一个该字母（注意递归）。继续“香蕉”的例子，我们有： permutations((3,A),(1,B),(2,N)) = A:(permutations((2,A),(1,B),(2 ,N)) ++ B:(排列((3,A),(2,N)) ++ N:(排列((3,A),(1,B),(1,N))

这是 Haskell 中的一个有效实现：

circularPermutations::[a]->[[a]]
circularPermutations xs = helper [] xs []
                          where helper acc [] _ = acc
                                helper acc (x:xs) ys =
                                  helper (((x:xs) ++ ys):acc) xs (ys ++ [x])

nrPermutations::[(Int, a)]->[[a]]
nrPermutations x | length x == 1 = [take (fst (head x)) (repeat (snd (head x)))]
nrPermutations xs = concat (map helper (circularPermutations xs))
  where helper ((1,x):xs) = map ((:) x)(nrPermutations xs)
        helper ((n,x):xs) = map ((:) x)(nrPermutations ((n - 1, x):xs))

【讨论】：

【解决方案2】：

使用以下递归算法：

PermutList Permute(SymArray fullSymArray){
    PermutList resultList=empty;
    for( each symbol A in fullSymArray, but repeated ones take only once) {
       PermutList lesserPermutList=  Permute(fullSymArray without A)
       for ( each SymArray item in lesserPermutList){
            resultList.add("A"+item);
       }
    }
    return resultList;
}

如你所见，这很容易

【讨论】：

【解决方案3】：

我认为这个问题本质上是生成multiset permutations的问题。这篇论文似乎是相关的：J. F. Korsh P. S. LaFollette。无环数组生成多集排列。计算机杂志，47(5):612–621, 2004。

摘自摘要：本文提出了一种无环算法来生成多重集的所有排列。每一个都是通过进行一次换位从其前身获得的。它与以前的此类算法不同，它使用数组进行排列，但只需要长度线性存储。

【讨论】：

然后自己尝试编写它怎么样？

【解决方案4】：

在我的解决方案中，我递归地生成选项，每次都尝试添加我不需要多次使用的每个字母。

#include <string.h>

void fill(char ***adr,int *pos,char *pref) {
    int i,z=1;
    //loop on the chars, and check if should use them
    for (i=0;i<256;i++)
        if (pos[i]) {
            int l=strlen(pref);
            //add the char
            pref[l]=i;
            pos[i]--;
            //call the recursion
            fill(adr,pos,pref);
            //delete the char
            pref[l]=0;
            pos[i]++;
            z=0;
        }
    if (z) strcpy(*(*adr)++,pref);
}

void calc(char **arr,const char *str) {
    int p[256]={0};
    int l=strlen(str);
    char temp[l+1];
    for (;l>=0;l--) temp[l]=0;
    while (*str) p[*str++]++;
    fill(&arr,p,temp);
}

使用示例：

#include <stdio.h>
#include <string.h>

int main() {
    char s[]="AABAF";
    char *arr[20];
    int i;
    for (i=0;i<20;i++) arr[i]=malloc(sizeof(s));
    calc(arr,s);
    for (i=0;i<20;i++) printf("%d: %s\n",i,arr[i]);
    return 0;
}

【讨论】：

添加了一些 cmets。还有什么建议吗？
最重要的改进，甚至比 cmets 更重要，将是描述性函数/变量名称。现在你有两个函数名为func和calc，变量名为arr、pref、pos、adr、p、l、i、@98765433 p、s 和 str；从他们的名字看，他们的目的都不是显而易见的。使用更具描述性的变量名可以提高代码的可读性。
其他较小的改进：使用描述性类型（z 应该是bool，#include <stdbool.h>）；不要使用幻数（arr 的大小，p 的大小）； don't use strcpy() for anything, ever;不要忘记在您的malloc() 上致电free() :)

【解决方案5】：

这是一个棘手的问题，我们需要使用递归来查找字符串的所有排列，例如“AAB”排列将是“AAB”、“ABA”和“BAA”。我们还需要使用 Set 来确保没有重复值。

import java.io.*;
import java.util.HashSet;
import java.util.*;
class Permutation {

    static HashSet<String> set = new HashSet<String>();
    public static void main (String[] args) {
    Scanner in = new Scanner(System.in);
        System.out.println("Enter :");
        StringBuilder  str = new StringBuilder(in.nextLine());
        NONDuplicatePermutation("",str.toString());  //WITHOUT DUPLICATE PERMUTATION OF STRING
        System.out.println(set);
    }


    public static void NONDuplicatePermutation(String prefix,String str){
        //It is nlogn
        if(str.length()==0){
            set.add(prefix);
        }else{
            for(int i=0;i<str.length();i++){

                NONDuplicatePermutation(prefix+ str.charAt(i), str.substring(0,i)+str.substring(i+1));
            }
        }

    }

}

【讨论】：

我用java写了我的代码。我认为我的代码中给出的逻辑已经很好理解了。
您仍在生成所有排列（包括重复项）。问题提到解决方案应该避免这样做。