如何在 PostgreSQL 中创建一个单词/字符串的所有可能字谜的列表答案

【问题标题】：How do I create a list of all possible anagrams of a word/string in PostgreSQL如何在 PostgreSQL 中创建一个单词/字符串的所有可能字谜的列表
【发布时间】：2019-02-03 16:11:57
【问题描述】：

如何在 PostgreSQL 中创建一个单词/字符串的所有可能字谜的列表。

例如，如果字符串是 'act' 那么所需的输出应该是：

行动， atc, cta, 猫，塔克， tca

我有一个表'tbl_words'，其中包含数百万个单词。

然后我想从这个字谜列表中检查/搜索我的数据库表中的有效单词。

就像上面的字谜列表一样，有效的单词是：act, cat。

有什么办法吗？

更新 1：

我需要这样的输出：（给定单词的所有排列）

有什么想法吗？？

【问题讨论】：

PostgreSQL combinations without repetitions的可能重复

标签： sql postgresql anagram

【解决方案1】：

查询生成 3 个元素集合的所有排列：

with recursive numbers as (
    select generate_series(1, 3) as i
),
rec as (
    select i, array[i] as p
    from numbers
union all
    select n.i, p || n.i
    from numbers n
    join rec on cardinality(p) < 3 and not n.i = any(p)
)
select p as permutation
from rec
where cardinality(p) = 3
order by 1

 permutation 
-------------
 {1,2,3}
 {1,3,2}
 {2,1,3}
 {2,3,1}
 {3,1,2}
 {3,2,1}
(6 rows)

修改最终查询以生成给定单词的字母排列：

with recursive numbers as (
    select generate_series(1, 3) as i
),
rec as (
    select i, array[i] as p
    from numbers
union all
    select n.i, p || n.i
    from numbers n
    join rec on cardinality(p) < 3 and not n.i = any(p)
)
select a[p[1]] || a[p[2]] || a[p[3]] as result
from rec
cross join regexp_split_to_array('act', '') as a
where cardinality(p) = 3
order by 1

 result 
--------
 act
 atc
 cat
 cta
 tac
 tca
(6 rows)

【讨论】：

【解决方案2】：

这是一个解决方案：

with recursive params as (
      select *
      from (values ('cata')) v(str)
     ),
     nums as (
      select str, 1 as n
      from params
      union all
      select str, 1 + n
      from nums
      where n < length(str)
     ),
     pos as (
      select str, array[n] as poses, array_remove(array_agg(n) over (partition by str), n) as rests, 1 as lev
      from nums
      union all
      select pos.str, array_append(pos.poses, nums.n), array_remove(rests, nums.n), lev + 1
      from pos join
           nums
           on pos.str = nums.str and array_position(pos.rests, nums.n) > 0
      where cardinality(rests) > 0
     )
select distinct pos.str , string_agg(substr(pos.str, thepos, 1), '')
from pos cross join lateral
     unnest(pos.poses) thepos
where cardinality(rests) = 0 
group by pos.str, pos.poses;

这很棘手，尤其是当字符串中有重复的字母时。这里采用的方法生成从 1 到 n 的所有数字排列，其中 n 是字符串的长度。然后它使用这些作为索引从原始字符串中提取字符。

有兴趣的人会注意到，这使用了select distinct 和group by。这似乎是避免结果字符串重复的最简单方法。

【讨论】：

感谢您的回复。它工作正常。我有几个查询我需要在@temp 表中选择这个结果，然后编写一些逻辑，在 tbl_words 中搜索这个字谜列表，并且只返回有效的单词。