我如何在 bash 中循环而不重复值对？答案

【问题标题】：How would I loop over pairs of values without repetition in bash?我如何在 bash 中循环而不重复值对？
【发布时间】：2019-07-06 07:09:03
【问题描述】：

我正在使用一个特定的程序，该程序需要我通过使用索引指定变量对来检查文本文件中的变量对。

例如：

gcta  --reml-bivar 1 2 --grm test  --pheno test.phen  --out test

其中 1 和 2 对应于文本文件中前两列的值。如果我有 50 列并且想检查每一对而不重复（1&2、2&3、1&3 ... 50），那么通过循环来自动执行此操作的最佳方法是什么？所以基本上该脚本将执行相同的命令，但采用成对的索引，例如：

gcta  --reml-bivar 1 3 --grm test  --pheno test.phen  --out test
gcta  --reml-bivar 1 4 --grm test  --pheno test.phen  --out test

... 以此类推。谢谢！

【问题讨论】：

请在您的问题中添加示例输入（无描述、无图像、无链接）以及该示例输入所需的输出（无评论）。
@Cyrus 我已经做到了。
请发布two columns in a text file 示例输入文件以及每对的外观。 “每对不重复”是什么意思？

标签： bash loops awk

【解决方案1】：

由于您没有向我们展示任何示例输入，我们只是在猜测，但如果您的输入是数字列表（从文件中提取或以其他方式提取），那么这是一种方法：

$ cat combinations.awk
###################
# Calculate all combinations of a set of strings, see
# https://rosettacode.org/wiki/Combinations#AWK
###################

function get_combs(A,B, i,n,comb) {
    ## Default value for r is to choose 2 from pool of all elements in A.
    ## Can alternatively be set on the command line:-
    ##    awk -v r=<number of items being chosen> -f <scriptname>
    n = length(A)
    if (r=="") r = 2

    comb = ""
    for (i=1; i <= r; i++) { ## First combination of items:
        indices[i] = i
        comb = (i>1 ? comb OFS : "") A[indices[i]]
    }
    B[comb]

    ## While 1st item is less than its maximum permitted value...
    while (indices[1] < n - r + 1) {
        ## loop backwards through all items in the previous
        ## combination of items until an item is found that is
        ## less than its maximum permitted value:
        for (i = r; i >= 1; i--) {
            ## If the equivalently positioned item in the
            ## previous combination of items is less than its
            ## maximum permitted value...
            if (indices[i] < n - r + i) {
                ## increment the current item by 1:
                indices[i]++
                ## Save the current position-index for use
                ## outside this "for" loop:
                p = i
                break
            }
        }

        ## Put consecutive numbers in the remainder of the array,
        ## counting up from position-index p.
        for (i = p + 1; i <= r; i++) indices[i] = indices[i - 1] + 1

        ## Print the current combination of items:
        comb = ""
        for (i=1; i <= r; i++) {
            comb = (i>1 ? comb OFS : "") A[indices[i]]
        }
        B[comb]
    }
}

# Input should be a list of strings
{
    split($0,A)
    delete B
    get_combs(A,B)
    PROCINFO["sorted_in"] = "@ind_str_asc"
    for (comb in B) {
        print comb
    }
}

.

$ awk -f combinations.awk <<< '1 2 3 4'
1 2
1 3
1 4
2 3
2 4
3 4

.

$ while read -r a b; do
    echo gcta  --reml-bivar "$a" "$b" --grm test  --pheno test.phen  --out test
done < <(awk -f combinations.awk <<< '1 2 3 4')
gcta --reml-bivar 1 2 --grm test --pheno test.phen --out test
gcta --reml-bivar 1 3 --grm test --pheno test.phen --out test
gcta --reml-bivar 1 4 --grm test --pheno test.phen --out test
gcta --reml-bivar 2 3 --grm test --pheno test.phen --out test
gcta --reml-bivar 2 4 --grm test --pheno test.phen --out test
gcta --reml-bivar 3 4 --grm test --pheno test.phen --out test

当您完成测试并对输出感到满意时，删除 echo。

如果有人在阅读本文并想要排列而不是组合：

$ cat permutations.awk
###################
# Calculate all permutations of a set of strings, see
# https://en.wikipedia.org/wiki/Heap%27s_algorithm

function get_perm(A,            i, lgth, sep, str) {
    lgth = length(A)
    for (i=1; i<=lgth; i++) {
        str = str sep A[i]
        sep = " "
    }
    return str
}

function swap(A, x, y,  tmp) {
    tmp  = A[x]
    A[x] = A[y]
    A[y] = tmp
}

function generate(n, A, B,      i) {
    if (n == 1) {
        B[get_perm(A)]
    }
    else {
        for (i=1; i <= n; i++) {
            generate(n - 1, A, B)
            if ((n%2) == 0) {
                swap(A, 1, n)
            }
            else {
                swap(A, i, n)
            }
        }
    }
}

function get_perms(A,B) {
    generate(length(A), A, B)
}

###################

# Input should be a list of strings
{
    split($0,A)
    delete B
    get_perms(A,B)
    PROCINFO["sorted_in"] = "@ind_str_asc"
    for (perm in B) {
        print perm
    }
}

.

$ awk -f permutations.awk <<< '1 2 3 4'
1 2 3 4
1 2 4 3
1 3 2 4
1 3 4 2
1 4 2 3
1 4 3 2
2 1 3 4
2 1 4 3
2 3 1 4
2 3 4 1
2 4 1 3
2 4 3 1
3 1 2 4
3 1 4 2
3 2 1 4
3 2 4 1
3 4 1 2
3 4 2 1
4 1 2 3
4 1 3 2
4 2 1 3
4 2 3 1
4 3 1 2
4 3 2 1

以上都使用 GNU awk for sorted_in 对输出进行排序。如果您没有 GNU awk，您仍然可以按原样使用脚本，如果您需要对输出进行排序，则将其通过管道传送到 sort。

【讨论】：

您好，谢谢，这非常有帮助！我想知道是否有办法打印出一系列数字来输入您创建的函数，而不是输入“1 2 3 4 5 ... 50”？非常感谢。 @埃德莫顿
嗨，我试过 awk -f combination.awk
要执行一个命令，你使用反引号，而不是单引号，或者更好的$(...)（见mywiki.wooledge.org/BashFAQ/082）。试试awk -f combinations.awk <<< $(seq 50) 或awk -f combinations.awk < <(seq 50)

【解决方案2】：

如果我理解正确并且您不需要像 '1 1', '2 2', ... 和 '1 2', '2 1' 这样的对...试试这个脚本

#!/bin/bash

for i in $(seq 1 49);
do
    for j in $(seq $(($i + 1)) 50);
    do gcta --reml-bivar "$i $j" --grm test --pheno test.phen --out test
done;

done;

【讨论】：

【解决方案3】：

1 和 2 对应于文本文件中前两列的值。

每对不重复

让我们来看看这个过程：

我们重复文件的第一列乘以文件长度
我们重复文件第二列中的每个值（每一行）乘以文件长度
我们加入重复的列 -> 我们拥有所有组合
我们需要过滤“重复”，我们可以将文件与原始文件连接并过滤掉重复的列
所以我们得到每一对没有重复。
然后我们就逐行读取文件。

脚本：

# create an input file cause you didn't provide any
cat << EOF > in.txt
1 a
2 b
3 c
4 d
EOF

# get file length
inlen=$(<in.txt wc -l)

# join the columns
paste -d' ' <(
  # repeat the first column inlen times
  # https://askubuntu.com/questions/521465/how-can-i-repeat-the-content-of-a-file-n-times
  seq "$inlen" |
  xargs -I{} cut -d' ' -f1 in.txt
) <(
  # repeat each line inlen times
  # https://unix.stackexchange.com/questions/81904/repeat-each-line-multiple-times
  awk -v IFS=' ' -v v="$inlen" '{for(i=0;i<v;i++)print $2}' in.txt
) |
# filter out repetitions - ie. filter original lines from the file
sort |
comm --output-delimiter='' -3 <(sort in.txt) - |
# read the file line by line
while read -r one two; do
  echo "$one" "$two"
done

将输出：

1 b
1 c
1 d
2 a
2 c
2 d
3 a
3 b
3 d
4 a
4 b
4 c

【讨论】：

【解决方案4】：

    #!/bin/bash

    #set the length of the combination depending the 
    #user's choice 

    eval rg+=({1..$2})

    #the code builds the script and runs it (eval)

    eval `
    #Character range depending on user selection
    for i in ${rg[@]} ; do
    echo "for c$i in {1..$1} ;do " 
    done ;


    #Since the script is based on a code that brings 
    #all possible combinations even with duplicates - 
    #this is where the deduplication 
    #prevention conditioning set by (the script writes           
    #the conditioning code)


    op1=$2
    op2=$(( $2 - 1 ))
    echo -n "if [ 1 == 1 ] "

    while [ $op1 -gt 1 ]  ; do
    echo -n  \&\& [ '$c'$op1 != '$c'$op2 ]' '
    op2=$(( op2 -1 )
    if [ $op2 == 0 ] ; then  
            op1=$(( op1 - 1 ))
            op2=$(( op1 - 1 ))
    fi
    done ;

    echo  ' ; then'
    echo -n "echo "

    for i in ${rg[@]} ; 
    do
    echo -n '$c'$i
    done ;

    echo \;
    echo fi\;

    for i in ${rg[@]} ; do
    echo 'done ;'
    done;`

    example:               range       length
    $ ./combs.bash '{1..2} {a..c} \$ \#' 4
    12ab$
    12ab#
    12acb
    12ac$
    12ac#
    12a$b
    12a$c
    12a$#
    12a#b
    12a#c
    12a#$
    ..........

【讨论】：

虽然此代码可能会解决问题，including an explanation 关于如何以及为什么解决问题将真正有助于提高您的帖子质量，并可能导致更多的赞成票。请记住，您正在为将来的读者回答问题，而不仅仅是现在提问的人。请edit您的回答添加解释并说明适用的限制和假设。
很抱歉没有解释（英语不是我的第一语言。也不是真正的第二语言。）脚本回答所有数字组合，没有两位数字符的字母。我添加了 cmets在脚本中，并希望获得反馈

【解决方案5】：

      #!/bin/bash
      len=$2
      eval c=($1)
      per()
      {
      ((`grep -Poi '[^" ".]'<<<$2|sort|uniq|wc -l` < $((len - ${1}))))&&{ return;}
      (($1 == 0))&&{ echo $2;return;}
      for i in ${c[@]} ; do
      per "$((${1} - 1 ))" "$2 $i"
      done
      }
      per "$2" ""

      #example
      $ ./neto '{0..3} {a..d} \# \!'  7
      0 1 2 3 a b c
      0 1 2 3 a b d
      0 1 2 3 a b #
      0 1 2 3 a b !
      0 1 2 3 a c b
      0 1 2 3 a c d
      0 1 2 3 a c #
      0 1 2 3 a c !
      0 1 2 3 a d b
      ...

【讨论】：

这是一个简短而快速的代码，允许用户确定字符的范围
欢迎来到 SO。在您的答案中添加解释有助于其他用户理解代码。