Shell/Bash - 想要将文件数组映射到一组数字答案

【问题标题】：Shell/Bash - want to map array of file to a set of numbersShell/Bash - 想要将文件数组映射到一组数字
【发布时间】：2021-03-09 11:16:43
【问题描述】：

我得到一个包含名称数组的文件，我得到另一个包含多个数字系列的文件，从 1 计数到数组的长度。它们具有相同的长度，除了系列的最后一位始终为 0。这些数字可以以“-”开头，那么当前元素不应映射到它，或者它只是纯数字。

现在我想将数组的第一个元素映射到数字 1，第二个元素映射到 2，...。这是一个例子：

第一个文件包含：

['aaaaa', 'bbb', 'cccc', 'dd', 'eeeee']

其他文件包含：

-1 2 -3 -4 -5 0
1 -2 3 4 -5 0
1 2 -3 4 5 0

所需的输出：

bbb
aaaaa, cccc, dd
aaaaa, bbb, dd, eeeee

注意：输出的格式不必是这种精确的格式。我想在 bash/shell 中完全使用它，并且不要使用其他 python、c、... 脚本。

编辑：

我有一段时间没找了，因为我以为没人会这么快回复。与此同时，我为此编写了自己的代码。在这里你可以看到最终结果...

...假设第一个文件称为tmp.txt，第二个文件称为tmp_num.txt:

NAMES=()
for elem in $(cat tmp.txt | sed "s/,//g; s/[][]//g; s/'//g")
do
    NAMES+=($elem)
done

for number in $(cat tmp_num.txt)
do
    if [ "$number" -eq "0" ]; then
        echo 
        continue
    fi

    if [[ ${number:0:1} != "-" ]] ; 
        
        then
            number=$((number-1))
            echo -n ${NAMES[${number}]}" ";
    fi
done

【问题讨论】：

请使用您迄今为止尝试过的代码更新问题，并且（正确/错误）结果表明代码正在生成
对不起@markp-fuso 我看到你来晚了。在我找到了一些好的方法之后，我开始了，然后我有了最终的代码。

标签： linux bash shell awk command-line

【解决方案1】：

您能否尝试在 GNU awk 中使用所示示例进行跟踪、编写和测试。 Input_file 是您的密钥文件，而您的 valFile 是您的 other file（有问题的 OP 提到）。

awk '
BEGIN{
  OFS=", "
}
FNR==NR{
  for(i=1;i<=NF;i++){
    match($i,/\047[^\047]*/)
    value[++count]=substr($i,RSTART+1,RLENGTH-1)
  }
  next
}
{
  for(i=1;i<=NF;i++){
    if($i!=0 && $i!~/^-/){
      val=(val?val OFS:"")value[$i]
    }
  }
  print val
  val=""
}
' Input_file valFile

说明：为上述添加详细说明。

awk '                              ##Starting awk program from here.
BEGIN{                             ##Starting BEGIN section of this program from here.
  OFS=", "                         ##Setting OFS as comma space here.  
}
FNR==NR{                           ##Checking condition FNR==NR which will be true when key values file is being read.
  for(i=1;i<=NF;i++){              ##Traversing through all fields here.
    match($i,/\047[^\047]*/)       ##Using match to match from single quote to next occurrence of single quote.
    value[++count]=substr($i,RSTART+1,RLENGTH-1)  ##Creating value with increasing value of count and its value is sub string of matched value.
  }
  next                             ##next will skip all statements from here.
}
{
  for(i=1;i<=NF;i++){               ##Traversing through all fields here.
    if($i!=0 && $i!~/^-/){          ##Checking condition if field is NOT 0 and not starts from - then do following.
      val=(val?val OFS:"")value[$i]  ##Keep adding value of array in val here.
    }
  }
  print val                         ##Printing val here.
  val=""                            ##Nullifying val here.
}
' Input_file valFile                ##Mentioning Input_file names here.

【讨论】：

谢谢！这是一个超级解决方案！
@kaiserm99，现在也在解决方案中添加了解释，其中 Input_file 是您的文件，其中包含 ['aaaaa', 'bbb', 'cccc', 'dd', 'eeeee'] 和 valFile 是您的实际文件，干杯。

【解决方案2】：

输入数据：

$ cat map.array
['aaaaa', 'bbb', 'cccc', 'dd', 'eeeee']

$ cat map.numbers
-1 2 -3 -4 -5 0
1 -2 3 4 -5 0
1 2 -3 4 5 0

一个awk解决方案：

awk -v sep="'" '                                 # splitting first file on single quote is cleaner with an input variable

# for first file (FNR==NR) ...

FNR==NR { n=split($0,tmp,sep)                    # split line on single quotes; store results in the tmp[] array
          ndx=0                                  # init arr[] array index
          for ( i=2 ; i<n ; i=i+2 )              # we want the even numbered entries from the tmp[] array
              { ndx++                            # increment arr[] index
                arr[ndx]=tmp[i]                  # copy tmp[] element into arr[]
              }
           next
        }

# for second file ...

        { output=""                              # init output string
          pfx=""                                 # clear prefix string
          for ( i=1 ; i<=ndx ; i++ )             # loop through our arr[] indices
              if ( $i == i )                     # if the value in field # i matches the value of i, eg, 1 == 1
                 { output=output""pfx""arr[i]    # then build our output string
                   pfx=", "                      # set the prefix to ", " for follow-on fields of interest
                 }
           if ( length(output) > 0 )             # if we have something to output ...
              printf "%s\n", output              # printf it
        }
' map.array map.numbers

注意：移除 cmets 以整理代码。

以上生成：

bbb
aaaaa, cccc, dd
aaaaa, bbb, dd, eeeee

【讨论】：

【解决方案3】：

awk 'NR==FNR { 
                str=gensub(/[\[\]'"'"']/,"","g",$0);
                split(str,map,",") 
             } 
     NR != FNR { 
                for (i=1;i<=NF;i++) { 
                                      if ( map[$i]!="") { 
                                                          printf "%s",map[$i] 
                                                        } 
                                      } 
                printf "\n" 
               }' fil1 fil2

一个班轮：

awk 'NR==FNR { str=gensub(/[\[\]'"'"']/,"","g",$0);split(str,map,",") } NR != FNR { for (i=1;i<=NF;i++) { if ( map[$i]!="") { printf "%s",map[$i] } } printf "\n" }' fil1 fil2

处理这两个文件。对于第一个文件 (NR==FNR)，去掉 [, ] 然后将该行拆分为一个数组映射，其中 , 作为分隔符。然后对于第二个文件 (NR!=FNR) 循环遍历每个空格分隔的字段，如果映射数组中有条目，则打印它。

【讨论】：

答案已修改

【解决方案4】：

假设您的文件分别是“f1”和“f2”，这可能是 bash 中的解决方案：

#!/bin/bash
A=()
while read -d ',' L; do
        A+=("${L:1:-1}")
done <<< "$(cat f1 | tr -d '[]'),"

while read L; do
        while read -d ' ' P; do
                if ((P>0)); then
                        P=$((P-1))
                        echo -n "${A[$P]} "
                fi
        done <<< "$L"
        echo
done < f2

【讨论】：