如何从文本文件创建多个数组并循环遍历每个数组的值答案

【问题标题】：How to create mutiple arrays from a text file and loop through the values of each array如何从文本文件创建多个数组并循环遍历每个数组的值
【发布时间】：2021-05-30 12:46:26
【问题描述】：

我有一个包含以下内容的文本文件：

Paige
Buckley
Govan
Mayer
King

Harrison
Atkins
Reinhardt
Wilson

Vaughan
Sergovia
Tarrega

我的目标是为每组名称创建一个数组。然后遍历第一个值数组，然后移动到第二个值数组，最后是第三个数组。每个集合在文本文件中由一个新行分隔。非常感谢您提供代码或逻辑方面的帮助！

到目前为止，我有以下内容。当我到达换行符时，我不确定前进的逻辑。我在这里的研究还表明我可以使用readarray -d。

#!/bin/bash

my_array=()
while IFS= read -r line || [[ "$line" ]]; do
    if [[ $line -eq "" ]]; 
.
.
.

        arr+=("$line") # i know this adds the value to the array
done < "$1"
printf '%s\n' "${my_array[@]}"

想要的输出：

array1 = (Paige Buckley6 Govan Mayer King)
array2 = (Harrison Atkins Reinhardt Wilson)
array3 = (Vaughan Sergovia Terrega)
#then loop through the each array one after the other.

【问题讨论】：

你真的需要这些数组吗？如果您打算一个接一个地循环它们，那么为什么不在读取文件时立即执行您计划执行的任务呢？
另外，这不是your last question吗？我看到你改进了很多。但是，下次您的问题被关闭时，请对其进行编辑并要求关闭它的人重新打开。他们没有永久关闭您的问题，而是要求您改进它。
My goal is to create an array for each set of names 这真的是你的目标吗？这听起来更像是其他方面的一小步。
@Socowi 感谢您的意见！这背后的逻辑是更大程序的一部分。不幸的是，它们需要位于单独的数组中，因为需要满足某些条件。如果我要组合这些数组，我稍后需要在代码中将它们分开。最初这是一个长文本文件。我有一个 awk 语句将文本集分解为 n 个值。那是我的问题。我会注意到这一点。
表面上这在 awk 中更容易做到，如果我们将 记录分隔符 定义为 \n\n 并将 字段分隔符 定义为 @987654328 @。我担心的是 bash 并不适合“更大的程序”：所需的数据结构不存在。

标签： arrays bash shell loops scripting

【解决方案1】：

Bash 没有数组数组。所以你必须用另一种方式来表示它。

您可以保留换行符并使用一组换行符分隔元素：

array=()

elem=""
while IFS= read -r line; do
    if [[ "$line" != "" ]]; then
        elem+="${elem:+$'\n'}$line" # accumulate lines in elem
    else
        array+=("$elem")  # flush elem as array element
        elem=""
    fi
done 
if [[ -n "$elem" ]]; then
   array+=("$elem") # flush the last elem
fi

# iterate over array
for ((i=0;i<${#array[@]};++i)); do
    # each array element is newline separated items
    readarray -t elem <<<"${array[i]}"
    printf 'array%d = (%s)\n' "$i" "${elem[*]}"
done

您可以使用一些独特的字符和sed 来简化循环，例如：

readarray -d '#' -t array < <(sed -z 's/\n\n/#/g' file)

但总的来说，这个awk 生成相同的输出：

awk -v RS= -v FS='\n' '{ 
     printf "array%d = (", NR;
     for (i=1;i<=NF;++i) printf "%s%s", $i, i==NF?"":" ";
     printf ")\n"
}'

【讨论】：

【解决方案2】：

假设：

空白链接是真正的空白（即，无需担心所述行上的任何空白）
可能有连续的空行
名称可能嵌入了空格
组的数量可能会有所不同，并不总是3（与问题中提供的示例数据一样）
OP 可以使用（模拟的）二维数组，而不是（可变）数量的一维数组

我的数据文件：

$ cat names.dat
                       <<< leading blank lines

Paige
Buckley
Govan
Mayer
King Kong

                       <<< consecutive blank lines

Harrison
Atkins
Reinhardt
Wilson

Larry
Moe
Curly
Shemp

Vaughan
Sergovia
Tarrega

                       <<< trailing blank lines

使用一对数组的一个想法：

数组 #1：关联数组 - 前面提到的（模拟的）二维数组，索引为 - [x,y] - 其中x 是一组名称的唯一标识符，y 是一组名称的唯一标识符组中的名称
数组#2：一维数组，用于跟踪每个组max(y)x

加载数组：

unset      names max_y                     # make sure array names are not already in use
declare -A names                           # declare associative array

x=1                                        # init group counter
y=0                                        # init name counter
max_y=()                                   # initialize the max(y) array
inc=                                       # clear increment flag

while read -r name
do
    if [[ "${name}" = '' ]]                # if we found a blank line ...
    then
        [[ "${y}" -eq 0 ]]   &&            # if this is a leading blank line then ...
        continue                           # ignore and skip to the next line

        inc=y                              # set flag to increment 'x'
    else
        [[ "${inc}" = 'y' ]] &&            # if increment flag is set ...
        max_y[${x}]="${y}"   &&            # make note of max(y) for this 'x'
        ((x++))              &&            # increment 'x' (group counter)
        y=0                  &&            # reset 'y'
        inc=                               # clear increment flag

        ((y++))                            # increment 'y' (name counter)

        names[${x},${y}]="${name}"         # save the name
    fi

done < names.dat

max_y[${x}]="${y}"                         # make note of the last max(y) value

数组的内容：

$ typeset -p names
declare -A names=([1,5]="King Kong" [1,4]="Mayer" [1,1]="Paige" [1,3]="Govan" [1,2]="Buckley" [3,4]="Shemp" [3,3]="Curly" [3,2]="Moe" [3,1]="Larry" [2,4]="Wilson" [2,2]="Atkins" [2,3]="Reinhardt" [2,1]="Harrison" [4,1]="Vaughan" [4,2]="Sergovia" [4,3]="Tarrega" )


$ for (( i=1; i<=${x}; i++ ))
do
    for (( j=1; j<=${max_y[${i}]}; j++ ))
    do
        echo "names[${i},${j}] : ${names[${i},${j}]}"
    done
    echo ""
done

names[1,1] : Paige
names[1,2] : Buckley
names[1,3] : Govan
names[1,4] : Mayer
names[1,5] : King Kong

names[2,1] : Harrison
names[2,2] : Atkins
names[2,3] : Reinhardt
names[2,4] : Wilson

names[3,1] : Larry
names[3,2] : Moe
names[3,3] : Curly
names[3,4] : Shemp

names[4,1] : Vaughan
names[4,2] : Sergovia
names[4,3] : Tarrega

【讨论】：

【解决方案3】：

使用 nameref ：

#!/usr/bin/env bash

declare -a array1 array2 array3

declare -n array=array$((n=1))

while IFS= read -r line; do
    test "$line" = "" && declare -n array=array$((n=n+1)) || array+=("$line")
done < "$1"

declare -p array1 array2 array3

调用：

bash test.sh data
# result
declare -a array1=([0]="Paige" [1]="Buckley" [2]="Govan" [3]="Mayer" [4]="King")
declare -a array2=([0]="Harrison" [1]="Atkins" [2]="Reinhardt" [3]="Wilson")
declare -a array3=([0]="Vaughan" [1]="Sergovia" [2]="Tarrega")

【讨论】：