【问题标题】:export from a while loop to a csv file从 while 循环导出到 csv 文件
【发布时间】:2022-01-23 11:18:53
【问题描述】:

给定以下脚本和数据集: 脚本:

while IFS="," 
   read v1 v2 v3 v4 v5 v6 v7 v8 v9 v10 v11 v12 v13; 
   do if [ -z "$v12" ]; 
      then echo "$v1,$v2,$v3,$v4,$v5,$v6,$v7,$v8,$v9,$v10,$v11,'unknown',$v13"; 
   else echo "$v1, $v2,$v3,$v4,$v5,$v6,$v7,$v8,$v9,$v10,$v11,$v12,$v13"; 
   fi;
done 
>train3.csv

数据集:

PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked
1,0,3,"Braund, Mr. Owen Harris",male,22,1,0,A/5 21171,7.25,,S
2,1,1,"Cumings, Mrs. John Bradley (Florence Briggs Thayer)",female,38,1,0,PC 17599,71.2833,C85,C
3,1,3,"Heikkinen, Miss. Laina",female,26,0,0,STON/O2. 3101282,7.925,,S
4,1,1,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",female,35,1,0,113803,53.1,C123,S
5,0,3,"Allen, Mr. William Henry",male,35,0,0,373450,8.05,,S
6,0,3,"Moran, Mr. James",male,,0,0,330877,8.4583,,Q
7,0,1,"McCarthy, Mr. Timothy J",male,54,0,0,17463,51.8625,E46,S
8,0,3,"Palsson, Master. Gosta Leonard",male,2,3,1,349909,21.075,,S
9,1,3,"Johnson, Mrs. Oscar W (Elisabeth Vilhelmina Berg)",female,27,0,2,347742,11.1333,,S
10,1,2,"Nasser, Mrs. Nicholas (Adele Achem)",female,14,1,0,237736,30.0708,,C

我想导出为名为“train3.csv”的 CSV 文件,但我的做法不起作用,它不显示已完成的更改或保存为 CSV 文件。

我该如何解决这个问题?

预期的结果是:

PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked
1,0,3,"Braund, Mr. Owen Harris",male,22,1,0,A/5 21171,7.25,'unknown',S
2,1,1,"Cumings, Mrs. John Bradley (Florence Briggs Thayer)",female,38,1,0,PC 17599,71.2833,C85,C
3,1,3,"Heikkinen, Miss. Laina",female,26,0,0,STON/O2. 3101282,7.925,'unknown',S
4,1,1,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",female,35,1,0,113803,53.1,C123,S
5,0,3,"Allen, Mr. William Henry",male,35,0,0,373450,8.05,'unknown',S
6,0,3,"Moran, Mr. James",male,,0,0,330877,8.4583,'unknown',Q
7,0,1,"McCarthy, Mr. Timothy J",male,54,0,0,17463,51.8625,E46,S
8,0,3,"Palsson, Master. Gosta Leonard",male,2,3,1,349909,21.075,'unknown',S
9,1,3,"Johnson, Mrs. Oscar W (Elisabeth Vilhelmina Berg)",female,27,0,2,347742,11.1333,'unknown',S
10,1,2,"Nasser, Mrs. Nicholas (Adele Achem)",female,14,1,0,237736,30.0708,'unknown',C

还包括创建新的 CSV 文件。

谢谢。

【问题讨论】:

    标签: bash csv


    【解决方案1】:

    稍微修改你的代码:

    #!/bin/bash
    
    datafile='dataset.txt'
    outputfile='train3.csv'
    >"$outputfile"
    
    while IFS="," read -r v1 v2 v3 v4 v5 v6 v7 v8 v9 v10 v11 v12 v13
    do
        if [[ -z "$v12" ]]
        then
            echo "$v1,$v2,$v3,$v4,$v5,$v6,$v7,$v8,$v9,$v10,$v11,'unknown',$v13"
        else
            echo "$v1, $v2,$v3,$v4,$v5,$v6,$v7,$v8,$v9,$v10,$v11,$v12,$v13"
        fi
    done < "$datafile" >"$outputfile"
    

    从文件中读取数据的一个很好的参考是https://mywiki.wooledge.org/BashFAQ/001

    【讨论】:

    • 最好将&gt;&gt;"$outputfile"移到循环外(较少打开/关闭文件),就在done之后。
    • 确实,谢谢,已修改。
    【解决方案2】:

    不要为此使用 Bash。您的输入 CSV 包含带引号的字符串。您可能无法保证带引号的字符串必须恰好包含一个逗号。如果它包含更少或更多的逗号,这将破坏您的代码。

    请改用专用工具,它可以正确处理带引号的字符串。最容易使用的工具是带有模块DBD::CSV 的Perl。以下命令将在 Debian 上安装它。

    sudo apt-get install libdbd-csv-perl
    

    现在您可以使用 SQL 来修复您的 CSV 文件。

    #! /usr/bin/perl
    
    use DBI;
    $dbh = DBI->connect ("dbi:CSV:")
        or die "Cannot connect: $DBI::errstr";
    
    my $sth = $dbh->prepare ("UPDATE train3.csv SET cabin = ? WHERE cabin is null");
    $sth->execute ("'unknown'");
    $sth->finish;
    
    $dbh->disconnect;
    

    如果您不想学习 Perl,您可以在命令行中将该脚本用作即用型程序。将其保存在csv.pl 并使其可执行:

    #! /usr/bin/perl
    use DBI;
    $dbh = DBI->connect ("dbi:CSV:")
        or die "Cannot connect: $DBI::errstr";
    my $sth = $dbh->prepare (shift);
    $sth->execute (@ARGV);
    $sth->finish;
    $dbh->disconnect;
    

    接下来您可以只传递查询及其参数:

    ./csv.pl 'UPDATE train3.csv SET cabin = ? WHERE cabin is null' \'unknown\'
    

    留意报价。

    【讨论】:

      【解决方案3】:

      您的脚本不工作,因为read 不知道从哪里读取,并且重定向应该在done 之后。 我还通过参数赋值改进了脚本:${parameter:-word} 将在参数为空时使用word

      while IFS="," read -r v1 v2 v3 v4 v5 v6 v7 v8 v9 v10 v11 v12 v13; do
        echo "$v1,$v2,$v3,$v4,$v5,$v6,$v7,$v8,$v9,$v10,$v11,${v12:-'unknown'},$v13"
      done <dataset.csv >train3.csv
      

      您可以使用其他工具避免while 循环

      awk -F, -v unknown="'unknown'" 'BEGIN { OFS="," } !$12 {$12=unknown} 1' < dataset.csv >train3.csv
      

      两种解决方案都会被字段 2 中的逗号混淆(这就是字段 12 而不是 11 被更改的原因)。如果名称不带逗号,则会检查错误的字段。
      当你知道Embarked是一个没有逗号的字段时,你可以使用

      awk -F, -v unknown="'unknown'" '
        BEGIN { OFS="," } 
        !$(NF-1) {$(NF-1)=unknown}
        1' < dataset.csv >train3.csv
      

      但是,您应该使用真正了解 csv 格式的解决方案,例如 @ceving 的答案。

      【讨论】:

      • 看起来不错,但我需要将脚本作为``` ./script.sh
      • 是的,只需删除 &lt; dataset.csv。您可以编辑您的问题并解释您如何调用脚本。
      猜你喜欢
      • 1970-01-01
      • 2016-03-13
      • 1970-01-01
      • 2021-06-24
      • 1970-01-01
      • 2014-12-05
      • 2018-03-13
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多