【问题标题】:Replace new lines with dashes under specific condition在特定条件下用破折号替换新行
【发布时间】:2014-05-07 08:15:55
【问题描述】:

我正在尝试编写一个命令 (sed/awk) 在以下条件下用破折号替换换行符:

这不应被替换为破折号,因为 CSV 中没有新行:

X00000;111111;1111111111;This is just a text

无论如何,在这个例子中,新行应该用破折号代替:

X00000;111111;1111111111;This is a longer text which contains a 
new line sign.

替换的输出应该是这样的:

X00000;111111;1111111111;This is a longer text which contains a - new line sign.

编辑:这也应该适用于这样的行:

X00000;111111;1111111111;"This is a longer text which contains a
new line sign
or even more

or a line that even contains only a new line sign

"

在这种情况下,预期会出现以下输出:

X00000;111111;1111111111;"This is a longer text which contains a - new line sign - or even more - - or a line that even contains only a new line sign - "

【问题讨论】:

  • CSV 中的列数是否固定为 4?换行符可以在任何地方吗?可以有多个换行符吗?
  • 是的,列数固定为4,换行只能在最后一列。谢谢
  • 请发布预期输出。空行应该就像- - -

标签: regex awk sed


【解决方案1】:

这是一个使用sed的选项:

$ cat file
X00000;111111;1111111111;This is just a text
X00000;111111;1111111111;This is a longer text which contains a 
new line sign.
X00000;111111;1111111111;"This is a longer text which contains a
new line sign
or even more

or a line that even contains only a new line sign

"

$ sed  ':a;$bc;N;s/\n/ - /;ba;:c;s/ - X00000;/\nX00000;/g' file
X00000;111111;1111111111;This is just a text
X00000;111111;1111111111;This is a longer text which contains a  - new line sign.
X00000;111111;1111111111;"This is a longer text which contains a - new line sign - or even more -  - or a line that even contains only a new line sign -  - "

说明:

sed '
    :a                         # Create a label a
    $bc                        # If it is last line, branch to label c
    N                          # Append next line to pattern space
    s/\n/ - /                  # Remove the \n and replace it with -
    ba                         # Keep repeating above steps until file is complete
    :c                         # Our label c. Do the following when end of file is reached
    s/ - X00000;/\nX00000;/g   # We do this substitution to add newlines where needed. 
' file

【讨论】:

    【解决方案2】:

    使用 awk 你可以做到:

    awk -F ';' 'NF<4{print p, "-", $0;p="";next} p{print p} {p=$0} END{if (p) print p}' file.csv
    X00000;111111;1111111111;This is just a text
    X00000;111111;1111111111;This is a longer text which contains a  - new line sign.
    

    【讨论】:

    • 这非常接近解决方案,但换行符仍然存在,因此看起来像这样:X00000;111111;1111111111;This is a longer text which contains a - new line sign.
    • 对不起,我在使用格式时遇到了麻烦,我希望删除换行符并用破折号替换 - 使用您的命令,新行中只有一个破折号,但换行符仍然存在。
    • 您可以在我的回答中看到我的 awk 输出。破折号前后没有换行符。你到底在哪里得到额外的换行符?
    • 还有新行,当输入文本有两个新行的时候,但我运行脚本几次就可以了。
    • 新行也可以多于 1 行吗?如果在您的问题中提供一个好的日期样本会更好,以便我更好地理解它。
    【解决方案3】:

    这是一个awk,它只是将所有内容连接在一起。

    awk '{printf (NR==1||!/X00000/)?$0:RS $0} END {print ""}' file
    X00000;111111;1111111111;This is just a text
    X00000;111111;1111111111;This is a longer text which contains a new line sign.
    X00000;111111;1111111111;"This is a longer text which contains anew line signor even moreor a line that even contains only a new line sign"
    

    它不会添加-

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 2013-06-06
      • 1970-01-01
      • 2010-11-19
      • 1970-01-01
      • 1970-01-01
      • 2011-03-08
      • 1970-01-01
      相关资源
      最近更新 更多