【问题标题】:Replacing quotation marks with "``" and "''"用“``”和“''”替换引号
【发布时间】:2012-02-16 12:44:36
【问题描述】:

我有一个包含许多 " 标记的文档,但我想将其转换为在 TeX 中使用。

TeX 使用 2 个 ` 标记作为开始引号,使用 2 个 ' 标记作为结束引号。

我只想在" 以偶数出现在单行上时对这些进行更改(例如,一行中有 2、4 或 6 个")。例如

"This line has 2 quotation marks."
--> ``This line has 2 quotation marks.''

"This line," said the spider, "Has 4 quotation marks."
--> ``This line,'' said the spider, ``Has 4 quotation marks.''

"This line," said the spider, must have a problem, because there are 3 quotation marks."
--> (unchanged)

我的句子从不跨行,因此无需检查多行。

单引号的引号很少,所以我可以手动更改。

如何转换这些?

【问题讨论】:

    标签: bash quotes tex


    【解决方案1】:

    这是适合我的单线:

    awk -F\" '{if((NF-1)%2==0){res=$0;for(i=1;i<NF;i++){to="``";if(i%2==0){to="'\'\''"}res=gensub("\"", to, 1, res)};print res}else{print}}' input.txt >output.txt
    

    还有这个带有 cmets 的单线的长版本:

    {
        FS="\"" # set field separator to double quote
        if ((NF-1) % 2 == 0) { # if count of double quotes in line are even number
            res = $0 # save original line to res variable
            for (i = 1; i < NF; i++) { # for each double quote
                to = "``" # replace current occurency of double quote by ``
                if (i % 2 == 0) { # if its closes quote replace by ''
                    to = "''"
                }
                # replace " by to in res and save result to res
                res = gensub("\"", to, 1, res)
            }
            print res # print resulted line
        } else {
            print # print original line when nothing to change
        }
    }
    

    您可以通过以下方式运行此脚本:

    awk -f replace-quotes.awk input.txt >output.txt
    

    【讨论】:

      【解决方案2】:

      这是我使用重复的sed 的单行代码:

      cat file.txt | sed -e 's/"\([^"]*\)"/`\1`/g' | sed '/"/s/`/\"/g' | sed -e 's/`\([^`]*\)`/``\1'\'''\''/g'
      

      (注意:如果文件中已经有反引号(`),它将无法正常工作,否则应该可以解决问题)

      编辑:

      通过简化删除了反勾号错误,现在适用于所有情况:

      cat file.txt | sed -e 's/"\([^"]*\)"/``\1'\'\''/g' | sed '/"/s/``/"/g' | sed '/"/s/'\'\''/"/g'
      

      使用 cmets:

      cat file.txt                           # read file
      | sed -e 's/"\([^"]*\)"/``\1'\'\''/g'  # initial replace
      | sed '/"/s/``/"/g'                    # revert `` to " on lines with extra "
      | sed '/"/s/'\'\''/"/g'                # revert '' to " on lines with extra "
      

      【讨论】:

      • 它不漂亮,但我想出了一个sed 1-liner
      • bash 脚本什么时候变得漂亮了?
      【解决方案3】:

      使用awk

      awk '{n=gsub("\"","&")}!(n%2){while(n--){n%2?Q=q:Q="`";sub("\"",Q Q)}}1' q=\' in
      

      说明

      awk '{
        n=gsub("\"","&") # set n to the number of quotes in the current line
      }
      !(n%2){ # if there are even number of quotes
        while(n--){ # as long as we have double-quotes
          n%2?Q=q:Q="`" # alternate Q between a backtick and single quote
          sub("\"",Q Q) # replace the next double quote with two of whatever Q is
        }
      }1 # print out all other lines untouched' 
      q=\' in # set the q variable to a single quote and pass the file 'in' as input
      

      使用sed

      sed '/^\([^"]*"[^"]*"[^"]*\)*$/s/"\([^"]*\)"/``\1'\'\''/g' in
      

      【讨论】:

      • 我一直在注视这个awk 解决方案(以及无数次迭代:P)已经超过 2 小时,老实说,我认为它不能做得更好。很好地使用gsub来识别引号的数量,很好地使用modulo operator和很好地使用ternary operator +1
      【解决方案4】:

      这可能对你有用:

      sed 'h;s/"\([^"]*\)"/``\1''\'\''/g;/"/g' file
      

      解释:

      • 复制原行h
      • 替换成对的"s/"\([^"]*\)"/``\1''\'\''/g
      • 检查奇数",如果找到则恢复原行/"/g

      【讨论】:

      • 我知道。 :) +1。这个解决方案可以很好地解释它。
      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 2018-09-23
      • 1970-01-01
      • 2012-11-27
      • 2016-08-10
      • 2021-08-12
      • 1970-01-01
      • 2013-01-22
      相关资源
      最近更新 更多