用“``”和“''”替换引号答案

【问题标题】：Replacing quotation marks with "``" and "''"用“``”和“''”替换引号
【发布时间】：2012-02-16 12:44:36
【问题描述】：

我有一个包含许多 " 标记的文档，但我想将其转换为在 TeX 中使用。

TeX 使用 2 个 ` 标记作为开始引号，使用 2 个 ' 标记作为结束引号。

我只想在" 以偶数出现在单行上时对这些进行更改（例如，一行中有 2、4 或 6 个"）。例如

"This line has 2 quotation marks."
--> ``This line has 2 quotation marks.''

"This line," said the spider, "Has 4 quotation marks."
--> ``This line,'' said the spider, ``Has 4 quotation marks.''

"This line," said the spider, must have a problem, because there are 3 quotation marks."
--> (unchanged)

我的句子从不跨行，因此无需检查多行。

单引号的引号很少，所以我可以手动更改。

如何转换这些？

【问题讨论】：

标签： bash quotes tex

【解决方案1】：

这是适合我的单线：

awk -F\" '{if((NF-1)%2==0){res=$0;for(i=1;i<NF;i++){to="``";if(i%2==0){to="'\'\''"}res=gensub("\"", to, 1, res)};print res}else{print}}' input.txt >output.txt

还有这个带有 cmets 的单线的长版本：

{
    FS="\"" # set field separator to double quote
    if ((NF-1) % 2 == 0) { # if count of double quotes in line are even number
        res = $0 # save original line to res variable
        for (i = 1; i < NF; i++) { # for each double quote
            to = "``" # replace current occurency of double quote by ``
            if (i % 2 == 0) { # if its closes quote replace by ''
                to = "''"
            }
            # replace " by to in res and save result to res
            res = gensub("\"", to, 1, res)
        }
        print res # print resulted line
    } else {
        print # print original line when nothing to change
    }
}

您可以通过以下方式运行此脚本：

awk -f replace-quotes.awk input.txt >output.txt

【讨论】：

【解决方案2】：

这是我使用重复的sed 的单行代码：

cat file.txt | sed -e 's/"\([^"]*\)"/`\1`/g' | sed '/"/s/`/\"/g' | sed -e 's/`\([^`]*\)`/``\1'\'''\''/g'

（注意：如果文件中已经有反引号（`），它将无法正常工作，否则应该可以解决问题）

编辑：

通过简化删除了反勾号错误，现在适用于所有情况：

cat file.txt | sed -e 's/"\([^"]*\)"/``\1'\'\''/g' | sed '/"/s/``/"/g' | sed '/"/s/'\'\''/"/g'

使用 cmets：

cat file.txt                           # read file
| sed -e 's/"\([^"]*\)"/``\1'\'\''/g'  # initial replace
| sed '/"/s/``/"/g'                    # revert `` to " on lines with extra "
| sed '/"/s/'\'\''/"/g'                # revert '' to " on lines with extra "

【讨论】：

它不漂亮，但我想出了一个sed 1-liner
bash 脚本什么时候变得漂亮了？

【解决方案3】：

使用`awk`

awk '{n=gsub("\"","&")}!(n%2){while(n--){n%2?Q=q:Q="`";sub("\"",Q Q)}}1' q=\' in

说明

awk '{
  n=gsub("\"","&") # set n to the number of quotes in the current line
}
!(n%2){ # if there are even number of quotes
  while(n--){ # as long as we have double-quotes
    n%2?Q=q:Q="`" # alternate Q between a backtick and single quote
    sub("\"",Q Q) # replace the next double quote with two of whatever Q is
  }
}1 # print out all other lines untouched' 
q=\' in # set the q variable to a single quote and pass the file 'in' as input

使用`sed`

sed '/^\([^"]*"[^"]*"[^"]*\)*$/s/"\([^"]*\)"/``\1'\'\''/g' in

【讨论】：

我一直在注视这个awk 解决方案（以及无数次迭代：P）已经超过 2 小时，老实说，我认为它不能做得更好。很好地使用gsub来识别引号的数量，很好地使用modulo operator和很好地使用ternary operator +1

【解决方案4】：

这可能对你有用：

sed 'h;s/"\([^"]*\)"/``\1''\'\''/g;/"/g' file

解释：

复制原行h
替换成对的" 的s/"\([^"]*\)"/``\1''\'\''/g
检查奇数"，如果找到则恢复原行/"/g

【讨论】：

我知道。 :) +1。这个解决方案可以很好地解释它。

编辑：

使用awk

使用sed

使用`awk`

使用`sed`