用\ quotation和\ quote替换引号

时间:2012-04-01 01:54:30

标签: bash sed awk

我有一个包含许多引号的文档。我需要用"(开头)和\quotation{(结尾)替换所有}对,以便在ConTeXt中使用,例如:

"Do not be afraid," said the tiger, "I am a vegetarian."

这应该成为:

\quotation{Do not be afraid,} said the tiger, \quotation{I am a vegetarian.}
  • 文档中没有嵌套引号。
  • 只有在成对找到引号时才会出现替换。如果一行有奇数个引号,则不应对该行进行更改,因为这表示存在错误。
  • 如果在开始和结束引号之间出现字符“/”,则应该对该行进行更改,因为这是错误的另一个指示。
  • 每个段落都出现在一行中,因此代码应该一次处理一行文档。

如何使用ConTeXt使用的格式替换这些引号?

5 个答案:

答案 0 :(得分:2)

这听起来像是自动化的可怕事情;复杂性令人印象深刻:

She said, "Don't say 'stupid', or I'll smack you.", to John's girlfriend.

没有什么好方法可以区分嵌入式引用,收缩,占有引用,嵌套可能很难匹配。在某处遗忘的收尾报价可能会完全搞砸输出。 (我在Terry Pratchett的书中看过几十个缺失的引号。你的内容是否更好?)

答案 1 :(得分:2)

这是我的awk代码,它可能不是很优雅,但它可以完成这项工作。

{
    # split current line into several pieces using quotation char
    split($0, a, "\"")
    # and if the number of pieces is even, which implies that the number of quotation marks is odd
    if (length(a) % 2 == 0) {
        # Then error, unclosed quotation mark
        # Handle it in some other way if you want
        print
    } else {
        # the only pieces that need to be quoted are those on even positions in array
        # so we just surround them with the desired text
        for (i = 2; i <= length(a); ++i) {
            if (i % 2 == 0) {
                printf "%s", "\\quote{" a[i]
            } else {
                printf "%s", "}" a[i]
            }
        }
        # We should output end-of-line character manually to end the line
        printf "\n"
    }
}

它的工作原理是使用引用字符将行拆分为零件并将它们存储在a数组中,例如行,“不要害怕,”老虎说,“我是素食主义者。”:< / p>

a[1]: 
a[2]: Do not be afraid,
a[3]:  said the tiger, 
a[4]: I am a vegetarian.
a[5]: 

a [1]和[5]都是空的

答案 2 :(得分:2)

不完美,但你可以尝试这样的事情 -

sed 's/"\(.[^"]*\)"/\\quotation{\1}/g' file

测试:

[jaypal:~/Temp] cat file
"Do not be afraid," said the tiger, "I am a vegetarian."

[jaypal:~/Temp] sed 's/"\(.[^"]*\)"/\\quotation{\1}/g' file
\quotation{Do not be afraid,} said the tiger, \quotation{I am a vegetarian.}

答案 3 :(得分:2)

另一种方式:

perl -n -e '$a=$_;$a=~s/\"([^\"^\\]*)\"/\\quotation\{$1\}/g;print $a' < input

答案 4 :(得分:1)

这可能对您有用:

echo -e 'a "b" c "d" e\na "b" c "d e\na "b" c "d/d" e' |
sed 'h;s/"\([^"/]*\)"/\\quotation{\1}/g;/"/{g;s/^/ERROR: /}'
a \quotation{b} c \quotation{d} e
ERROR: a "b" c "d e
ERROR: a "b" c "d/d" e

如果您不想要ERROR消息,那么:

echo -e 'a "b" c "d" e\na "b" c "d e\na "b" c "d/d" e' | 
sed 'h;s/"\([^"/]*\)"/\\quotation{\1}/g;/"/g'
a \quotation{b} c \quotation{d} e
a "b" c "d e
a "b" c "d/d" e