编辑：Regex Escapes

Question

如何使用perl转换此多行：

Colours:
Red
Green
Yellow
Blue

在：

Colours: Red
Green
Yellow
Blue

Answer 1

Unicode换行字形是\R，从v5.10开始支持。

因此，您可以通过这种方式将第一个换行符更改为整个文件中的空格：

$ perl -Mv5.10 -CSD -i.orig -0777 -pe 's/\R/ /' some_utf8_file.txt

还有其他方法不会浪费内存，但要想做到这一点可能会很棘手。你可以省略-0777，看看在这种情况下这对你来说是否足够好。

编辑：Regex Escapes

以下是正则表达式中支持的转义符，包括首次支持的版本，即使从v5.6开始，版本号也会向上舍入。

Release  Rx Escape   Meaning
=======  ==========  ===========================================================================================
v1.0     \0          Match character number zero (U+0000, NULL, NUL).
v1.0     \0N,\0NN    Match octal character up to octal 077.
v1.0     \N          Match Nth capture group (decimal) if not in charclass and that many seen, else (octal) character up to octal 377.
v1.0     \NN         Match Nth capture group (decimal) if not in charclass and that many seen, else (octal) character up to octal 377.
v1.0     \NNN        Match Nth capture group (decimal) if not in charclass and that many seen, else (octal) character up to octal 377.
v4.0     \a          Match the alert character (ALERT, BEL).
v5.0     \A          True at the beginning of a string only, not in charclass.
v1.0     \b          Match the backspace char (BACKSPACE, BS) in charclass only.
v1.0     \b          True at Unicode word boundary, outside of charclass only.
v1.0     \B          True when not at Unicode word boundary, not in charclass.
v4.0     \cX         Match ASCII control character Control-X (\cZ, \c[, \c?, etc).
v5.6     \C          Match one byte (C char) even in UTF‑8 (dangerous!), not in charclass.
v1.0     \d          Match any Unicode digit character.
v1.0     \D          Match any Unicode nondigit character.
v4.0     \e          Match the escape character (ESCAPE, ESC, not backslash).
v4.0     \E          End case (\F, \L, \U) or quotemeta (\Q) translation, only if interpolated.
v1.0     \f          Match the form feed character (FORM FEED, FF).
v5.16    \F          Foldcase (not lowercase) till \E, only if interpolated.
v5.10    \g{GROUP}   Match the named or numbered capture group, not in charclass.
v5.0     \G          True at end-of-match position of prior m//g or pos() setting, not in charclass.
v5.10    \h          Match any Unicode horizontal whitespace character.
v5.10    \H          Match any Unicode character except horizontal whitespace.
v5.10    \k<GROUP>   Match the named capture group; also \k'NAME', not in charclass.
v5.10    \K          Keep text to the left of \K out of match, not in charclass.
v4.0     \l          Lowercase (not foldcase) next character only, only if interpolated.
v4.0     \L          Lowercase (not foldcase) till \E., only if interpolated.
v1.0     \n          Match the newline character (usually LINE FEED, LF).
v5.12    \N          Match any character except newline.
v5.6     \N{NAME}    Match the named character or named alias, or if outside of charclass named sequence, but only if interpolated and charnames loaded.
v5.14    \o{NNNNNN}  Match the character given in any number of octal digits.
v5.6     \p{PROP}    Match any character with the named property.
v5.6     \P{PROP}    Match any character without the named property.
v4.0     \Q          Quote (de-meta) metacharacters till \E.
v1.0     \r          Match the return character (usually CARRIAGE RETURN, CR).
v5.10    \R          Match any Unicode linebreak grapheme, only outside of charclass.
v1.0     \s          Match any Unicode whitespace character except \cK.
v1.0     \S          Match any Unicode nonwhitespace character or \cK.
v1.0     \t          Match the tab character (CHARACTER TABULATION, HT).
v4.0     \u          Titlecase (not uppercase) next character only, only if interpolated.
v4.0     \U          Uppercase (not titlecase) till \E, only if interpolated.
v5.10    \v          Match any Unicode vertical whitespace character.
v5.10    \V          Match any character except Unicode vertical whitespace.
v1.0     \w          Match any Unicode “word” character (alphabetics, digits, combining marks, and connector punctuation)
v1.0     \W          Match any Unicode nonword character.
v4.0     \xH         Match the character given in one hex digit.
v4.0     \xHH        Match the character given in two hex digits.
v5.6     \x{HHHHHH}  Match the character given in any number of hex.
v5.6     \X          Match Unicode extended grapheme cluster, only outside of charclass.
v5.5     \z          True at end of string only.
v5.0     \Z          True right before optional final newline.

Answer 2

有趣的情况是你逐行阅读文件：

#!/usr/bin/perl

use strict; use warnings;

if (defined(my $first = <DATA>)) {
    chomp $first;
    if (defined(my $second = <DATA>)) {
        $first .= $second
    }
    print $first;
}

print while <DATA>;

__DATA__
Colours:
Red
Green
Yellow
Blue

Answer 3

怎么样：

$string =~ s/\n/ /;

Answer 4

简单的问题，简单的回答：

#!/usr/bin/perl

use strict;
use warnings;

my $str = 'Colours:
Red
Green
Yellow
Blue';

$str =~ s/\n/ /;

print "$str\n";

在Perl中，我如何加入前两行输入？

4 个答案:

编辑：Regex Escapes