Question

我想知道我是否可以删除\n（换行符），前提是当前行只有一个列表中的一个或多个关键字;例如，如果\n包含单词 hello 或 world ，我想将其删除。

示例：

this is an original
file with lines
containing words like hello
and world
this is the end of the file

结果将是：

this is an original
file with lines
containing words like hello and world this is the end of the file

我想使用sed或awk，如果需要，可以使用grep，wc或任何用于此目的的命令。我希望能够在很多文件上做到这一点。

Answer 1

使用awk你可以这样做：

awk '/hello|world/{printf "%s ", $0; next} 1' file
this is an original
file with lines
containing words like hello and world this is the end of the file

Answer 2

这里使用sed

是简单的

sed -r ':a;$!{N;ba};s/((hello|world)[^\n]*)\n/\1 /g' file

解释

:a;$!{N;ba}将整个文件读入模式，如下所示：this is an original\nfile with lines\ncontaining words like hell\ o\nand world\nthis is the end of the file$
s/((hello|world)[^\n]*)\n/\1 /g搜索关键字hello或world并删除下一个\n，
g命令将替换应用于正则表达式的所有匹配，而不仅仅是第一个匹配。

Answer 3

非正则表达式方法：

awk '
    BEGIN {
        # define the word list
        w["hello"]
        w["world"]
    }
    {
        printf "%s", $0
        for (i=1; i<=NF; i++) 
            if ($i in w) {
                printf " "
                next
            }
        print ""
    }
'

或perl one-liner

perl -pe 'BEGIN {@w = qw(hello world)} s/\n/ / if grep {$_ ~~ @w} split'

要就地编辑文件，请执行以下操作：

awk '...' filename > tmpfile && mv tmpfile filename
perl -i -pe '...' filename

Answer 4

这可能适合你（GNU sed）：

sed -r ':a;/^.*(hello|world).*\'\''/M{$bb;N;ba};:b;s/\n/ /g' file

这将检查可能的多行的最后一行是否包含所需的字符串，如果是，则读取另一行直到文件结尾或最后一行不包含/那些字符串（ S）。删除换行符并打印行。

Answer 5

$ awk '{ORS=(/hello|world/?FS:RS)}1' file
this is an original
file with lines
containing words like hello and world this is the end of the file

Answer 6

sed -n '
:beg
/hello/ b keep
/world/ b keep
H;s/.*//;x;s/\n/ /g;p;b
: keep
H;s/.*//
$ b beg
' YourFile

由于检查当前行可能包含先前的 hello 或 world 已经

原理：

在每个模式匹配时，将字符串保留在保持缓冲区中另外，加载保持缓冲区并删除\ n（由于可用的缓冲区操作有限，使用交换并清空当前行）并打印内容在最后一行添加一个特殊的图案（正常保持，否则不打印）

如果字符串包含关键字，请删除\ n换行符

6 个答案:

解释