如何在单独的行上打印多个图案

时间:2019-04-17 13:25:54

标签: bash awk sed grep

我有一个要使用bash处理的文件。可以与awk,sed或grep或类似版本一起使用。该文件在一行上出现多次。我想提取这两个事件之间的所有内容,并将输出分别打印在单独的行上。

我已经尝试使用此功能

cat file.txt | grep -o 'pattern1.*pattern2'

但这将打印从pattern1到最后一个匹配的pattern2的所有内容。

$ cat file.txt
pattern1 this is the first content pattern2 this is some other stuff pattern1 this is the second content pattern2 this is the end of the file.

我想得到:

pattern1 this is the first content pattern2
pattern1 this is the second content pattern2

3 个答案:

答案 0 :(得分:1)

这可能对您有用(GNU sed):

sed -n '/pattern1.*pattern2/{s/pattern1/\n&/;s/.*\n//;s/pattern2/&\n/;P;D}' file

将选项-n设置为显式打印。

仅包含pattern1后跟pattern2的处理行。

将换行符添加到pattern1

删除并包括引入的换行符。

pattern2之后添加换行符。

在图案空间中打印第一行,将其删除并重复。

答案 1 :(得分:0)

尝试gnu sed:

 sed -E 's/(pattern2).*(pattern1)(.*\1).*/\1\n\2\3/' file.txt

答案 2 :(得分:0)

如果您无权访问支持环视的工具,则这种方法虽然冗长,但可以在任何UNIX机器上使用标准工具来可靠地工作:

awk '{
    gsub(/@/,"@A"); gsub(/{/,"@B"); gsub(/}/,"@C"); gsub(/pattern1/,"{"); gsub(/pattern2/,"}")
    out = ""
    while( match($0,/{[^{}]*}/) ) {
        out = (out=="" ? "" : out ORS) substr($0,RSTART,RLENGTH)
        $0 = substr($0,RSTART+RLENGTH)
    }
    $0 = out
    gsub(/}/,"pattern2"); gsub(/{/,"pattern1"); gsub(/}/,"@C"); gsub(/{/,"@B"); gsub(/@A/,"@")
} 1' file

以上方法通过创建输入中不存在的字符来工作(首先将那些字符{}更改为其他字符串@B@C)因此它可以使用否定字符类中的那些字符来查找目标字符串,然后将所有更改的字符返回其原始值。这是一些印刷品,可以使每个步骤中发生的事情更加明显:

awk '{
    print "1): " $0 ORS
    gsub(/@/,"@A"); gsub(/{/,"@B"); gsub(/}/,"@C"); gsub(/pattern1/,"{"); gsub(/pattern2/,"}")
    print "2): " $0 ORS
    out = ""
    while( match($0,/{[^{}]*}/) ) {
        out = (out=="" ? "" : out ORS) substr($0,RSTART,RLENGTH)
        $0 = substr($0,RSTART+RLENGTH)
    }
    $0 = out
    print "3): " $0 ORS
    gsub(/}/,"pattern2"); gsub(/{/,"pattern1"); gsub(/}/,"@C"); gsub(/{/,"@B"); gsub(/@A/,"@")
    print "4): " $0 ORS
} 1' file
1): pattern1 this is the first content pattern2 this is some other stuff pattern1 this is the second content pattern2 this is the end of the file.

2): { this is the first content } this is some other stuff { this is the second content } this is the end of the file.

3): { this is the first content }
{ this is the second content }

4): pattern1 this is the first content pattern2
pattern1 this is the second content pattern2

pattern1 this is the first content pattern2
pattern1 this is the second content pattern2