Question

我有以下格式的日志

<<
[ABC] some other data
some other data
>>

<<
DEF some other data
some other data
>>

<<
[ABC] some other data
some other data
>>

我想选择所有具有ABC预期结果的日志

<<
[ABC] some other data
some other data
>>

<<
[ABC] some other data
some other data
>>

sed命令的表达式是什么？用于获取内容b / w＆lt;＆lt; ＆GT;＆GT;表达式将是

sed -e '/<</,/>>/!d'

但我怎么能强迫它在b / w

中有[ABC]

Answer 1

这可能对您有用：

sed '/^<</,/^>>/{/^<</{h;d};H;/^>>/{x;/^<<\n\[ABC\]/p}};d' file
<<
[ABC] some other data
some other data
>>
<<
[ABC] some other data
some other data
>>

sed配备了一个名为hold space（HS）的注册表。

您可以使用HS收集感兴趣的数据。在这种情况下，/^<</,/^>>/

之间的行

h用模式空间（PS）中的内容替换HS中的任何内容

H将换行符\n添加，然后将PS添加到HS

x将HS替换为PS

N.B。这会删除除<<...>> [ABC]之间的所有行以外的所有行。如果您想保留其他行，请使用：

sed '/^<</,/^>>/{/^<</{h;d};H;/^>>/{x;/^<<\n\[ABC\]/p};d}' file
<<
[ABC] some other data
some other data
>>


<<
[ABC] some other data
some other data
 >>

Answer 2

这对我有用：

awk '$0~/ABC/{print "<<";print;getline;print;getline;print }' temp.txt

测试如下：

pearl.242> cat temp.txt
<< 
[ABC] some other data 
some other data 
>>  
<< 
DEF some other data 
some other data 
>>  

nkeem

<< 
[ABC] some other data 
some other data 
>> 
pearl.243> awk '$0~/ABC/{print "<<";print;getline;print;getline;print }' temp.txt
<<
[ABC] some other data 
some other data 
>>  
<<
[ABC] some other data 
some other data 
>> 
pearl.244>

如果您不想对此声明print "<<";进行硬编码，那么您可以选择以下内容：

pearl.249> awk '$0~/ABC/{print x;print;getline;print;getline;print}{x=$0}' temp.txt
<< 
[ABC] some other data 
some other data 
>>  
<< 
[ABC] some other data 
some other data 
>> 
pearl.250>

Answer 3

对我来说，sed是基于行的。你可以把它说成是多行的，但用awk或perl开始工作比用sed做更容易。

我使用perl并制作一个像这个伪代码的小型状态机（我不保证它会抓住你想要实现的每一个细节）

state = 0;
for each line
    if state == 0
        if line == '<<'
            state = 1;
    if state == 1
        If line starts with [ABC]
            buffer += line
            state =2
    if state == 2
      if line == >>
          do something with buffer
          state = 0
      else
          buffer += line;

另请参阅http://www.catonmat.net/blog/awk-one-liners-explained-part-three/，了解如何使用awk作为1个班轮...

Answer 4

TXR：专为多线工作而设计。

@(collect)
<<
[ABC] @line1
@line2
>>
@  (output)
>>
[ABC] @line1
@line2
<<

@  (end)
@(end)

执行命令

$ txr data.txr  data
>>
[ABC] some other data
some other data
<<

>>
[ABC] some other data
some other data
<<

很基本的东西;你可能最好坚持使用awk，直到你有一个非常复杂的多行提取工作，包含大量不规则数据，大量嵌套等等。

如果日志非常大，我们应该写@(collect :vars ())，这样收集不会隐式累积列表;那么这份工作将在不断的记忆中运行。

另外，如果日志不总是两行，则会变得有点复杂。我们可以使用嵌套的收集来收集可变数量的行。

@(collect :vars ())
<<
[ABC] @line1
@  (collect)
@line
@  (until)
>>
@  (end)
@  (output)
>>
[ABC] @line1
@  {line "\n"}
<<

@  (end)
@(end)

sed匹配多行模式

4 个答案: