每行只打印一个匹配项

时间:2014-11-07 10:52:32

标签: regex awk grep sh

我有这样的日志

3>DirectMicrophone.obj : error LNK2019: unresolved external symbol _DirectSoundCaptureEnumerateW@8 referenced in function "private: void __thiscall DirectMicrophoneManager::getDevices(void)" (?getDevices@DirectMicrophoneManager@@AAEXXZ)
3>DirectMicrophone.obj : error LNK2001: unresolved external symbol _DSDEVID_DefaultVoiceCapture
3>DirectMicrophone.obj : error LNK2001: unresolved external symbol _IID_IDirectSoundCapture
3>DirectSoundPlayer.obj : error LNK2019: unresolved external symbol _DirectSoundCreate@12 referenced in function "private: bool __thiscall DirectSoer::CreateDirBuffers(void)" (?CreateDirBuffers@DirPlayer@@AAE_NXZ)
libmodule-text.lib(CTS_Support.obj) : error LNK2001: unresolved external symbol _delete "void __cdecl operator delete(void *)" (??3@YAXPAX@Z)
3>rtmfp_interface.obj : error LNK2001: unresolved external symbol __CIcos

我只想提取日志中突出显示的符号。同样的

有两种方法
  1. 在“外部符号”
  2. 之后每行打印第一个单词
  3. 以“_”
  4. 开头,每行打印第一个单词

    我尝试使用脚本

    的第二种方法
    egrep -o "(\s(_\S+))" <log_file> 
    

    但它会打印所有以“_”开头的单词,而不仅仅是行中第一个匹配的单词。 我想知道如何使脚本只打印第一个匹配的单词而不是行中所有匹配的单词。

    预期产出:

    _DirectSoundCaptureEnumerateW@8
    _DSDEVID_DefaultVoiceCapture
    _IID_IDirectSoundCapture
    _DirectSoundCreate@12
    _delete
    __CIcos 
    

5 个答案:

答案 0 :(得分:0)

您可以使用grep -oP命令:

grep -oP '^[^:]+:[^_]+\K(\S+)' logs
_DirectSoundCaptureEnumerateW@8
_DSDEVID_DefaultVoiceCapture
_IID_IDirectSoundCapture
_DirectSoundCreate@12
_delete
__CIcos

或使用awk

awk -F '^[^:]+:[^_]+' '{sub(/ .*$/, "", $2); print $2}' logs

答案 1 :(得分:0)

如果你的grep支持-P,那么你可以使用下面的正则表达式。

grep -oP 'external symbol\K\h_\S+' file

答案 2 :(得分:0)

使用(G)awk

awk 'match($0,/_([^*]+)/,a){print a[1]}' file

如果必须是外部符号

之后的下一个单词,这将有效
awk 'match($0,/external symbol[^[:alnum:]]+([[:alnum:]]+)/,a){print a[1]}' file

另外,由于您已指定word构成的内容,您可以执行此操作以包含@

awk 'match($0,/external symbol[^[:alnum:]]+([[:alnum:]@]+)/,a){print a[1]}' file

答案 3 :(得分:0)

另一个gnu awk(由于RS中的多个字符)

awk -v RS='external symbol \\*\\*_' -F'\\*\\*' 'NR>1{print $1}' file
DirectSoundCaptureEnumerateW@8
DSDEVID_DefaultVoiceCapture
IID_IDirectSoundCapture
DirectSoundCreate@12

答案 4 :(得分:0)

好的,试试这个:

sed 's/[^_]*\( _[^\b\t\s ]*\)[^_]*/\1XXX/;s/\(.*\)XXX.*/\1/;s/.*\(_\)/\1/' logs_data

It says 
[^_]* don't match _ any amount 
\( start a hold pattern 
_ start with underbar and terminate at any boundary \b or tab \t or space \s  
end hold \) 
[^_]* don't match _ any amount 
then replace the previous with a boundary marker XXX 
and delete everything else - keeping only the first match.