使用正则表达式从日志文件中提取字符串

时间:2017-10-18 21:29:37

标签: regex bash logging awk

我有Home Assistant日志文件,我需要通过bash中的脚本从中提取特定信息。

日志看起来像这样:

2017-09-26 20:54:09 INFO (MainThread) [homeassistant.core] Bus:Handling <Event state_changed[L]: new_state=<state media_player.player1=playing; media_duration=6054, is_volume_muted=False, media_season=-1, media_episode=-1, media_series_title=, media_content_type=movie, media_album_name=, friendly_name=player1, media_title=My movie, supported_features=55231, entity_picture=/api/media_player_proxy/media_player.player1?token=5ce13d0e4d8ed86b3073a795a4bd82627b387cf2b6c5dfdc8c64ab9b62ec7d61&cache=f4a48, media_content_id=unknown=tt0266543, volume_level=0.91 @ 2017-09-26T20:44:37.080005+02:00>, old_state=<state media_player.player1=playing; media_duration=6054, is_volume_muted=False, media_season=-1, media_episode=-1, media_series_title=, media_content_type=movie, media_album_name=, friendly_name=player1, media_title=My movie, supported_features=55231, entity_picture=/api/media_player_proxy/media_player.player1?token=5ce13d0e4d8ed86b3073a795a4bd82627b387cf2b6c5dfdc8c64ab9b62ec7d61&cache=23108, media_content_id=unknown=tt0266543, volume_level=0.91 @ 2017-09-26T20:44:37.080005+02:00>, entity_id=media_player.player1>
2017-09-26 21:30:22 INFO (MainThread) [homeassistant.core] Bus:Handling <Event state_changed[L]: new_state=<state media_player.tv_player2=playing; is_volume_muted=False, app_id=233637DE, app_name=YouTube, volume_level=1.0, friendly_name=player2, media_position=0, supported_features=21437, media_position_updated_at=2017-09-26T21:30:22.660153+02:00, entity_picture=/api/media_player_proxy/media_player.tv_player2?token=f03173d412f3c7396762605a4f347b83f651d4c0f23935c8b1fccd9305698ffe&cache=88a6b, media_content_id=AcADAFAD, media_title=YT test movie @ 2017-09-26T21:30:21.848088+02:00>, old_state=<state media_player.tv_player2=playing; supported_features=21437, is_volume_muted=False, media_position_updated_at=2017-09-26T21:30:22.557344+02:00, app_id=233637DE, friendly_name=player2, media_content_id=AcADAFAD, app_name=YouTube, media_position=0, volume_level=1.0 @ 2017-09-26T21:30:21.848088+02:00>, entity_id=media_player.tv_player2>
2017-09-26 21:30:22 INFO (MainThread) [homeassistant.core] Bus:Handling <Event state_changed[L]: new_state=<state media_player.tv_player2=playing; is_volume_muted=False, app_id=233637DE, app_name=YouTube, volume_level=1.0, friendly_name=player2, media_position=0, supported_features=21437, media_position_updated_at=2017-09-26T21:30:22.695325+02:00, entity_picture=/api/media_player_proxy/media_player.tv_player2?token=f03173d412f3c7396762605a4f347b83f651d4c0f23935c8b1fccd9305698ffe&cache=88a6b, media_content_id=AcADAFAD, media_title=YT test movie @ 2017-09-26T21:30:21.848088+02:00>, old_state=<state media_player.tv_player2=playing; is_volume_muted=False, app_id=233637DE, app_name=YouTube, volume_level=1.0, friendly_name=player2, media_position=0, supported_features=21437, media_position_updated_at=2017-09-26T21:30:22.660153+02:00, entity_picture=/api/media_player_proxy/media_player.tv_player2?token=f03173d412f3c7396762605a4f347b83f651d4c0f23935c8b1fccd9305698ffe&cache=88a6b, media_content_id=AcADAFAD, media_title=YT test movie @ 2017-09-26T21:30:21.848088+02:00>, entity_id=media_player.tv_player2>
2017-09-26 21:30:29 INFO (MainThread) [homeassistant.core] Bus:Handling <Event state_changed[L]: new_state=<state media_player.tv_player2=playing; media_duration=1789.213605442177, is_volume_muted=False, app_id=233637DE, app_name=YouTube, volume_level=1.0, friendly_name=player2, media_position=0, supported_features=21437, media_position_updated_at=2017-09-26T21:30:29.115053+02:00, entity_picture=/api/media_player_proxy/media_player.tv_player2?token=f03173d412f3c7396762605a4f347b83f651d4c0f23935c8b1fccd9305698ffe&cache=88a6b, media_content_id=AcADAFAD, media_title=YT test movie @ 2017-09-26T21:30:21.848088+02:00>, old_state=<state media_player.tv_player2=playing; is_volume_muted=False, app_id=233637DE, app_name=YouTube, volume_level=1.0, friendly_name=player2, media_position=0, supported_features=21437, media_position_updated_at=2017-09-26T21:30:22.695325+02:00, entity_picture=/api/media_player_proxy/media_player.tv_player2?token=f03173d412f3c7396762605a4f347b83f651d4c0f23935c8b1fccd9305698ffe&cache=88a6b, media_content_id=AcADAFAD, media_title=YT test movie @ 2017-09-26T21:30:21.848088+02:00>, entity_id=media_player.tv_player2>
2017-09-26 21:30:29 INFO (MainThread) [homeassistant.core] Bus:Handling <Event state_changed[L]: new_state=<state media_player.tv_player2=playing; media_duration=1789.213605442177, is_volume_muted=False, app_id=233637DE, app_name=YouTube, volume_level=1.0, friendly_name=player2, media_position=0.733, supported_features=21437, media_position_updated_at=2017-09-26T21:30:29.843295+02:00, entity_picture=/api/media_player_proxy/media_player.tv_player2?token=f03173d412f3c7396762605a4f347b83f651d4c0f23935c8b1fccd9305698ffe&cache=88a6b, media_content_id=AcADAFAD, media_title=YT test movie @ 2017-09-26T21:30:21.848088+02:00>, old_state=<state media_player.tv_player2=playing; media_duration=1789.213605442177, is_volume_muted=False, app_id=233637DE, app_name=YouTube, volume_level=1.0, friendly_name=player2, media_position=0, supported_features=21437, media_position_updated_at=2017-09-26T21:30:29.115053+02:00, entity_picture=/api/media_player_proxy/media_player.tv_player2?token=f03173d412f3c7396762605a4f347b83f651d4c0f23935c8b1fccd9305698ffe&cache=88a6b, media_content_id=AcADAFAD, media_title=YT test movie @ 2017-09-26T21:30:21.848088+02:00>, entity_id=media_player.tv_player2>
2017-09-26 21:31:34 INFO (MainThread) [homeassistant.core] Bus:Handling <Event state_changed[L]: new_state=<state media_player.tv_player2=playing; media_duration=1789.213605442177, is_volume_muted=False, app_id=233637DE, app_name=YouTube, volume_level=1.0, friendly_name=player2, media_position=65.681, supported_features=21437, media_position_updated_at=2017-09-26T21:31:34.823855+02:00, entity_picture=/api/media_player_proxy/media_player.tv_player2?token=f03173d412f3c7396762605a4f347b83f651d4c0f23935c8b1fccd9305698ffe&cache=88a6b, media_content_id=AcADAFAD, media_title=YT test movie @ 2017-09-26T21:30:21.848088+02:00>, old_state=<state media_player.tv_player2=playing; media_duration=1789.213605442177, is_volume_muted=False, app_id=233637DE, app_name=YouTube, volume_level=1.0, friendly_name=player2, media_position=0.733, supported_features=21437, media_position_updated_at=2017-09-26T21:30:29.843295+02:00, entity_picture=/api/media_player_proxy/media_player.tv_player2?token=f03173d412f3c7396762605a4f347b83f651d4c0f23935c8b1fccd9305698ffe&cache=88a6b, media_content_id=AcADAFAD, media_title=YT test movie @ 2017-09-26T21:30:21.848088+02:00>, entity_id=media_player.tv_player2>
2017-09-26 22:45:58 INFO (MainThread) [homeassistant.core] Bus:Handling <Event state_changed[L]: new_state=<state media_player.player1=playing; media_duration=0, is_volume_muted=False, media_album_name=, media_content_type=music, friendly_name=player1, volume_level=0.75, supported_features=55231, media_title=My radio R1 @ 2017-09-26T22:44:57.441924+02:00>, old_state=<state media_player.player1=playing; media_duration=0, is_volume_muted=False, media_album_name=, media_content_type=music, friendly_name=player1, volume_level=0.75, supported_features=55231, media_title= @ 2017-09-26T22:44:57.441924+02:00>, entity_id=media_player.player1>
2017-09-26 22:46:09 INFO (MainThread) [homeassistant.core] Bus:Handling <Event state_changed[L]: new_state=<state media_player.player1=playing; media_duration=0, is_volume_muted=False, media_album_name=, media_content_type=music, friendly_name=player1, volume_level=0.75, supported_features=55231, media_title= @ 2017-09-26T22:44:57.441924+02:00>, old_state=<state media_player.player1=playing; media_duration=0, is_volume_muted=False, media_album_name=, media_content_type=music, friendly_name=player1, volume_level=0.75, supported_features=55231, media_title=My radio R1 @ 2017-09-26T22:44:57.441924+02:00>, entity_id=media_player.player1>

我需要通过脚本信息提取:时间(第一次出现), friendly_name media_title media_title中没有重复项

输出文件中的类似内容:

2017-09-26 20:54:09, player1, My movie
2017-09-26 21:30:22, player2, YT test movie
2017-09-26 22:45:58, player1, My radio R1

我尝试了grep,sed和没有propper结果。 谢谢

2 个答案:

答案 0 :(得分:1)

GNU awk 解决方案:

awk '{ 
         d=$1" "$2; 
         match($0,/friendly_name=([^,]+).*media_title=([^[:space:]][^@,]+)[[:space:]]*[@,]/,a);
         if (a[1]!="" && a[2]!="" && !(a[1]"@"a[2] in res)) { 
             res[a[1]"@"a[2]]; 
             print d,a[1],a[2] 
         } 
     }' OFS=', ' logfile

输出:

2017-09-26 20:54:09, player1, My movie
2017-09-26 21:30:22, player2, YT test movie 
2017-09-26 22:45:58, player1, My radio R1

答案 1 :(得分:1)

GNU awk 解决方案。由于我们不知道function update { cd the/repo/path git pull -f origin a_branch:tmp_branch } friendly_name出现的顺序,我们需要拆分匹配项:

media_title

结果如下:

$ awk 'match($0,/friendly_name=([^,]+)/,a) && 
       match($0,/media_title=([^,@]+?)(\s@)?/,b) && 
       a[1] && b[1] && !(b[1] in arr){
           arr[b[1]]; 
           printf "%s, %s, %s\n", $1" "$2, a[1], b[1]}' file