解析dbus监视器输出消息

时间:2019-04-23 12:03:39

标签: python regex parsing dbus

我正在尝试解析dbus监视器的输出消息。它具有大多数消息作为多行条目(包括参数)。我需要将单个日志消息解析并连接到单个行条目。

dbus-monitor输出消息如下所示,

method call time=462.117843 sender=:1.62 -> destination=org.freedesktop.filehandler serial=122 path=/org/freedesktop/filehandler/routing; interface=org.freedesktop.filehandler.routing; member=start
int16 29877
uint16 0
method return time=462.117844 sender=org.freedesktop.filehandler -> destination=:1.62 serial=2210 reply_serial=122
int16 29877
uint16 0
method call time=462.117845 sender=:1.62 -> destination=org.freedesktop.filehandler serial=123 path=/org/freedesktop/filehandler/routing; interface=org.freedesktop.filehandler.routing; member=comment
string "starting .."
string "routing"
method return time=462.117846 sender=:1.19 -> destination=:1.62 serial=2212 reply_serial=123
int12 -23145
signal time=463.11223 sender=:1.64 -> destination=(null destination) serial=124 path=/org/freedesktop/fileserver; interface=org.freedesktop.DBus.Properties; member=PropertiesChanged
  string "com.freedesktop.Systemserver"
  array[
    dict entry(
      string "SystemTime"
      variant       struct{
            byte 12
            byte 9
            byte 0
        }
    )
  ]
  array [
  ]

这是 regex ,我尝试将dbus消息分组(参数未分组),

\b(signal|method call|method return)\b time=([\d,.]*) sender=([\w,.,:,(,), ]*) -> destination=([\w,.,:,(,), ]*) serial=([(,),\w]*) (?:path=([\w,\/]*); interface=([\w,.]*); member=([\w,_,-]*))?(?:reply_serial=([\d]*))?

我希望输出采用以下格式,

C [sender,serial] path interface+member (parameter1, parameter2, ...)
R [destination,reply_serial] interface+member (parameter1, parameter2, ...)
S [sender, serial] path interface+member (parameter1, parameter2, ...)

上述dbus-monitor消息的示例输出如下所示,

C [:1.62,122] /org/freedesktop/filehandler/routing org.freedesktop.filehandler.routing.start (29877,0)
R [:1.62,122] org.freedesktop.filehandler.routing.start (29877,0)
C [:1.62,123] /org/freedesktop/filehandler/routing org.freedesktop.filehandler.routing.comment ("starting", "routing")
R [:1.62,123] org.freedesktop.filehandler.routing.comment (-23145)
S [:1.64, 124] /org/freedesktop/fileserver org.freedesktop.DBus.Properties.PropertiesChanged ("com.freedesktop.Systemserver"[("SystemTime",{12,9,0})][])

当条目通常为多行时,如何实现上述预期结果?而且,SIGNALS具有多个封装,因此难以访问参数。有人可以帮助将这些dbus消息解析为预期的格式吗?

3 个答案:

答案 0 :(得分:0)

如果您绝对必须使用dbus-monitor,则最好通过传递--pcap选项来使用其PCAP输出模式。输出为well-documented structured formatlibpcap可以读取。

答案 1 :(得分:0)

由于您已经拥有可用的正则表达式,因此可以在re.split上使用它来获取所需的消息部分。请注意,对于每个消息条目,这将为每个捕获组生成一个单独的字符串,并为每个参数生成一个带有参数的字符串。本示例假定所有消息都在字符串messages中:

import re
import sys
regex = r'\b(signal|method call|method return)\b time=([\d,.]*) sender=([\w,.,:,(,), ]*) -> destination=([\w,.,:,(,), ]*) serial=([(,),\w]*) (?:path=([\w,\/]*); interface=([\w,.]*); member=([\w,_,-]*))?(?:reply_serial=([\d]*))?'
m = re.split(regex, messages)
m = m[1:]                       # discard empty? text before first match
remember = dict()
while m:    # each match group is 9 capturing groups + 1 parameter group
    if m[0] == 'method call':
        print "C [{2},{4}] {5} {6}.{7}".format(*m),
        remember[m[4]] = m[6:8] # store interface+member for return
    if m[0] == 'method return':
        m[6:8] = remember[m[8]] # recall stored interface+member
        print "R [{3},{8}] {6}.{7}".format(*m),
    if m[0] == 'signal':
        print "S [{2}, {4}] {5} {6}.{7}".format(*m),
    # now handle parameters
    sep = "("
    for p in m[9].split('\n')[1:-1]:    # except empty string at start and end
        if p[-1] in "[](){}":           # with "encapsulations":
            p = p[-1]                   #   delete spaces, "array", "dict ..."
        p = re.sub('^\s*\w*\s*', '', p) # delete spaces and data type
        if p[-1] in "])}":
            sep = ''                    # no separator before closing
        print sep+p,
        sys.stdout.softspace=0
        if p[-1] in "[](){}":   sep = ''
        else:                   sep = ', '  # separator after data item
    print ")"
    m = m[10:]                  # delete the processed match group of 10

包含示例数据的输出为:

C [:1.62,122] /org/freedesktop/filehandler/routing org.freedesktop.filehandler.routing.start (29877, 0)
R [:1.62,122] org.freedesktop.filehandler.routing.start (29877, 0)
C [:1.62,123] /org/freedesktop/filehandler/routing org.freedesktop.filehandler.routing.comment ("starting ..", "routing")
R [:1.62,123] org.freedesktop.filehandler.routing.comment (-23145)
S [:1.64, 124] /org/freedesktop/fileserver org.freedesktop.DBus.Properties.PropertiesChanged ("com.freedesktop.Systemserver", [("SystemTime", {12, 9, 0})][])

答案 2 :(得分:0)

  

您能建议如何将代码重写为逐行处理吗?

在这里我进行了相应的重新排列:

import re
import sys
regex = r'\b(signal|method call|method return)\b time=([\d,.]*) sender=([\w,.,:,(,), ]*) -> destination=([\w,.,:,(,), ]*) serial=([(,),\w]*) (?:path=([\w,\/]*); interface=([\w,.]*); member=([\w,_,-]*))?(?:reply_serial=([\d]*))?'
remember = dict()
sep = None
for line in open('dbusl.in'):
    m = re.match(regex, line)
    if m:
        if sep is not None: print ")"   # end the previous parameter group
        m = list(m.groups())        # each match is 9 capturing groups
        if m[0] == 'method call':
            print "C [{2},{4}] {5} {6}.{7}".format(*m),
            remember[m[4]] = m[6:8]     # store interface+member for return
        if m[0] == 'method return':
            m[6:8] = remember.pop(m[8]) # recall stored interface+member
            print "R [{3},{8}] {6}.{7}".format(*m),
        if m[0] == 'signal':
            print "S [{2}, {4}] {5} {6}.{7}".format(*m),
        sep = "("
    else:
        p = line.rstrip()               # now handle parameters
        if p[-1] in "[](){}":           # with "encapsulations":
            p = p[-1]                   #   delete spaces, "array", "dict ..."
        p = re.sub('^\s*\w*\s*', '', p) # delete spaces and data type
        if p[-1] in "])}":
            sep = ''                    # no separator before closing
        print sep+p,
        sys.stdout.softspace=0
        if p[-1] in "[](){}":   sep = ''
        else:                   sep = ', '  # separator after data item
print ")"                       # end the previous parameter group

请注意,我还将m[6:8] = remember[m[8]]更改为m[6:8] = remember.pop(m[8]),以释放不再需要的 interface + member 数据的内存。