在python中创建名称值对

时间:2013-10-01 12:44:46

标签: python string parsing

我有一个python脚本,其输出存储在一个名为jvmData的变量中:

Stats name=jvmRuntimeModule, type=jvmRuntimeModule#
{
name=HeapSize, ID=1, description=The total memory (in KBytes) in the Java virtual machine run time., unit=KILOBYTE, type=BoundedRangeStatistic, lowWaterMark=1048576, highWaterMark=1048576, current=1048576, integral=0.0, lowerBound=1048576, upperBound=2097152

name=FreeMemory, ID=2, description=The free memory (in KBytes) in the Java virtual machine run time., unit=KILOBYTE, type=CountStatistic, count=348466

name=UsedMemory, ID=3, description=The amount of used memory (in KBytes) in the Java virtual machine run time., unit=KILOBYTE, type=CountStatistic, count=700109

name=UpTime, ID=4, description=The amount of time (in seconds) that the Java virtual machine has been running., unit=SECOND, type=CountStatistic, count=3706565

name=ProcessCpuUsage, ID=5, description=The CPU Usage (in percent) of the Java virtual machine., unit=N/A, type=CountStatistic, count=0
}

我想要做的只是打印出重要部分的名称/值对,在这种情况下只是:

HeapSize=1048576
FreeMemory=348466
UsedMemory=700109
UpTime=3706565
ProcessCpuUsage=0

我对python一点都不好:)我头脑中唯一的解决方案似乎很啰嗦?拆分线,扔掉第一行,第二行和最后一行,然后用不同的情况(有时是当前的,有时是计数的)循环遍历每一行,以找到字符串的长度等等。

也许(当然)我错过了一些很好的功能,我可以用它来把它们放到相当于java的hashmap中吗?

2 个答案:

答案 0 :(得分:2)

"相当于java HashMap"将在python中称为字典。至于如何解析这个问题,只需遍历包含数据的行,在行中创建所有键/值对的dict,并为HeapSize设置一个特殊情况:

jvmData = "..." #the string holding the data
jvmLines = jvmData.split("\n") #a list of the lines in the string
lines = [l.strip() for l in jvmLines if "name=" in l] #filter all data lines
result = {}
for line in lines:
    data = dict(s.split("=") for s in line.split(", "))
    #the value is in "current" for HeapSize or in "count" otherwise
    value = data["current"] if data["name"] == "HeapSize" else data["count"]
    result[data["name"]] = value

由于你似乎陷入了Jython2.1,这里有一个应该使用它的版本(显然未经测试)。基本上与上面相同,但列表理解和生成器表达式分别由filtermap替换,并且不使用三元if/else运算符:

jvmData = "..." #the string holding the data
jvmLines = jvmData.split("\n") #a list of the lines in the string
lines = filter(lambda x: "name=" in x, jvmLines) #filter all data lines
result = {}
for line in lines:
    data = dict(map(lambda x: x.split("="), line.split(", ")))
    if data["name"] == "HeapSize":
        result[data["name"]] = data["current"]
    else:
        result[data["name"]] = data["count"]

答案 1 :(得分:0)

尝试使用find函数和小re:

import re
final_map = {}
NAME= 'name='
COUNT= 'count='
HIGHWATERMARK= "highWaterMark="
def main():    
    with open(r'<file_location>','r') as file:
        lines = [line for line in file if re.search(r'^name', line)]        
        for line in lines:                    
            sub = COUNT if line.find(COUNT) != -1  else HIGHWATERMARK       
            final_map[line[line.find(NAME)+5:line.find(',')]] = line[line.find(sub)+len(sub):].split(',')[0].strip()
            print line[line.find(NAME)+5:line.find(',')]+'='+final_map[line[line.find(NAME)+5:line.find(',')]]                

if __name__ == '__main__':
    main()

<强>输出:

HeapSize=1048576
FreeMemory=348466
UsedMemory=700109
UpTime=3706565
ProcessCpuUsage=0