Question

我有一个相当难的问题，我无法解决问题。我们的想法是遍历一部分数据并找到任何缩进。（总是空格）每当一行有比前一个更大的缩进时，例如4个以上的空格，第一行应该是字典的键，并且应该追加下一个值。

如果有另一个缩进，这意味着应该创建一个带有键和值的新词典。这应该是递归的，直到通过数据。为了使事情更容易理解，我做了一个例子：

Chassis 1:
    Servers:
        Server 1/1:
            Equipped Product Name: EEE UCS B200 M3
            Equiped PID: e63-samp-33
            Equipped VID: V01
            Acknowledged Cores: 16
            Acknowledged Adapters: 1
    PSU 1:
        Presence: Equipped
        VID: V00
        HW Revision: 0

这个想法是能够以字典形式返回任何数据部分。 dictionary.get（＆＃34; Chassis 1：＆＃34;）应该返回所有数据，dictionary.get（＆＃34; Servers＆＃34;）应该返回比行更深入的所有内容＆＃34; Servers＆＃ 34 ;. dictionary.get（＆＃34; PSU 1：＆＃34;）应该给{＆＃34; PSU 1：＆＃34;：＆＃34; Presence：Equipped＆＃34;，＆＃34; VID：100＆＃ 34;，＆＃34; HW Revision：0＆＃34;}等等。我已经绘制了一个小方案来证明这一点，每种颜色都是另一种字典。

当缩进再次变深时，例如从8到4个空格，数据应该附加到具有较少缩进数据的字典中。

我已尝试过代码，但它并没有出现在我想要的地方附近..

for item in Array:
    regexpatt = re.search(":$", item)
    if regexpatt:
        keyFound = True
        break

if not keyFound:
    return Array

#Verify if we still have lines with spaces
spaceFound = False
for item in Array:
    if item != item.lstrip():
        spaceFound = True
        break

if not spaceFound:
    return Array

keyFound = False
key=""
counter = -1
for item in Array:
    counter += 1
    valueTrim = item.lstrip()
    valueL = len(item)
    valueTrimL = len(valueTrim)
    diff = (valueL - valueTrimL)
    nextSame = False
    if item in Array:
        nextValue = Array[counter]
        nextDiff = (len(nextValue) - len(nextValue.lstrip()))
        if diff == nextDiff:
            nextSame = True


    if diff == 0 and valueTrim != "" and nextSame is True:
        match = re.search(":$", item)
        if match:
            key = item
            newArray[key] = []
            deptDetermine = True
            keyFound = True
    elif diff == 0 and valueTrim != "" and keyFound is False:
        newArray["0"].append(item)
    elif valueTrim != "":
        if depthDetermine:
            depth = diff
            deptDetermine = False
        #newValue = item[-valueL +depth]
        item = item.lstrip().rstrip()
        newArray[key].append(item)

for item in newArray:
    if item != "0":
        newArray[key] = newArray[key]

return newArray

结果应该是这样的：例如：

{
    "Chassis 1": {
        "PSU 1": {
            "HW Revision: 0", 
            "Presence: Equipped", 
            "VID: V00"
        }, 
        "Servers": {
            "Server 1/1": {
                "Acknowledged Adapters: 1", 
                "Acknowledged Cores: 16", 
                "Equiped PID: e63-samp-33", 
                "Equipped Product Name: EEE UCS B200 M3", 
                "Equipped VID: V01"
            }
        }
    }
}

我希望这足以解释这个概念

Answer 1

这应该为您提供所需的嵌套结构。

如果你想要每个嵌套的dictonary，也可以从root获得。取消注释if .. is not root部分

def parse(data):

    root = {}
    currentDict = root
    prevLevel = -1
    parents = []
    for line in data:
        if line.strip() == '': continue
        level = len(line) - len(line.lstrip(" "))
        key, value = [val.strip() for val in line.split(':', 1)]

        if level > prevLevel and not len(value):
            currentDict[key] = {}
            # if currentDict is not root:
            #     root[key] = currentDict[key]
            parents.append((currentDict, level))
            currentDict = currentDict[key]
            prevLevel = level
        elif level < prevLevel and not len(value):
            parentDict, parentLevel = parents.pop()
            while parentLevel != level:
                if not parents: return root
                parentDict, parentLevel = parents.pop()
            parentDict[key] = {}
            parents.append((parentDict, level))
            # if parentDict is not root:
            #     root[key] = parentDict[key]
            currentDict = parentDict[key]
            prevLevel = level
        else:
            currentDict[key] = value
    return root 




with open('data.txt', 'r') as f:
    data = parse(f)
    #for pretty print of nested dict
    import json
    print json.dumps(data,sort_keys=True, indent=4)

输出：

{
    "Chassis 1": {
        "PSU 1": {
            "HW Revision": "0", 
            "Presence": "Equipped", 
            "VID": "V00"
        }, 
        "Servers": {
            "Server 1/1": {
                "Acknowledged Adapters": "1", 
                "Acknowledged Cores": "16", 
                "Equiped PID": "e63-samp-33", 
                "Equipped Product Name": "EEE UCS B200 M3", 
                "Equipped VID": "V01"
            }
        }
    }
}

Answer 2

该数据格式确实看起来像YAML。以防万一有人偶然发现这个问题并且库解决方案很好：

import yaml
import pprint

s = """
Chassis 1:
    Servers:
        Server 1/1:
            Equipped Product Name: EEE UCS B200 M3
            Equiped PID: e63-samp-33
            Equipped VID: V01
            Acknowledged Cores: 16
            Acknowledged Adapters: 1
    PSU 1:
        Presence: Equipped
        VID: V00
        HW Revision: 0
"""

d = yaml.load(s)
pprint.pprint(d)

输出结果为：

{'Chassis 1': {'PSU 1': {'HW Revision': 0,
                         'Presence': 'Equipped',
                         'VID': 'V00'},
               'Servers': {'Server 1/1': {'Acknowledged Adapters': 1,
                                          'Acknowledged Cores': 16,
                                          'Equiped PID': 'e63-samp-33',
                                          'Equipped Product Name': 'EEE UCS B200 M3',
                                          'Equipped VID': 'V01'}}}}

供参考：

在循环数据时嵌套字典

2 个答案: