如何解析XML并将层次结构写入Django数据库?

时间:2019-04-26 12:47:00

标签: python django xml parsing

我正在尝试解析xml文件并将层次结构存储在数据库中,但是它似乎是我做错了,特别是在数据库中创建对象时,因为这样做需要花费大量时间并查询来存储此数据。 我不知道如何做得更好,希望你能帮助我

xml文件示例:

<RECORD><OBL_NAME>Автономна Республіка Крим</OBL_NAME><REGION_NAME></REGION_NAME><CITY_NAME>м.Сімферополь</CITY_NAME><CITY_REGION_NAME></CITY_REGION_NAME></RECORD>
<RECORD><OBL_NAME>Автономна Республіка Крим</OBL_NAME><REGION_NAME></REGION_NAME><CITY_NAME>м.Сімферополь</CITY_NAME><CITY_REGION_NAME></CITY_REGION_NAME></RECORD>
<RECORD><OBL_NAME>Автономна Республіка Крим</OBL_NAME><REGION_NAME></REGION_NAME><CITY_NAME>м.Севастополь</CITY_NAME><CITY_REGION_NAME>с.Балка</CITY_REGION_NAME></RECORD>
<RECORD><OBL_NAME>Автономна Республіка Крим</OBL_NAME><REGION_NAME></REGION_NAME><CITY_NAME>м.Севастополь</CITY_NAME><CITY_REGION_NAME>с.Набережне</CITY_REGION_NAME></RECORD>

def data_xml_parser(xml_file):
    tree = ET.parse(xml_file)
    # get document root
    root = tree.getroot()
    # find all records of data
    structure = []
    s = set()
    for records in root.iter('RECORD'):

        obl_name = records.find('OBL_NAME').text
        region_name = records.find('REGION_NAME').text
        city_name = records.find('CITY_NAME').text
        city_region_name = records.find('CITY_REGION_NAME').text

        d = {'OBL_NAME': obl_name, 'R_NAME': region_name, 'CITY_NAME': city_name,
             'CITY_REGION_NAME': city_region_name}
        # prevent duplicate data
        t = tuple(d.items())
        if t not in s:
            s.add(t)
            structure.append(d)
    return structure

和我试图将数据写入数据库的文件,例如Obl_name-> Region_name-> City_name-> City_region_name:


class Command(BaseCommand):

    help = 'Fill DB with initial data'

    def handle(self, *args, **options):
        for entry in get_dataset():
            oblast = Place.objects.get_or_create(name=entry.get('OBL_NAME'))
            region = None
            city = None
            if entry.get('REGION_NAME') is not None:
                region = Place.objects.get_or_create(name=entry.get('REGION_NAME'),
                                                     parent=oblast)
            if entry.get('CITY_NAME') is not None and region is not None:
                city = Place.objects.get_or_create(name=entry.get('CITY_NAME'),
                                                   parent=region)
            elif entry.get('CITY_NAME') is not None:
                city = Place.objects.get_or_create(name=entry.get('CITY_NAME'),
                                                   parent=oblast)
            # if entry.get('CITY_REGION_NAME') is not None and city is not None:
            #     Place.objects.get_or_create(name=entry.get('CITY_REGION_NAME'),
            #                                 parent=city)
        self.stdout.write(self.style.SUCCESS('Database updated successfully!'))

0 个答案:

没有答案