从XML获取所有标记名称(仅标记名称)

时间:2017-02-28 13:00:26

标签: python xml python-3.x parsing tags

我正在尝试打印所有标签(只有标签名称),我遇到连接问题:

DEXML = urlopen('# the URL of the XML')

tree_DE = ET.parse(DEXML)

root_DE = tree_DE.findall('.//*')
a = []

for element in list(set(root_DE)):
    x = str(element)
    m = re.search("'[a-zA-Z]+'", x)
    m = ",".join()
    a.append(m)

print(a)

运行此代码后,错误是:
TypeError: join() takes exactly one argument (0 given)

2 个答案:

答案 0 :(得分:0)

DEXML = urlopen('# the URL of the XML')

tree_DE = ET.parse(DEXML)

root_DE = tree_DE.findall('.//*')
a = []

for element in list(set(root_DE)):
    x = str(element)
    seq = re.search("'[a-zA-Z]+'", x)
    # add seq argument 
    m = ",".join(seq)
    a.append(m)

print(a)

答案 1 :(得分:0)

正确的方式:

PRD_XML = urlopen('URL.xml_1')
DEX_ML = urlopen('URL.xml_2')

for event, element in ET.iterparse(PRD_XML):
    PRD_tags.append(element.tag)

for event_2, element_2 in ET.iterparse(DEX_ML):
    DE_tags.append(element_2.tag)


def compare():
    if not [item for item in PRD_tags if item in DE_tags]:
        return False
    return True

assert_false(compare(), "no match")