为什么我的代码只解析部分XML文件?

时间:2016-08-22 14:56:01

标签: python xml

提前道歉我是Python的新手。 我试图总结这个XML文件中的所有元素,但似乎我的代码只是出于某种原因只是文件的一部分。 我试图找出但失败了。我可以请一些建议吗?谢谢。 抱歉,长文件

import xml.etree.ElementTree as ET

input='''
<commentinfo>
<note>This file contains the sample data for testing</note>
<comments>
<comment>
<name>Romina</name>
<count>97</count>
</comment>
<comment>
<name>Laurie</name>
<count>97</count>
</comment>
<comment>
<name>Bayli</name>
<count>90</count>
</comment>
<comment>
<name>Siyona</name>
<count>90</count>
</comment>
<comment>
<name>Taisha</name>
<count>88</count>
</comment>
<comment>
<name>Ameelia</name>
<count>87</count>
</comment>
<comment>
<name>Alanda</name>
<count>87</count>
</comment>
<comment>
<name>Prasheeta</name>
<count>80</count>
</comment>
<comment>
<name>Risa</name>
<count>79</count>
</comment>
<comment>
<name>Asif</name>
<count>79</count>
</comment>
<comment>
<name>Zi</name>
<count>78</count>
</comment>
<comment>
<name>Ediomi</name>
<count>76</count>
</comment>
<comment>
<name>Danyil</name>
<count>76</count>
</comment>
<comment>
<name>Barry</name>
<count>72</count>
</comment>
<comment>
<count>64</count>
<name>Lance</name>
<count>72</count>
</comment>
<comment>
<name>Hattie</name>
<count>66</count>
</comment>
<comment>
<name>Mathu</name>
<count>66</count>
</comment>
<comment>
<name>Bowie</name>
<count>65</count>
</comment>
<comment>
<name>Samara</name>
<count>65</count>
</comment>
<comment>
<name>Uchenna</name>
</comment>
<comment>
<name>Shauni</name>
<count>61</count>
</comment>
<comment>
<name>Georgia</name>
<count>61</count>
</comment>
<comment>
<name>Rivan</name>
<count>59</count>
</comment>
<comment>
<name>Kenan</name>
<count>58</count>
</comment>
<comment>
<name>Isma</name>
<count>57</count>
</comment>
<comment>
<name>Hassan</name>
<count>57</count>
</comment>
<comment>
<name>Samanthalee</name>
<count>54</count>
</comment>
<comment>
<name>Alexa</name>
<count>51</count>
</comment>
<comment>
<name>Caine</name>
<count>49</count>
</comment>
<comment>
<name>Grady</name>
<count>47</count>
</comment>
<comment>
<name>Anne</name>
<count>40</count>
</comment>
<comment>
<name>Rihan</name>
<count>38</count>
</comment>
<comment>
<name>Alexei</name>
<count>37</count>
</comment>
<comment>
<name>Indie</name>
<count>36</count>
</comment>
<comment>
<name>Rhuairidh</name>
<count>36</count>
</comment>
<comment>
<name>Annoushka</name>
<count>32</count>
</comment>
<comment>
<name>Kenzi</name>
<count>25</count>
</comment>
<comment>
<name>Shahd</name>
<count>24</count>
</comment>
<comment>
<name>Irvine</name>
<count>22</count>
</comment>
<comment>
<name>Carys</name>
<count>21</count>
</comment>
<comment>
<name>Skye</name>
<count>19</count>
</comment>
<comment>
<name>Atiya</name>
<count>18</count>
</comment>
<comment>
<name>Rohan</name>
<count>18</count>
</comment>
<comment>
<name>Nuala</name>
<count>14</count>
</comment>
<comment>
<name>Carlo</name>
<count>12</count>
</comment>
<comment>
<name>Maram</name>
<count>12</count>
</comment>
<comment>
<name>Japleen</name>
<count>9</count>
</comment>
<comment>
<name>Breeanna</name>
<count>7</count>
</comment>
<comment>
<name>Zaaine</name>
<count>3</count>
</comment>
<comment>
<name>Inika</name>
<count>2</count>
</comment>
</comments>
</commentinfo>'''

tree = ET.fromstring(input)
counts = tree.findall('comments/comment')

summa=0
for item in counts:
    try:
        k=item.find('count').text
        k=int(k)
        print k
        summa +=k
    except:
        break

print summa

1 个答案:

答案 0 :(得分:1)

您的<comment>个代码之一没有<count>

<comment>
<name>Uchenna</name>
</comment>

这导致item.find('count')None。显然,None没有.text属性,因此会引发AttributeError。您的广泛异常处理会捕获AttributeError并提前终止循环。

这很好地证明了你永远不应该使用的原因:

try:
    ...
except:
    ...

您应捕获您知道如何处理的异常(然后尝试尽可能减少try套件中的代码)。在这种情况下:

for item in counts:
    try:
        k=item.find('count').text
        k=int(k)
    except (AttributeError, ValueError):  # missing or malformatted `<count>`.
        continue  # Skip that tag and keep on summing the others
    print k
    summa +=k