Question

我想使用HTMLParser（仅限。不想使用BeautifulSoup和其他非标准库）将HTML表解析为python中的2d数组（行和列）。

这是一个个人项目，这样做是为了好玩：P

无论如何，这是我的代码。它给了我一个非常混乱的错误 - 它说

Answer 1

我没有检查你想要做什么，但你给self.txt分配了一个字符串，然后尝试将它用作列表。

在构造函数中，使用空列表初始化self.txt：

def __init__(self):
...
self.txt = []
...

然后在handle_data方法中：

def handle_data(self, text):
    if (len(self.txt) > 0 ) :
        self.txt.append(text + " ") # <-- Here you consider self.txt is a list

    if (self.in_table == 1 and self.in_th == 0):
        self.txt = text.lstrip() # <-- Here you **assign a string** to self.txt

使用HTMLParser在Python中解析表

1 个答案: