Question

您好我在beautifulsoup object上应用了find_all，并找到了bs4.element.ResultSet object或list的内容。

我想在那里进一步做find_all，但bs4.element.ResultSet object不允许这样做。我可以循环遍历bs4.element.ResultSet object的每个元素来执行find_all。但是我可以避免循环并将其转换回beautifulsoup object吗？

请参阅代码了解详情。感谢

html_1 = """
<table>
    <thead>
        <tr class="myClass">
            <th>A</th>
            <th>B</th>
            <th>C</th>
            <th>D</th>
        </tr>
    </thead>
</table>
"""
soup = BeautifulSoup(html_1, 'html.parser')

type(soup) #bs4.BeautifulSoup

# do find_all on beautifulsoup object
th_all = soup.find_all('th')

# the result is of type bs4.element.ResultSet or similarly list
type(th_all) #bs4.element.ResultSet
type(th_all[0:1]) #list

# now I want to further do find_all
th_all.find_all(text='A') #not work

# can I avoid this need of loop?
for th in th_all:
    th.find_all(text='A') #works

Answer 1

ResultSet class是列表的子类，而不是定义了find*方法的Tag class。循环使用find_all()的结果是最常用的方法：

th_all = soup.find_all('th')
result = []
for th in th_all:
    result.extend(th.find_all(text='A'))

通常情况下，CSS selectors可以帮助您一次解决，但不能使用find_all()方法使用select()完成所有操作。例如，没有＆＃34;文本＆＃34; bs4 CSS选择器中可用的搜索。但是，例如，如果您必须在b元素中找到所有th元素，那么您可以执行以下操作：

soup.select("th td")

beautifulsoup：find_all on bs4.element.ResultSet对象还是列表？

1 个答案: