Python脚本 - Findall

时间:2017-09-29 11:29:05

标签: python beautifulsoup

以下脚本列出了<ar-save-item>

的所有标记的输出
def getrec():
    import requests
    from bs4 import BeautifulSoup
    recipe_list=[]
    recipes=[]
    result=[]
    key = "Paneer"
    url = "http://allrecipes.com/search/results/?wt="+key+"&sort=re"
    print(url);
    r=[]
    response = requests.get(url)
    try:
        result_page=BeautifulSoup(response.content,'lxml')
        r=result_page.find_all('ar-save-item')
        for res in r:
            print(r);

但是,我想在tag中显示class-id值。怎么去呢?

输出如下所示:

[<ar-save-item class="favorite" data-id="73715" data-imageurl="'http://images.media-allrecipes.com/userphotos/250x250/00/42/82/428269.jpg'" data-name='"Paneer"' data-type="'Recipe'"></ar-save-item>, <ar-save-item class="favorite" data-id="212521" data-imageurl="'http://images.media-allrecipes.com/userphotos/250x250/00/32/99/329922.jpg'" data-name='"Shahi Paneer"' data-type="'Recipe'"></ar-save-item>, <ar-save-item class="favorite" data-id="221826" data-imageurl="'http://images.media-allrecipes.com/userphotos/250x250/01/03/63/1036376.jpg'" data-name='"Palak Paneer (Indian Spinach and Paneer)"' data-type="'Recipe'"></ar-save-item>

结果需要什么:

data-id="73715"
data-id="212521"
等等等等。请帮忙。

1 个答案:

答案 0 :(得分:0)

resdict。您可以按res['data-id']get()方法res.get('data-id')获取值。如果没有get()属性但使用None作为data-id中的键会引发异常,则最好使用data-id,因为它返回res

import requests
from bs4 import BeautifulSoup

def getrec():

    key = "Paneer"
    url = "http://allrecipes.com/search/results/?wt="+key+"&sort=re"

    response = requests.get(url)
    result_page = BeautifulSoup(response.content,'lxml')
    r = result_page.find_all('ar-save-item')

    for res in r:
        print('data-id =', res.get('data-id'))

getrec()

<强>输出

data-id = 73715
data-id = 212521
data-id = 221826
data-id = 212756
data-id = 232201
data-id = 222787
data-id = 232203
data-id = 240652
data-id = 138127
data-id = 256164
data-id = 221828
data-id = 212814
data-id = 106159
data-id = 159147
data-id = 86602
data-id = 237491
data-id = 213235
data-id = 228957
data-id = 228899
data-id = 232202