美丽的汤找不到div

时间:2019-09-08 17:40:12

标签: html python-3.x beautifulsoup python-requests lxml

我正在尝试使用Beautiful Soup在div类中获取HTML,但找不到div。 我试图将名称更改为该类的sub-div:

<div class = 'Parent'><div class = 'child'> </div></div>

,但这仍然找不到div。必须更改的内容

<div id="SEDOC-1558061327644--611898955" class="se_doc_viewer se_body_wrap se_theme_default  se_pc" data-docversion="1.0">
<div class="se_doc_header_start" id="SEDOC-1558061327644--611898955_se_doc_header_start"></div>
<!-- SE_DOC_HEADER_START -->
<div id="SEDOC-1558061327644--611898955_viewer_head" class="se_viewer_head"></div>
<div class="se_component_wrap">
<div class="se_component se_documentTitle default  is-fill">
    <div class="se_sectionArea is-fill se_align-left">
        <!-- $SE3-TITLE_TOP Post Service placeholder --><a href="/my/series/detail.nhn?seriesNo=517466&amp;memberNo=29747755&amp;prevVolumeNo=20163796" onclick="mug.common.nclick(this, 'tit.series');"><div class="se_series"><i class="se_ico_series" style="display:none">스타에디터3</i><i class="se_ico_series" style="">시리즈</i><i class="se_ico_groupseries" style="display:none">콜라보</i>"Test" a week</div></a><!-- $SE3-TITLE_TOP Post Service placeholder -->
        <!-- SE_DOC_HEADER_TITLE_TOP-->
        <div id="SEDOC-1558061327644--611898955_se_doc_title_top" class="se_doc_title_top"></div>
        <div class="se_editArea">
            <div class="se_viewArea se_ff_nanumgothic se_fs_D2">
                <div class="se_editView se_title">
                    <div class="se_textView">



                    </div>
                </div>
            </div>
        </div><!-- SE_DOC_HEADER_TITLE_MIDDLE-->
        <div id="SEDOC-1558061327644--611898955_se_doc_title_middle" class="se_doc_title_middle"></div>
        <!-- SE_DOC_HEADER_CONTENTS_START -->
        <!-- $SE3-TITLE_CONTENTS Post Service placeholder --><div class="se_container"><a href="/my.nhn?memberNo=29747755" onclick="mug.common.nclick(this, 'tit.profile');" class="se_thumbnail"><img src="https://post-phinf.pstatic.net/20160308_60/jypentertainment_1457428790611iij1y_JPEG/jypentertainment_5695439586391775582.jpg?type=f120_120" data-src="https://post-phinf.pstatic.net/20160308_60/jypentertainment_1457428790611iij1y_JPEG/jypentertainment_5695439586391775582.jpg?type=f120_120" onerror="this.onerror=null;this.src='https://ssl.pstatic.net/static.post/image/im/img_default.gif'" alt="JYPnation님의 프로필 사진"></a><div class="se_group"><div class="se_author_area"><p><a href="/my.nhn?memberNo=29747755" onclick="mug.common.nclick(this, 'tit.name');"><span class="se_author">JYPnation</span></a></p><p style=""><span class="ico_official"><i class="blind">공식</i></span></p><p><i class="se_divide_dot" style="display:none"></i><span class="se_follower" style="display:none">{=badgeName}</span></p></div><p class="se_detail"><span class="se_publishDate">2019.05.17. 12:02</span><i class="se_divide_line"></i><span class="se_view" style="">53,049 읽음 </span><span class="se_secret" style="display:none">비밀글</span></p></div><div class="flw_area"><a href="#" class="btn_stats __se3_stat_btn" style="display:none" onclick="mug.common.nclick(this, 'tit.stat');" target="_blank"><span>통계</span></a>
</div></div><!-- $SE3-TITLE_CONTENTS Post Service placeholder -->
        <!-- SE_DOC_HEADER_CONTENTS_END -->
        <!-- SE_DOC_HEADER_TITLE_BOTTOM-->
        <div id="SEDOC-1558061327644--611898955_se_doc_title_bottom" class="se_doc_title_bottom"></div>
    </div>
</div>

</div>
<!-- {{{$SE3-CONTENTS_HEADER}}} -->
<!-- SE_DOC_HEADER_END -->
<div class="se_doc_header_end" id="SEDOC-1558061327644--611898955_se_doc_header_end"></div>
<div class="se_doc_contents_start" id="SEDOC-1558061327644--611898955_se_doc_contents_start"></div>
<!-- SE_DOC_CONTENTS_START -->
<div class="se_component_wrap sect_dsc __se_component_area">

<div class="se_component se_paragraph default">
    <div class="se_sectionArea">
        <div class="se_editArea">
            <div class="se_viewArea se_ff_nanumgothic se_fs_T3 se_align-center">
                <div class="se_editView">
                    <div class="se_textView">

                    </div>
                </div>
            </div>
        </div>
    </div>
</div>
<div class="se_component se_image default">
            <div class="se_sectionArea se_align-left">
                <div class="se_editArea">
                    <div class="se_viewArea" style="max-width:683px">
        <a href="#" onclick="return false;" class="se_mediaArea __se_image_link __se_link" data-linktype="img" data-linkdata="{&quot;imgId&quot; : &quot;SEDOC-1558061327644--611898955_image_3_img&quot;, &quot;src&quot; : &quot;https://post-phinf.pstatic.net/MjAxOTA1MTdfMjk1/MDAxNTU4MDYwMjA0ODEw.NZ-xPvBVeYEluUvn3wktkslX3VboIF7Ks5UqBmNvDZQg.drS4FNylyMyk17gVXmsJlS7jPbkoZML1fC65Y5caLV0g.GIF/5._%EB%82%98%EB%AA%A8_GIF.gif&quot;, &quot;linkUse&quot; : &quot;false&quot;, &quot;link&quot; : &quot;&quot;}">
                            <img id="SEDOC-1558061327644--611898955_image_3_img" class="se_mediaImage __se_img_el" src="https://post-phinf.pstatic.net/MjAxOTA1MTdfMjk1/MDAxNTU4MDYwMjA0ODEw.NZ-xPvBVeYEluUvn3wktkslX3VboIF7Ks5UqBmNvDZQg.drS4FNylyMyk17gVXmsJlS7jPbkoZML1fC65Y5caLV0g.GIF/5._%EB%82%98%EB%AA%A8_GIF.gif?type=w1200" data-src="https://post-phinf.pstatic.net/MjAxOTA1MTdfMjk1/MDAxNTU4MDYwMjA0ODEw.NZ-xPvBVeYEluUvn3wktkslX3VboIF7Ks5UqBmNvDZQg.drS4FNylyMyk17gVXmsJlS7jPbkoZML1fC65Y5caLV0g.GIF/5._%EB%82%98%EB%AA%A8_GIF.gif?type=w1200" width="683" height="960" data-attachment-id="Igx1zPa0KwmLJxVgMUYz0YqRGSa8" alt="">

        </a>
                    </div>
                </div>
            </div>
        </div>

</div>
    </div>
</div>

</div>
</div>
url = 'https://page_url'
response = get(url)
html_soup = BeautifulSoup(response.text, 'lxml')
print(html_soup)

container = html_soup.findAll("div", {"class": "se_component_wrap sect_dsc __se_component_area"})

print(len(container))

我希望它在div.se_component_wrap sect_dsc __se_component_area内打印html代码。

我使用len(container)来查找是否有任何东西,结果为0。此外,当使用print(html_soup)时,所需的div将显示在输出中。

0 个答案:

没有答案