防止BS4添加重复标签

时间:2020-03-09 04:35:48

标签: python python-3.x beautifulsoup

我将HTML代码段/元素附加到现有的HTML上,而BS4复制了其中的元素。如何预防呢?

简化代码

from bs4 import BeautifulSoup as bs4
html = bs4("<!DOCTYPE html>", "html5lib")
message =  bs4("<span>Complete all required fields.<span>", "html.parser")
html.select("body")[0].append(message)
print(html.prettify())

输出

<!DOCTYPE html>
<html>
 <head>
 </head>
 <body>
  <span>
   Complete all required fields.
   <span>
   </span>
  </span>
 </body>
</html>

期望

<!DOCTYPE html>
<html>
 <head>
 </head>
 <body>
  <span>
   Complete all required fields.
  </span>
 </body>
</html>

1 个答案:

答案 0 :(得分:1)

您做得很好,但是您忘记关闭跨度

from bs4 import BeautifulSoup as bs4
html = bs4("<!DOCTYPE html>", "html5lib")
message =  bs4("<span>Complete all required fields.</span>", "html.parser")#changed
html.select("body")[0].append(message)
print(html.prettify())

o/p:
<!DOCTYPE html>
<html>
 <head>
 </head>
 <body>
<span>
 Complete all required fields.
</span>
 </body>
</html>
相关问题