virtualenv:相同版本的BeautifulSoup会返回不同的结果

时间:2013-01-23 15:06:21

标签: python beautifulsoup

我在一般Python路径中安装了一个beautifulSoup,在virtualenv中安装了另一个

beautifulsoup4  - 4.1.3        - active  # in general Python installation

beautifulsoup4  - 4.1.3        - active # in virtualenv path

我在两个环境中运行以下代码

import urllib2
import unicodedata
from bs4 import BeautifulSoup
from collections import Counter
soup = BeautifulSoup(urllib2.urlopen('http://www.thehindu.com/news/cities/bangalore/aero-india-takes-off-on-february-6/article4329776.ece').fp)

在一般的Python安装中,它给了我

>>> soup.select('.article-text .body')
[<p class="body"> It is that time when aviation buffs get ready to take off to the Air Force Station in Yelahanka here when the ninth edition of Aero India will be inaugurated by Defence Minister A.K. Antony on February 6.</p>, <p class="body">They can watch aerobatics by, among others, the Flying Bulls from the Czech Republic and Russian Knights — the Russian Air Force Aerobatic Team will complement Indian Air Force’s Sarang Aerobatic Team — at the biennial event that provides a platform for Indian and foreign vendors.</p>, <p class="body">However, IAF’s pride — the Surya Kiran Aerobatic Tea — which has performed to huge plaudits from the audience in the previous shows, will not be there for the country’s premier air show, a press release said.</p>, <p class="body">All exhibition space has been sold out and this edition is expected to see the participation of over 600 companies and 768 overseas delegations. </p>, <p class="body">The largest overseas participation is from the U.S. followed by Israel and Russia. The other major participants include France, the U.K., Germany and Belgium, Bulgaria, Italy, Ukraine, Australia, Belarus, Czech Republic, Japan, Norway, South Africa, Spain, Switzerland, Austria, Brazil, Canada, The Netherlands, Romania, Sweden, Singapore and the UAE.</p>, <p class="body">Organised by the Department of Defence Production, the five-day show aims at promoting products and services being offered by the Indian Defence industry in the international market.</p>]
>>> 

在virtualenv环境中,它什么都没有显示

>>> soup.select('.article-text .body')  
[]

是什么导致了这个问题?如何在虚拟环境中修复它?

2 个答案:

答案 0 :(得分:1)

此问题的最常见原因是one environment has a parser library installed which the other lacks。检查一下。

答案 1 :(得分:0)

我刚遇到同样的问题。对我有用的解决方案是明确指出解析器。在我的情况下,这是: soup = BeautifulSoup(markup, "html5lib")