我正在尝试从保管箱链接 (https://www.dropbox.com/s/i77mern7joxc9ur/TestResultCodelistVoC.xlsx) 解析表格。这是一个 .xlsx
表,到目前为止我已经尝试了两种方法
方法一
codeID_url = 'https://www.dropbox.com/s/i77mern7joxc9ur/TestResultCodelistVoC.xlsx'
tables = pd.read_html(codeID_url)
df_codeID = tables[0]
给予
ValueError: No tables found
这是有道理的,因为最后,我不是从 html 页面解析表格。上面的命令对于本页 (https://www.ecdc.europa.eu/en/covid-19/variants-concern) 中的表格非常有效
方法二
codeID_url = 'https://www.dropbox.com/s/i77mern7joxc9ur/TestResultCodelistVoC.xlsx'
data = pd.read_excel(codeID_url,'TestResultCodelistVoC')
给出:
XLRDError: Unsupported format, or corrupt file: Expected BOF record; found b'<!DOCTYP'
我确实在同一错误 here 上找到了一个主题,尽管所有答案都在处理 local .xls
文件,在我的情况下,我正在尝试解析一个网页/链接,最终是一个 .xls
文件。
我也遇到过一个使用 dropbox token 的解决方案,不过如果可能的话,我首先想尝试在没有 Dropbox 帐户的情况下下载上述表格。
答案 0 :(得分:0)
在网址末尾添加 ?dl=1
。
>>> import pandas as pd
>>>
>>> url = 'https://www.dropbox.com/s/i77mern7joxc9ur/TestResultCodelistVoC.xlsx?dl=1'
>>> df = pd.read_excel(url)
>>> print(df)
Codelistname Codesystem name ... Short label DE 1st Release
0 TestResultCodelistVoC NaN ... Confirmed 501Y.V1 NaN
1 TestResultCodelistVoC NaN ... Confirmed 501Y.V2 NaN
2 TestResultCodelistVoC NaN ... Confirmed 501Y.V3 NaN
3 TestResultCodelistVoC NaN ... Confirmed 501Y.V3.P1 NaN
4 TestResultCodelistVoC NaN ... Confirmed 501Y.V3.P2 NaN
5 TestResultCodelistVoC NaN ... Confirmed not one of the listed VOC NaN
6 TestResultCodelistVoC NaN ... Compatible with 501Y.V1 NaN
7 TestResultCodelistVoC NaN ... Compatible with 501Y.V2 NaN
8 TestResultCodelistVoC NaN ... Compatible with 501Y.V3 NaN
9 TestResultCodelistVoC NaN ... Compatible with 501Y.V3.P1 NaN
10 TestResultCodelistVoC NaN ... Compatible with 501Y.V3.P2 NaN
11 TestResultCodelistVoC NaN ... Compatible with 501Y.V2-3 NaN
12 TestResultCodelistVoC NaN ... Compatible with a VOC NaN
13 TestResultCodelistVoC NaN ... Confirmed MinkCluster5 NaN
14 TestResultCodelistVoC NaN ... Compatible with MinkCluster5 NaN
15 TestResultCodelistVoC NaN ... Not compatible with 501Y.V1 NaN
16 TestResultCodelistVoC NaN ... Not compatible with 501Y.V2-3 NaN
17 TestResultCodelistVoC NaN ... No compatibility with VOC detected (VOC not fu... NaN
18 TestResultCodelistVoC NaN ... Other variant of concern NaN
[19 rows x 12 columns]
>>>