如何从姓名中删除数字?

时间:2018-07-31 00:51:28

标签: python web-scraping lxml urlopen

我已经开始抓取网页,但是我的文字上附加了数字0.01。

例如,我希望名称“ Doe,John0.01”看起来像“ Doe,John”。

这是到目前为止的代码...

from urllib.request import urlopen
from lxml import html

response = urlopen("https://www.baseball-reference.com/leagues/MLB/2018-standard-pitching.shtml")
content = response.read()

tree = html.fromstring(content)

comment_html = tree.xpath('//comment([contains(.,"players_standard_pitching")]'[0]

comment_html = str(comment_html).replace("-->", "")
comment_html = comment_html.replace("<!--", "")

tree = html.fromstring(comment_html)

for pitcher_row in tree.xpath('//table[@id="players_standard_pitching"]/tbody/tr[contains(@class, "full_table")]'):
    csk = pitcher_row.xpath('./td[@data-stat="player"]/@csk')[0]
    print(csk)

0 个答案:

没有答案