网页抓取不同的足球直播分数网站

时间:2017-03-17 09:19:07

标签: python web-scraping live

我需要在数据库中为我想要开发的应用程序提供足球比分。我发现的api是不完整的或没有我需要的一些功能,是否合法网站刮实况评分网站?我想我可以抓不同的网站,不创造流量,你怎么看?谢谢

1 个答案:

答案 0 :(得分:0)

我不相信从网站上解析数据是非法的。您可能希望执行类似的操作,它是一个程序,它转到指定的网页并从特定行获取数据并将其保存到文件中以供其他程序使用。

#this is to get the price of various stocks from Google's search page.     Let's hope this works.
import requests
# Example file for parsing and processing HTML
# import the HTMLParser module
from HTMLParser import HTMLParser
import time

metacount = 0;

x=0
while x==0:
# create a subclass and override the handler methods
class MyHTMLParser(HTMLParser):
    # function to handle character and text data (tag contents)
    def handle_data(self, data):
        #print data
        pos = self.getpos()
        #print "At line: ", pos[0], " position ", pos[1]
        if pos[0]==154:
            price=data
            print price
            # Open a file for writing and create it if it doesn't exist
            f = open("price.txt", "w+")
            # write some lines of data to the file
            f.write(price)
            f.close()
            # Open the file back up and read the contents
            #if = open("price.txt", "r")
            #if f.mode == 'r':  # check to make sure that the file was opened
        # use the read() function to read the entire file
            #   print('true')



def main():
    # instantiate the parser and feed it some HTML
    parser = MyHTMLParser()
    #stock=open('stocks.txt')
    stockname=raw_input('stock symbol')#stock.read()
    r=requests.get('http://stocks.tradingcharts.com/stocks/quotes/'+stockname)
    #print (r.status_code)
    stuff = r.text
    parser.feed(stuff)


if __name__ == "__main__":
    main();
#put it on a timer since the page is updated once every 5 minutes
time.sleep(300)