如何在不重新加载的情况下检测网页上的更改

时间:2016-04-21 11:29:07

标签: python

我发现使用perl可以检测到。 How to detect a changed webpage? 但不幸的是,我不知道perl。 在python中有一种方法吗? 如果不复杂,你能举个详细的例子吗?

1 个答案:

答案 0 :(得分:1)

你的意思是一个python脚本,它读取一个网页并显示它是否与上次访问不同?一个非常简单的版本(适用于python2和python3):

#!/usr/bin/env python
# -*- coding: utf-8 -*-

import sys
import os
import requests
from hashlib import sha1

recent_hash_filename = ".recent_hash"


def test(url):
    print("looking up %s" % url)
    if not os.path.exists(recent_hash_filename):
        open(recent_hash_filename, 'a').close()

    hash_fetched = sha1()
    hash_read    = ""
    r = requests.get(url)
    hash_fetched.update(r.text.encode("utf8"))

    with open(recent_hash_filename) as f:
        hash_read = f.read()

    print(hash_fetched.hexdigest())
    print(hash_read)

    if hash_fetched.hexdigest() == hash_read:
        print("same")
    else:
        print("different")

    with open(recent_hash_filename, "w") as f:
        f.write(hash_fetched.hexdigest())

if __name__ == '__main__':
    if len(sys.argv) > 1:
        url = sys.argv[1]
    else:
        url = "https://www.heise.de"

    test(url)

    print("done")

如果您有任何疑问,请告诉我