Question

我正在使用以下代码从2个站点中抓取特定数据。

firstclass = input("First class: ")
nestedclass = input("Nested class: ")
classend = input("Class close tag: ")
exportlist = []

def getNames(i):     #i is the html string.
    i=str(i)
    check = i.find(firstclass)
    while check != -1:
        logging("Making new loop...")  #function to show the message together with time in the console
        i = str(i)
        i = i.replace(firstclass, '\n', 1)
        logging("progress = 25%")
        i = i.split('\n')
        i = str(i[1])
        logging("progress = 50%")
        i = i.replace(nestedclass, '\n', 1) 
        i = i.split('\n')
        logging("progress = 75%")
        i = str(i[1])
        i = i.replace(classend, '\n', 1)
        logging("Loop done ! ")
        i = i.split('\n')
        exportlist.append(i[0])
        i = str(i[1])
        check = i.find(firstclass)


        if check < 500 and check!= -1:              #This part removes the next data piece,
            logging("In short Check")               #if it's very close to the previous one.
            i = str(i)                              #In case of double data in short distance
            i = i.replace(firstclass, '\n', 1) 
            i = i.split('\n')
            i = str(i[1])
            i = i.replace(nastedclass, '\n', 1) 
            i = i.split('\n')
            i = str(i)
            i = i.replace(classend, '\n', 1)
            i = i.split('\n')
            i = str(i)
            check = i.find(firstclass), '\n', 1)

这是代码中的部分，我遇到了大多数问题。最近2周，它运行缓慢，但是还不错。即使经过10到20分钟，我也得到了正确的结果。但是也许由于文件大小的不断增加，现在我可以在运行文件时得到它：

Memmory error.

我试图删除一流的东西，但是没有一流的东西就无法正常工作，因为嵌套的东西也存在于一流的外面，并且带来了错误的结果。那么，有什么建议可以更好地编写该代码？

此外，我正在使用Python 64位。有64GB RAM，在这种情况下，我认为足够了。如果有办法增加python使用的内存，我已经准备好了。

Python内存错误-优化刮板中的一部分代码，以减少内存使用量

0 个答案: