Question

我想在单独的函数中打印结果但是当我调用变量时我不能使用它们，因为它们处于不同的函数中。任何人都可以告诉我如何编辑我的代码，这样可行吗？附：我知道我应该使用beautifulsoup ...但是，我无法在计算机上安装它

import urllib2
from urllib2 import urlopen
import re
import cookielib
from cookielib import CookieJar
import time
c_j = CookieJar()
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(c_j))
opener.addheaders = [('User-agent','Mozilla/5.0')] #Makes the website think we are       using firefox by using header

def proxies1():
      try:
          page = 'http://free-proxy-list.net/' #Sets the variable page as our website
          sourceCode = opener.open(page).read() #Reads the source code
          titles = re.findall('<tr><td>(.*?)</td><td>', sourceCode) #Parses the Html, collects the proxies
     for title in titles:
         proxy1 = title.replace(',', '').replace("!", '').replace(":", '').replace(";", '') 
     except Exception, e:

         print str(e)

def ports1():
      try:
        page = 'http://free-proxy-list.net/' #Sets the variable page as our website
        sourceCode = opener.open(page).read() #Reads the source code
        banana = re.findall('</td><td>(.*?)</td><td>', sourceCode) #Parses the Html, collects the proxies
        for title in banana:
              port1 = title.replace('a', '').replace('b', '').replace('c', '').replace('d', '').replace('e', '').replace('f', '').replace('g', '').replace('h', '') \
  .replace('i', '').replace('j', '').replace('k', '').replace('l', '').replace('m', '').replace('n', '').replace('o', '').replace('p', '') \
  .replace('q', '').replace('r', '').replace('s', '').replace('t', '').replace('u', '').replace('v', '').replace('w', '').replace('x', '') \
  .replace('y', '').replace('z', '').replace('A', '').replace('B', '').replace('C', '').replace('D', '').replace('E', '').replace('F', '').replace('G', '') \
  .replace('H', '').replace('I', '').replace('J', '').replace('K', '').replace('L', '').replace('M', '').replace('N', '').replace('O', '') \
  .replace('P', '').replace('Q', '').replace('R', '').replace('S', '').replace('T', '').replace('U', '').replace('V', '').replace('W', '') \
  .replace('X', '').replace('Y', '').replace('Z', '')
  except Exception, e:
    print str(e)



def printfun():

      print str(proxy1) + ":" + str(port1)

printfun()

我知道我的缩进有点偏离，堆栈溢出搞砸了......我怎么能这样做？

Answer 1

您不能在另一个函数中使用一个函数的局部变量。也就是说，事实上，局部变量的整个点。

你可以通过使用全局变量来改变事物，但这是一个坏主意。

然而，类似的想法，具有所有相同的好处，但没有问题，是使用对象属性。将两个函数更改为类的方法，并将变量更改为该类的属性。像这样：

class ProxyParser(object):
    def proxies1(self):
          try:
              self.page = 'http://free-proxy-list.net/' #Sets the variable page as our website
              self.sourceCode = opener.open(page).read() #Reads the source code
              self.titles = re.findall('<tr><td>(.*?)</td><td>', sourceCode) #Parses the Html, collects the proxies
         for title in self.titles:
             proxy1 = title.replace(',', '').replace("!", '').replace(":", '').replace(";", '') 
         except Exception, e:

             print str(e)
def ports1(self):
    try:
        for title in self.titles:
            # etc.

proxy_parser = ProxyParser()
proxy_parser.proxies1()
proxy_parser.ports1()

基本上，只需在每个函数的参数列表的开头添加self，在每个局部变量之前添加self.，现在它们就是实例变量，它们是共享的通过该对象的所有方法，而不是每个单独的函数本地。

另一种选择是来自一个函数的return值，然后将它们作为参数传递给另一个函数。像这样：

def proxies1():
      try:
          page = 'http://free-proxy-list.net/' #Sets the variable page as our website
          sourceCode = opener.open(page).read() #Reads the source code
          titles = re.findall('<tr><td>(.*?)</td><td>', sourceCode) #Parses the Html, collects the proxies
     for title in titles:
         proxy1 = title.replace(',', '').replace("!", '').replace(":", '').replace(";", '')
     except Exception, e:

         print str(e)
     return titles

def ports1(titles):
    for title in titles:
         # etc.

titles = proxies1()
ports1(titles)

同时，当你的问题询问如何使用另一个函数中的一个函数的变量时，我认为你实际想要的是从两个函数外部使用它们。幸运的是，完全相同的解决方案也可以正常工作：要么return要使用的值，要么将它们存储为实例属性。

但是，在你到达那里之前......你必须拥有你想要的值。您的proxies函数只是一遍又一遍地重新定义proxy1变量。所以，即使你return proxy1，它也只是页面中的最后一个。同样，您的ports1函数也会对port1执行相同的操作。

我仍然不确定你想要在这做什么，但可能你想要返回代理的所有。你可以通过建立一个列表并返回它来做到这一点 - 或者，如果你有冒险精神，但yield每一个。然后，调用者可以从一个函数获取代理的列表（或迭代器），从另一个函数获取端口的列表（或迭代器），zip它们在一起，并循环结果。像这样：

proxies = proxies1()
ports = ports1()
for proxy, port in zip(proxies, ports):
    print proxy + ':' + port

Answer 2

您是否正在尝试打印代理列表：端口地址？这应该有助于：

import urllib2
from urllib2 import urlopen
import re
import cookielib
from cookielib import CookieJar
import time
c_j = CookieJar()
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(c_j))
opener.addheaders = [('User-agent','Mozilla/5.0')]

def proxies():
      page = 'http://free-proxy-list.net/'
      sourceCode = opener.open(page).read()
      proxy_ports = re.findall('<tr><td>(.*?)</td><td>(.*?)</td><td>', sourceCode)
      addresses = []
      for pp in proxy_ports:
            addresses.append("%s:%s" % pp)
      return addresses

print "\n".join(proxies())

如何在其他函数中使用局部变量？

2 个答案: