代码之家  ›  专栏  ›  技术社区  ›  Bruno 'Shady'

如何在Python中通过代理打开带有urllib的网站?

  •  20
  • Bruno 'Shady'  · 技术社区  · 15 年前

    这是代码,举个例子

    while True:
        try:
            h = urllib.urlopen(website)
            break
        except:
            print '['+time.strftime('%Y/%m/%d %H:%M:%S')+'] '+'ERROR. Trying again in a few seconds...'
            time.sleep(5)
    
    4 回复  |  直到 9 年前
        1
  •  50
  •   Oren    5 年前

    默认情况下, urlopen http_proxy 要确定要使用的HTTP代理,请执行以下操作:

    $ export http_proxy='http://myproxy.example.com:1234'
    $ python myscript.py  # Using http://myproxy.example.com:1234 as a proxy
    

    proxies 论据

    proxies = {'http': 'http://myproxy.example.com:1234'}
    print("Using HTTP proxy %s" % proxies['http'])
    urllib.urlopen("http://www.google.com", proxies=proxies)
    

    编辑: 如果我正确理解您的评论,您需要尝试几个代理并在尝试时打印每个代理。像这样的怎么样?

    candidate_proxies = ['http://proxy1.example.com:1234',
                         'http://proxy2.example.com:1234',
                         'http://proxy3.example.com:1234']
    for proxy in candidate_proxies:
        print("Trying HTTP proxy %s" % proxy)
        try:
            result = urllib.urlopen("http://www.google.com", proxies={'http': proxy})
            print("Got URL using proxy %s" % proxy)
            break
        except:
            print("Trying next proxy in 5 seconds")
            time.sleep(5)
    
        2
  •  42
  •   DomTomCat    9 年前

    #!/usr/bin/env python3
    import urllib.request
    
    proxy_support = urllib.request.ProxyHandler({'http' : 'http://user:pass@server:port', 
                                                 'https': 'https://...'})
    opener = urllib.request.build_opener(proxy_support)
    urllib.request.install_opener(opener)
    
    with urllib.request.urlopen(url) as response:
        # ... implement things such as 'html = response.read()'
    

    另请参阅 the relevant section in the Python 3 docs

        3
  •  4
  •   daz    9 年前

    下面的示例代码指导如何使用urllib通过代理进行连接:

    authinfo = urllib.request.HTTPBasicAuthHandler()
    
    proxy_support = urllib.request.ProxyHandler({"http" : "http://ahad-haam:3128"})
    
    # build a new opener that adds authentication and caching FTP handlers
    opener = urllib.request.build_opener(proxy_support, authinfo,
                                         urllib.request.CacheFTPHandler)
    
    # install it
    urllib.request.install_opener(opener)
    
    f = urllib.request.urlopen('http://www.google.com/')
    """
    
        4
  •  2
  •   CDspace Matrix    8 年前

    对于http和https使用:

    proxies = {'http':'http://proxy-source-ip:proxy-port',
               'https':'https://proxy-source-ip:proxy-port'}
    

    可以类似地添加更多代理

    proxies = {'http':'http://proxy1-source-ip:proxy-port',
               'http':'http://proxy2-source-ip:proxy-port'
               ...
              }
    

    filehandle = urllib.urlopen( external_url , proxies=proxies)
    

    不要使用任何代理(如果是网络中的链接)

    filehandle = urllib.urlopen(external_url, proxies={})
    

    proxies = {'http':'http://username:password@proxy-source-ip:proxy-port',
               'https':'https://username:password@proxy-source-ip:proxy-port'}
    

    注意:避免使用特殊字符,如 :,@