代码之家 › 专栏 › 技术社区 › Bruno 'Shady'

如何在Python中通过代理打开带有urllib的网站?

proxy python

Bruno 'Shady' · 技术社区 · 15 年前

这是代码,举个例子

while True:
    try:
        h = urllib.urlopen(website)
        break
    except:
        print '['+time.strftime('%Y/%m/%d %H:%M:%S')+'] '+'ERROR. Trying again in a few seconds...'
        time.sleep(5)

4 回复 | 直到 9 年前

Oren 5 年前

默认情况下, urlopen http_proxy 要确定要使用的HTTP代理,请执行以下操作:

$ export http_proxy='http://myproxy.example.com:1234'
$ python myscript.py  # Using http://myproxy.example.com:1234 as a proxy

proxies 论据

proxies = {'http': 'http://myproxy.example.com:1234'}
print("Using HTTP proxy %s" % proxies['http'])
urllib.urlopen("http://www.google.com", proxies=proxies)

编辑: 如果我正确理解您的评论,您需要尝试几个代理并在尝试时打印每个代理。像这样的怎么样?

candidate_proxies = ['http://proxy1.example.com:1234',
                     'http://proxy2.example.com:1234',
                     'http://proxy3.example.com:1234']
for proxy in candidate_proxies:
    print("Trying HTTP proxy %s" % proxy)
    try:
        result = urllib.urlopen("http://www.google.com", proxies={'http': proxy})
        print("Got URL using proxy %s" % proxy)
        break
    except:
        print("Trying next proxy in 5 seconds")
        time.sleep(5)

DomTomCat 9 年前

#!/usr/bin/env python3
import urllib.request

proxy_support = urllib.request.ProxyHandler({'http' : 'http://user:pass@server:port', 
                                             'https': 'https://...'})
opener = urllib.request.build_opener(proxy_support)
urllib.request.install_opener(opener)

with urllib.request.urlopen(url) as response:
    # ... implement things such as 'html = response.read()'

另请参阅 the relevant section in the Python 3 docs

daz 9 年前

下面的示例代码指导如何使用urllib通过代理进行连接:

authinfo = urllib.request.HTTPBasicAuthHandler()

proxy_support = urllib.request.ProxyHandler({"http" : "http://ahad-haam:3128"})

# build a new opener that adds authentication and caching FTP handlers
opener = urllib.request.build_opener(proxy_support, authinfo,
                                     urllib.request.CacheFTPHandler)

# install it
urllib.request.install_opener(opener)

f = urllib.request.urlopen('http://www.google.com/')
"""

CDspace Matrix 8 年前

对于http和https使用:

proxies = {'http':'http://proxy-source-ip:proxy-port',
           'https':'https://proxy-source-ip:proxy-port'}

可以类似地添加更多代理

proxies = {'http':'http://proxy1-source-ip:proxy-port',
           'http':'http://proxy2-source-ip:proxy-port'
           ...
          }

filehandle = urllib.urlopen( external_url , proxies=proxies)

不要使用任何代理(如果是网络中的链接)

filehandle = urllib.urlopen(external_url, proxies={})

proxies = {'http':'http://username:password@proxy-source-ip:proxy-port',
           'https':'https://username:password@proxy-source-ip:proxy-port'}

注意:避免使用特殊字符,如 :,@

推荐文章

Google User · Django管理员在`list_display中未显示`creation_date`字段`

6 月前

user29747013 · 如何创建一个新的数据框架,其中包含原始数据框架中列的聚合列?

6 月前

ÎÎÎ½Î· ÎÎ®Î¹Î½Î¿Ï · Python lxml.html语法错误:使用lxml find时XPATH的谓词无效

6 月前

user29715306 · from_users=和chats=电视节目中的差异

6 月前

Redshoe · 当执行numpy.genfromtxt()时,python是否会读取文件的所有行?

6 月前

RASEL MAHMUD · 为什么以及如何在is_even()函数内的IF条件中递归X变量在满足0后递增?[副本]

6 月前

prayner · 更新嵌套字典包含列表中的项

7 月前

Bringo Jr · 我可以在O(n)中解决这个问题吗?

7 月前

Dave · 如何在for循环中修改列表值

7 月前

Shukurullox Komiljonov · 从记录中获得相互和解。使用SQL

7 月前