代码之家  ›  专栏  ›  技术社区  ›  Daniel Von Fange

Python:httppost使用流媒体发布大文件

  •  16
  • Daniel Von Fange  · 技术社区  · 15 年前

    import urllib2
    
    f = open('somelargefile.zip','rb')
    request = urllib2.Request(url,f.read())
    request.add_header("Content-Type", "application/zip")
    response = urllib2.urlopen(request)
    

    但是,这会在发布之前将整个文件的内容读入内存。如何让它将文件流式传输到服务器?

    6 回复  |  直到 15 年前
        1
  •  29
  •   Daniel Von Fange    15 年前

    通过阅读systempuntoout链接到的邮件列表线程,我找到了解决方案的线索。

    这个 mmap 模块允许您打开类似字符串的文件。文件的一部分按需加载到内存中。

    import urllib2
    import mmap
    
    # Open the file as a memory mapped string. Looks like a string, but 
    # actually accesses the file behind the scenes. 
    f = open('somelargefile.zip','rb')
    mmapped_file_as_string = mmap.mmap(f.fileno(), 0, access=mmap.ACCESS_READ)
    
    # Do the request
    request = urllib2.Request(url, mmapped_file_as_string)
    request.add_header("Content-Type", "application/zip")
    response = urllib2.urlopen(request)
    
    #close everything
    mmapped_file_as_string.close()
    f.close()
    
        2
  •  3
  •   systempuntoout    15 年前

    你试过吗 Mechanize

    from mechanize import Browser
    br = Browser()
    br.open(url)
    br.form.add_file(open('largefile.zip'), 'application/zip', 'largefile.zip')
    br.submit()
    

    或者,如果不想使用多部分/表单数据,请选中 this 老职位。

    它提出了两种选择:

      1. Use mmap, Memory Mapped file object
      2. Patch httplib.HTTPConnection.send
    
        3
  •  3
  •   Brian Beach    10 年前

    您需要自己设置内容长度头。如果未设置,urllib2将对数据调用len(),而文件对象不支持。

    import os.path
    import urllib2
    
    data = open(filename, 'r')
    headers = { 'Content-Length' : os.path.getsize(filename) }
    response = urllib2.urlopen(url, data, headers)
    

    HTTPConnection 上课时间 httplib.py 在Python 2.7中:

    def send(self, data):
        """Send `data' to the server."""
        if self.sock is None:
            if self.auto_open:
                self.connect()
            else:
                raise NotConnected()
    
        if self.debuglevel > 0:
            print "send:", repr(data)
        blocksize = 8192
        if hasattr(data,'read') and not isinstance(data, array):
            if self.debuglevel > 0: print "sendIng a read()able"
            datablock = data.read(blocksize)
            while datablock:
                self.sock.sendall(datablock)
                datablock = data.read(blocksize)
        else:
            self.sock.sendall(data)
    
        4
  •  1
  •   JimB    15 年前

    试试pycurl。我没有任何东西安装程序会接受一个大文件

    import os
    import pycurl
    
    class FileReader:
        def __init__(self, fp):
            self.fp = fp
        def read_callback(self, size):
            return self.fp.read(size)
    
    c = pycurl.Curl()
    c.setopt(pycurl.URL, url)
    c.setopt(pycurl.UPLOAD, 1)
    c.setopt(pycurl.READFUNCTION, FileReader(open(filename, 'rb')).read_callback)
    filesize = os.path.getsize(filename)
    c.setopt(pycurl.INFILESIZE, filesize)
    c.perform()
    c.close()
    
        5
  •  1
  •   EthanP    8 年前

    requests 你能做什么

    with open('massive-body', 'rb') as f:
        requests.post('http://some.url/streamed', data=f)
    

    here in their docs

        6
  •  0
  •   Sergey Nudnov    5 年前

    下面是Python 2/Python 3的工作示例:

    try:
        from urllib2 import urlopen, Request
    except:
        from urllib.request import urlopen, Request
    
    headers = { 'Content-length': str(os.path.getsize(filepath)) }
    with open(filepath, 'rb') as f:
        req = Request(url, data=f, headers=headers)
        result = urlopen(req).read().decode()