代码之家  ›  专栏  ›  技术社区  ›  ʞɔıu

Python告诉ftp传输何时完成

  •  3
  • ʞɔıu  · 技术社区  · 16 年前

    我必须从FTP服务器下载一些文件。看起来很平淡。但是,此服务器的行为方式是,如果文件非常大,当下载表面上完成时,连接将挂起。

    我如何在python中使用ftplib优雅地处理这个问题?

    python代码示例:

    from ftplib import FTP
    
    ...
    
    ftp = FTP(host)
    ftp.login(login, passwd)
    files=ftp.nlst()
    ftp.set_debuglevel(2)
    
    for fname in files:
        ret_status = ftp.retrbinary('RETR ' + fname, open(fname, 'wb').write)
    

    调试上述输出:

    *cmd* 'TYPE I'
    *put* 'TYPE I\r\n'
    *get* '200 Type set to I.\r\n'
    *resp* '200 Type set to I.'
    *cmd* 'PASV'
    *put* 'PASV\r\n'
    *get* '227 Entering Passive Mode (0,0,0,0,10,52).\r\n'
    *resp* '227 Entering Passive Mode (0,0,0,0,10,52).'
    *cmd* 'RETR some_file'
    *put* 'RETR some_file\r\n'
    *get* '125 Data connection already open; Transfer starting.\r\n'
    *resp* '125 Data connection already open; Transfer starting.'
    [just sits there indefinitely]
    

    当我尝试使用curl-v进行相同的下载时,它看起来是这样的:

    * About to connect() to some_server port 21 (#0)
    *   Trying some_ip... connected
    * Connected to some_server (some_ip) port 21 (#0)
    < 220 Microsoft FTP Service
    > USER some_user
    < 331 Password required for some_user.
    > PASS some_password
    < 230 User some_user logged in.
    > PWD
    < 257 "/some_dir" is current directory.
    * Entry path is '/some_dir'
    > EPSV
    * Connect data stream passively
    < 500 'EPSV': command not understood
    * disabling EPSV usage
    > PASV
    < 227 Entering Passive Mode (0,0,0,0,11,116).
    *   Trying some_ip... connected
    * Connecting to some_ip (some_ip) port 2932
    > TYPE I
    < 200 Type set to I.
    > SIZE some_file
    < 213 229376897
    > RETR some_file
    < 125 Data connection already open; Transfer starting.
    * Maxdownload = -1
    * Getting file with size: 229376897
    { [data not shown]
      % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                     Dload  Upload   Total   Spent    Left  Speed
    100  218M  100  218M    0     0   182k      0  0:20:28  0:20:28 --:--:--     0* FTP response timeout
    * control connection looks dead
    100  218M  100  218M    0     0   182k      0  0:20:29  0:20:29 --:--:--     0* Connection #0 to host some_server left intact
    
    curl: (28) FTP response timeout
    * Closing connection #0
    

    --2009-07-09 11:32:23--  ftp://some_server/some_file
               => `some_file'
    Resolving some_server... 0.0.0.0
    Connecting to some_server|0.0.0.0|:21... connected.
    Logging in as some_user ... Logged in!
    ==> SYST ... done.    ==> PWD ... done.
    ==> TYPE I ... done.  ==> CWD not needed.
    ==> SIZE some_file ... 229376897
    ==> PASV ... done.    ==> RETR some_file ... done.
    Length: 229376897 (219M)
    
    100%[==========================================================>] 229,376,897  387K/s   in 18m 54s 
    
    2009-07-09 11:51:17 (198 KB/s) - Control connection closed.
    Retrying.
    
    --2009-07-09 12:06:18--  ftp://some_server/some_file
      (try: 2) => `some_file'
    Connecting to some_server|0.0.0.0|:21... connected.
    Logging in as some_user ... Logged in!
    ==> SYST ... done.    ==> PWD ... done.
    ==> TYPE I ... done.  ==> CWD not needed.
    ==> SIZE some_file ... 229376897
    ==> PASV ... done.    ==> REST 229376897 ... done.    
    ==> RETR some_file ... done.
    Length: 229376897 (219M), 0 (0) remaining
    
    100%[+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++] 229,376,897 --.-K/s   in 0s      
    
    2009-07-09 12:06:18 (0.00 B/s) - `some_file' saved [229376897]
    
    2 回复  |  直到 16 年前
        1
  •  0
  •   John Fouhy    16 年前

    我认为一些调试可能会有用。你能把下面的类折叠到你的代码中吗?(我自己没有这样做,因为我知道这个版本有效,不想冒出错的风险。你应该能够把类放在文件的顶部,用我在#loop body之后写的东西替换循环的主体)

    class CounterFile():
        def __init__(self, file, maxsize):
            self.file = file
            self.count = 0
            self.maxsize = maxsize
    
        def write(self, bytes):
            self.count += len(bytes)
            print "total %d bytes / %d"%(self.count, self.maxsize)
            if self.count == self.maxsize:
                print "   Should be complete"
            self.file.write(bytes)
    
    
    from ftplib import FTP
    ftp = FTP('ftp.gimp.org')
    ftp.login('ftp', 'thouis@gmail.com')
    ftp.set_debuglevel(2)
    
    ftp.cwd('/pub/gimp/v2.6/')
    fname = 'gimp-2.6.2.tar.bz2'
    
    # LOOP BODY
    sz = ftp.size(fname)
    if sz is None:
        print "Could not get size!"
        sz = 0
    ret_status = ftp.retrbinary('RETR ' + fname, CounterFile(open(fname, 'wb'), sz).write)
    
        2
  •  0
  •   user227667 user227667    16 年前

    我从未使用过ftplib,但也许你可以这样做:

    1. 获取所需文件的名称和大小。
    2. 启动一个新的守护线程来下载文件。
    3. 在主线程中,每隔几秒钟检查一次磁盘上的文件大小是否等于目标大小。
    4. 当它发生时,等待几秒钟,让连接有机会很好地关闭,然后退出程序。