Version: | 2.1.1 |
---|---|
Date: | 2006-08-19 |
Summary: | high-level FTP client library for Python |
Keywords: | FTP, ftplib substitute, virtual filesystem, pure Python |
Author: | Stefan Schwarzer <sschwarzer@sschwarzer.net> |
Russian translation: | |
Anton Stepanov <antymail@mail.ru> |
The ftputil module is a high-level interface to the ftplib module. The FTPHost objects generated from it allow many operations similar to those of os, os.path and shutil.
Examples:
import ftputil # download some files from the login directory host = ftputil.FTPHost('ftp.domain.com', 'user', 'password') names = host.listdir(host.curdir) for name in names: if host.path.isfile(name): host.download(name, name, 'b') # remote, local, binary mode # make a new directory and copy a remote file into it host.mkdir('newdir') source = host.file('index.html', 'r') # file-like object target = host.file('newdir/index.html', 'w') # file-like object host.copyfileobj(source, target) # similar to shutil.copyfileobj source.close() target.close()
Also, there are FTPHost.lstat and FTPHost.stat to request size and modification time of a file. The latter can also follow links, similar to os.stat. Even FTPHost.walk and FTPHost.path.walk work.
The distribution contains a custom UserTuple module to provide stat results with Python versions 2.0 and 2.1.
The exceptions are in the namespace of the ftp_error module, e. g. ftp_error.TemporaryError. Getting the exception classes from the "package module" ftputil is deprecated.
The exceptions are organized as follows:
FTPError FTPOSError(FTPError, OSError) PermanentError(FTPOSError) TemporaryError(FTPOSError) FTPIOError(FTPError) InternalError(FTPError) InaccessibleLoginDirError(InternalError) ParserError(InternalError) RootDirError(InternalError) TimeShiftError(InternalError)
and are described here:
FTPError
is the root of the exception hierarchy of the module.
FTPOSError
is derived from OSError. This is for similarity between the os module and FTPHost objects. Compare
try: os.chdir('nonexisting_directory') except OSError: ...
with
host = ftputil.FTPHost('host', 'user', 'password') try: host.chdir('nonexisting_directory') except OSError: ...
Imagine a function
def func(path, file): ...
which works on the local file system and catches OSErrors. If you change the parameter list to
def func(path, file, os=os): ...
where os denotes the os module, you can call the function also as
host = ftputil.FTPHost('host', 'user', 'password') func(path, file, os=host)
to use the same code for a local and remote file system. Another similarity between OSError and FTPOSError is that the latter holds the FTP server return code in the errno attribute of the exception object and the error text in strerror.
PermanentError
is raised for 5xx return codes from the FTP server (again, that's similar but not identical to ftplib.error_perm).
TemporaryError
is raised for FTP return codes from the 4xx category. This corresponds to ftplib.error_temp (though TemporaryError and ftplib.error_temp are not identical).
FTPIOError
denotes an I/O error on the remote host. This appears mainly with file-like objects which are retrieved by invoking FTPHost.file (FTPHost.open is an alias). Compare
>>> try: ... f = open('not_there') ... except IOError, obj: ... print obj.errno ... print obj.strerror ... 2 No such file or directory
with
>>> host = ftputil.FTPHost('host', 'user', 'password') >>> try: ... f = host.open('not_there') ... except IOError, obj: ... print obj.errno ... print obj.strerror ... 550 550 not_there: No such file or directory.
As you can see, both code snippets are similar. (However, the error codes aren't the same.)
InternalError
subsumes exception classes for signaling errors due to limitations of the FTP protocol or the concrete implementation of ftputil.
InaccessibleLoginDirError
This exception is only raised if both of the following conditions are met:
ParserError
is used for errors during the parsing of directory listings from the server. This exception is used by the FTPHost methods stat, lstat, and listdir.
RootDirError
Because of the implementation of the lstat method it is not possible to do a stat call on the root directory /. If you know any way to do it, please let me know. :-)
This problem does not affect stat calls on items in the root directory.
TimeShiftError
is used to denote errors which relate to setting the time shift, for example trying to set a value which is no multiple of a full hour.
FTPHost instances may be generated with the following call:
host = ftputil.FTPHost(host, user, password, account, session_factory=ftplib.FTP)
The first four parameters are strings with the same meaning as for the FTP class in the ftplib module. The keyword argument session_factory may be used to generate FTP connections with other factories than the default ftplib.FTP. For example, the M2Crypto distribution uses a secure FTP class which is derived from ftplib.FTP.
In fact, all positional and keyword arguments other than session_factory are passed to the factory to generate a new background session (which happens for every remote file that is opened; see below).
This functionality of the constructor also allows to wrap ftplib.FTP objects to do something that wouldn't be possible with the ftplib.FTP constructor alone.
As an example, assume you want to connect to another than the default port but ftplib.FTP only offers this by means of its connect method, but not via its constructor. The solution is to provide a wrapper class:
import ftplib import ftputil EXAMPLE_PORT = 50001 class MySession(ftplib.FTP): def __init__(self, host, userid, password, port): """Act like ftplib.FTP's constructor but connect to other port.""" ftplib.FTP.__init__(self) self.connect(host, port) self.login(userid, password) # try not to use MySession() as factory, - use the class itself host = ftputil.FTPHost(host, userid, password, port=EXAMPLE_PORT, session_factory=MySession) # use `host` as usual
On login, the format of the directory listings (needed for stat'ing files and directories) should be determined automatically. If not, please file a bug.
curdir, pardir, sep
are strings which denote the current and the parent directory on the remote server. sep identifies the path separator. Though RFC 959 (File Transfer Protocol) notes that these values may depend on the FTP server implementation, the Unix counterparts seem to work well in practice, even for non-Unix servers.
upload(source, target, mode='')
copies a local source file (given by a filename, i. e. a string) to the remote host under the name target. Both source and target may be absolute paths or relative to their corresponding current directory (on the local or the remote host, respectively). The mode may be "" or "a" for ASCII uploads or "b" for binary uploads. ASCII mode is the default (again, similar to regular local file objects).
download(source, target, mode='')
performs a download from the remote source to a target file. Both source and target are strings. Additionally, the description of the upload method applies here, too.
upload_if_newer(source, target, mode='')
is similar to the upload method. The only difference is that the upload is only invoked if the time of the last modification for the source file is more recent than that of the target file, or the target doesn't exist at all. If an upload actually happened, the return value is a true value, else a false value.
Note that this method only checks the existence and/or the modification time of the source and target file; it can't recognize a change in the transfer mode, e. g.
# transfer in ASCII mode host.upload_if_newer('source_file', 'target_file', 'a') # won't transfer the file again, which is bad! host.upload_if_newer('source_file', 'target_file', 'b')
Similarly, if a transfer is interrupted, the remote file will have a newer modification time than the local file, and thus the transfer won't be repeated if upload_if_newer is used a second time. There are (at least) two possibilities after a failed upload:
If it seems that a file is uploaded unnecessarily, read the subsection on time shift settings.
download_if_newer(source, target, mode='')
corresponds to upload_if_newer but performs a download from the server to the local host. Read the descriptions of download and upload_if_newer for more. If a download actually happened, the return value is a true value, else a false value.
If it seems that a file is downloaded unnecessarily, read the subsection on time shift settings.
set_time_shift(time_shift)
sets the so-called time shift value (measured in seconds). The time shift is the difference between the local time of the server and the local time of the client at a given moment, i. e. by definition
time_shift = server_time - client_time
Setting this value is important if upload_if_newer and download_if_newer should work correctly even if the time zone of the FTP server differs from that of the client (where ftputil runs). Note that the time shift value can be negative.
If the time shift value is invalid, e. g. no multiple of a full hour or its absolute (unsigned) value larger than 24 hours, a TimeShiftError is raised.
See also synchronize_times for a way to set the time shift with a simple method call.
time_shift()
return the currently-set time shift value. See set_time_shift (above) for its definition.
synchronize_times()
synchronizes the local times of the server and the client, so that upload_if_newer and download_if_newer work as expected, even if the client and the server are in different time zones. For this to work, all of the following conditions must be true:
If you can't fulfill these conditions, you can nevertheless set the time shift value manually with set_time_shift. Trying to call synchronize_times if the above conditions aren't true results in a TimeShiftError exception.
mkdir(path, [mode])
makes the given directory on the remote host. This doesn't construct "intermediate" directories which don't already exist. The mode parameter is ignored; this is for compatibility with os.mkdir if an FTPHost object is passed into a function instead of the os module (see the subsection on Python exceptions above for an explanation).
makedirs(path, [mode])
works similar to mkdir (see above, but also makes intermediate directories, like os.makedirs). The mode parameter is only there for compatibility with os.makedirs and is ignored.
rmdir(path)
removes the given remote directory. If it's not empty, raise a PermanentError.
rmtree(path, ignore_errors=False, onerror=None)
removes the given remote, possibly non-empty, directory tree. The interface of this method is rather complex, in favor of compatibility with shutil.rmtree.
If ignore_errors is set to a true value, errors are ignored. If ignore_errors is a false value and onerror isn't set, all exceptions occuring during the tree iteration and processing are raised. These exceptions are all of type PermanentError.
To distinguish between error situations and/or pass in a callable for onerror. This callable must accept three arguments: func, path and exc_info). func is a bound method object, for example your_host_object.listdir. path is the path that was the recent argument of the respective method (listdir, remove, rmdir). exc_info is the exception info as it is got from sys.exc_info.
The code of rmtree is taken from Python's shutil module and adapted for ftputil.
Note: I find this interface rather complicated and would like to simplify it without making error handling too difficult. Possible changes to ``rmtree`` will depend on the discussion between the versions 2.1b and 2.1.
remove(path)
removes a file or link on the remote host (similar to os.remove).
unlink(path)
is an alias for remove.
listdir(path)
returns a list containing the names of the files and directories in the given path; similar to os.listdir. The special names . and .. are not in the list.
The methods lstat and stat (and others) rely on the directory listing format used by the FTP server. When connecting to a host, FTPHost's constructor tries to guess the right format, which mostly succeeds. However, if you get strange results or ParserError exceptions by a mere lstat call, please file a bug.
If lstat or stat yield wrong modification dates or times, look at the methods that deal with time zone differences (time shift).
lstat(path)
returns an object similar that from os.lstat (a "tuple" with additional attributes; see the documentation of the os module for details). However, due to the nature of the application, there are some important aspects to keep in mind:
Currently, ftputil recognizes the common Unix-style and Microsoft/DOS-style directory formats. If you need to parse output from another server type, please write to the ftputil mailing list.
FTPHost objects contain an attribute named path, similar to os.path. The following methods can be applied to the remote host with the same semantics as for os.path:
abspath(path) basename(path) commonprefix(path_list) dirname(path) exists(path) getmtime(path) getsize(path) isabs(path) isdir(path) isfile(path) islink(path) join(path1, path2, ...) normcase(path) normpath(path) split(path) splitdrive(path) splitext(path) walk(path, func, arg)
walk(top, topdown=True, onerror=None)
iterates over a directory tree, similar to os.walk in Python 2.3 and above. Actually, FTPHost.walk uses the code from Python with just the necessary modifications, so see the linked documentation.
Unless the rest of ftputil, which should work with every Python version from 2.0 on, the walk method requires generators and thus will only work with Python 2.2 and above.
path.walk(path, func, arg)
For Python 2.1 the walk method in FTPHost.path can be used.
close()
closes the connection to the remote host. After this, no more interaction with the FTP server is possible without using a new FTPHost object.
rename(source, target)
renames the source file (or directory) on the FTP server.
copyfileobj(source, target, length=64*1024)
copies the contents from the file-like object source to the file-like object target. The only difference to shutil.copyfileobj is the default buffer size. Note that arbitrary file-like objects can be used as arguments (e. g. local files, remote FTP files). See File-like objects for construction and use of remote file-like objects.
FTPFile objects are returned by a call to FTPHost.file (or FTPHost.open).
FTPHost.file(path, mode='r')
returns a file-like object that refers to the path on the remote host. This path may be absolute or relative to the current directory on the remote host (this directory can be determined with the getcwd method). As with local file objects the default mode is "r", i. e. reading text files. Valid modes are "r", "rb", "w", and "wb".
FTPHost.open(path, mode='r')
is an alias for file (see above).
The methods
close() read([count]) readline([count]) readlines() write(data) writelines(string_sequence) xreadlines()
and the attribute closed have the same semantics as for file objects of a local disk file system. For details, see the section File objects in the Library Reference.
Note that ftputil supports both binary mode and text mode with the appropriate line ending conversions.
See the download page. Announcements will be sent to the mailing list. Announcements on major updates will also be posted to the newsgroup comp.lang.python .
Yes, please visit http://ftputil.sschwarzer.net/mailinglist to subscribe or read the archives.
Before reporting a bug, make sure that you already tried the latest version of ftputil. There the bug might have already been fixed.
Please see http://ftputil.sschwarzer.net/issuetrackernotes for guidelines on entering a bug in ftputil's ticket system. If you are unsure if the behaviour you found is a bug or not, you can write to the ftputil mailing list. In either case you must not include confidential information (user id, password, file names, etc.) in the problem report! Be careful!
By default, an instantiated FTPHost object connects on the usual FTP ports. If you have to use a different port, refer to the section FTPHost construction.
You can use the same approach to connect in active or passive mode, as you like.
Use a wrapper class for ftplib.FTP, as described in section FTPHost construction:
import ftplib class ActiveFTPSession(ftplib.FTP): def __init__(self, host, userid, password): """ Act like ftplib.FTP's constructor but use active mode explicitly. """ ftplib.FTP.__init__(self) self.connect(host, port) self.login(userid, password) # see http://docs.python.org/lib/ftp-objects.html self.set_pasv(False)
Use this class as the session_factory argument in FTPHost's constructor.
You may find that ftputil uploads or downloads files unnecessarily, or not when it should. This can happen when the FTP server is in a different time zone than the client on which ftputil runs. Please see the the section on setting the time shift. It may even be sufficient to call synchronize_times.
Please see the previous tip.
Please send an email with your problem report or question to the ftputil mailing list, and we'll see what we can do for you. :-)
If not overwritten via installation options, the ftputil files reside in the ftputil package. The documentation (in reStructured Text and in HTML format) is in the same directory.
The files _test_*.py and _mock_ftplib.py are for unit-testing. If you only use ftputil (i. e. don't modify it), you can delete these files.
ftputil is written by Stefan Schwarzer <sschwarzer@sschwarzer.net>.
Feedback is appreciated. :-)