This page details a proposed new APT method for communicating between APT and the DebTorrent program. More information on APT methods can be found in the [http://packages.debian.org/libapt-pkg-doc libapt-pkg-doc package].

The Current State of Communication

Currently, the DebTorrent program makes use of the HTTP retrieval method for communicating with APT. It implements almost a complete proxy for downloading files from HTTP mirrors. The only exceptions are, since it considers Packages files to be torrents, it notes when they are requested and starts the corresponding torrent running. Also, when DebTorrent receives a request for a package file (which it identifies by extension), it finds the appropriate torrent that contains that file and begins to download it using the DebTorrent protocol (i.e. not using HTTP). Once the download is complete, it passes the file on to APT as if it had been downloaded directly from the HTTP mirror.

The major problems with this method are:

To solve the first problem of slow startup of downloads, multiple packages need to be downloaded at once from the same torrent, without waiting for one to finish before starting another. This could be as simple as telling APT to pipeline multiple requests to DebTorrent, which would alleviate some of the problem.

The second problem is trickier, as APT will only be aware of when downloads begin and end. Pipelined downloads may help though, as there may be more activity of files starting and stopping so that the user will not notice so much. However, in BitTorrent downloads usually occur in such a way that all the files complete at near the same time at the end of the download. So, even with pipelined downloads, it may appear to the user that nothing is happening (at which point they may abort the download), until finally all will suddenly come in at the end.

Proposal A: Implement HTTP/1.1 Pipelining

The HTTP protocol already has functionality built into it to allow for [http://www.w3.org/Protocols/rfc2616/rfc2616-sec8.html pipelining multiple requests] over the same connection. The current DebTorrent APT request listener only implements HTTP/1.0, so some functionality would need to be added to support HTTP/1.1. Then, the current http method of APT could be used to pipeline multiple requests. The APT http method has a configuration parameter (Acquire::http::Pipeline-Depth) for controlling the maximum number of outstanding requests that can be sent on a single connection (defaults to 5), however there is a maximum of 10 set in the code for this value. The HTTP protocol also specifies that the files returned by the pipelined connection must be in the same order as they were requested, since there is no way to identify the file by the HTTP server response, and APT's http method conforms to this.

The advantages of this are:

The disadvantages are:

Proposal B: Modify APT's http Method

Considering the limitations of HTTP pipelining possible under the standard HTTP protocol, extensions could be made to APT's implementation of the protocol to support some of the desired features. Non-standard headers could be added to the communication for APT to indicate it's support for the extension to DebTorrent, and DebTorrent could reply with non-standard headers indicating to APT it's abilities.

The advantages are:

The disadvantages are:

Proposal C: A New APT Method

Instead of using APT's http method, a new method will be developed to be used by APT to request downloads of files (and possibly other information, see proposal D below) from DebTorrent. This method will be indicated in the sources.list file by "debtorrent://...". This method will be based on the current http method, and will use the HTTP protocol to communicate with DebTorrent, which will allow it to be accessed easily from other machines on the network. The debtorrent method will send all requests from APT to the DebTorrent program immediately, without waiting for any requests to complete. This "ultra-pipelining" will allow all the downloads to occur in parallel. The debtorrent method will then expect the downloaded files to be returned in an arbitrary order, which will then be passed to the APT program (which supports an arbitrary ordering).

This new method could be added as a separate package (for an example, see the [http://packages.debian.org/apt-transport-https apt-transport-https package]), and so does not require any changes to the APT code. However, some changes to the APT code are still needed, as the current APT code will only pass a maximum of 10 requests to a method at a time. In order for this method to improve on the HTTP/1.1 Pipelining solution, this limit would have to be somehow raised or eliminated.

The advantages are:

The disadvantages are:

Proposal D: A Status Protocol

To solve the second problem, that of no status updates in APT, a special HTTP connection will be opened between the APT method and the DebTorrent program. This connection will communicate the status of ongoing downloads from the DebTorrent program to the APT method. However, in order for these status updates to be shown to the user, they would need to be communicated from the method to the APT program. This would require some changes to the APT code to support 1 or 2 new messages in the method communication protocol, and to use the information in these messages for the display to the user.

The status will be communicated using a so-far-undetermined format. Two candidates are XML-RPC, and bencoded dictionaries.

The [http://www.xmlrpc.com/spec XML-RPC parameter format] is a good candidate for the communication format, as it is simple, well understood, operates over HTTP, and contains all the functionality needed for the status information. There is also a library available in the Python standard library for [http://docs.python.org/lib/node658.html creating and parsing XML-RPC strings from Python data].

Although the XML-RPC method does have the advantage of readability, it also involves a lot of overhead, both in transmitted bytes and parsing time. A more efficient method, which is also more familiar to BitTorrent users, would be to use a [http://en.wikipedia.org/wiki/Bencode Bencoded dictionary] instead of XML. Since all torrent information is stored bencoded, BitTorrent clients have all the functionality needed to bencode/bdecode all Python variable types. The disadvantage of this method is that bencoded data is not as human-readable as XML, and functionality would have to be added to the APT method to encode/decode it (though this is probably true of XML as well).

Summary

Four methods of improving the communication between DebTorrent and APT have been suggested. These proposals are not exclusionary however, and some combination of them is probably the best solution. Perhaps a combination of A (to support the most general situation), C (to solve the first problem), and D (to solve the second), would be ideal.