This page describes a method used by DebTorrent for solving one of the problems of using a bittorrent-like program to serve a constantly updating archive of files, like the testing and unstable suites of the debian archive. These suites are currently updated twice daily, with plans to update more frequently in the future.

The problem this creates is that a traditional torrent created for files in the archive is identified by an infohash (an SHA1 hash of the file names, sizes and SHA1 hashes of the files). If a single file in the torrent changes, or a new one is added, or an old one removed, the infohash will change, which will create a new torrent and swarm of peers. The peers connected to this torrent will not be able to connect to the peers from the previous torrent, even though they have 99% of the files in common. This will fracture the downloading population into many torrents of very small size, reducing the efficiency of the download (bittorrent is most efficient when the swarm of peers is very large).

To solve this problem requires some fundamental changes to the way bittorrent operates: This was implemented in version 0.1.6, and the current DebTorrent works this way.

The following sections contain more information on some of these changes.

Creating the Torrent

There are several possibilities for choosing the information to hash to create the torrent identifier (infohash). It should be specific to the torrent, and so should contain information about the files in it. Examples are, the fields from the Release file, such as origin, label, codename (sid), version, and suite (unstable); also, the component (contrib) and architecture (amd64) are probably good choices. Additionally, it should be specific to this generation of the infohash, so the date and time of the generation should be included in the hash, as well as the maximum piece size used.

The unique piece numbers will first be assigned in the traditional bittorrent fashion when a new torrent is generated. The files will be sorted in order by their full path name (which is different than the order in the Packages file), and assigned piece numbers sequentially from 0 (zero). On subsequent modification of the files contained in the torrent, new files will be given piece numbers starting with the largest piece number already in the torrent. The new files will again be ordered by their full path name for this assignment, and given new piece numbers in that order. Files that are removed will have their piece numbers kept in reserve, resulting in gaps in the piece numbering of clients using the new torrent. The clients will know where these gaps are, but not the file names that go in them, nor the file's hashes. Modifying files without changing the file name is not allowed.

A good strategy for when to restart the unique piece numbering is necessary. Otherwise the piece numbers would grow forever until a lot of communication would be wasted sending bitfields that are mostly empty, and memory is wasted storing lists that are mostly empty. The strategy could be one of the following:

Probably the last strategy is the best, as it does not rely on any other factor than the amount of wastage in the torrent. See the discussion below for a more in depth analysis of when to restart piece numbering.

New Torrent Files

A new file will need to be distributed to peers to supplement the information available in Packages files to create this new torrent. These files (one per torrent) will contain the information needed by a DebTorrent client to identify the torrent to others, and to order the pieces appropriately for the download. This information will be in the RFC 2822 header field format, as used by the Release files. It will begin with the fields shown below:

The last entries in the torrent file will be a list of the file name and starting piece numbers for the files. This will have a field name of PieceNumbers, and will consist of one file per line, listing first the starting piece number (right justified) followed by the file name.

Here is an example of what the complete torrent file should look like:

Torrent: 1a8d7c18b3b5d89fedd0b113f251b6c1002b8710
OriginalDate: Sun, 08 Apr 2007 10:41:22 UTC
Date: Tue, 07 Aug 2007 22:53:09 UTC
PieceSize: 524288
NextPiece: 2486
OriginalPieces: 2460
Codename: sid
Suite: unstable
Component: contrib
Architecture: amd64
TorrentHashFields: Codename Suite Component Architecture PieceSize OriginalDate
    0 pool/contrib/a/acx100/acx100-source_20060521-3_amd64.deb
    1 pool/contrib/a/alien-arena/alien-arena_6.05-1_amd64.deb
   10 pool/contrib/a/alien-arena/alien-arena-dbg_6.05-1_amd64.deb
   11 pool/contrib/a/alien-arena/alien-arena-server_6.05-1_amd64.deb
   25 pool/contrib/a/alien-arena/alien-arena-server-dbg_6.05-1_amd64.deb
   36 pool/contrib/a/amoeba/amoeba_1.1-19_amd64.deb
   39 pool/contrib/a/arbortext-catalog/arbortext-catalog_1.01-3.3_amd64.deb
   41 pool/contrib/a/argouml/argouml-doc_0.19.6-2.1_amd64.deb
   57 pool/contrib/a/aspectj/aspectj_1.1.1-2_amd64.deb
 2456 pool/contrib/x/xmmplayer/xmms-xmmplayer_0.3.3-1+b1_amd64.deb
 2459 pool/contrib/x/xserver-xorg/xserver-xorg_0.10.6-1_amd64.deb
 2460 pool/contrib/a/arbortext-catalog/xt-catalog_1.01-3.3_amd64.deb
 2476 pool/contrib/x/xtrs/xtrs_4.9c-1_amd64.deb
 2477 pool/contrib/y/ydpdict/ydpdict_0.66-1_amd64.deb
 2478 pool/contrib/a/alsa-tools/alsa-firmware-loaders_1.0.14-1_amd64.deb
 2485 pool/contrib/a/argouml/argouml_0.19.6-2.1_amd64.deb

The actual torrent files for the testing (lenny) and unstable (sid) suites are available online, as well as a status page showing the growing size of the number of pieces.

Currently, the limitations of having these files kept separately from the mirrors and the Packages files that reference them requires some special handling. Since a user might have obtained a Packages file from a slightly out-of-date mirror, but then got the most current torrent file, the unique piece numberings need to be kept for a time until all mirrors can be sure to be lined up. Currently old references to unique piece numbers are kept until the torrent is regenerated, which allows out-of-date mirror users to extract the old information they need, and all users ignore piece numbers they aren't interested in. The restarting of a torrent will cause trouble though for out-of-date mirror users if they try to download a file that is no longer in the up-to-date archive. This file will not be in their torrent, since it is no longer listed in the unique piece numbering file, so trying to install it will result in a 'file not found' error. Hopefully in the future, this data will be kept with (or in) the Packages files, so as not to cause any problems (see 'Adding this information to the Packages/Release files' below).

Architecture:all packages

Since architecture:all packages are used by everybody, it is important to keep all the downloaders of them together in a single torrent, rather than splitting them up into the individual architecture torrents. Therefore, an extra torrent of only architecture:all packages will be created and added to, just as the other architecture torrents are. The architecture:all packages will then not be listed in the individual architecture torrents (i,e. i386). Creating the 'all' torrent is currently quite simple, since all architectures contain exactly the same architecture:all packages, the torrent can be created identically for each of them. However, this behavior can not be depended on in the future, so a better strategy is needed.

The architecture:all torrent will therefore be created from all of the other architecture's Packages files. It will contain an entry for each architecture:all file found in any of the Packages files. Thus, if an architecture:all file exists only in the i386 Packages file, clients downloading the 'all' torrent for other architectures will simply ignore this entry, and so the file will be in a piece gap just like removed files are. When new architecture:all files are added to the archive, these files are added to the end of the 'all' torrent just as for the others, the only exception being that all the new architecture:all files added to all the architectures will be added to the same 'all' torrent file.

Adding this information to the Packages/Release files

Eventually, it would be favorable to include a lot of the information in the torrent file in the standard Packages and Release files to avoid duplication and avoid possible errors in retrieving the data.

The largest part of the torrent file, each file's starting piece number, would be easily included in the Packages file using a single StartingPiece field. This would save a lot of storage space, as the largest part of the torrent file is mostly a duplication of the file names. This would also work for the architecture:all packages, except that their numbering would be separate from the architecture-specific files.

The remainder of the torrent data could be mostly included in the main Release file, including a multi-line field indicating the torrent identifier for each torrent. Alternatively, the torrent data could be put into the architecture specific torrent files (e.g. debian/dists/sid/main/binary-alpha/Release), as those files don't seem to be used for anything anyway. (Then a new Release file would be needed for the 'all' torrent.)

Peer Protocol

Implementing unique piece numbers means that a single torrent swarm will contain peers who have a different idea of how many pieces there are in the complete torrent. However, they both know exactly where the piece numbers that they do have in common belong in the torrent, even if one of them doesn't know exactly what file they belong to (but that's OK, since that peer doesn't have, nor want to download the unknown file anyway). It's important that peers do not drop these connections unnecessarily, which requires some modifications to the communication protocol between peers.

The rest of the protocol should be fine for these peers. A peer will never advertise that it has a piece of a file it doesn't know about, so it will never get a request to upload that piece (if it does, the connection should be dropped).

Compressed bitfields

Considering the already large, and now growing size of the bitfield, it may be beneficial to compress it before sending it to another peer to minimize the bandwidth. This would involve adding a new COMPRESSED-BITFIELD message, that would be used by a peer when sending a bitfield longer than 1000. The compressed bitfield would be created using the gzip compression module, and should result in a large savings since the bitfield will be mostly filled with 0's.


Restarting the unique piece numbering

An analysis of the increasing number of pieces used by this strategy has been performed on data from the Debian archive's testing and unstable suites for 3 months, from May 9th to August 7th, 2007. This data is similar to that shown here for the unstable suite, but contains more information as the number of new pieces can not be directly determined from the total size of the new files.

Starting the unique piece numbering strategy on May 9th, and adding pieces to the end of the torrents for these 3 months, would result in torrents with the number of pieces shown in the following 2 graphs (main = black, contrib = blue, non-free = red, architecture-specific = solid, architecture:all = dashed):

(Note: the reason for the lack of new pieces during May was a pending transition of glibc which occurred on June 1st.)

(Note: the one lonely solid black line is for main-hurd-i386, which has half as many pieces as the other architectures.)

The proposed restarting of piece numbering strategy is to wait until the torrent is half empty. Or, rather that the number of pieces in the torrent has approximately doubled from what it started at. To better understand the effects of this strategy, it is more informative to observe the percentage increase in the number of pieces for each torrent, which will prevent the large number of pieces in main from overwhelming the smaller torrents such as contrib/non-free. Here are the same 2 graphs, shown as a percentage increase (main = black, contrib = blue, non-free = red, architecture-specific = solid, architecture:all = dashed):

The large increase in one of the contrib (red) torrents is from the addition of a large package to the non-free s390 architecture, which more than doubled the size of the torrent. This shows the drawback of this approximation for the wasted number of pieces, since these pieces are not actually wasted, as they were never in the torrent to begin with and so left behind no gap.

Other than that discrepancy, the plan of restarting the piece numbering once a torrent reaches twice it's original size seems sound. For unstable, this would mean a torrent would last almost 2 months, while a testing torrent would probably last about twice as long. In both cases the total number of pieces would max out at about 60,000, though this may change if the archive size grows considerably.

The elimination of seeds

An important difference between this proposal and the standard bittorrent, is that there is no longer a possibility to be a seed. This is because, once a torrent has a piece gap, and new pieces added to it, it is no longer possible for a single peer to have all the pieces. This is probably not an issue though, since seeding is very difficult in a DebTorrent system already, due to the very large size of the archive, and the lack of a peer's interest in all of the packages. The only peers who would have all the latest pieces would be mirrors, but even they would lack any old pieces that have been removed.

Bittorrent only depends on seeding for the initial seed to upload pieces into the system. Since DebTorrent falls back to a backup HTTP download from a mirror, this initial seed requirement is no longer an issue, as the first person to request an updated package will become the seed for it in the system.

There are some optimizations, especially in the tracker, that will have to be rethought because of this. For example, the tracker usually separates seeds and leechers so as to not return other seeds to a seed when it requests peers to connect to.

Not your problem (Packages.gz rather than individual archives)

The issue of packages changing rapidly is an inherent problem that is common to all download methods - http, ftp, rsync, debtorrent, you name it. It is therefore "not your problem to solve". Debian has already recognised the seriousness of the issue, and has created rred, and a system for downloading package diffs, generated daily, for the testing, unstable and experimental archives.

The bottom line is this: attempting to solve this issue, which is common to all download methods, is not adviseable at this time - it's far too complicated an optimisation, and it is an optimisation, when there are more important issues to solve.