Hrm, packages larger than 50MB in sid/i386 (main, contrib and non-free) [0]:

Eclipse NLS: (76MB)
    76498868 eclipse-platform-nls

Software: (104MB) (not arch:all)
   104032026 gcc-snapshot

Clipart: (144MB)
   144354894 openclipart-png

Documentation: (147MB)
    52801680 context-doc-nonfree
    94869090 koffice-doc

TeX: (192MB)
    56200406 texlive-fonts-extra
    56559816 tetex-src
    79907246 texlive-latex-extra

Debug packages: (369MB) (not arch:all)
    53959746 boson-dbg
    55430908 icedove-dbg
    56274922 koffice-dbg
    57383550 libqt4-debug
    59787420 iceape-dbg
    86404478 libgl1-mesa-dri-dbg

Game data: (1,413MB)
    52247006 nexuiz-music
    52552812 scorched3d-data
    60332050 vegastrike-music
    63569974 fillets-ng-data
    69398222 beneath-a-steel-sky
    75266604 openarena-data
    86669382 torcs-data-tracks
   100913422 tremulous-data
   103359260 freedroidrpg-data
   132611332 vdrift-full
   138455838 nexuiz-data
   140544192 vegastrike-data
   161313228 fgfs-base
   176337384 sauerbraten-data

Largest deb in unstable is sauerbraten-data at 176MB, largest
arch-specific deb is atlas3-test for ia64 at 158MB. Total size of debs
in sid/i386 is 17,259MB, so packages above 50MB make up about 10%-15%
of the archive by size already.

Moving game data elsewhere would require some way for games in main to
depend on data elsewhere.

Moving -dbg data elsewhere would presumably require some way to upload
to both archives in a single step, and we'd want to keep them reasonably
tightly coupled.

Not moving -dbg stuff elsewhere would mean there'd be no need to worry
about supporting arch-specific stuff, afaics. OTOH, if we did support
arch-specific stuff, maybe it'd make sense to move the game engines
into the other area along with the data if there's no point to having
the engine without a huge amount of data to use it with.

> The advantages would be: [...]
> - make it possible to not include such data on the regular binary CDs,
>   but for example on separate arch-independent "data" CDs

Particularly for game data, it seems like it'd make more sense (at least
from a user's pov) to include the game code and the game data on the same
CD, given they'd always be installed at the same time. I've no idea if
that could realistically be done, or if there's any point thinking about
it 'til later though.


[0] find dists/sid -name Packages.gz |
       xargs zcat |
       awk '/^Package:/ {P=$2} /^Architecture:/ {A=$2} /^Size:/ {S=$2} /^$/ {if (S > 50000000) { print S, P, A }}' |
       sort -n | uniq | less