Most of this is currently outdated. We had a i18n bof during DebConf8, where we discussed how this can actually work and be accepted by all of Debian, including the archive and dpkg and whatever. The notes from this BoF are available at i18n/TranslationDebsDebconfMeeting

Subsequent to that meeting, there was also a Debian QA / FTPMaster meeting in Extremadura in 2008 at which various elements of the Translation Deb support were agreed and collated into a DEP - Debian Enhancement Proposal.

As such, all future discussion of Translation Debs, TDebs and related issues needs to be done via the DEP:

Please see DEP-4 for the current TDeb status and discussion.

http://dep.debian.net/deps/dep4/

The following is retained for historical purposes only.


Translation debs or tdebs is a concept aimed to solve i18n/TranslationDataDistribution problem in Debian. It was discussed in I18N meeting of 2006 in Extremadura, Spain and during DebConf7 during the Wacky Ideas BoF. Here is the implementation approach of the concept. Other discussed implementations are available at i18n/TranslationDebsProposals.


Implementation design

Definition of tdebs

An ancillary package that contains all localization information that corresponds to a {language,package} pair. It is also possible to group multiple translation of a package in one single package. The archive is a regular Debian package (with a different suffix).

This could contain all the types of localization material:

Note: the po-debconf localization material is somewhat special since there are certain situations that need to be handled: preinst, postrm scripts need to be localized, so the debconf translation needs to be in place and configured already when the aforementioned scripts are ran. For this reason, their inclusion in the tdebs should be postponed until the problems are solved. (I welcome people to reiterate the problems we discussed about po-debconf or any problems regarding splitted po-debconf translations.)

The filename format would be something like:

$PACK_$VER_all.$LANG.tdeb

So for the Romanian localization material for wormux_0.7.4-3, the tdeb package name would be wormux_0.7.4-3_all.ro.tdeb.

For the cases where mutiple langauges are grouped in one single udeb there is the posibility to use a different langauge identifier in the name. (Posssibly the concatenated language codes or special identifiers like weeu - western europe).

Dependency handling

How do tdebs relate to regular debs? How to update translations in stable?

Since translations are (usually) not the cause of application problems it would be nice to allow translation updates even after the release.

The following diagram shows how tdebs result from a package that is released (in stable) and translations are updated later.

regular_deb-src -+-(dpkg-bp)-+--> .deb packages (current debs without l10n material)
 (tdeb-ized)     |           |
                 |           +--> .tdeb packages (as many as l10n material exists)
                 |
                 +--(dpkg-gentdebsrc)---> (.dsc + .tar.gz) = tdeb-source package(s?) (for translation updates)
                                                                  |
                                                                  +--(dpkg-bp)--> .tdeb packages (newer)

dpkg-gentdebsrc is a tool that needs to be created. It creates a whole new debian source package which contains only l10n material. This new source, if compiled with dpkg-buildpackage, should generate a new set of tdebs that should supersede the initially generated ones.

Note: is not clear if generating a source tdeb for each language would be a good thing, but if done so, the translation maintainers could get each and everyone the opportunity to maintain their own language's translation.

Supplemental tools / Changes to handle tdebs / Clarifications

Changes needed

Changes in dpkg

New location in /var/lib/dpkg/info/l10n/$LANG (or /var/lib/dpkg/info/tdeb/$LANG) which contains the .list files.

Properties of the tdebs:

Installation of tdeb packages will not be different in any way, but they must take place after the regular deb was installed (to have maintainer scripts available). dpkg will refuse to install a tdeb if the deb is not present (can be forced).

Discussion

Without further changes, would there be a way to remove a tdeb? -- FilipusKlutiero

Changes in apt and archive

Selecting translations

/etc/apt/sources.list does not contain any changes

Translations to be installed are selected via the LANG variable in /etc/default/locale and the new EXTRALANGS variable in the same file.

Example for a system mainly localized in Romanian with French and Brazilian Portuguese extra translations:

$ cat /etc/default/locale
LANG=ro_RO.UTF-8
EXTRALANGS='fr pt_BR'

Archive layout is not relevant and not bound to this

Advantages

Disadvantages

Discussion

The first Google hit for "/etc/default/locale" is from sco.com. Since that file is apparently a standard Unix file, I doubt that it's a good idea to use it only for APT. I think that an APT configuration parameter such as APT::Default-Translations would be better. When this is not defined, APT could rely on /etc/default/locale or /etc/locale.gen. -- FilipusKlutiero

We already have /etc/default/locale and it was introduced recently (after or with the release of Sarge), so I am not that sure is a standard Unix file.

Format of the Translations file

The file is similar to a Packages file, except there is no description and other needless information.

Example for a Romanian tdeb for wormux 0.7.4-3, translation updated for the first time, version is 0.7.4-3+t1.

Package: wormux
Installed-Size: $IS
Size: $S
Maintainer: Debian Games Team <pkg-games-devel @ lists.alioth.debian.org>
Architecture: all
Source: wormux-tdebs
Version: 0.7.4-3+t1
Filename: pool/main/w/wormux/wormux_0.7.4-3+t1_all.ro.tdeb
MD5sum: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
SHA1: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
SHA256: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

or, if there will be a tdeb-source for each language:

Package: wormux
Installed-Size: $IS
Size: $S
Maintainer: Debian L10N Romanian <debian-l10n-romanian @ lists.debian.org>
Architecture: all
Source: wormux-tdeb-ro
Version: 0.7.4-3+t1
Filename: pool/main/w/wormux/wormux_0.7.4-3+t1_all.ro.tdeb
MD5sum: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
SHA1: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
SHA256: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

Notes:

Discussion

Problems, Questions and Discussions about this proposal

Size of the resulting Packages files would be huge

EddyP : well, no, because the tdebs are in a separate section (another source must be added in sources.list), so is a separate file. Indeed, there could be many Translations files, one per each language, but their sizes would depend on the available l10n material for each package. See the archive format above.

Why not just add hooks to dpkg to not install locales that are not interesting to the user?

EddyP : because this is what localepurge is doing and is a hack. Also, having tdebs would allow updates of translations after releases. Note that l10n material is more than just .mo files and localized man pages; it can contain material like audio files for a game with speech in a certain language or similar material. Also, because of economic reasons we should prefer slimmer packages and less bandwidth usage.

Maintainers do not have to care that much about translation updates and can delegate that to translation teams; translators are empowered to backport translations and will be able to make translations in stable releases.

Maybe better to use language packs

RaphaelHertzog: I edited the UsefulImprovements page to add a note concerning mirrors. They don't necessarily like less download if it means more files to download, because each file costs a disk seek which is time lost during which they can't send any data out. There's a reason why Ubuntu has big "language packs" instead of many small files and I believe that you must take that into account as well. As time goes, we can take less care of the local disk usage and accept some middle ground. IMO we should group translations even if it means that we have on disk some useless translations. It's already the case... and I won't be bothered to have some not used french translation instead of having all translations in all languages. Some intelligent grouping should be possible (all of essential, then all of standard, then the rest grouped by logical groups maybe our official sections).

Mirrors might drop some languages

EddyP: If one mirror chooses to not mirror a language, since there is no special line in apt to indicate the place where to get those translations there is a need to either specify a new mirror just to get the tdebs or some mechanism/setting in /etc/default/locale. Which would be the best approach?

Some translation material might be arch specific, so placing it in an arch all tdeb is wrong

EddyP: Frans was pointing out during Neil's talk at FOSDEM 2008 (lo-res video) that there might be cases where a translation changes with the arch.


CategoryLocalization