(!) Discussion


This document describes two currently used methods to distribute localized data and current known issues. Localized data includes binary MO files, translated documentation files and translated sounds (for example, in games).

Improvements to the second method are not discussed here, but in TranslationDebs.

Overview

There are currently two methods to distribute the localized data of a localized package. A localized package is a package which is available for more than one locale. For example, iceweasel is a localized package.

The first method is suboptimal while the second is less user-friendly.

Bundling localized data for all languages with the localized package

Description

This is the simplest solution. Localized data is included in the localized package (or, rarely, in one of its dependencies).

Examples

apt

The apt package contains localized manual pages in /usr/share/man/ and binary MO files for several language in /usr/share/locale/. All of the files provided by apt in these directories are localized data.

$ dpkg -L apt|perl -nle 'print if -f'|xargs du -c|tail -n 1
3824    total
$ dpkg -L apt|egrep '/usr/share/man|/usr/share/locale'|perl -nle 'print if -f'|xargs du -c|tail -n 1
2456    total
$ apt-cache show apt|egrep '(Installed-Size|Version)'
Installed-Size: 4312

This shows that between 56 to 65 percent of the size of apt 0.6.46.4-0.1 is constituted by localized data.

cupsys-client

cupsys-client uses several types of localized data, including manual pages and PO files. Manual pages are included in cupsys-client, as is typical. Meanwhile, PO files are included in a dependency, cupsys-common.

Issues

  1. Localized data increases the localized package size.
    1. On multi-architecture mirrors, architecture-specific packages bundling localized data increase disk usage and bandwidth usage for synchronizations.
    2. Increases bandwidth usage for users and uploading mirrors.
    3. Increases disk space usage for users. localepurge, considered a hack, exists to diminish this issue.

    4. Time for installs is increased due to getting and unpacking a larger .deb.
  2. Localized data is in the same binary package and therefore has to be built from the same source package as the application.
    1. Localized data can not be handled by different maintainers.
    2. Translation updates of application binary packages can not be made independently and could cause a regression. It is risky to do translation updates during a freeze.
    3. A translation update means that the localized package needs to be rebuilt. This causes larger updates (mostly more bandwidth usage) and increased buildd usage. To reduce these issues, maintainers tend to wait for a new software release before providing the translation updates. The delay for translator's work to reach users tends to increase (e.g. debconf updates sitting in the BTS).
  3. If the package contains not just its translations but also those of other dependant packages, all translations are installed regardless of the dependant packages actually used.

Language packages

Description

Each language has a separate package for its localized data. Since there are different binary packages for the localized package and its localized data, the language packages can be generated either from the localized package's source package or use another source package (depending on upstream's approach to translations).

Examples

iceweasel

The iceweasel package relies on iceweasel-l10n-fr to provide the French translation. As of March 2007, there are 50 iceweasel-l10n packages. The sum of their installed size is 41 MB, which is shipped in separate packages rather than multiplying iceweasel's size by about 1.5.

$ grep-aptavail -r -P 'iceweasel$' -s Installed-Size
Installed-Size: 26940
$ grep-aptavail --eregex -P 'iceweasel-l10n.*' -s Installed-Size|cut -d ' ' -f2|awk '{t=t+$1}END{print t}'
40912

Language packages are built from the iceweasel-l10n source package. iceweasel doesn't depend on its language packages.

debian-reference

debian-reference is an example of localized package which isn't an application. It recommends debian-reference-fr and all its other language packages.

Issues

  1. There is no good way to have the language package installed when it is useful and only then.
    1. A conservative approach, such as recommending all language packages, creates issues similar to issue 1 of the first distribution method.
    2. For localized packages which don't recommend their language packages, users need to specifically select the language packages they want to install in order to make their translation available.
  2. The number of packages in the distribution grows with the number of language packages.
    1. Increases the time needed by package management tools to handle the dependency tree.
    2. Increases the time needed to download Packages.
    3. Increases a lot the number of packages returned by searches for the localized package's name, which can be disturbing.
  3. Langpacks which don't depend on the package they localize are not automatically removed when that package is removed.
  4. Installation of an application and the localized data for one language requires the installation of two packages rather than one, which can trigger two disk seeks rather than a single one on mirrors. This significantly reduces performance of mirrors limited by disk I/O.
  5. If the language package is provided for a large collection of software (such as KDE) the user has to install the full language pack regardless of the pieces of software he actually uses.