Differences between revisions 1 and 2
Revision 1 as of 2007-02-18 12:23:10
Size: 12799
Editor: EddyPetrisor
Comment: praparing to move TranslationDebs page
Revision 2 as of 2007-02-18 12:27:28
Size: 12831
Editor: EddyPetrisor
Comment: more clarifications
Deletions are marked like this. Additions are marked like this.
Line 1: Line 1:
Translation debs or tdebs is a concept aimed to solve ["I18n/TranslationDataDistribution"] problem in Debian. It was discussed in I18N meeting of 2006 in Extremadura, Spain. Here is one possible implementation of the concept. Other discussed implementations should be added to the page by their authors along with discussed pros and cons of each approach. Translation debs or tdebs is a concept aimed to solve ["I18n/TranslationDataDistribution"] problem in Debian. It was discussed in I18N meeting of 2006 in Extremadura, Spain. Here are collected possible implementation of the concept. Other discussed implementations should be added to the page by their authors along with discussed pros and cons of each approach.
Line 192: Line 192:
   * EddyP, [[DateTime]] : because this is what localepurge is doing and is a hack. Also, having tdebs would allow updates of translations after releases. Note that l10n material is more than just .mo files and localized man pages; it can contain audio files for a game with speech in a certain language. Also, because of [wiki:Self:UsefulImprovements#head-c740b96135b206e2b2e7204d0c19fe230b84a5d7 economic reasons] we should prefer slimmer packages.    * EddyP, [[DateTime]] : because this is what localepurge is doing and is a hack. Also, having tdebs would allow updates of translations after releases. Note that l10n material is more than just .mo files and localized man pages; it can contain audio files for a game with speech in a certain language. Also, because of [wiki:Self:UsefulImprovements#head-c740b96135b206e2b2e7204d0c19fe230b84a5d7 economic reasons] we should prefer slimmer packages and less bandwidth usage.

Translation debs or tdebs is a concept aimed to solve ["I18n/TranslationDataDistribution"] problem in Debian. It was discussed in I18N meeting of 2006 in Extremadura, Spain. Here are collected possible implementation of the concept. Other discussed implementations should be added to the page by their authors along with discussed pros and cons of each approach.

?TableOfContents([3])

Aigarius proposal

TDeb structure

TDebs could be made to be the same format as the regular deb files or be simple tar.gz (or tar.bz2) archives.

Dpkg level changes

A new folder would be introduced - /var/lib/dpkg/info/tdebs/ . This folder would contain .list files for tdebs and would be considered for removing files of tdebs and (optionally) for conflict resolution. Allowing tdebs to overwrite files from their respective base packages might ease the transition.

A new hook script would be added - /var/lib/dpkg/info/*.posttrans . If a package has special i18n requirements and some commands need to be run after installation of a translation package, then it could provide this script. It would be called by dpkg after installation (or removal) of a translation package with appropriate parameters (language iso code, for example).

EddyP: I wonder if there could be cases where a single posttrans wouldn't fit all the tdebs.

/var/lib/dpkg/status is modified to add a field "Installed-Translations:" that would consist of a comma separated list of translations the current package has installed.

When "dpkg --install somepackage-1.0-4.ru.tdeb" is run, then dpkg determines the base package name, asserts that base package is installed, unpacks the tdeb, puts its file list into /var/lib/dpkg/info/tdebs/somepackage.ru.list, adds the language to "Installed-Translations" field in status file and runs /var/lib/dpkg/info/somepackage.posttrans if it exists.

APT level changes

A configuration file /etc/apt/languages.list would be introduced that lists language codes for which translations must be installed.

Translations will be installed upon installation or upgrade of a package (for one package) or upon an upgrade or dist-upgrade (for all packages).

Translation packages will be in no way included in the dependency calculations. Packages for selected languages for all installed packages will be installed. All other translations will be removed.

Downloading and parsing of the Translations file from the mirror (see below) will need to be added. It would be preferable to parse this file sequentially while not storing it in the memory to reduce space concerns. Or implement fetching without an index.

Mirror level changes

A Translations file will need to be added at the same level as Sources files are now. The Translations file could be a simple as "packagename-version: comma separated list of iso codes of available translations". It would also be possible to avoid needing such file and simply constructing request urls from known components: package name, version and language. 404 error would indicate absence of such translation.

Translations could be located in the package pool in a separate subdirectory of a directory of the package, for example /debian/pool/main/s/sb/sbackup/tdebs/

Archive maintenance changes

(I know little of this, so this will need corrections)

TDebs could be created either by stripping translations from existing packages (temporary solution) or by using the (not ready yet) Big Universal Debian i18n System or by manual uploads. Translations could be extracted from packages either in build time (with the help of some debhelper) or even after the upload just prior installing the package into the archive.

Did I forget anything?

EddyP's proposal

Definition of tdebs

An ancillary package that contains all localization information that corresponds to a {language,package} pair. The archive is a regular Debian package (with a different suffix).

This could contain all the types of localization material:

  • .mo files /usr/share/locale/${LANG}/LC_MESSAGES/*.mo
  • localized man pages
  • any other localized material like audio, video and images that correspond to the package in question

Note: the po-debconf localization material is somewhat special since there are certain situations that need to be handled: preinst, postrm scripts need to be localized, so the debconf translation needs to be in place and configured already when the aforementioned scripts are ran. For this reason, their inclusion in the tdebs should be postponed until the problems are solved. (I welcome people to reiterate the problems we discussed about po-debconf or any problems regarding splitted po-debconf translations.)

The filename format would be something like:

$PACK_$VER_all.$LANG.tdeb

So for the Romanian localization material for wormux_0.7.4-3, the tdeb package name would be wormux_0.7.4-3_all.ro.tdeb.

Dependency handling

  • tdebs are marked as automatically installed dependency of the main packages, but they themselves really depend on the debs.
  • trying to install a tdeb without the .deb should normally fail.
  • installing a deb via apt/aptitude/synaptic/any_other_aptitude_like_tool should result in installation of the deb and all the tdebs available for that version and should mark all tdebs as automated dependency.

How do tdebs relate to regular debs? / How to update translations in stable?

Since translations are (usually) not the cause of application problems it would be nice to allow translation updates even after the release.

The following diagram shows how tdebs result from a package that is released (in stable) and translations are updated later.

regular_deb-src -+-(dpkg-bp)-+--> .deb packages (current debs without l10n material)
 (tdeb-ized)     |           |
                 |           +--> .tdeb packages (as many as l10n material exists)
                 |
                 +--(dpkg-gentdebsrc)---> (.dsc + .tar.gz) = tdeb-source package(s?) (for translation updates)
                                                                  |
                                                                  +--(dpkg-bp)--> .tdeb packages (newer)

dpkg-gentdebsrc is a tool that needs to be created. I create a whole new debian source package which contains only l10n material. This new source, if compiled with dpkg-buildpackage should generate a new set of tdebs that should supersede the initially generated ones.

Note: is not clear if generating a source tdeb for each language would be a good thing, but if done so, the translation maintainers could get each and everyone the opportunity to maintain their own language's translation.

Supplemental tools / Changes to handle tdebs / Clarifications

  • Aptitude/Apt/dpkg must be modified to allow installation of the binary and source tdeb packages (see explanations above and examples). (Eg.: apt-get source --l10n ro wormux should do the right thing)

  • By default, tdebs are not visible in searches, views, etc. Users need to force the display by adding an option like "--l10n ro", or some menu option.
  • dpkg-gentdebsrc - a helper command that creates source tdeb packages needs to be created (some people suggested that the tdeb-source packages could be made by hand in the beginning, but I believe that this could result in bad tdebs-sources; that can lead to many mistakes and packages of poor quality)
  • the main deb package will no longer contain l10n material
    • Q: how does one ensure smooth upgrades for translations without loosing them (e.g.: from etch to lenny without loosing translations)?
    • A: providing material in both the base package and the tdebs files are handled via diverts (? am I missing something ?)

Changes needed

Changes in dpkg

New location in /var/lib/dpkg/info/l10n/$LANG (or /var/lib/dpkg/info/tdeb/$LANG) which contains the .list files. Idea is almost the same as Aigars', except the tdebs can have all types of maintainer scripts provided by the main package (filename /var/lib/dpkg/info/l10n/package.{pre,post}{inst,rm}) or the tdebs themselves (filename /var/lib/dpkg/info/l10n/$LANG/package.{pre,post}{inst,rm}). The tdeb maintainer scripts override the scripts provided by the main package if the tdeb_version >> package_version. If the tdeb wants to stop to provide/no longer use the tdeb maintainer scripts, it must provide empty scripts.

Installation of tdeb packages will not be different in any way, but they must take place after the regular deb was installed (to have maintainer scripts available). dpkg will refuse to install a tdeb if the deb is not present (can be forced). Installed-translations: filed is populated like Aigars proposed.

Changes in apt and archive

* The l10n material is selected by adding new sources in /etc/apt/sources.list. There is no need of any supplemental enabling/disabling/configuration.

* The tdebs will live in the common pool section.

* Apt will take care to install all available tdebs for a given installed package, after the deb was installed. Dependency resolving is done in apt libraries, NOT in the applications/frontends.

Selecting translations -- Entirely new(incompatible) approach

/etc/apt/sources.list can contain new sources which implicitly select desired languages:

tdeb-$LANG http://ftp.debian.org/debian/ etch main

Archive layout is: ftp.debian.org/debian/dists/l10n/dists/etch/main/$LANG/Translations

Selecting translations -- Compatible approach

If the previous is unacceptable due to backward compatibility reasons, a line like the following should be used

deb http://ftp.debian.org/debian/ etch l10n/$LANG/main''

Archive layout is: ftp.debian.org/debian/dists/dists/etch/l10n/$LANG/main/{Translations,Packages}

The compatible archive should contain in the expected place a Packages file with no packages and a Translations file with the correspondent tdebs.

Format of the Translations file

The file is similar to a Packages file, except there is no description and other needless information.

Example for a Romanian tdeb for wormux 0.7.4-3, translation updated for the first time, version is 0.7.4-3+t1.

Package: wormux
Language: ro
Installed-Size: $IS
Size: $S
Maintainer: Debian Games Team <pkg-games-devel @ lists.alioth.debian.org>
Architecture: all
Source: wormux-tdebs
Version: 0.7.4-3+t1
Filename: pool/main/w/wormux/wormux_0.7.4-3+t1_all.ro.tdeb
Depends: wormux (= 0.7.4-3)
MD5sum: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
SHA1: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
SHA256: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

or, if there will be a tdeb-source for each language:

Package: wormux
Language: ro
Installed-Size: $IS
Size: $S
Maintainer: Debian L10N Romanian <debian-l10n-romanian @ lists.debian.org>
Architecture: all
Source: wormux-tdeb-ro
Version: 0.7.4-3+t1
Filename: pool/main/w/wormux/wormux_0.7.4-3+t1_all.ro.tdeb
Depends: wormux (= 0.7.4-3)
MD5sum: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
SHA1: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
SHA256: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

Notes:

  • Package is the deb package for which the translation is provided (do we need this?)
  • The Maintainer field could change (a translation maintainership team) after release.
  • The Source field could be (at different points in time - e.g.: "Debian Games Team" for 0.7.4-3, "Debian L10N Romanian" for 0.7.4-3+t1)

Problems, Questions and Discussions about this proposal

  • Size of the resulting Packages files would be huge
    • EddyP, DateTime : well, no, because the tdebs are in a separate section (another source must be added in sources.list), so is a separate file. Indeed, there could be many Translations files, one per each language, but their sizes would depend on the available l10n material for each package. See the archive format above, either the compatible or the incompatible proposal.

  • Why not just add hooks to dpkg to not install locales that are not interesting to the user ?
    • EddyP, DateTime : because this is what localepurge is doing and is a hack. Also, having tdebs would allow updates of translations after releases. Note that l10n material is more than just .mo files and localized man pages; it can contain audio files for a game with speech in a certain language. Also, because of [wiki:UsefulImprovements economic reasons] we should prefer slimmer packages and less bandwidth usage.

[Fill in]