i18n Worksession Extremadura
(this is work in progress: will be posted to d-d-a and CC'ed to -i18n and the Junta de Extremadua authorities. Please complete the missing parts if possible. I'll reassemble things from your comments and additions then post a report. The target is having this ready for Sunday September 17th. Missing parts at that time will be....removed..:-)
Please don't use fancy wiki formatting as we need a text-only report
The first Debian internationalisation meeting occurred from September 7th 2006 to September 9th 2006 in Casar de Caceres, Extremadura, Spain.
This meeting has been organised as part of the "Extremadura sessions" entirely sponsored by the government of the Extremadura region in Spain ("Junta de Extremadura") as a commitment and reward to the Debian Project which is the base of the ?LinEx custom Linux distribution they use for their general IT project entirely based on free software.
23 people from all over the world, representing various different scope in the Debian internationalisation and localisation effort, as well as representative from related projects participated to this meeting. The full list of participants is available on http://wiki.debian.org/I18n/Extremadura2006.
The meeting was organized with several technical and social goals:
- making a new step towards a real "i18n Task Force" for the Debian Project
- draw the final plans for an official "infrastructure server" for all Debian i18n and l10n activities
enforce the collaboration with the ?WordForge free software project, which was decided during sessions in Debconf6 in Mexico and continued into a "Google Summer of Code" project granted to Gintautas Miliauskas about "improvements to the architecture of the Pootle server: separation of backend and frontend"
- continue the revival of the Debian Packages Description Translation Project (DDTP) and begin to integrate it in a first Debian i18n server
- have more specialized talks, BOFs and brainstorming sessions about:
- use of po4a
- localization-config revival for etch
- modularization of language handling in D-I
- "language packs"
- testing D-I localisation
GSOC 2006 and projects related to Wordforge's Pootle
Gintautas Miliauskas presented the results and achievements of his work during the GSOC 2006. The initial goal was the separation of the frontend and the backend in the Pootle server, in collaboration with Pootle developers (Friedel Wolff, from the Wordforge project, was present during the meeting, as well as Javier Sola, co-director of the Wordforge project).
Gintautas succeeded to write a storage backend which will be possible to be used as a storage backend for Pootle. This will allow to separate the storage from the Pootle frontend.
Gintautas and Friedel Wolff began, during the meeting, to work on integrating this work into the Pootle server (the 0.10 version was released on August 29th 2006).
The meeting allowed Wordforge contributors and directors to draw more precise plans on their roadmap and eventually figure out how to drive new resources into their project to fit these plans. We again verified the deep commitment of the Wordforge community to fit the needs of the Debian project and work on a partner basis.
Building the Debian infrastructure server
The Debian i18n task force and the Junta de Extremadura representatives (namely César Gómez Martín, who organized all the local logistics, travel and related practical items) agreed about dedicating a server for the Debian i18n activities.
This server will be hosted in the Junta de Extremadura datacenter, in Badajoz, Spain. It will be entirely dedicated to the Debian i18n activities, first as a test platform for the future Debian i18n infrastructure and later as part of the official Debian servers network.
During the first phase, this server will be added to the debian.net domain. Felipe Augusto van de Wiel will be the main server administrator, helped by César Gómez Martín as local contact. Felipe will build a system admin team for the testing and setup phase.
The initial server was setup by Felipe during the meeting. We consider this as the first technical achievement towards a Debian i18n infrastructure. The server features a Pootle server and chrooted environments have been setup for installation of alternative or complementary software (for instance, Eddy Petrisor began working on setting up an implementation of transdict).
Initial work began to "feed" the server with data extracted from the Debian packages description translations, with help of Michael Bramer, initiator and leader of the DDTP project, who was present at the meeting. These data will help Wordforge developers to push Pootle off its limit and improve its ability to sustain high loads.
This data will also help testing the integration of Gintautas work, namely the storage backend, in heavy load conditions.
Michael Bramer presented the status of the DDTP project. The Debian mirror infrastructure is now ready to host Translate-<lang> files for the use of modified APT versions. A version of APT which can use these translated descriptions has been successfully tested.
The i18n team agreed to commit themselves to get this modified APT into etch and support the translated descriptions feature and the possible bugs that could come because of it.
A very basic infrastructure exists to allow translation updates. It fits the very simple needs of translating material even if it is very far from the ideal infrastructure.
A first attempt to feed the demo Pootle server with PO files generated from the raw DDTP material has been launched. Though not completely successful, it helped showing that, after some more debugging, we could very soon be able to have our demo server including the DDTP translations. This will serve as a high load test. However, managing translation updates through this method will not be supported and that demo server should not be used for production work. We recommend using the DDTSS interface, written by Martijn van Oosterhout .
The basis for more active actions by the Debian i18n task force has been drawn.
We will begin working on a few directions, some before the release, some after:
- complete the transition to po-debconf (and make the use of it a policy requirement)
- push the inclusion of translation work in packages
- help the gettext 0.15 transition
Decision has been taken to request for the addition of a debian-i18n pseudo-package. Most work will be tracked by using metabugs on this package. Metabugs will be used to identify different category of i18n bugs (some ideas were: transition-po-debconf, transition-po4a-manpages, transition-new-gettext, transition-utf8-support, cat-po-debconf, cat-po-native, cat-po4a). The combination of these metabugs, of blockers, and of the existing usertags (for languages) will be helpful for the i18n Task Force. Gerfried Fuchs is responsible for asking for the pseudo-package creation.
A NMU campaign will start to push as many po-debconf translations as possible into packages during the next months. It will use infrastructure and methods put in place by Lucas Wall and Christian Perrier  back in Jan. 2005 for a similar campaign to push po-debconf transitions.
Thomas Huriaux and Gerfried Fuchs will initiate the work by identifying pending l10n bugs and sort packages according to the age and number of pending l10n bugs (in various categories if possible). Contact will be made with Lucas for the re-use of his infrastructure for this campaign (Felipe Augusto van de Wiel). The templates will have to be checked (Stefano Canepa), the pre-NMU schedule could also be reviewed.
The Debian Developers present at the meeting enforced their commitment to participate in this NMU campaign.
Packages which do not use po-debconf for the interaction with users should not be allowed in Etch+1 (RC). This should be proposed as a release goal.
Localization-config (l-c) revival
Christian Perrier presented the l-c package, which was aimed at completing the system localization on installed systems, in relation with D-I.
l-c is used in the sarge installer to handle various localization/internationalization related parameters, which are not considered to be properly handled in the relevant packages: X serever keyboard settings, GDM localization, dictionaries settings, KDE parameters, etc.
In sarge, l-c is run during the second stage install, in two steps, before and after the packages and tasks installation. Up to now, this has not been re-integrated to D-I. The D-I team is awaiting for this to happen, even though this is not considered as release critical for D-I.
Christian did some early work on that purpose and mentioned that this all needs testing. The new version of the package, which provides a new udeb package, has been processed by the ftpmasters during the week-end.
Several aspects that previously required the use of l-c do now correctly handle l10n, so it's quite likely that the tool's importance will be lowered.
However, some work has now to be done to adapt l-c actions to etch. Gerfried Fuchs agreed to conduct this task, first in relation with Christian Perrier, backup maintainer, then with Konstantinos Margaritis, the main maintainer.
Fonts and Input Methods (Keyboard handling - console and X)
Javier Solá presented the Khmer font. This pointed some assumptions made by latin glyphs users (height of glyphs, hyperlink decoration, shortcut for menus). Friedel Wolff indicated a page started on the translate wiki (http://translate.sourceforge.net/wiki/l10n/displaysettings) to gather this information.
Guntupalli Karunakar talked about input methods (X and Gnome keyboard, SCIM, IME extension for Firefox), Jaldhar Vyas presented SCIM (Smart Common Input Method), and Kenshi Muto talked about the Japanese glyph and input method.
This topic also popped up during the l-c BOF session. That session concluded that an interesting post-etch would would be creating a matrix of all languages we support in D-I and, for each, identify what should be the default keymap in X, then recreate this keymap with console-setup tools, and add it to console-data. These keymaps would then be the only proposed ones in D-I, which would help getting consistency between console and X keymaps. Felipe Augusto van de Wiel volunteered for this work.
Improving Debian i18n/l10n Documentation
One area of activity is improving the i18n/l10n documentation, esp. the i18n guide (http://www.debian.org/doc/manuals/intro-i18n/) and related to areas discussed in this report. Also documentation about some tools like defoma, unicode fonts, input, scim, etc. Also a quick & easy guide to building a CDDD (CDD for Dummies)... Jaldhar & Karunakar volunteered for this.
Modularization of D-I languages support
There was an extensive discussion on how to improve the way d-i handles translations so that it will be possible, in the future, to provide as many translations as we are provided with.
The current d-i limitations are:
- initrd size
- RAM consumption
- required bandwidth
- separate translations from udebs and only download the one selected by users
- generate different initrds per language families
- only translate non expert questions
- reduce localechooser translations (all country names in all languages)
- move translations in 2 udebs (one for initrd components and another for other components
- use the 'lowmem' mechanisms to remove unused translations
From a side discussion from the D-I modularization initially, this topic derived into a deep improvised brainstorming session. A first draft summary is present at ?I18n/TranslationDataDistribution.
A language pack (or language package) is a "complement" for a software package that provides a translation for a given language separately from the main package. It is distributed in a separate way and can either be produced by the upstream developers and extracted from the main package source or they can be produced by independent third parties. For more information see I18n/LanguagePacks
Translations currently distributed in the Debian archive through:
- Binary packages
- Architecture independent packages associated with binary software packages
The discussion started focusing on one of the advantages of the language pack approach by Ubuntu: the capability to provide updated translations post-release. Some agreement is reached to try reaching a similar goal for etch+1. Some initial work (pre-etch) could include:
- Ubuntu's glibc patch to have an alternate location for MO binary files
- study a mechanism for translation updates for non-gettext data
Testing D-I translations
The need for more tests of the D-I translations was repeated. It is important that many users test the installer in their languages. Lior Kaplan presented how to use qemu to make these tests (how to run qemu, how to test the translations, make changes, and test again efficiently).
Defining the needs of Debian for its infrastructure server
This discussion essentially reaffirmed the needs we mention in the Debconf6 i18n sessions. See http://lists.debian.org/debian-i18n/2006/05/msg00135.html
These identified needs should be reformalized in a shorter document, probably maintained on the wiki. The Wordforge developer will then be able to mention whether each of these requirements is already supported, planned to be supported...or to be added to Pootle's roadmap.
i18n wiki and IRC channels
The next i18n server will feature a wiki for dedicated i18n activites. We will think about moving thing to the general Debian wiki when it appears to be more appropriate. The i18n wiki should only be a work wiki for meetings, common work, etc.
The i18n Task Force runs a #debian-i18n channel on irc.debian.org. All Debian developers and contributors are welcome to join and contact i18n wizards on that channel.
Videos of the meeting will be available at http://meetings-archive.debian.net/pub/debian-meetings/2006/ (this will be announced separately on debian-i18n)