FreeDict - free (multi/bi)lingual dictionaries
This wiki page documents the work flow and the usage of FreeDict dictionary databases in Debian. Users should directly skip to the [[#Using_FreeDict_dictionaries|Using FreeDict dictionaries]] section.
- FreeDict - free (multi/bi)lingual dictionaries
The FreeDict project aims at providing free (multi|bi)lingual dictionary (databases) and this team aims at packaging the latest versions to include at both in Ubuntu and Debian Stable/Unstable.
The preferred editable form of the data is in TEI/XML a file format under active maintenance by the Text Encoding Initiative.
The source can be obtained by checking out:
git clone https://anonscm.debian.org/pkg-freedict/freedict.git
If you have an alioth account, check out:
git clone ssh://<username>@alioth.debian.org/git/pkg-freedict/pkg-freedict.git
Using FreeDict dictionaries
A list of dictionaries can be obtained e.g. through
aptitude search dict-freedict
You can install all dictionaries using
apt-get install dict-freedict-all
(if anyone wants that ;-)).
In addition to the dictionary databases you need a program to view them. You have two options:
- you can install goldendict, which views the databases directly
- you can use the classical dictd client/server software:
Explanation Of Files Under debian/
This is essentially the FreeDict API, an XML document containing meta information about all available dictionaries and their release location. Important for Debian are the size, author, name and download destination. It is downloaded from http://www.freedict.org/freedict-database.xml and can be fetched using debian/fetchdictdata.py -x.
get orig.tar.gz distribution
In order to fetch the latest FreeDict source, one has to execute:
FreeDict does not provide all dictionaries as one source, but release each dictionary individually. However, the release process is exactly the same for all dictionaries and it would be cumbersome to update > 80 packages everytime, it's better to have all dictionaries in one repository. The latest release orig tarball can be obtained using debian/fetchdictdata.py -f.
From time to time, one should also update the debian/iso.*-file with the language codes, used to translate the short dictionary names like afr-deu into long names for package descriptions. They can be found at http://www-01.sil.org/iso639-3/download.asp.
The file debian/control is auto-generated.
Modifications for the source package should be done in debian/control.HEAD, everything else is then generated with
Any bug in the generated control file has to be fixed in debian/fetchdictdata.py.
debian/copyright is also semi-auto-generated. The command
generates the copyright (and control) file. The generated file MUST be checkd afterwards.
The generation consists of the following steps:
- include debian/copyright.snippets/HEAD
- for each dictionary look for an exception in debian/copyright.snippets and if present, include it
- every dictionary where the license cannot be found will get "FIXME" as its license value
If a FIXME entry is found, it's best to investigate the copyright information about this dictionary, extend the parser and if not possible, add an exception. As the name suggest, there should be only a few exceptions.
The Actual Build Process
The build process is essentially only a loop going through all directories and executing
One special exception is hung-eng and eng-hun, where the database is extracted from a special file before the above explained run.
Building all the databases takes very, very long. To test the packaging, the BUILD_MODE environment variable can be set, so that no actual dictionaries are generated (only empty files are created):
Contact And Help Wanted!
A Mailing List exists, where packaging issues are discussed. If you like IRc, you can join #freedict in the OFTC network. We're looking forward to speak to you!
Please see also the to do section.
To Do / Roadmap (?)
A good place to start is to have a look at the open bugs
Further things to be done:
- package freedict-tools as separate package
- parallelize build process ( no data dependencies actually) ! adjust maintainer scripts to register databases in goldendict
package descriptions are autogenerated from data, do the very same with translations, see http://www.debian.org/international/l10n/ddtp
- review old patches under debian/patches.old and commit those upstream, if appropriate
- ! could be split among several persons, so everyone has only a little piece of work