Title: AppStream and Component Metadata for Debian DEP: 11 State: CANDIDATE Date: 2011-10-12 URL: http://wiki.debian.org/AppStreamDebianProposal Source: http://wiki.debian.org/AppStreamDebianProposal?action=info Drivers: Matthias Klumpp <matthias@tenstral.net>, Julian Andres Klode <jak@debian.org>, Michael Vogt <mvo@debian.org> License: GPL-3 Abstract: Proposal for an additional file in Debian repositories containing information about components packages provide as well as all data required for the cross-distro application manager project AppStream.
This is an old draft!
For the most recent proposal, please go to DEP-11.
This page contains a proposal how Debian could provide useful package metadata to applications and users and implement an own version of the cross-distribution AppStream specifications so we can get rid of big packages like "app-install-data".
The proposed solution will resolve two problems:
Implement a Debian-style version of the AppStream project metadata to provide information about available applications
- Provide new metadata describing the components a package contains.
AppStream is a cross-distro effort to provide an application manager ("AppStore") for all distributions, with advanced features like ratings & reviews. It uses the well-known Open Collaboration Service API to achieve this. The project should also improve collaboration with upstreams as well as collaboration with other distributions (packages can be compared easily using this metadata - useful for e.g. sharing patches) More (and detailed) info about AppStream can be found at Freedesktop.Distributions.
Contents
The ComponentMetadata index file
We propose the addition of a new index file for Application installers such as software-center, in order to replace the manually maintained app-install-data package, as that package needs to mirror and scan large portions in order to build, and is thus not very up-to-date.
Moving the meta-data to the server side allows for up-to-date information about the available applications, and should thus be a useful thing. The metadata we need is the information from the .desktop files in /usr/share/ applications/ together with the icons used for those .desktop files.
This file should also contain information about the components some packages provides. Components are for example shared libraries, pkg-config files, KDE-Plasma-Dataproviders, Fonts, Codecs, Firmware, Perl modules, Python-Modules, Haskell modules, printer drivers, etc. This information can be used by applications to automatically load missing functionality on request. This is already implemented for RPM-based distributions using PackageKit. Applications trying to use PackageKit to search for components on Debian will get very bad results at the moment due to missing metadata. (Currently, we're guessing the components by package names)
Proposed format
AppStream suggested the use of an XML file for providing the meta information. As Debian does not use XML anywhere else in the archive, and we do not expect anyone in Debian to like XML, we propose to use a simple RFC822-style Debian-style format for the index.
Here is a list of the fields we propose for this index file.
Fields
A block needs to contain a "Package" field and a "Architectures" field, as well as an "Application" and/or additional fields describing which components this package provides. Every field referring to a .desktop-file is not required if a package does not provide an application with a graphical UI.
Basic fields
Basic fields which are always required.
Field-Name: Package
Description: The same as in a Packages file
Field-Name: Version
Description: The same as in a Packages file. Used to associate the components described with exactly one .deb package. (And to display a version for applications)
Field-Name: Architectures
Description: Contains all architectures this package has been built for (amd64, i386, armel, kfreebsd-*, etc.) Additional field not present in AppStream stuff, included here to uniquely associate a .desktop file with exactly one .deb file. It also makes it possible to exclude metadata if a package is not (yet) present on one architecture, avoiding lots of duplication and wasted disk space.
Component fields
Fields describing resources ("components") this package provides. A list of possible components can be found here.
Field-Name: PlasmaServices
Description: The Plasma services (KDE4 desktop) this package provides.
Field-Name: SharedLibs
Description: A list of all shared libraries this package provides. (e.g. libprojectM.so.2, libogg.so, ...)
Field-Name: Python2
Description: A list of all Python2 modules this package provides
Field-Name: Python3
Description: A list of all Python3 modules this package provides
Field-Name: MimeType
Description: The "MimeType" field of the .desktop file, post-processed Just as with "Categories", we want to separate using commas instead of semicolons. A "MimeType" field can only exist if there is an "Application" field too.
Field-Name: Modaliases
Description: A list of "modalias" globs representing the hardware types (for example USB, PCI, ACPI, DMI) this package handles. Useful for installing printer drivers or other USB protocol drivers for smartphones, firmware, kernel drivers which are not merged upstream yet or whatever else.
Field-Name: Firmwares
Description: A list of firmware files included in the package, to make it possible to find the right firmware package to install for a given kernel driver.
AppStream app fields
Fields describing the application this package ships, for use in software-center-like applications.
Field-Name: Application
Description: Name of a .desktop file, should serve as unique identifier In practice, the name of a .desktop file is not always unique, for example packages building optimized and unoptimized versions of the application.
Field-Name: Name
Description: The "Name" field of the .desktop file
Field-Name: Name-<lang>
Description: Localized Name, The "Name[lang]" field of the .desktop file
Field-Name: Summary
Description: The "Comment" field of the .desktop file
Field-Name: Summary-<lang>
Description: Localized Comment, The "Comment[lang]" field of the .desktop file
Field-Name: Keywords
Description: The "Keywords" field of the .desktop file
Field-Name: Keywords-<lang>
Description: Localized Keywords, The "Keywords[lang]" field of the .desktop file
Field-Name: Icon
Description: The name of the icon, created from the .desktop "Icon" field This one is a bit more complicated, in case the original icon was a path, we need to rename it to something that is not a path, for example, by replacing separators with underscores
Field-Name: Categories
Description: The "Categories" field of the .desktop file, post-processed The "Categories" field of a .desktop file is separated by semicolons, we probably want to use comma here.
Field-Name: Homepage
Description: As in the packages file This one appears in the original XML specification, we could read it from the packages file, but it probably does no harm to copy it here, so we have all display information in one place.
Location in the Archive
We propose that the indices are provided alongside the packages files, that is in dists/<SUITE>/<COMPONENT>/ComponentMetadata.xy and compressed with whatever ftpmaster wish to compress with. This file will only be downloaded on demand, e.g. the Software Center could download it. On servers this file is not really required.
Application Icons
We also need to store the icons for the applications somewhere and provide e.g. a tarball of those icons as part of the archive. That tarball could be located at something like dists/<SUITE>/<COMPONENT>/Applications-Icons.tar.gz.
The icons in the tarball should probably be 32x32 sized and located in the "icons/32x32" sub-directory, at least according to the AppStream site. We could also deal with other schemes.
Implementation Hints for ftpmaster
ftpmaster might want to look at app-install-data's extractor code (for example, on git) for hints how icons can be located and named. All data can be generated by one script and be updated per-package, so the use of resources should be low. Much information can already be generated by just scanning the Contents.tar.gz file.
Client-side implementation
The data from the ComponentMetadata file will be used to generate a Xapian database, like the AppStream code does. The only difference is that we won't generate the Xapian DB from XML files. This Xapian database will then be queried by the Software Center or another Application Managament Tool. The component information will be fetched via apt and be available for PackageKit and APTDaemon, so other applications can make use of it.
The Fedora approach
In Fedora a similar feature is implemended using the Provides RPM field. Supported MIME types are listed like 'mimehandler(text/plain)' in the list of provides for a given RPM, and provided fonts are listed like 'font(?FontName)' in the same field. Similar is done for Perl modules using 'perl(module)', Tex packages using 'tex(packagename)', Ocaml modules using ocaml(modulename), Windows DLLs using mingw32(dllname), Mono using mono(module) and gstreamer feature using gstreamer0.10(feature). This make it possible to install a package using the modules, fonts or mime types provided by a given package. To install a special tex package, a simple 'yum install tex(packagename)' will install it.
More information is available from
Comments
Why not just use debtags? They exist and are already supported by all the tools. --ThorstenGlaser
The additional information would mess up debtags. (We would get many new, very technical tags) Also, this is technical information, which IMHO does not belong there. Also, the way debtags work is not really suitable to store the data described above. --?MatthiasKlumpp
Would this be an appropriate place to store information about the upstream VCS, VCS tagging scheme and upstream bug tracker? They are upstream metadata, but don't seem to really be useful for AppStream. --JelmerVernooij
They are useful for AppStream users, but this file (ComponentMetadata) proposed in DEP11 is mostly about auto-generated metadata. If this information is added to the package itself somewhere, we might be able to include it in DEP11 later. Maybe providing a DOAP file for the packages might be nice too, but that's out of scope for DEP11. --?MatthiasKlumpp
The fields Usb-Id and Pci-Id are probably not a good design choice. The Linux kernel have the concept of modalias, which contain bus, ID, class etc and is probably a better abstraction to use. This allow packages to declear suppot for entire classes of devices (think USB video cameras) and ranges of IDs (think all cards from a given vendor). The modalias values from the kernel are the base of the mechanism Ubuntu uses to locate their hardware specific packages. See the ubuntu packages ubuntu-drivers-common and jockey for their implementation details. I suggest to drop these fields, and replace them with values compatible with the Linux kernel and the Ubuntu Packages files, or perhaps drop them completely and store them in the Packages files like Ubuntu do instead. I've changed the proposal and replaced the Usb-ID and Pci-ID fields with a Modaliases field. --?PetterReinholdtsen