Maven Repository Specification

This page specifies how to install Java libraries in Maven compatible way which makes it possible to use Maven for Debian packaging: Java/MavenBuilder. This specification is intended to be

Status

Discussion in the mailing list has started.


This document is not up-to-date.

If you are looking for how to use the tools provided by package maven-repo-helper, please install this package then open the documentation files located in /usr/share/doc/maven-repo-helper.


Motivation: advantages of using Maven

Maven has advantages for the upstream developers that won't be repeated here. That is the reason why more and more projects are switching to Maven as their primary build tool. Detailed information about maven can be found at Maven's homepage and in the book Maven: The Definitive Guide.

Maven maintains a model of a project in a file pom.xml: the developer can assign attributes to a project such as:

Most of those attributes can directly be used for Debian packaging but the most interesting ones are the dependencies.

Imagine a project 'a' that depends on 2 other projects 'b' and 'c' where 'b' itself depends on 'd', 'e', 'f' and 'c' depends on 'f', 'g', 'h'.

a ---> b ---> d
   |      |
   |      |-> e
   |      |
   |       -> f
   |
    -> c ---> f
          |
          |-> g
          |
           -> h

In a later upstream version 'c' adds another depends 'i' and that means that we have to change all reverse depends of 'c' including 'a' (like adding i.jar to DEB_JARS in debian/rules). But Maven will do this automatically for us and we do not have to touch reverse depends of any package when the dependencies change.

Problems with upstream's repository (central)

There is one central repository for Maven artifacts at http://repo2.maven.org/maven2/ that ships all releases of an artifact. The artifact log4j:log4j has 12 different versions at http://repo2.maven.org/maven2/log4j/log4j/ and maven downloads one of them during building a package that declares log4j:log4j as a dependency. Sometimes it is difficult to predict which version gets downloaded by maven and that is why it is hard to use maven in offline mode but for building Debian packages the offline mode is essential. All dependencies must be available as Debian packages and it is not acceptable to download artifacts during the build process from the central Maven repository.

The package maven-debian-helper tries to solve this problem by providing a local repository below the following directory:

REPO=/usr/share/maven-repo

We will reference this location as $REPO in the specification.

Alternatives

JPackage

The documentation of JPackage can be found at http://www.jpackage.org/cgi-bin/viewvc.cgi/src/jpackage-utils/doc/jpackage-1.5-policy.xhtml?root=jpackage&view=co. There is no information there on how to use maven. JPackage uses a patched Maven that understands the package layout in /usr/share/java. As a maintainer you have to learn the toolset - and that is why JPackage fails the 'easy to use' requirement.

JPackage cheats on version numbers - whenever a pom requests a specific version like 1.2.3 its Maven just delivers what it has in /usr/share/java without considering the requested version all. They obviously did not solved the problem of having multiple versions of an artifact installed at the same time but we have various versions of asm, commons-collections, junit, and more in Debian and we must have a solution for that.

Ubuntu

Ubuntu has its own specification at https://wiki.ubuntu.com/JavaTeam/Specs/MavenSupportSpec. There is no (nontrivial) package yet that follows the specification. All the Maven plugins and helper libraries mentioned in the specification hasn't been packaged in Ubuntu yet but they are packaged in Debian. Trying to use the Ubuntu toolset for packaging things like jetty looks very hard. It might be worth checking the spec again later. We are in contact with the Ubuntu developers and we are trying to cooperate as much as possible.

Targets

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.

This specification is targeted at the following types of packages:

  1. Packages that use Maven for building SHALL install their artifacts into $REPO. Those packages SHOULD use maven-debian-helper which will do most of the work automatically.

  2. Packages that don't use Maven (yet) but their upstream developers are using Maven: they SHALL install their artifacts into $REPO after making sure they follow the specification. Patching of the pom.xml files might be necessary. Maven-debian-helper MAY be used to check the conformance to the spec.

  3. Package where the upstream developers don't use Maven but pom.xml are provided for Maven users: the artifacts SHOULD be installed into $REPO after making sure they follow the specification. Patching of the pom.xml files might be necessary. Maven-debian-helper MAY be used to check the conformance to the spec.

  4. All other packages: pom.xml files from other sources (central, mvnrepository.com or hand written) MAY be installed into $REPO after making sure the artifacts follow the specification. Patching of the pom.xml files might be necessary. Maven-debian-helper MAY be used to check the conformance to the spec.

For packages that are used very often by Maven based packages (example: junit) the MAY or SHOULD used above SHOULD by upgraded to a SHALL.

Specification

Artifacts MUST be installed into $REPO/$GROUPID/$ARTIFACTID/$VERSION/ where $GROUPID is the result of groupId.replace( '.', '/' ). The pom.xml files MUST be installed as $ARTIFACTID-$VERSION.pom and jar files as $ARTIFACTID-$VERSION.jar. A unversioned symlink $ARTIFACTID.jar to the jar file SHOULD be installed into /usr/share/java/.

All compile and run time dependencies including parents and plugins MUST be resolved by packages that are available in Debian. Test dependencies need not be resolvable except if you want to build and run the test code. Dependencies that are not yet following this specification can be referred with <scope>system</scope> and <systemPath>/usr/share/java/$ARTIFACTID.jar</systemPath> but this SHOULD be avoided if possible. Example:

<dependency>
  <groupId>org.apache.maven</groupId>
  <artifactId>maven-core</artifactId>
  <version>2.0.9</version>
</dependency>

could be changed to

<dependency>
  <groupId>org.apache.maven</groupId>
  <artifactId>maven-core</artifactId>
  <version>2.0.9</version>
  <scope>system</scope>
  <systemPath>/usr/share/java/maven2.jar</systemPath>
</dependency>

as long as the Debian package maven2 does not ship its pom files. The hardcoded version number is ignored by Maven if the <systemPath> element is specified.

Hard coded version numbers in dependencies SHOULD be avoided and replaced by properties: $GROUPID.$ARTIFACTID.version. Example:

<dependency>
  <groupId>org.apache.maven</groupId>
  <artifactId>maven-core</artifactId>
  <version>2.0.9</version>
</dependency>

should be changed to

<dependency>
  <groupId>org.apache.maven</groupId>
  <artifactId>maven-core</artifactId>
  <version>${org.apache.maven.maven-core.version}</version>
</dependency>

Due to 2 bugs in Maven 2.0.x

the interpolation of properties does not work for referring parent poms. That is why you either need to hard code the parent version or comment out the parent element entirely. This specification will be updated as soon as those bugs are fixed in a stable release of Maven.

Glossary

Some of Maven's concept are explained here but do not expect an exact reference, please.

Artifact

An artifact is a module in a Maven project. Every artifact has one pom.xml file (called the POM) and has zero or one binary jar files. An artifact can be uniquely addressed by the <artifactId>, <groupId>, and <version> elements.

Dependency

A reference to another artifact that is needed for building, testing, or during runtime. It is specified by the <dependency> element.

Parent

Every artifact can have zero or one dependency specified by the <parent> element. Parent are somewhat similar to dependencies but not identical.

Plugin

Maven uses plugins to carry out most of the work of build process like the maven-clean-plugin, maven-compile-plugin, and maven-jar-plugin - just to name a few. Specialized plugins can be used to customize the build process and they are specified by the <plugin> element.

POM

The project object model that describes the artifact and its build process. It is represented as a file pom.xml in the source code which gets renamed to $ARTIFACTID-$VERSION.pom during installation.

Project

One or more modules can be built in one build process and they usually share the same version number. In a multimodule project the modules are specified by the <module> element. That is why every maven project is best packaged as one Debian source package.

Brainstorming

The Maven repo should support smooth upgrades of Java libraries. When a new version of a library is installed in a Debian system, this is what should happend:

  1. Files in $REPO/$GROUPID/$ARTIFACTID/$OLD_VERSION/ are deleted

  2. The new POM file and link to the jar are installed under $REPO/$GROUPID/$ARTIFACTID/$NEW_VERSION/

  3. Other POMs which have a dependency on $GROUPID:$ARTIFACTID:$OLD_VERSION should see their dependencies updated to $GROUPID:$ARTIFACTID:$NEW_VERSION

1. ans 2. are simple file operations, but 3. implies that dpkg should somehow parse all POM files installed under $REPO, and update the dependency version where necessary.

I propose another solution, which keeps the amount of effort to a minimum, keeps $REPO consistent and useable at all times and works well with Maven.

The idea is to maintain 2 versions of each artifact under the Maven repository. The first version uses the native version from Maven, to keep compatibility.

The second version is more interesting: its version is converted to a Debian managed version, usually 'debian', but it could be '1.x' to represent any version compatible with the version 1 of the API.

The Maven repository will look like this:

  /usr/share/maven-repo/
       commons-beanutils/commons-beanutils/1.8.0/
              commons-beanutils-1.8.0.jar
              commons-beanutils-1.8.0.pom
       commons-beanutils/commons-beanutils/debian/
              commons-beanutils-debian.jar -> ../../../commons-beanutils/commons-beanutils/1.8.0/commons-beanutils-1.8.0.jar
              commons-beanutils-1.8.0.pom
       junit/junit/3.8.2/
              junit-3.8.2.jar
              junit-3.8.2.pom
       junit/junit/3.x/
              junit-3.x.jar -> ../../../junit/junit/3.8.2/junit-3.8.2.jar
              junit-3.x.pom

Note that I'm using links for the jars from the Debian version to the native version, to avoid duplication and ease upgrades.

Now the real trick is in how dependencies are versioned in each POM: we replace all native versions with Debian versions.

This is the (simplified) content of commons-beanutils-1.8.0.pom:

<?xml version="1.0" encoding="UTF-8"?>
<project>
        <modelVersion>4.0.0</modelVersion>
        <groupId>commons-beanutils</groupId>
        <artifactId>commons-beanutils</artifactId>
        <version>1.8.0</version>
        <packaging>jar</packaging>
        <dependencies>
                <dependency>
                        <groupId>commons-logging</groupId>
                        <artifactId>commons-logging</artifactId>
                        <version>debian</version>
                </dependency>
                <dependency>
                        <groupId>commons-collections</groupId>
                        <artifactId>commons-collections</artifactId>
                        <version>3.x</version>
                        <optional>true</optional>
                </dependency>
                <dependency>
                        <groupId>commons-collections</groupId>
                        <artifactId>commons-collections-testframework</artifactId>
                        <version>debian</version>
                        <scope>test</scope>
                </dependency>
                <dependency>
                        <groupId>junit</groupId>
                        <artifactId>junit</artifactId>
                        <version>3.x</version>
                        <scope>test</scope>
                </dependency>
        </dependencies>
</project>

commons-beanutils-debian.pom has the same content, except that <version> is now 'debian':

<?xml version="1.0" encoding="UTF-8"?>
<project>
        <modelVersion>4.0.0</modelVersion>
        <groupId>commons-beanutils</groupId>
        <artifactId>commons-beanutils</artifactId>
        <version>debian</version>
        <packaging>jar</packaging>
  [...]  

This layout makes it easy to upgrade libraries independenly of each others, it keeps compatibility with the Maven central repository so that you can mix and match Debian-controlled parts of the repository with downloads from the Internet if you wish.

I am currently preparing a package (maven-repo-helper) which could be used to manage the files in $REPO. My idea is to use some Java code to clean up automatically the POM files before their insertion in /usr/share/maven-repo, and to provide some helper scripts similar to the dh_* scripts which can install the jar files and the POM descriptors in the repository.

TODO

The merging of maven-debian-helper and maven-repo-helper (aka Debian and Ubuntu) should be reflected correctly in this spec.