This page is about a proposal to make debian/copyright machine-interpretable. It is one of the most important files in Debian packaging, yet its format is vague and varies tremendously across packages, making it difficult to automatically parse.

This is not a proposal to change the policy in the short term.

Recent changes:

Rationale

The diversity of free software licenses means that Debian does not only need to care about the freeness of a given work, but also its license's compatibility with the other parts of Debian it uses.

The arrival of the GPL version 3, its incompatibility with version 2, and our inability to spot the software where the incompatibility might be problematic is the most recent occurrence of this limitation.

There are a few precedents, also. One is the GPL/OpenSSL incompatibility. Apart from grepping debian/copyright, which is prone to numerous false positives (packaging under the GPL but software under another license) or negatives (GPL software but with an "OpenSSL special exception" dual licensing form), there is no reliable way to know which software in Debian might be problematic.

And there is more to come. There are issues with shipping GPLv2-only software with a CDDL operating system such as Nexenta. The GPL version 3 solves this issue, but not all GPL software can switch to it and we have no way to know how much of Debian should be stripped from such a system.

Proposal

I suggest to add simple RFC2822 multiline fields to debian/copyright containing machine-interpretable values for copyright holders, known licenses, upstream URLs etc.

These fields should be clear enough to obviate duplicating their information somewhere else in the file.

Compatiblity and human-readability

It is important to have debian/copyright remain human-readable, and thus not to overengineer this proposal by adding too many fields. However I believe that as it is, it remains clear enough to a human (as suggested by the examples at the end).

Also, it is important to allow any form of free text in the file, be it before or after the machine-interpretable part. I therefore suggest that fields can be interspread anywhere in the file. Lines that do not start with a known field name or that do not start with a space and follow a valid line should be ignored by an interpreter.

For clarity we should recommend separating machine-interpretable parts with empty lines.

Its probably a good idea to keep a human-readable (traditional) part in the copyright file. Because while the proposed format is quiet easily readable to people, who are actually technically experienced, it might not be so easily readable for not so technically experienced people. And these people should not be barred from understanding the copyright situation for a given package. I know tht this is extra work, but its a good practice IMHO and probably can be dropped, once there exist interpreters for the copyright file. (PatrickSchoenfeld)

I think it is important to keep the files strictly RFC2822 format and I think that including the human-readable licence text certainly seems to be readble with only "." seperating the paragraphs. - ?NoahSlater

Fields

The details are yet to be discussed. Here is a list of what is needed, and which fields I suggest to add:

Proposal: Full RFC822 specification

I propose that debian/copyright is fully in RFC822 format. I have added the following construction to a few of my packages:

X-Format-Specification: http://wiki.debian.org/Proposals/CopyrightFormat
X-Debianized-By: Morten Kjeldgaard <mok@bioxray.au.dk>
X-Debianized-Date: Sun, 10 Jun 2007 16:13:07 +0000.
X-Source-Downloaded-From: http://sourceforge.net/projects/btk/
X-Upstream-Author: Tim Robertson <kid50@users.sourceforge.net>,
    Chris Saunders <ctsa@users.sourceforge.net>

If formally adopted in the standard, the "X-" in front of each label can be dropped. One of the arguments for doing it this way, is that it is important to be able to see the author (copyright owner) of the original debianization.

I find that given a copyright template with the above lines, plus a template set of [Files:, Copyright:, License:] it becomes quite fast and easy to write the copyright file.

The mandatory statement, saying where to find the standard license file, is also RFC822'ed:

X-Comment: On Debian systems, the complete text of the GNU    Lesser General Public License can be found in `/usr/share/common-licenses/LGPL-2.1'.

Again, the "X-" signifies a private tag.

One might object that this makes debian/copyright less "humanly" readable. OTOH, the standardized formatting makes it feasible to make a nice "copyright browser" with beautifully formatting of the file.

Please note that the RFC822 examples above are not correctly indented in the Wiki.

-- ?MortenKjeldgaard

For my packages where the upstream tarball is not versioned or no watch file is available I have used the following:

X-Source-Downloaded-From: http://www.intertwingly.net/code/venus/
X-Source-Get-Original: ./debian/rules get-orig-source
X-Source-Size: 389120
X-Source-MD5: 934d927eecfdb5a1a4a17798de3ed60f

Also, for any dependances that the get-orig-source target might require I propose to use:

X-Source-Original-Depends: autoconf, automake, libtool, subversion-tools

Additionally, I propose that the Licence field can be used in isolation to supplement the main File blocks. A use case for this is when a package contains many components that are licenced to different people but all use the same licence. Including the Licence blocks seperatly lets you avoid repetition of potentially long licence text. An example is included below:

Files: src/js/editline/*
Copright: Copyright 1993, Simmule Turner,
 Copyright 1993, Rich Salz
License: MPL-1.1 | GPL-2 | LGPL-2.1
Files: src/js/fdlibm/*
Copright: Copyright 1993, Sun Microsystems Corporation
License: MPL-1.1 | GPL-2 | LGPL-2.1
Licence: MPL-1.1
 [LICENCE TEXT]
Licence: GPL-2
 [LICENCE TEXT]
Licence: LGPL-2.1
 [LICENCE TEXT]

The couchdb package uses this to avoid repeating each licence for every Files block.

-- ?NoahSlater

I propose an additional field, X-Non-Free-Autobuild, to fullfill the [http://lists.debian.org/msgid-search/20061129152824.GT2560@mails.so.argh.org requirements for the use of the non-free autobuilders] from release.net team. Here is for example what I use in non-free/clustalw:

X-Non-Free-Autobuild: yes
  The licence does not forbid Debian from using autobuilders to create binary
  packages.

-- CharlesPlessy

File patterns

Field format

The contents of the Files field should be a list of comma-separated values:

Files: foo.c, bar.*, baz.[ch]

Files containing spaces or commas should be put within double quotes. The backslash character is an escaping character, be it inside or outside double quotes:

Files: "Program Files/*", manual\[english\].txt

Pattern syntax

Patterns are the ones recognised by the find utility's -name and -wholename flags. They behave as if find had been called in the following way from the top source directory:

find . -wholename "$PATTERN"

This will match all Makefile.am files in the tree and all Python scripts:

Files: */Makefile.am, *.py

But this will only match the top-level Makefile.am:

Files: ./Makefile.am

Special rule: if a pattern $PATTERN does not match any file in the source, it is implicitly considered to be expanded to */$PATTERN. This is to avoid insane verbosity when referring to a unique file buried deep in the tree.

Match order

It is quite common for a work to have most of its files under a given license, and only a few files (for instance, embedded getopt.c and getopt.h) under another. However it makes more sense to have the copyright file list the "main" license first.

Matches should be exclusive (a file can only match one rule). The final rule that should be considered is the most specific one (the one that matches the fewer files), or if this is ambiguous, the last one in the file.

Thus, in this case of getopt.c, it is the second rule that has to be taken into account:

Files: *
Copyright: [the main work’s author]
License: [the main work’s license]
Files: getopt.*
Copyright: © 2000 the NetBSD Foundation, Inc.
License: other-BSD
 [text of the NetBSD license]

License keywords

The "License" field format should not contain random values. Which is why there needs to be a list of accepted keywords which have a very specific, unambiguous meaning. Here is a non-exhaustive list, please help fill it with popular license names we're likely to meet in Debian:

keyword

meaning

GPL-any

GNU General Public License, author did not specify version ?BR (probably the same as GPL-1+)

GPL-1

GNU General Public License, version 1 only

GPL-1+

GNU General Public License, version 1 or later ?BR (probably the same as GPL-any)

GPL-2

GNU General Public License, version 2 only

GPL-2+

GNU General Public License, version 2 or later

GPL-3

GNU General Public License, version 3 only

GPL-3+

GNU General Public License, version 3 or later

LGPL-any

GNU Library/Lesser General Public License, author did not specify version

LGPL-2

GNU Library General Public License, version 2 only

LGPL-2+

GNU Library General Public License, version 2 or later

LGPL-2.1

GNU Lesser General Public License, version 2.1 only

LGPL-2.1+

GNU Lesser General Public License, version 2.1 or later

LGPL-3

GNU Lesser General Public License, version 3 only

LGPL-3+

GNU Lesser General Public License, version 3 or later

PSF

Python License, author did not specify version

PSF-2

Python License, version 2 only

GFDL-any

GNU Free Documentation License, author did not specify version ?BR (maybe this needs mention of the fact that we accept no invariant sections, etc.)

GFDL-1.1

GNU Free Documentation License, version 1.1 only ?BR (same note as above)

GFDL-1.1+

GNU Free Documentation License, version 1.1 or newer ?BR (same note as above)

GFDL-1.2

GNU Free Documentation License, version 1.2 only ?BR (same note as above)

GFDL-1.2+

GNU Free Documentation License, version 1.2 or newer ?BR (same note as above)

GAP

GNU All-Permissive license, http://www.gnu.org/prep/maintain/maintain.html#License-Notices-for-Other-Files

BSD-2

Two-clause BSD license

BSD-3

Three-clause BSD license, with no-endorsement clause, as seen in /usr/share/common-licenses/BSD?BR

BSD-4

Four-clause BSD license, with no-endorsement clause and advertising clause; GPL-incompatible (need exact text)

Apache-1.0

Apache license, version 1.0; not GPL-compatible

Apache-1.1

Apache license, version 1.1; not GPL-compatible

Apache-2.0

Apache license, version 2.0; GPL-3-compatible, not GPL-2-compatible

MPL-1.1

Mozilla Public License, version 1.1 only, http://www.mozilla.org/MPL/MPL-1.1.html

Artistic

The original Artistic license, as seen in /usr/share/common-licenses/Artistic

Artistic-2.0

The Artistic license, version 2.0, http://www.perlfoundation.org/artistic_license_2_0

LPPL-1.3a

The LaTeX Project Public License, version 1.3a, http://www.latex-project.org/lppl/lppl-1-3a.txt; GPL-incompatible?BRNote that works under any version of the LPPL often have additional restrictions attached; check carefully.

ZPL

Zope Public License, author did not specify version

ZPL-2.1

Zope Public License, version 2.1 only

EPL-1.1

Erlang Public License, version 1.1 only

EFL-2

Eiffel Forum License, version 2 only

CC-BY-3

Creative Commons Attribution License (Unported), version 3.0 only

CC-BY-SA-3

Creative Commons Attribution-?ShareAlike Licence (Unported), version 3.0 only

ZLIB

The zlib/libpng license as in http://www.opensource.org/licenses/zlib-license.php

...

add your favourite license here

other

Anything else not covered in this list, should be clarified in the following lines of the field

Stuff that we might want but that needs to be clarified:

other-BSD

a BSD-like license ?BR (not sure it's wise to have this keyword, especially since it might be GPL-incompatible; if in doubt, let's stick with "other")

MIT

Several variants of the MIT license exist: the standard version with three paragraphs (blanket permission, keep this notice, NO WARRANTY), a version with a no-endorsement clause, and other versions with slight wording differences.

PD

public domain, not applicable everywhere

License names are case-insensitive.

The syntax of the field should follow debian/control's Depends field. The pipe character "|" is used for code that can be used under the terms of either licenses. The comma "," is used for code that must be used under the terms of both licenses (for rare cases where a single file contains code under both licenses).

For instance, this is a simple, "GPL version 2 or later" field:

License: GPL-2+

This is a dual-licensed GPL/Artistic work such as Perl:

License: GPL-1+ | Artistic

This is for a file that has both GPL and classic BSD code in it:

License: GPL-any, BSD-3

And this is for a file that has Perl code and classic BSD code in it:

License: GPL-1+ | Artistic, BSD-3

A GPL-2+ work with the OpenSSL exception is in effect a dual-licensed work that can be redistributed either under the GPL-2+, or under the GPL-2+ with the OpenSSL exception. It is thus expressed as "GPL-2+ | other":

License: GPL-2+ | other
 In addition, as a special exception, the author of this program gives
 permission to link the code of its release with the OpenSSL project's
 "OpenSSL" library (or with modified versions of it that use the same
 license as the "OpenSSL" library), and distribute the linked executables.
 You must obey the GNU General Public License in all respects for all of
 the code used other than "OpenSSL".  If you modify this file, you may
 extend this exception to your version of the file, but you are not
 obligated to do so.  If you do not wish to do so, delete this exception
 statement from your version."

Examples

Simple example

Here is a very simple example. This is the original copyright file for xsol:

This package was debianized by Josip Rodin <jrodin@jagor.srce.hr> on
Sun,  8 Nov 1998 18:00:00 +0100
Original source may be found at: ftp://sunsite.unc.edu/pub/Linux/X11/games/
Upstream author: Brian Masney <masneyb@newwave.net>.
Licensed under the terms of GNU GPL v2 (or later).
On Debian systems, the complete text of the GNU General Public License
can be found in file "/usr/share/common-licenses/GPL".

And this is a possible machine-interpretable format:

Original source may be found at: ftp://sunsite.unc.edu/pub/Linux/X11/games/
Files: debian/*
Copyright: [previous packager whose copyright might still apply]
           © 1998 Josip Rodin <jrodin@jagor.srce.hr>
License: [license of the packaging itself, if meaningful]
Files: *
Copyright: Brian Masney <masneyb@newwave.net>
License: GPL-2+
 This package is free software; you can redistribute it and/or modify
 it under the terms of the GNU General Public License as published by
 the Free Software Foundation; either version 2 of the License, or
 (at your option) any later version.
On Debian systems, the complete text of the GNU General Public License
can be found in file "/usr/share/common-licenses/GPL".

Complex example

This is the original copyright file for monsterz:

This package was downloaded from http://sam.zoy.org/monsterz/
monsterz.c, monsterz.py: Copyright (c) 2004-2005 Sam Hocevar <sam@zoy.org>
 |             DO WHAT THE FUCK YOU WANT TO PUBLIC LICENSE
 |                     Version 2, December 2004
 |
 |  Copyright (C) 2004 Sam Hocevar
 |   22 rue de Plaisance, 75014 Paris, France
 |  Everyone is permitted to copy and distribute verbatim or modified
 |  copies of this license document, and changing it is allowed as long
 |  as the name is changed.
 |
 |             DO WHAT THE FUCK YOU WANT TO PUBLIC LICENSE
 |    TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
 |
 |   0. You just DO WHAT THE FUCK YOU WANT TO.
music.s3m: Copyright (c) 1998 MenTaLguY <http://moonbase.rydia.net/>
 |  music.s3m was put in the public domain by MenTaLguY.
applause.wav, pop.wav: Copyright (c) 2002, 2005 Sun Microsystems, Inc.
 |  applause.wav was taken from OpenOffice.org's applause.wav and pop.wav was
 |  taken from OpenOffice.org's laser.wav. This product is made available
 |  subject to the terms of GNU Lesser General Public License Version 2.1.
click.wav: Copyright (c) Michael Speck <kulkanie@gmx.net>
 |  click.wav was taken from Barrage's click.wav. This program is free
 |  software; you can redistribute it and/or modify it under the terms
 |  of the GNU General Public License as published by the Free Software
 |  Foundation; either version 2 of the License, or (at your option) any
 |  later version.
boing.wav, ding.wav, duh.wav, grunt.wav, laugh.wv, whip.wav:
  Copyright (C) 2003 by David White <davidnwhite@optusnet.com.au> and the
  Battle for Wesnoth project
  Copyright (C) 2006 Sam Hocevar <sam@zoy.org>
 |  boing.wav was taken from Wesnoth's spear.wav and reworked by Sam
 |  Hocevar, ding.wav was taken from receive.wav, duh.wav was taken from
 |  female-strong-hit.wav, grunt.wav was taken from dwarf-die.wav, laugh.wav
 |  was taken from zombie-hit.wav, whip.wav was taken from dagger-swish.wav.
 |  This program is free software; you can redistribute it and/or modify
 |  it under the terms of the GNU General Public License. This program is
 |  distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY.
warning.wav: Copyright (c) Mike Kershaw <dragorn@kismetwireless.net>
 |  warning.wav was taken from Kismet's alert.wav. It is distributed under
 |  the terms of the GNU General Public License.
On Debian GNU/Linux systems, the complete text of the GNU General
Public License can be found in `/usr/share/common-licenses/GPL' and the
complete text of the GNU Lesser General Public License can be found in
`/usr/share/common-licenses/LGPL'.

Proposed format:

This package was downloaded from http://sam.zoy.org/monsterz/
Files: debian/*
Copyright: © 2004-2007 Sam Hocevar <sam@zoy.org>
License: GPL-2+
 The Debian packaging information is under the GPL, version 2 or later
Files: *.c, *.py
Copyright: © 2004-2005 Sam Hocevar <sam@zoy.org>
License: other-BSD
              DO WHAT THE FUCK YOU WANT TO PUBLIC LICENSE
                      Version 2, December 2004
 .
   Copyright (C) 2004 Sam Hocevar
    22 rue de Plaisance, 75014 Paris, France
   Everyone is permitted to copy and distribute verbatim or modified
   copies of this license document, and changing it is allowed as long
   as the name is changed.
 .
              DO WHAT THE FUCK YOU WANT TO PUBLIC LICENSE
     TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
 .
    0. You just DO WHAT THE FUCK YOU WANT TO.
Files: music.s3m
Copyright: © 1998 MenTaLguY <http://moonbase.rydia.net/>
License: PD
 music.s3m was put in the public domain by MenTaLguY.
Files: applause.wav, pop.wav
Copyright: © 2002, 2005 Sun Microsystems, Inc.
License: LGPL-2.1
 applause.wav was taken from OpenOffice.org's applause.wav and pop.wav was
 taken from OpenOffice.org's laser.wav. This product is made available
 subject to the terms of GNU Lesser General Public License Version 2.1.
Files: click.wav
Copyright: © Michael Speck <kulkanie@gmx.net>
License: GPL-2+
 click.wav was taken from Barrage's click.wav. This program is free
 software; you can redistribute it and/or modify it under the terms
 of the GNU General Public License as published by the Free Software
 Foundation; either version 2 of the License, or (at your option) any
 later version.
Files: boing.wav, ding.wav, duh.wav, grunt.wav, laugh.wav, whip.wav
Copyright: © 2003 by David White <davidnwhite@optusnet.com.au> and the
                  Battle for Wesnoth project
           © 2006 Sam Hocevar <sam@zoy.org>
License: GPL-any
 boing.wav was taken from Wesnoth's spear.wav and reworked by Sam
 Hocevar, ding.wav was taken from receive.wav, duh.wav was taken from
 female-strong-hit.wav, grunt.wav was taken from dwarf-die.wav, laugh.wav
 was taken from zombie-hit.wav, whip.wav was taken from dagger-swish.wav.
 This program is free software; you can redistribute it and/or modify
 it under the terms of the GNU General Public License. This program is
 distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY.
Files: warning.wav
Copyright: © Mike Kershaw <dragorn@kismetwireless.net>
License: GPL-any
 warning.wav was taken from Kismet's alert.wav. It is distributed under
 the terms of the GNU General Public License.
On Debian GNU/Linux systems, the complete text of the GNU General
Public License can be found in `/usr/share/common-licenses/GPL' and the
complete text of the GNU Lesser General Public License can be found in
`/usr/share/common-licenses/LGPL'.

This is how it could look like in vim:

attachment:debian-copyright-vim-syntax.png