This page is to help writing "get-orig-source" target.

"get-orig-source" is an optional target in "debian/rules" file to obtain orig.tar archive with upstream sources.

It must comply with policy requirements as per ยง4.9:

In its simplest form it can help to invoke uscan to fetch orig.tar:

PKD  = $(abspath $(dir $(MAKEFILE_LIST)))
PKG  = $(word 2,$(shell dpkg-parsechangelog -l$(PKD)/changelog --show-field=Source))
VER ?= $(shell dpkg-parsechangelog -l$(PKD)/changelog | perl -ne 'print $$1 if m{^Version:\s+(?:\d+:)?(\d.*)(?:\-\d+.*)};')

.PHONY: get-orig-source
## http://wiki.debian.org/onlyjob/get-orig-source
get-orig-source:  $(info I: $(PKG)_$(VER))
        @echo "# Downloading..."
        uscan --noconf --verbose --rename --destdir=$(CURDIR) --check-dirname-level=0 --force-download --download-version $(VER) $(PKD)

The above code (without informational printout) is close to minimum implementation of policy requirements.

PKD holds path to package directory which is needed when get-orig-source called not from current working directory.

PKG holds source package name (to avoid hard-coding).

VER is a package version without debian revision number. It can be overridden from command line:

VER=0.2 debian/rules get-orig-source

Note:
Since dpkg 1.17 you can simply use the -S option of dpkg-parsechangelog instead of extracting the version number with sed, awk or perl. Please add dpkg-dev >=1.17 to Build-Depends.

VER  = $(shell dpkg-parsechangelog -l$(PKD)/changelog -SVersion | cut -d- -f1)

The above implementation of "get-orig-source" prints some information before invoking uscan with rather not-trivial arguments -- that's why it could be helpful to have get-orig-source even for trivial uscan invocation. All those parameters given to uscan are somewhat helpful:

I always had troubles remembering all those parameters so get-orig-source was helpful for that matter. However more often get-orig-source is used to generate orig.tar from VCS checkout and/or repackage orig.tar to get rid of non-DFSG or other unwanted content.

Repackaging orig.tar

Let's improve the above example with repackaging and clean-up:

PKD   = $(abspath $(dir $(MAKEFILE_LIST)))
PKG   = $(word 2,$(shell dpkg-parsechangelog -l$(PKD)/changelog | grep ^Source))
UVER  = $(shell dpkg-parsechangelog -l$(PKD)/changelog | perl -ne 'print $$1 if m{^Version:\s+(?:\d+:)?(\d.*)(?:\-\d+.*)};')
DTYPE = +dfsg
VER  ?= $(subst $(DTYPE),,$(UVER))

## http://wiki.debian.org/onlyjob/get-orig-source
.PHONY: get-orig-source
get-orig-source: $(PKG)_$(VER)$(DTYPE).orig.tar.xz $(info I: $(PKG)_$(VER)$(DTYPE))
        @

$(PKG)_$(VER)$(DTYPE).orig.tar.xz:
        @echo "# Downloading..."
        uscan --noconf --verbose --rename --destdir=$(CURDIR) --check-dirname-level=0 --force-download --download-version $(VER) $(PKD)
        $(if $(wildcard $(PKG)-$(VER)),$(error $(PKG)-$(VER) exist, aborting..))
        @echo "# Extracting..."
        mkdir $(PKG)-$(VER) \
        && tar -xf $(PKG)_$(VER).orig.tar.* --directory $(PKG)-$(VER) --strip-components 1 \
        || $(RM) -r $(PKG)-$(VER)
        @echo "# Cleaning-up..."
        cd $(PKG)-$(VER) \
        && find . -depth -type d -name 'windows' -exec $(RM) -r {} \; -printf 'removed %p\n' \
        && $(RM) -r -v \
            notneededdir/unwantedfile1 \
            notneededdir/unwantedfiles.*
        #$(RM) -v $(PKG)_$(VER).orig.tar.*
        @echo "# Packing..."
        find -L "$(PKG)-$(VER)" -xdev -type f -print | sort \
        | XZ_OPT="-6v" tar -caf "$(PKG)_$(VER)$(DTYPE).orig.tar.xz" -T- --owner=root --group=root --mode=a+rX \
        && $(RM) -r "$(PKG)-$(VER)"

This example chunk get-orig-source task in two parts:

  1. Downloading, extracting and cleaning.
  2. Packing.

get-orig-source requires .orig.tar.xz target that responsible for download, extraction and (re-)packing. Original orig.tar is deliberately left behind to avoid re-downloading in case when further clean-up is required.

"DTYPE" variable is to set name/type of repacked archive i.e. "+dfsg" or "+repack".

"VER" is a mangled (raw) upstream version without "+dfsg".

Packing turned out to be more complicated than just invocation of `tar -caf` because GNU tar stores directory times and permissions so generated archives will be always different unless the above workaround is used which is to filter out directories and pack only files. This trick helps to create predictable (binary identical) tars even if archiving took place on different architectures and file systems. Previously making identical tars was considered to be nearly impossible. :)

Often orig.tar needs to be generated from VCS checkout. It can be useful when upstream release no tars or if they are messy or not to be trusted.

Thanks to chunking only little modifications are needed to above example to modify it for git checkout:

orig.tar from git checkout

## checkout from git (add "git" to Build-Depends)
UURL = git://github.com/someauthor/someproject.git
UDATE = $(shell date --rfc-3339=seconds --date='TZ="UTC" $(shell echo $(VER) | perl -ne 'print "$$1-$$2-$$3" if m/\+(?:git|svn|hg)(\d{4})(\d{2})(\d{2})/')')
$(PKG)_$(VER)$(DTYPE).orig.tar.xz: $(info I: UDATE=$(UDATE))
        $(if $(wildcard $(PKG)-$(VER)),$(error $(PKG)-$(VER) exist, aborting..))
        @echo "# Downloading..."
        git clone $(UURL) $(PKG)-$(VER) \
        || $(RM) -r $(PKG)-$(VER)
        cd $(PKG)-$(VER) \
        && git checkout v$(VER) || git checkout $$(git log -n1 --format=%h --before="$(UDATE)") \
        && [ -s ChangeLog ] || ( echo "# Generating ChangeLog..." \
           ; git log --pretty="format:%ad  %aN  <%aE>%n%n%x09* %s%n" --date=short > ChangeLog \
           ; touch -d "$$(git log -1 --format='%ci')" ChangeLog) \
        && echo "# Setting times..." \
        && git ls-tree -r --name-only HEAD | while read F ; do touch --no-dereference -d "$$(git log -1 --format="%ai" -- "$$F")" "$$F"; done \
        && echo "# Cleaning-up..." \
        && $(RM) -r -v \
            notneededdir/unwantedfile1 \
            notneededdir/unwantedfiles.* \
        && $(RM) -r .git .git*
        @echo "# Packing..."
        find -L "$(PKG)-$(VER)" -xdev -type f -print | sort \
        | XZ_OPT="-6v" tar -caf "$(PKG)_$(VER)$(DTYPE).orig.tar.xz" -T- --owner=root --group=root --mode=a+rX \
        && $(RM) -r "$(PKG)-$(VER)"

In the above example I do several interesting things:

After cloning upstream repository I'm trying to checkout tag that is upstream version prefixed with "v" -- if upstream tag their releases in a different way you'll need to adjust "v$(VER)". If that fails there is a fallback to checkout last commit before given date in which case package version may look like "0.1+git20130618".

Then all files' times are set according to their commit dates as I want to preserve this information (FYI github-generated orig.tars reset modification times so don't expect to get two identical tars even if both of them were downloaded from the same URL).

After verbose clean-up a "ChangeLog" file is generated from commit log but only if it didn't exist or were empty. Commit log is formatted according to GNU ChangeLog format convention and its modification time is set to commit time.

Finally ".git" files and directories are removed to prepare upstream directory tree for packing.

Subversion example is quite similar:

orig.tar from SVN checkout

## checkout from subversion (add "subversion, svn2cl | subversion-tools (<< 1.7.5)" to Build-Depends)
UURL = svn://some.project.org/code/trunk
REV   = $(shell echo $(VER) | perl -ne 'print "$$1" if m/(?:git|svn|hg)(\d+)/;')
$(PKG)_$(VER)$(DTYPE).orig.tar.xz: $(info I: REV=$(REV))
        $(if $(wildcard $(PKG)-$(VER)),$(error $(PKG)-$(VER) exist, aborting..))
        svn checkout --config-option config:miscellany:use-commit-times=yes -r $(REV) \
            $(UURL) $(PKG)-$(VER) \
        || $(RM) -r $(PKG)-$(VER)
        @echo "Clean-up..."
        cd $(PKG)-$(VER) \
        && $(RM) -r -v \
            notneededdir/unwantedfile1 \
            notneededdir/unwantedfiles.* \
        && [ -s ChangeLog ] || ( echo "# Generating ChangeLog..." \
            ; svn2cl --break-before-msg --include-rev \
            | perl -0pi -e 's{(\d+\])[^:]+?:\s+}{$$1 }sgm;' ChangeLog) \
        && find . -depth -name ".svn" -exec $(RM) -r '{}' \;
        @echo "# Packing..."
        find -L $(PKG)-$(VER) -xdev -type f -print | sort \
        | XZ_OPT="-6v" tar -caf "$(PKG)_$(VER)$(DTYPE).orig.tar.xz" -T- --owner=root --group=root --mode=a+rX \
        && $(RM) -r "$(PKG)-$(VER)"

The above example will checkout revision given as number like "0.1+svn1234". To checkout for given date like "0.1+svn20130618" modify REV to add curly brackets around date:

REV   = $(shell echo $(VER) | perl -ne 'print "{$$1}" if m/(?:git|svn|hg)(\d+)/;')

orig.tar from mercurial checkout (:FIXME:)

Package version is expected with date: "0.1+hg20130818".

## checkout from mercurial (add "mercurial" to Build-Depends)
UURL = http://some.project.org/code/project
UDATE = $(shell date --utc --date='TZ="UTC" $(shell echo $(VER) | perl -ne 'print "$$1-$$2-$$3" if m/\+(?:git|svn|hg)(\d{4})(\d{2})(\d{2})/')' "+%F %T %z")
BRANCH= default
$(PKG)_$(VER)$(DTYPE).orig.tar.xz: $(info I: UDATE=$(UDATE))
        $(if $(wildcard $(PKG)-$(VER)),$(error $(PKG)-$(VER) exist, aborting..))
        hg clone --branch $(BRANCH) $(UURL) $(PKG)-$(VER) \
        || $(RM) -r $(PKG)-$(VER)
        cd $(PKG)-$(VER) \
        && hg update --date "<$(UDATE)" \
        && [ -s ChangeLog ] || ( echo "# Generating ChangeLog..." \
        && hg log --style=changelog --date "<$(UDATE)" > ChangeLog ) \
        && echo "# Clean-up..." \
        && $(RM) -r -v \
            notneededdir/unwantedfile1 \
            notneededdir/unwantedfiles.* \
        && $(RM) -r .hg .hg* \
        @echo "# Packing..."
        find -L "$(PKG)-$(VER)" -xdev -type f -print | sort \
        | XZ_OPT="-6v" tar -caf "$(PKG)_$(VER)$(DTYPE).orig.tar.xz" -T- --owner=root --group=root --mode=a+rX \
        && $(RM) -r "$(PKG)-$(VER)"

orig.tar from bazaar checkout

## checkout from bzr (add "bzr (>= 2.6.0~bzr6520)" to Build-Depends, see #666496)
UURL = http://some.project.org/code/trunk
$(PKG)_$(VER)$(DTYPE).orig.tar.xz:
        $(if $(wildcard $(PKG)-$(VER)),$(error $(PKG)-$(VER) exist, aborting..))
        bzr checkout --hardlink --lightweight --revision=tag:release-$(VER) $(UURL) $(PKG)-$(VER) \
        || $(RM) -r $(PKG)-$(VER)
        @echo "# Packing..."
        find -L "$(PKG)-$(VER)" -xdev -type f -print | sort \
        | XZ_OPT="-6v" tar -caf "$(PKG)_$(VER)$(DTYPE).orig.tar.xz" -T- --owner=root --group=root --mode=a+rX \
        && $(RM) -r "$(PKG)-$(VER)"

Available Makefile snippets providing useful variables