Dpkg source format version 3.0 (git)

FAQ

This FAQ concerns a the new 3.0 (git) format for debian source packages.

The version documented here is included in dpkg 1.15.8.

Q: How do I convert a debian source package to be built using the git format?

A: Although all that's needed for dpkg-buildpackage or debuild to build the new source format is to edit debian/source/format to contain "3.0 (git)", be aware that this format is not yet supported by the Debian upload queue, so only do this if you are experimenting with the new format. To do this, you also need to build the package from a git checkout of your source package (and note that git-buildpackage explicitly creates a non-Git source tree to build from, so stick to dpkg-buildpackage or debuild). There are some additional things you might want to consider, see below.

Q: What about pristine tarballs?

A: You can use the pristine-tar package to commit a delta file. It can then later extract the pristine source tarball directly from the git repository.

Q: So we all have to use git now?

A: No, not really. dpkg-source will continue to support the .orig.tar.gz+diff format. Switch to using the git format if you like, ignore it if you don't like. Note that bzr (experimental) and a quilt format are also available in v3 source format packages.

Q: But doesn't this mean that people have to learn git to contribute to or NMU my package?

A: Yes, contributors or NMUers will need to know some basic git commands like "git commit -a", since any change they make needs to be checked in before the package is built.

Q: Might this not be the start of a slippery slope? I don't mind learning git, but what if we get a dozen other revision control systems used in source packages?

A: It will be up to the dpkg developers to keep things sane. (Obligatory comment about sanity of dpkg developers elided. ;-) On the other hand, having a real revision control system available in the source package format avoids need for dbs and all its ilk. You have to learn them too, before contributing to a package that uses them. So in the end adding git to the source format might make things simpler and more standardised.

Q: Won't this waste bandwidth? Normally, I can upload only a small .diff.gz, and not the whole .orig.tar.gz. There's no diff to accompany the git bundle ...

A: Yes, you have to upload the whole git bundle. However, there's also the possibility of using git push, which is much, much faster than uploading often bloated .diff.gz files.

Now, imagine a special upload queue on git.debian.org. You upload just the .debs, .dsc and .changes, and git push the rest, and this queue builds the tarball for you and uploads it on to ftp-master. Done imagining? Go implement it. :-) (also see this blog post).

Q: Teams like the XSF and kernel team work from git branches of the upstream source. These repositories may not be appropriate to use as a source package, firstly because they can be quite large, and secondly because they can contain historic (or, in the case of the kernel, present) files with license problems.

Would it be practical for such a package to have its git history collapsed to the latest upstream release and the debianisation changes, and have that simplified history uploaded in the git bundle?

A: Sure. Just use dpkg-source's --git-depth option to specify a depth, and a shallow clone will be created containing only the history you want in the Debian package. This option can be put in debian/source/options or debian/source/local-options.

This is recommended for any package that uses git upstream, because the ftp-masters shouldn't have to review the entire historical content of the git repo for badly licensed content.

Q: Won't keeping all this history around bloat the archive something fierce?

A: It depends. Git is quite efficient, and more commonly, after converting a source package to git, you'll find that the git bundle is smaller than the old .orig.tar.gz.

If you have metric tons of upstream history in git -- ie, if you're maintaining linux-2.6 or glibc -- and the tarball does get too large, use --git-depth.

If you are packaging a source that contains large gzipped data files (maybe a game's data), or other files that git cannot handle efficiently, git may not be the right choice for that package.

Q: I'm worried about people who want to unpack debian source packages on a box running {SunOS,Windows}. Before all they needed were standard unix commands: tar, gz, and patch. Now they need git and who knows what other new tools next.

A: If you have a specific use-case for your source package that involves it being able to be unpacked that way, you shouldn't switch it to use git. However, we can't let {SunOS,Windows} get in the way of innovation. :-) Also, the git source format is simply a git-bundle file. It can be cloned with plain old git if you don't have dpkg-source, and nothing else needs to be done to unpack the package. So this is actually probably the easiest format to unpack by hand, if you have git.

Q: But, since more tools are being used for unpacking source packages, isn't there more potential for security exposure for insecure tools?

A: Yes, this is something the dpkg developers need to consider when adding support for a new version control system.

Git bundles use the same data formats as the git packs sent over the wire when cloning a git repository. So any security holes in git that can be exploited by a malicious source package could affect anyone pulling from a git repo on the net. This is actually a good thing, because it means that there's pressure to avoid such holes. Other revision control systems might not have their repositories used in ways that promote secure code (for example, svk's repository is local to one user).

Q: What if the git repository format breaks in some non-backwards-compatible way? We don't want Debian to be stuck with source packages that don't unpack anymore!

A: This seems unlikely to happen with git, because this same format is used for publishing git repositories all over the net.

In general, this is again something that has to be considered on a per-system basis. The subversion database format has changed a lot in the past, and nothing is keeping it from changing in the future, so we're unlikely to see dpkg-source supporting .svn.tar.gz.

Q: Are hooks preserved?

A: No, git-bundle does not preserve hooks or other configuration in .git. All you get from a bundle are all the local branches, tags, and history.

Q: If I apt-get source a package that uses git, can I use git pull, etc?

A: The bundle file is configured as the origin, so pulling from it is not useful. You can always add remotes as ususual.

In general, you should be able to use a git repo obtained via apt-get source just like any other repo. (There will be some limitations if the maintainer chose to publish a shallow clone.)

Q: dpkg-source complains that there are uncommitted changes and fails my package build?

A: This check is done because only committed changes will end up in the source package. If dpkg-source allowed a build to happen from an unclean repository, you could end up with a source package that doesn't match the binaries you distribute.

If you want to use dpkg-buildpackage to do a test build, you can use dpkg-buildpackage -b to skip building the source package.

Of course, you can always just list files to ignore in .gitignore. Or, to ignore a file only for one build, use dpkg-buildpackage -i.

Q: But I didn't change any files myself and it still complains?

A: There are some bad old habits that can lead to this happening. Maybe your package deletes a file as part of its build process? With the old source format, file deletions can't be represented, and are just ignored by dpkg-source, so it's not uncommon for files to be deleted during the build. If you convert such a package to use the git source format, you need to fix it to not delete files during the build.

Similarly, the old source format can't represent changes to binary files, while git can, and so if binary files are modified as part of the build, you'll need to work around it.

Simple example: A package ships with a.out in the upstream tarball (yuck), but also creates it during the build. Of course the clean rule removes it. The fix is easy in this case, just git rm a.out and commit.

Q: What if I have one git repository for a lot of packages?

A: Sorry, but you need to have one repo per package for this to work.

http://people.freedesktop.org/~jamey/git-split

Q: How to enforce that the upstream source is not changed between debian revisions ?

A: dak could check for an upstream tag referencing the same commit between uploads, as well as maybe also verifying the tag signature. This would ensure that the upstream source couldn't be altered in an upload, the same as is enforced for orig.tar right now. dak could also check the debian tag if required. These tags could be also present in .dsc and benefit from its signature, even if signed tag verification would be useful for a potential git-only workflow in the future.

Discussion

Pros and cons for accepting packages in format 3.0 (git) in our archive have been discussed in DebConf10 and reported on debian-devel.

The following table attempts to roughly summarise discussions on the Debian mailing lists. Links to threads are indicated with a date.

Problem

Solution or comment

Limitations

Uploading a Git archive requires reviewing the entire contents of the archive. (16 May 2012, 642801)

Shallow clones or similar solutions, like dropping all commits not part of the graph between tow signed tags identifying a debian version and and upstream version.

How to represent stable updates or local rebuilds ? (18 May 2012)

Git branches. It is also possible to locally rebuild without committing.

Like other formats, 3.0 (git) needs to allow further uploads, including NMUs. (18 May 2012)

VCS history is not as convenient as a patch directory to review the changes Debian makes to Upstream's work. (18 May 2012)

Export the changes as patches in debian/patches for patch-tracker, of find a similar way to present relevant diffs.

How to remove unredistributable files without rewriting history ? (18 May 2012)

The source package's VCS is becomming the preferred form for modification.(11 Aug 2010)

Use the 3.0 (git) format


See also:


CategoryGit CategoryVersionControlSystem CategoryPackaging