Differences between revisions 1 and 2
Revision 1 as of 2007-10-05 23:12:31
Size: 9215
Editor: JoeyHess
Comment: new page documenting my dpkg-source v3 git stuff
Revision 2 as of 2007-10-05 23:17:51
Size: 9213
Editor: JoeyHess
Comment:
Deletions are marked like this. Additions are marked like this.
Line 166: Line 166:
<http://people.freedesktop.org/~jamey/git-split> http://people.freedesktop.org/~jamey/git-split

This FAQ concerns a the new .git.tar.gz format for debian source packages. Support for this is not yet available in dpkg, but I have patches at git://kitenet.net/dpkg, and it's been submitted to the dpkg maintainers. --JoeyHess

Q: How do I convert a debian source package to be built using the git format?

A: All that's really needed is to edit debian/control and put "Format: 3.0 (git)" in the source stanza. Of course, you need to build the package from a git checkout of your source package. There are some additional things you might want to consider, see below.

Q: What about pristine tarballs?

A: You can use the pristine-tar package to generate a small "delta" file that can be checked into git. You'll also need to have a branch in git that corresponds to the upstream source of the current release. If you satisfy these two conditions, then it will be possible to regenerate upstream's exact tarball using the package's git repository.

(There's a lot of room for automation and standardisation here.)

Q: So we all have to use git now?

A: No, not really. dpkg-source will continue to support the .orig.tar.gz+diff format. Switch to using the git format if you like, ignore it if you don't like.

Q: But doesn't this mean that people have to learn git to contribute to or NMU my package?

A: Yes, contributors or NMUers will need to know some basic git commands like "git commit -a", since any change they make needs to be checked in before the package is built.

Q: Might this not be the start of a slippery slope? I don't mind learning git, but what if we get a dozen other revision control systems used in source packages?

A: It will be up to the dpkg developers to keep things sane. (Obligatory comment about sanity of dpkg developers elided. ;-) On the other hand, having a real revision control system available in the source package format leaves need for dbs and all its ilk. You have to learn them too, before contributing to a package that uses them. So in the end adding git to the source format might make things simpler and more standardised.

Q: Won't this waste bandwidth? Normally, I can upload only a small .diff.gz, and not the whole .orig.tar.gz. There's no diff to accompany the .git.tar.gz ...

A: Yes, you have to upload the whole .git.tar.gz. However, there's also the possibility of using git push, which is much, much faster than uploading often bloated .diff.gz files.

Now, imagine a special upload queue on git.debian.org. You upload just the .debs, .dsc and .changes, and git push the rest, and this queue builds the tarball for you and uploads it on to ftp-master. Done imagining? Go implement it. :-)

Q: Teams like the XSF and kernel team work from git branches of the upstream source. These repositories may not be appropriate to use as the git.tar.gz, firstly because they can be quite large, and secondly because they can contain historic (or, in the case of the kernel, present) files with license problems.

Would it be practical for such a package to have its git history collapsed to the latest upstream release and the debianisation changes, and have that simplified history uploaded in the .git.tar.gz?

A: Sure. Just use git clone --shallow to create a shallow clone containing only the history you want in the Debian package, and build from that.

This is recommended for any package that uses git upstream, because the ftp-masters shouldn't have to review the entire historical content of the git repo for badly licensed content.

(There's a lot of room for automation and standardisation here.)

Q: Won't keeping all this history around bloat the archive something fierce?

A: It depends. Git is quite efficient, and more commonly, after converting a source package to git, you'll find that the .git.tar.gz is smaller than the old .orig.tar.gz.

If you have metric tons of upstream history in git -- ie, if you're maintaining linux-2.6 or glibc -- and the tarball does get too large, use git clone --shallow.

If you are packaging a source that contains large gzipped data files (maybe a game's data), or other files that git cannot handle efficiently, git may not be the right choice for that package.

Q: I'm worried about people who want to unpack debian source packages on a box running {SunOS,Windows}. Before all they needed were standard unix commands: tar, gz, and patch. Now they need git and who knows what other new tools next.

A: If you have a specific use-case for your source package that involves it being able to be unpacked that way, you shouldn't switch it to use git. However, we can't let {SunOS,Windows} get in the way of innovation. :-)

Q: But, since more tools are being used for unpacking source packages, isn't there more potential for security exposure for insecure tools?

A: Yes, this is something the dpkg developers need to consider when adding support for a new version control system.

Git's model means that any security holes in git that can be exploited by a malicious source package could affect anyone pulling from a git repo on the net. This is actually a good thing, because it means that there's pressure to avoid such holes. Other revision control systems might not have their repositories used in ways that promote secure code (for example, svk's repository is local to one user).

(Actually, a few files in .git have to be sanitised when it is unpacked: hook scripts are disabled, and some settings in .git/config are commented out.)

Q: What if the git repository format breaks in some non-backwards-compatible way? We don't want Debian to be stuck with source packages that don't unpack anymore!

A: This seems unlikely to happen with git, because this same repo format is used for publishing git repositories all over the net.

In general, this is again something that has to be considered on a per-system basis. The subversion database format has changed a lot in the past, and nothing is keeping it from changing in the future, so we're unlikely to see dpkg-source supporting .svn.tar.gz.

Q: Why does dpkg-source remove the execute bits from git hooks when unpacking a source package?

A: Since these hooks are run automatically, they can be probalimatic and even potentially a security problem. In general, unpacking a source package should not involve running any code from the package, and so the git backend takes care to prevent that from happening.

If the hooks are useful to you, you can of course chmod +x them after unpacking.

Q: Why does dpkg-source comment out parts of .git/config when unpacking a source package?

A: Some things in .git/config, like aliases, may be unexpected or unsafe when provided by a third party in a source package. To be better safe than sorry, dpkg-source comments out settings that are not in its whitelist.

Q: If I apt-get source a package that uses git, can I use git pull, etc?

A: Yes, and it's good practice for the package's maintainer to make sure they have a .git/config that allows this, since it goes into the source package.

In general, you should be able to use a git repo obtained via apt-get source just like any other repo. (There will be some limitations if the maintainer chose to publish a shallow clone.)

Q: dpkg-source complains that there are uncommitted changes and fails my package build?

A: This check is done because only committed changes will end up in the source package. If dpkg-source allowed a build to happen from an unclean repository, you could end up with a source package that doesn't match the binaries you distribute.

If you want to use dpkg-buildpackage to do a test build, you can use dpkg-buildpackage -b to skip building the source package.

Of course, you can always just list files to ignore in .gitignore. Or, to ignore a file only for one build, use dpkg-buildpackage -i.

Q: But I didn't change any files myself and it still complains?

A: There are some bad old habits that can lead to this happening. Maybe your package deletes a file as part of its build process? With the old source format, file deletions can't be represented, and are just ignored by dpkg-source, so it's not uncommon for files to be deleted during the build. If you convert such a package to use the git source format, you need to fix it to not delete files during the build.

Similarly, the old source format can't represent changes to binary files, while git can, and so if binary files are modified as part of the build, you'll need to work around it.

Simple example: A package ships with a.out in the upstream tarball (yuck), but also creates it during the build. Of course the clean rule removes it. The fix is easy in this case, just git rm a.out and commit.

Q: What if I have one git repository for a lot of packages?

A: Sorry, but you need to have one repo per package for this to work.

http://people.freedesktop.org/~jamey/git-split