Differences between revisions 1 and 17 (spanning 16 versions)
Revision 1 as of 2007-10-05 23:12:31
Size: 9215
Editor: JoeyHess
Comment: new page documenting my dpkg-source v3 git stuff
Revision 17 as of 2012-05-18 23:21:58
Size: 9312
Comment: Split in two sections, FAQ and discussion.
Deletions are marked like this. Additions are marked like this.
Line 1: Line 1:
This FAQ concerns a the new .git.tar.gz format for debian source packages.
Support for this is not yet available in dpkg, but I have patches at
git://kitenet.net/dpkg, and it's been submitted to the dpkg maintainers.
--JoeyHess
= Dpkg source format version 3.0 (git) =

== FAQ ==

'''This FAQ concerns a the new 3.0 (git) format for debian source packages.'''

The version documented here is included in dpkg 1.15.8.
Line 8: Line 11:
'''A''': All that's really needed is to edit debian/control and put
"Format: 3.0 (git)" in the source stanza. Of course, you need to
build the package from a git checkout of your source package.
There are some additional things you might want to consider, see
below.
'''A''': Although all that's needed for dpkg-buildpackage or debuild to build the new source format is to edit debian/source/format to contain
"3.0 (git)", be aware that this format is not yet supported by the Debian upload queue, so only do this if you are experimenting with the new format. To do this, you also need to build the package from a git checkout of your source package (and note that git-buildpackage explicitly creates a non-Git source tree to build from, so stick to dpkg-buildpackage or debuild). There are some additional things you might want to consider, see below.
Line 16: Line 16:
'''A''': You can use the pristine-tar package to generate a small "delta" file that can be checked into git. You'll also need to have a branch in
git that corresponds to the upstream source of the current release.
If you satisfy these two conditions, then it will be possible to
regenerate upstream's exact tarball using the package's git repository.
   
(There's a lot of room for automation and standardisation here.)
'''A''': You can use the pristine-tar package to commit a delta file. It can then later extract the pristine source tarball directly from the git repository.
Line 27: Line 22:
ignore it if you don't like. ignore it if you don't like. Note that bzr (experimental) and a quilt format are also available in v3 source format packages.
Line 38: Line 33:
in the source package format leaves need for dbs and all its ilk. in the source package format avoids need for dbs and all its ilk.
Line 45: Line 40:
the .git.tar.gz ... the git bundle ...
Line 47: Line 42:
'''A''': Yes, you have to upload the whole .git.tar.gz. However, there's also '''A''': Yes, you have to upload the whole git bundle. However, there's also
Line 54: Line 49:
imagining? Go implement it. :-) imagining? Go implement it. :-) (also see
[[http://blog.madduck.net/debian/2005.08.11-rcs-uploads|this blog post]]).
Line 58: Line 54:
the git.tar.gz, firstly because they can be quite large, and secondly a source package, firstly because they can be quite large, and secondly
Line 64: Line 60:
changes, and have that simplified history uploaded in the .git.tar.gz? changes, and have that simplified history uploaded in the git bundle?
Line 66: Line 62:
'''A''': Sure. Just use git clone --shallow to create a shallow clone containing only the history you want in the Debian package, and build from that. '''A''': Sure. Just use dpkg-source's --git-depth option to specify a depth, and a shallow clone will be created containing only the history you want in the Debian package. This option can be put in debian/source/options or debian/source/local-options.
Line 72: Line 68:
(There's a lot of room for automation and standardisation here.)
Line 77: Line 71:
'''A''': It depends. Git is quite efficient, and more commonly, after converting a source package to git, you'll find that the .git.tar.gz is smaller than the old .orig.tar.gz. '''A''': It depends. Git is quite efficient, and more commonly, after converting a source package to git, you'll find that the git bundle is smaller than the old .orig.tar.gz.
Line 81: Line 75:
use git clone --shallow. use --git-depth.
Line 89: Line 83:
'''A''': If you have a specific use-case for your source package that involves it being able to be unpacked that way, you shouldn't switch it to use git. However, we can't let {SunOS,Windows} get in the way of innovation. :-) '''A''': If you have a specific use-case for your source package that involves it being able to be unpacked that way, you shouldn't switch it to use git. However, we can't let {SunOS,Windows} get in the way of innovation. :-) Also, the git source format is simply a git-bundle file. It can be cloned with
plain old git if you don't have dpkg-source, and nothing else needs to be done to unpack the package.
So this is actually probably the easiest format to unpack by hand, if you have git.
Line 95: Line 91:
Git's model means that any security holes in git that can be exploited by Git bundles use the same data formats as the git packs sent over the wire when cloning a git repository.
So
any security holes in git that can be exploited by
Line 102: Line 99:
(Actually, a few files in .git have to be sanitised when it is unpacked:
hook scripts are disabled, and some settings in .git/config are
commented out.)
Line 110: Line 103:
'''A''': This seems unlikely to happen with git, because this same repo format is used for publishing git repositories all over the net. '''A''': This seems unlikely to happen with git, because this same format is used for publishing git repositories all over the net.
Line 117: Line 110:
'''Q''': Why does dpkg-source remove the execute bits from git hooks when unpacking a source package? '''Q''': Are hooks preserved?
Line 119: Line 112:
'''A''': Since these hooks are run automatically, they can be probalimatic and even potentially a security problem. In general, unpacking a source
package should not involve running any code from the package, and so the git backend takes care to prevent that from happening.

If the hooks are useful to you, you can of course chmod +x them after
unpacking.

'''Q''': Why does dpkg-source comment out parts of .git/config when unpacking a source package?

'''A''': Some things in .git/config, like aliases, may be unexpected or unsafe when provided by a third party in a source package. To be better safe than sorry, dpkg-source comments out settings that are not in its
whitelist.
'''A''': No, git-bundle does not preserve hooks or other configuration in .git. All you get from a bundle are all the local branches, tags, and history.
Line 132: Line 116:
'''A''': Yes, and it's good practice for the package's maintainer to make sure they have a .git/config that allows this, since it goes into the source
package.
'''A''': The bundle file is configured as the origin, so pulling from it is not useful. You can always
add remotes as ususual.
Line 166: Line 150:
<http://people.freedesktop.org/~jamey/git-split> http://people.freedesktop.org/~jamey/git-split

== Discussion ==

Pros and cons for accepting packages in format 3.0 (git) in our archive have been discussed in DebConf10 and reported on [[http://lists.debian.org/debian-devel/2010/08/msg00244.html|debian-devel]].

----
__See also:__
 * [[Projects/DebSrc3.0]] - Source package formats "3.0 (quilt)" and "3.0 (native)"

Dpkg source format version 3.0 (git)

FAQ

This FAQ concerns a the new 3.0 (git) format for debian source packages.

The version documented here is included in dpkg 1.15.8.

Q: How do I convert a debian source package to be built using the git format?

A: Although all that's needed for dpkg-buildpackage or debuild to build the new source format is to edit debian/source/format to contain "3.0 (git)", be aware that this format is not yet supported by the Debian upload queue, so only do this if you are experimenting with the new format. To do this, you also need to build the package from a git checkout of your source package (and note that git-buildpackage explicitly creates a non-Git source tree to build from, so stick to dpkg-buildpackage or debuild). There are some additional things you might want to consider, see below.

Q: What about pristine tarballs?

A: You can use the pristine-tar package to commit a delta file. It can then later extract the pristine source tarball directly from the git repository.

Q: So we all have to use git now?

A: No, not really. dpkg-source will continue to support the .orig.tar.gz+diff format. Switch to using the git format if you like, ignore it if you don't like. Note that bzr (experimental) and a quilt format are also available in v3 source format packages.

Q: But doesn't this mean that people have to learn git to contribute to or NMU my package?

A: Yes, contributors or NMUers will need to know some basic git commands like "git commit -a", since any change they make needs to be checked in before the package is built.

Q: Might this not be the start of a slippery slope? I don't mind learning git, but what if we get a dozen other revision control systems used in source packages?

A: It will be up to the dpkg developers to keep things sane. (Obligatory comment about sanity of dpkg developers elided. ;-) On the other hand, having a real revision control system available in the source package format avoids need for dbs and all its ilk. You have to learn them too, before contributing to a package that uses them. So in the end adding git to the source format might make things simpler and more standardised.

Q: Won't this waste bandwidth? Normally, I can upload only a small .diff.gz, and not the whole .orig.tar.gz. There's no diff to accompany the git bundle ...

A: Yes, you have to upload the whole git bundle. However, there's also the possibility of using git push, which is much, much faster than uploading often bloated .diff.gz files.

Now, imagine a special upload queue on git.debian.org. You upload just the .debs, .dsc and .changes, and git push the rest, and this queue builds the tarball for you and uploads it on to ftp-master. Done imagining? Go implement it. :-) (also see this blog post).

Q: Teams like the XSF and kernel team work from git branches of the upstream source. These repositories may not be appropriate to use as a source package, firstly because they can be quite large, and secondly because they can contain historic (or, in the case of the kernel, present) files with license problems.

Would it be practical for such a package to have its git history collapsed to the latest upstream release and the debianisation changes, and have that simplified history uploaded in the git bundle?

A: Sure. Just use dpkg-source's --git-depth option to specify a depth, and a shallow clone will be created containing only the history you want in the Debian package. This option can be put in debian/source/options or debian/source/local-options.

This is recommended for any package that uses git upstream, because the ftp-masters shouldn't have to review the entire historical content of the git repo for badly licensed content.

Q: Won't keeping all this history around bloat the archive something fierce?

A: It depends. Git is quite efficient, and more commonly, after converting a source package to git, you'll find that the git bundle is smaller than the old .orig.tar.gz.

If you have metric tons of upstream history in git -- ie, if you're maintaining linux-2.6 or glibc -- and the tarball does get too large, use --git-depth.

If you are packaging a source that contains large gzipped data files (maybe a game's data), or other files that git cannot handle efficiently, git may not be the right choice for that package.

Q: I'm worried about people who want to unpack debian source packages on a box running {SunOS,Windows}. Before all they needed were standard unix commands: tar, gz, and patch. Now they need git and who knows what other new tools next.

A: If you have a specific use-case for your source package that involves it being able to be unpacked that way, you shouldn't switch it to use git. However, we can't let {SunOS,Windows} get in the way of innovation. :-) Also, the git source format is simply a git-bundle file. It can be cloned with plain old git if you don't have dpkg-source, and nothing else needs to be done to unpack the package. So this is actually probably the easiest format to unpack by hand, if you have git.

Q: But, since more tools are being used for unpacking source packages, isn't there more potential for security exposure for insecure tools?

A: Yes, this is something the dpkg developers need to consider when adding support for a new version control system.

Git bundles use the same data formats as the git packs sent over the wire when cloning a git repository. So any security holes in git that can be exploited by a malicious source package could affect anyone pulling from a git repo on the net. This is actually a good thing, because it means that there's pressure to avoid such holes. Other revision control systems might not have their repositories used in ways that promote secure code (for example, svk's repository is local to one user).

Q: What if the git repository format breaks in some non-backwards-compatible way? We don't want Debian to be stuck with source packages that don't unpack anymore!

A: This seems unlikely to happen with git, because this same format is used for publishing git repositories all over the net.

In general, this is again something that has to be considered on a per-system basis. The subversion database format has changed a lot in the past, and nothing is keeping it from changing in the future, so we're unlikely to see dpkg-source supporting .svn.tar.gz.

Q: Are hooks preserved?

A: No, git-bundle does not preserve hooks or other configuration in .git. All you get from a bundle are all the local branches, tags, and history.

Q: If I apt-get source a package that uses git, can I use git pull, etc?

A: The bundle file is configured as the origin, so pulling from it is not useful. You can always add remotes as ususual.

In general, you should be able to use a git repo obtained via apt-get source just like any other repo. (There will be some limitations if the maintainer chose to publish a shallow clone.)

Q: dpkg-source complains that there are uncommitted changes and fails my package build?

A: This check is done because only committed changes will end up in the source package. If dpkg-source allowed a build to happen from an unclean repository, you could end up with a source package that doesn't match the binaries you distribute.

If you want to use dpkg-buildpackage to do a test build, you can use dpkg-buildpackage -b to skip building the source package.

Of course, you can always just list files to ignore in .gitignore. Or, to ignore a file only for one build, use dpkg-buildpackage -i.

Q: But I didn't change any files myself and it still complains?

A: There are some bad old habits that can lead to this happening. Maybe your package deletes a file as part of its build process? With the old source format, file deletions can't be represented, and are just ignored by dpkg-source, so it's not uncommon for files to be deleted during the build. If you convert such a package to use the git source format, you need to fix it to not delete files during the build.

Similarly, the old source format can't represent changes to binary files, while git can, and so if binary files are modified as part of the build, you'll need to work around it.

Simple example: A package ships with a.out in the upstream tarball (yuck), but also creates it during the build. Of course the clean rule removes it. The fix is easy in this case, just git rm a.out and commit.

Q: What if I have one git repository for a lot of packages?

A: Sorry, but you need to have one repo per package for this to work.

http://people.freedesktop.org/~jamey/git-split

Discussion

Pros and cons for accepting packages in format 3.0 (git) in our archive have been discussed in DebConf10 and reported on debian-devel.


See also: