Debian Hadoop packaging efforts
Debian currently does not include Hadoop packages. There are a number of reasons for this; in particular the Hadoop build process will load various dependencies via Maven instead of using distribution-supplied packages. Java projects like this are unfortunately not easy to package because of interdependencies; and unfortunately the Hadoop stack is full of odd dependencies (including Apache Forrest, which for a long time only worked with Java = 1.5).
If you want to build Debian packages, the most complete efforts can be found at the Apache Bigtop project http://bigtop.apache.org/ . Unfortunately, the build process for these packages is currently of a disastrous quality, and should only be attempted within disposable virtual machines, as it requires root permissions and will install non-packaged software.
These will allow building Hadoop packages on a recent Debian system. However, the resulting packages do not live up to Debian quality standards. In particular, they include copies of .jar files from other packages, and will thus not benefit from security updates done to these packages.
If you are interested in getting Hadoop packages into Debian,
Coordinate with the Java packaging team
- Coordinate with upstream Apache Bigtop
to avoid duplicate efforts. Thank you.
Information below is dated 2010
This page is used to track informations regarding the packaging of Hadoop in Debian
from the packaging repository, debian/TODO
Packaging Repo: http://git.debian.org/?p=pkg-java/hadoop.git;a=summary
Solved by excluding parts of hadoop:
jets3t ant-based, lots of dependencies but we have most of them, except:
Java XMLBuilder: single javaclass, no very specific dep, ITP: 573804
kfs (KosmosFS): ant-based, no very specific dep.
KosmoFS and jets3t (Amazons S3) are optional Filesystem implementations. It should be possible to build a Debian Hadoop package without them until they are packaged.
For Katta (not a part of hadoop):