Differences between revisions 4 and 5
Revision 4 as of 2013-11-14 10:19:51
Size: 10977
Editor: wookey
Comment:
Revision 5 as of 2014-11-27 05:54:19
Size: 17201
Editor: wookey
Comment: major rewrite started - not finished yet
Deletions are marked like this. Additions are marked like this.
Line 1: Line 1:
= Debian Cross Toolchain overview =
Line 3: Line 5:
= Related Pages =

 * CoinstallableToolchains : Scheme to allow co-installable compilers
 * MultiarchCrossToolchainBuild : Info on building toolchains using multiarch methods
 * [[toolchain/BootstrapIssues]] : Breakage found when building debian crosstoolchains (might save you some time!)


= Cross toolchain =

== Identify issues ==
 * CrossToolchains : Top-level Cross Toolchain index, status, and Installation details

== Cross-toolchain building ==

The cross-toolchain has several components. The fundamental ones are:
 * binutils
 * gcc

But there are also necessary related parts:
 * target-arch libc
 * target-arch linux-libc-headers
 * gcc-defaults symlinks

And related packages such as build-essential/crossbuild-essential, dpkg-cross, and cross-pkg-config. The importance or otherwise of these is discussed later.

Binutils is straightforward as it goes not have any host-arch dependencies and is easily built for any supported target arch, to produce a binutils-<triplet> package.

The interesting bit is gcc. <triplet>-gcc needs a <target>-libc-dev and <target>-linux-libc-headers to build. Those can be generated in various ways, and the gcc packaging supports two locations for them.
 1. in 'classic cross-compiler dirs': /usr/<triplet>/{lib,include}
 1. in multiarch locations: /usr/lib/<triplet>, /usr/include/<triplet>

The first case is the default, the second case is enabled by setting 'with_deps_on_target_arch_pkgs=yes'.

The files are supplied in corresponding dependencies:
 1. libc-dev-<target>-cross, linux-libc-dev-<target>-cross
 1. libc-dev:<target>, linux-libc-dev:<target>

We refer to these build types as 'standalone', and 'multiarch', because in the first case the resulting cross gcc depends on -cross:all packages, and in the second case depends on foreign-arch :<target> packages.

There are also two ways of getting the -cross packages. They can be copied from the :<target> packages (by moving files around using dpkg-cross), or they can be cross-built from libc and linux (currently still using dpkg-cross).


The former builds (for example) gcc-arm-linux-gnueabihf (and cpp,g++,gfortran) against the linux-libc-dev:armhf, libc6-dev:armhf, libstdc++-dev:armhf and libgcc1:armhf already in the archive. The build is quick and simple, but the resulting package has cross-arch
dependencies (on the various :armhf packages), so you need to enabled multiarch to install them. You will need to enable multiarch for most cross-builds anyway.

The latter builds (for example) linux-libc-dev-cross-armhf, libc-dev-cross-armhf, libstdc++-dev-cross-armhf, libgcc1-cross-armhf and gcc-arm-linux-gnueabihf (and cpp,g++,gfortran) from the kernel, glibc, and gcc sources, via the toolchain bootstrap process. Because the foreign-arch libraries are converted to build-arch packages, the
toolchain does not need multiarch enabled to build, but the build is much longer and more complicated, and you end up with two copies of those libraries on your system.

This type of build is necessary when the target architecture is not in the debian archive as the libraries are not available to build against.


=== Multiarch vs Multilib ===

Multiarch and multilib can be viewed as alternative ways of doing the same thing (providing a place for foreign libraries). In the toolchain

Targeting related architectures (such as i386/amd64, armel/armhf, mips/mipsel) can be done in two different ways. You can either build one cross-compiler for each target, and install whichever ones you need, or you can build one cross-compiler which has two or more multilibs installed, install just that one cross-compiler and use build options to control which code is output.

i.e on amd64 you can target i386 either by running:
 * {{{i386-linux-gnu-gcc}}}
or by running
 * {{{x86_64-linux-gnu-gcc -m32}}}

These options are not consistent across different architecture sets,
whilst use of <triplet>-gcc always works:
  {{{arm-linux-gnueabi-gcc}}} produces the same (armel) output on all arches
  {{{arm-linux-gnueabihf-gcc -mabi=softfp}}} can be used instead on armhf to produce armel binaries.

Building these multilibbed cross-toolchains is a lot more fiddly than plain multiarch ones. Thus the current debian cross-toolchains are not multilibbed and only support the <triplet>-gcc method. Install whichever targets you need, and use the same commands everywhere. Encourage upstreams to call tools this way, rather than using -mabi=blah options, although in x86 world the use of m32/m64 is common and probably too late to change. Some packages may need their build options adjusting to do this right.


== Cross Toolchain build methodologies ==

Debian cross-toolchains have existed in various forms since 2000.

First (circa 2000) was the 'toolchain-source' package which was a copy of the gcc sources that could be used to build cross-compilers. This suffered from being behind the normal gcc packages, with different patches and bugs. The cross-support rules in this was merged into the main gcc package, and gcc output a gcc-source package so that cross-toolchains could be built using that.

For many years the emdebian project used this functionality to build cross-toolchain binaries for Debian. These builds were done by using the libc/linux-headers from the target arch, converted to libc-<target>-cross/linux-libc-dev-<target>-cross with dpkg-cross, then building the package against those. The 'buildcross' tool was developed to mechanise this process.

The problem with this method was that it could not easily be made into a standard package that would build in the archive, because there was no way to express the dependency on the foreign-arch libc/linux-libc-dev, and also because downloading as part of a package build is not permitted. Thus the packages lived outside the archive (at emdebian) for a decade or so. They became quite well used.

Whilst multiarch was being developed it became clear that it could solve this problem of specifying foreign dependencies for cross-toolchains, and explicit-arch dependencies were included in the spec partly for that reason.

Meanwhile linaro wanted cross-toolchains in Ubuntu before all this was ready so Marcin packaged up the whole toolchain bootstrap procedure into a package which build-depped on linux, binutils, libc, and gcc sources, and built linux-libc-dev-<target>-cross, binutils-<triplet>, libc-<target>-cross, gcc-<triplet>, via gcc stage1, libc stage1, gcc stage2, libc stage2, gcc stage3. This was the only way to build a cross-toolchain inside a standard package at the time. Those toolchains went into Ubuntu 10.10 and at the emdebian sprint at ARM in 2009 we agreed that they would be fixed up to build on Debian and uploaded there too, until multiarch-built cross-toolchains were available/practical.

Unfortunately that was never done so Debian still had no in-archive cross-toolchains for wheezy, and the emdebian toolchains were not maintained any more as we expected a move to the new multiarch ones quickly. That took much longer than expected due to other work/distractions.

Multiarch-built cross-toolchains were working in 2013, but still could not be uploaded until the infrastructure learned about foreign-arch dependencies. This <more later>...
Line 35: Line 105:
  Also a gcc-defaults-cross packages is desired (scroll down).
 
Line 39: Line 108:
  * NumSourcePackages = ($build_host_arch ('amd64') -> $build_target_arch ('armel') ) x (11 ports + X non-official ports)
  
  If we split the build in several source packages then that means we need to maintain more source package burden:
  
  * Y source packages x NumSourcePackages
  
  * A possible optimization at build time would be to have cross compilers to build depend on themselves.

  

Debian Cross Toolchain overview

  • CrossToolchains : Top-level Cross Toolchain index, status, and Installation details

Cross-toolchain building

The cross-toolchain has several components. The fundamental ones are:

  • binutils
  • gcc

But there are also necessary related parts:

  • target-arch libc
  • target-arch linux-libc-headers
  • gcc-defaults symlinks

And related packages such as build-essential/crossbuild-essential, dpkg-cross, and cross-pkg-config. The importance or otherwise of these is discussed later.

Binutils is straightforward as it goes not have any host-arch dependencies and is easily built for any supported target arch, to produce a binutils-<triplet> package.

The interesting bit is gcc. <triplet>-gcc needs a <target>-libc-dev and <target>-linux-libc-headers to build. Those can be generated in various ways, and the gcc packaging supports two locations for them.

  1. in 'classic cross-compiler dirs': /usr/<triplet>/{lib,include}

  2. in multiarch locations: /usr/lib/<triplet>, /usr/include/<triplet>

The first case is the default, the second case is enabled by setting 'with_deps_on_target_arch_pkgs=yes'.

The files are supplied in corresponding dependencies:

  1. libc-dev-<target>-cross, linux-libc-dev-<target>-cross

  2. libc-dev:<target>, linux-libc-dev:<target>

We refer to these build types as 'standalone', and 'multiarch', because in the first case the resulting cross gcc depends on -cross:all packages, and in the second case depends on foreign-arch :<target> packages.

There are also two ways of getting the -cross packages. They can be copied from the :<target> packages (by moving files around using dpkg-cross), or they can be cross-built from libc and linux (currently still using dpkg-cross).

The former builds (for example) gcc-arm-linux-gnueabihf (and cpp,g++,gfortran) against the linux-libc-dev:armhf, libc6-dev:armhf, libstdc++-dev:armhf and libgcc1:armhf already in the archive. The build is quick and simple, but the resulting package has cross-arch dependencies (on the various :armhf packages), so you need to enabled multiarch to install them. You will need to enable multiarch for most cross-builds anyway.

The latter builds (for example) linux-libc-dev-cross-armhf, libc-dev-cross-armhf, libstdc++-dev-cross-armhf, libgcc1-cross-armhf and gcc-arm-linux-gnueabihf (and cpp,g++,gfortran) from the kernel, glibc, and gcc sources, via the toolchain bootstrap process. Because the foreign-arch libraries are converted to build-arch packages, the toolchain does not need multiarch enabled to build, but the build is much longer and more complicated, and you end up with two copies of those libraries on your system.

This type of build is necessary when the target architecture is not in the debian archive as the libraries are not available to build against.

Multiarch vs Multilib

Multiarch and multilib can be viewed as alternative ways of doing the same thing (providing a place for foreign libraries). In the toolchain

Targeting related architectures (such as i386/amd64, armel/armhf, mips/mipsel) can be done in two different ways. You can either build one cross-compiler for each target, and install whichever ones you need, or you can build one cross-compiler which has two or more multilibs installed, install just that one cross-compiler and use build options to control which code is output.

i.e on amd64 you can target i386 either by running:

  • i386-linux-gnu-gcc

or by running

  • x86_64-linux-gnu-gcc -m32

These options are not consistent across different architecture sets, whilst use of <triplet>-gcc always works:

  • arm-linux-gnueabi-gcc produces the same (armel) output on all arches arm-linux-gnueabihf-gcc -mabi=softfp can be used instead on armhf to produce armel binaries.

Building these multilibbed cross-toolchains is a lot more fiddly than plain multiarch ones. Thus the current debian cross-toolchains are not multilibbed and only support the <triplet>-gcc method. Install whichever targets you need, and use the same commands everywhere. Encourage upstreams to call tools this way, rather than using -mabi=blah options, although in x86 world the use of m32/m64 is common and probably too late to change. Some packages may need their build options adjusting to do this right.

Cross Toolchain build methodologies

Debian cross-toolchains have existed in various forms since 2000.

First (circa 2000) was the 'toolchain-source' package which was a copy of the gcc sources that could be used to build cross-compilers. This suffered from being behind the normal gcc packages, with different patches and bugs. The cross-support rules in this was merged into the main gcc package, and gcc output a gcc-source package so that cross-toolchains could be built using that.

For many years the emdebian project used this functionality to build cross-toolchain binaries for Debian. These builds were done by using the libc/linux-headers from the target arch, converted to libc-<target>-cross/linux-libc-dev-<target>-cross with dpkg-cross, then building the package against those. The 'buildcross' tool was developed to mechanise this process.

The problem with this method was that it could not easily be made into a standard package that would build in the archive, because there was no way to express the dependency on the foreign-arch libc/linux-libc-dev, and also because downloading as part of a package build is not permitted. Thus the packages lived outside the archive (at emdebian) for a decade or so. They became quite well used.

Whilst multiarch was being developed it became clear that it could solve this problem of specifying foreign dependencies for cross-toolchains, and explicit-arch dependencies were included in the spec partly for that reason.

Meanwhile linaro wanted cross-toolchains in Ubuntu before all this was ready so Marcin packaged up the whole toolchain bootstrap procedure into a package which build-depped on linux, binutils, libc, and gcc sources, and built linux-libc-dev-<target>-cross, binutils-<triplet>, libc-<target>-cross, gcc-<triplet>, via gcc stage1, libc stage1, gcc stage2, libc stage2, gcc stage3. This was the only way to build a cross-toolchain inside a standard package at the time. Those toolchains went into Ubuntu 10.10 and at the emdebian sprint at ARM in 2009 we agreed that they would be fixed up to build on Debian and uploaded there too, until multiarch-built cross-toolchains were available/practical.

Unfortunately that was never done so Debian still had no in-archive cross-toolchains for wheezy, and the emdebian toolchains were not maintained any more as we expected a move to the new multiarch ones quickly. That took much longer than expected due to other work/distractions.

Multiarch-built cross-toolchains were working in 2013, but still could not be uploaded until the infrastructure learned about foreign-arch dependencies. This <more later>...

  • There are currently different approaches for cross toolchain builds:
  • Cross packages output of former Debian source packages
    • Not good idea, because builds would take long and a cross build failure would hold native builds which it is not really good idea.
    • If cross packages are built off former Debian packages a version skew is introduced.
  • Build $build_host_arch ('amd64') -> any on one source package

    • Not good idea, as an architecture build failure holds the rest of architectures, i.e. build can take 48h and imagine situation when 10th arch fails...
  • Build $build_target_arch ('armel') complete run from one source package
    • That is not Debian approach but making modular builds, but simplifies build process
    • we can upload 11 source packages at once and if some of them fail then we have some others built anyway - less work on fixing
  • Build $build_target_arch ('armel') complete run from multiple source packages
    • Modular builds are nice, but it is a burden to maintain multiple source packages per each $build_target_arch
  • Jonas proposes to build current dpkg-cross packages as packages architecture 'all' built on one specific architecture, i.e. linux-libc-dev-armel-cross can be built on armel as arch:all package.
    • This approach avoids bootstraps (saving compile time), keeps package maintainership in the proper Debian source packages, but makes it difficult to build cross compilers for new architectures.
    After discussion, we decided to follow point described under 3. (Marcin proposal). We got 11 Debian ports, which basically means:
  • Building cross gcc requires libc(-dev) and linux-libc-dev packages from target arch - for now they need to be fetched by hand which is impossible on buildd or we can bootstrap them from sources, but that it is not currently implemented in Debian packages (Linaro/Ubuntu packages do implement that), there is work in progress.

Source

Handling cross compiler versions/defaults

For now cross gcc uses update-alternatives to select default version. Since my (Marcin) changes landed in gcc-4.[45] versions (also in gcc-4.6 now) the newest version is selected by default. This affects Debian where 4.4 is default and 4.5 can be provided. In Ubuntu it is solved by 'gcc-defaults-armel-cross' package.

Marcin notes

Status

Currently we have two ways of doing cross toolchain in Debian/Ubuntu world:

  • EmDebian one (echo $arch >debian/target + build)

  • Ubuntu one (bootstrap whole cross compiler)

EmDebian way

Should work in any Debian derived distribution due to simpleness of it. The problem is that it is manual process which can be automated but is still impossible to do on buildd - and as such it can not be added into Debian repository. EmDebian developers solved that by having daemon which rebuilds toolchain packages after their updates in Debian archive.

Another problem is manual fetching of eglibc and linux packages fortarget arch. But this part can be solved by using multiarch capable APT (apt-get -o APT::architecture=armel download libc6-dev).

Ubuntu way

Ubuntu way handles building of cross toolchain in other way - by fullbootstrap of it. Due to fact that final gcc (gcc stage3 in bootstrap terminology) requires target headers to be available in /usr/$ARCH/ directories I split toolchain into two packages:

  • armel-cross-toolchain-base (does binutils, eglibc, libgcc)
  • gcc-4.x-armel-cross (does gcc without libgcc packages)

So far packages for gcc 4.4 and 4.5 are created. 4.6 version will follow soon - it will be basically copy of 4.5 one.

But how to get Ubuntu source packages working under Debian?

Experimental requirements

First we need binutils 2.21 and gcc-4.5 from experimental - they contain all my changes which I did for Ubuntu 10.10 'maverick' and all later ones. Many things got cleaned, code duplication which was present for cross targets got eliminated in favour of reusing native packaging as much as possible. Effect is that we have -dbg packages for all libraries and soon also -dbgsym ones. Some work may still need to be done to make sure that cross toolchain for all of Debian architectures can be built and used.

In-progress packaging

Next requirements are armel-cross-toolchain-base and gcc-4.5-armel-cross from my git repository at git.linaro.org server. Latter one is same as Ubuntu one but has build dependencies lowered (Ubuntu has eglibc 2.12, Debian has 2.11 for example). Worse situation is with armel-cross-toolchain-base one...

How it works

To bootstrap cross toolchain I reuse sources which are available in *-source binary packages for binutils/eglibc/gcc-4.5/linux-2.6 components. For binutils and gcc-4.[456] there is no problem as changes are present.

Eglibc/Linux problems

Worse situation is with eglibc and linux-2.6 -source packages as they do not provide Debian packaging inside. I opened bug against linux-2.6 but so far it got refused with answer like "wait for multiarch it will solve your problem". I assume similar answer will be for eglibc but I will report wishlist bug anyway. So far as a work around I included whole eglibc packaging (4MB) inside of armel-cross-toolchain-base and same with linux-2.6. Effect is ugly, non-maintainable but at least I have something to test.

Build problems

Current Debian builds of final eglibc fails on building "nscd/others". It is linking problem as ld is not able to find ld-linux.so for some symbols. It links fine if I call failing line with library added.

If build fails on "build-linux" stage then it is a reason of not whole linux-2.6 packaging copy but it was solved by making it complete.

Bootstrap order and dependencies

1. binutils-cross sysrooted 2. gcc1-cross (requires 1) 3. linux-headers-cross 4. eglibc1-cross (requires 2) 5. gcc2-cross (requires 4, gives libgcc packages) 6. eglibc-final-cross (requires 5, gives all eglibc packages) 7. binutils-cross without sysroot (gives binutils-cross packages)

Why two builds of binutils? gcc1 and gcc2 are build with sysroot enabled as we do not have access to /usr/ARCH directories during build. So we need binutils which will also use sysroot.

Patches used

  • Solves missing file in -source package - I should fix binutils for not needing this as we apply it again (it is normally applied during normal binutils builds due to missing file in upstream tarball of 2.21 release).
    • patches/binutils/add-gnu-oids.texi.diff
  • Disables building of documentation, sources, udebs as we do not need them. Need to clean them or change to shell action.
    • patches/eglibc: 0001-limit-packages-for-backport-version.patch local-no-notneeded-packages.patch debian-local-no-notneeded-packages.patch
  • Disable building of localedata for all languages except first one. The process of building them takes lot of time and we do not use this data in cross toolchain.
    • local-kill-locales.patch debian-local-kill-locales.patch
  • Debian does not ship eglibc manual in eglibc-source but also does not ship manual/ directory which was created by one of debian patches. This patch handles this. Need to report bug.
    • debian-local-remove-manual.diff
  • Force using of gcc-4.4 to build eglibc under Lucid. Was used only for Linaro toolchain PPA and got dropped on Friday (solved by applying it into eglibc). Need to merge ubuntu/natty version into debian branch.
    • lucid-force-gcc-4.4.patch
  • Do not apply patches when PATCHED_SOURCE=yes is given. Added to Ubuntu eglibc some time ago on my request.
    • ubuntu-backport-50-patched-sources.patch
  • Stages support for bootstrapping cross compiler. Taken from Ubuntu packaging.
    • ubuntu-backport-51-stages.patch
  • Moves GFDL_INVARIANT_FREE variable outside of stages support so it got used under Debian. Need to report bug and get it merged.
    • patches/gcc-4.5: handle-doc-for-stages.diff
  • Do not use gold to build compiler - it fails on building because there is no lto plugin built during gcc2 stage. Merge request sent long time ago, need to check it again and discuss with Matthias Klose.
    • no-gold.diff
  • Adds support for building linux-libc-dev package only.
    • patches/linux: linux-stage1.diff

Multiarch future

There is ongoing work on having multiarch dpkg working for both Debian and Ubuntu distributions. When it will get to final state both ways of building cross compiler will have to be changed because there will be no need to manually fetch target arch packages because we could just build-depend on them. But thats future - first stage of deploying multiarch will not give us this because whole build infrastructure of both distributions needs to be changed first.

But what we will have to do when we will have final multiarch support? I think that there will be will be able to abandon armel-cross-toolchain-base package in favour of binutils-cross one as there will be no need to cross build eglibc or linux headers (we will just build-depend on target packages).

On Ubuntu side I will still maintain (then deprecated) packages due to LTS support which I promised to our users. But this part will not affect Ubuntu 'current' or Debian 'wheezy'.

Results

Common development on cross toolchains will happen in a Cross Toolchain Team at Alioth under collab-maint