Differences between revisions 96 and 97
Revision 96 as of 2006-09-28 14:14:56
Size: 24416
Editor: MartinGuy
Comment: Mantion pubic repo of ARM EABI ipackages
Revision 97 as of 2006-11-29 23:41:22
Size: 24416
Editor: wookey
Comment:
Deletions are marked like this. Additions are marked like this.
Line 232: Line 232:
The compiler is gcc-4.1.1 with glibc-4.2: the exact versions we need. The compiler is gcc-4.1.1 with glibc-2.4: the exact versions we need.

ARM EABI Port

EABI is the new "Embedded" ABI by [http://arm.com ARM ltd]. EABI is actually a family of ABI's and one of the "subABIs" is GNU EABI, for Linux. The effective changes for users are:

  • Mixing soft and hardfloat code is possible
  • Structure packing is not as painful as it used to be
  • More compatibility with various tools (in future - currently linux-elf is well supported)
  • A more efficient [http://www.arm.linux.org.uk/developer/patches/viewpatch.php?id=3105/4 syscall convention]

  • At present (with gcc-4.1.1) it works with ARMv4t, ARMv5t processors and above, but supporting ARMv4 (e.g., StrongARM) requires modification to GCC. See "Thumb interworking" below.

GCC view

New ABI is not only a new ABI field, it is also a new GCC target.

Legacy ABI

  • ABI flags passed to binutils: -mabi=apcs-gnu -mfpu=fpa
  • gcc -dumpmachine: arm-unknown-linux
  • objdump -x for compiled binary:

private flags = 2: [APCS-32] [FPA float format] [has entry point]
  • "file" on compiled Debian binary:

ELF 32-bit LSB executable, ARM, version 1 (ARM), for GNU/Linux 2.2.0, dynamically linked (uses shared libs), for GNU/Linux 2.2.0, stripped

Arm EABI:

  • ABI flags passed by gcc to binutils: -mabi=aapcs-linux -mfloat-abi=soft -meabi=4
  • gcc -dumpmachine: arm-unknown-linux-gnueabi
  • objdump -x for compiled binary:

private flags = 4000002: [Version4 EABI] [has entry point]
  • "file" on compiled binary (under Debian):

ELF 32-bit LSB executable, ARM, version 1 (SYSV), for GNU/Linux 2.4.17, dynamically linked (uses shared libs), for GNU/Linux 2.4.17, stripped

ARM floating points

The current Debian port creates hardfloat FPA instructions. FPA comes from "Floating Point Accelerator". Since the FPA floating point unit was implemented only in very few ARM cores, these days FPA instructions are emulated in kernel via Illegal instruction faults. This is of course very inefficient: about 10 times slower that -msoftfloat for a FIR test program. The FPA unit also has the peculiarity of having mixed-endian doubles, which is usually the biggest grief for ARM porters, along with structure packing issues.

ARM has now introduced a new floating point unit, VFP (Vector Floating Points), which uses a different instruction set than FPA and stores floats in natural-endian [http://www.cs.berkeley.edu/~ejr/Projects/ieee754/ IEEE-754 format]. VFP is implemented in new some ARM9/10/11 cores, like in the new TI OMAP2 family. It seems likely that ARM cores without VFP will remain popular, as in many places ARM is used floats are unnecessary.

To complicate thing further, ARM processors are being integrated with many other FPUs and DSPs, each of which adds its own set of instructions to the ARM set:

  • Cirrus Logic's EP93XX series integrate an ARM920T core with a Maverick Crunch FPU. This also uses IEEE-754, though uses a different instruction set to VFP. Current ARM-Debian users cannot use their Maverick FPUs at all except by programming in assembler or using an alternative compiler. GCC has flags to generate Maverick FP instructions (-mfpu=maverick), but the .o files cannot be linked with the standard Debian GCC startup files or libraries.
  • Intel's iWMMXt unit is used in their PXA270 processor with an XScale main core. This adds integer SIMD and some other instructions but there is currently no iWMMXt processor with hardware floating point capabilities. iWMMXt processors are incompatible with FPA due to opcode overlap, while they could have an VFP coprocessor in principal. That said, iWMMXt instructions should make softfloat fairly quick anyway. Again, GCC support exists (-march=iwmmxt) for this but is also currently unusable within standard Debian.
  • Texas Instruments' OMAP, OMAP2, [http://focus.ti.com/dsp/docs/dspplatformscontentnp.tsp?sectionId=2&familyId=749&tabId=1398 ?DaVinci DM644x series] and numerous other products integrate a ARM9/ARM11 core with their own DSP core for multimedia acceleration and/or telecommunication signal processing. Most Dsp's do fixed-point math. DSP code is completely separated from ARM code. In Linux [http://dspgateway.sourceforge.net/pub/index.php DSP Gateway] or proprietary solutions are used to load code for execution on the c55x/c6xx and provide a way to for ARM and DSP code to communicate.

For a generic-purpose distribution like Debian, targeting binary compatibility (as opposed to source-based distributions that currently are more popular among Linux systems), EABI lets us have the cake and eat it. We can make a soft-float distribution using IEEE-754 with FPU-specific versions of packages (linux-kernel-2.6.x-vfp, libc6-iwmmxt, mediaplayer-maverick, etc) where needed. This also enables individual packages to do runtime FPU detection and call code compiled for different FP scenarios (in liboil for example).

The major FP variants worth support as alternative versions of FP-critical packages seem to be

  • the current arm arch supporting ARMv3 with or without FPA. This may also be necessary for armv4 processors.
  • EABI for generic ARM (>= v4t? >= v5t?) using IEEE softfloat

  • EABI for lowest common denominator VFP (there are now more than one VFP "extended" variant)
  • EABI for Maverick FPU (for which the baseline CPU is armv4t)
  • EABI for iWMMXt using iWMMXt-specific softfloat (baseline CPU is armv5t? xscale?)

Struct packing and alignment

With the new ABI, default structure packing changes, as do some default data sizes and alignment (which also have a knock-on effect on structure packing). In particular the minimum size and alignment of a structure was 4 bytes. Under the EABI there is no minimum and the alignment is determined by the types of the components it contains. This will break programs that know too much about the way structures are packed and can break code that writes binary files by dumping and reading structures.

Stack alignment

The ARM EABI requires 8-byte stack alignment at public function entry points, compared to the previous 4-byte alignment.

64-bit data type alignment

"One of the key differences between the traditional GNU/Linux ABI and the EABI is that 64-bit types (like long long) are aligned differently. In the traditional ABI, these types had 4-byte alignment; in the EABI they have 8-byte alignment. As a result, if you use the same structure definitions (in a header file) and include it in code used in both the kernel and in application code, you may find that the structure size and alignment differ."

Enum sizes

The EABI defines an optional system for controlling the size of C enumerated types. For arm-linux it was decided to keep the existing behaviour (enums are at least the same size as an int) for consistency with other Linux systems.

This is also reflected in the -mabi=aapcs or -mabi=aapcs-linux switches to GCC: aapcs defines enums to be a variable sized type, while with aapcs-linux they are always ints (4 bytes).

System call interface

One area affected by alignment/struct packing is the few system calls that pass structures or 64-bit types.

Since this already causes an incompatible change in the system call interface, the opportunity has been taken to slip in a more efficient, totally incompatible way of doing system calls: instead of using the swi __NR_SYSCALL_BASE(==0x900000)+N instruction, where N is the number of the system call, swi 0 is always used with the system call number stashed in register r7. This is more efficient because the kernel no longer has to go and fish N out of the instruction stream(*), which used to have a negative impact on the efficiency of processors with separate text and data caches (i.e. most ARMs).

Fortunately, the two schemes can coexist and EABI kernels have an option to support the old syscall interface (including old structure layout rules) for running non-EABI binaries. However some features (e.g., ALSA) do not have the necessary kernel shims, so will only work correctly from EABI binaries.

Some third party EABI toolchains (e.g., CodeSourcery 2005q3) use the old kernel interface via userspace shims in glibc. This is now obsolete and no longer supported by glibc.

(*) This is only true if the old-ABI compatibility option is disabled.

See [http://www.arm.linux.org.uk/developer/patches/viewpatch.php?id=3105/4 this article] for more details.

Thumb interworking

The EABI includes thumb interworking, which means that 16-bit Thumb and normal 32-bit ARM instructions can be mixed at function-level granularity. With current gcc this requires at least an armv4t core, because it uses the BX instruction which does not exist in armv4 or earlier. Currently mixing Thumb code and shared libraries only works on armv5t cores because support for armv4t interworking support was omitted by mistake in the gcc-4.1.0 release.

Thumb interworking requires every return and indirect function call execute BX instruction (or LDR/LDM on armv5t) to set the core to the correct state. Paul Brook suggested using

tst lr, #1; moveq pc, lr; bx lr

as an alternative to BX, which should allow running on older, thumbless cores such as ?StrongArm, with the extra cost of two instructions per indirect call/function return. This seems desirable, both for the many StrongARM processors present in the world and is in line with the run-on-minimum-hardware Debian way. A cpu-optimisation flag can be introduced to go back to the BX/LDR/LDM behaviour on non-strongarm hosts. At some point in the future when we decide to drop StrongARM support this could become the default build.

Choice of minimum CPU

Debian policy is to run on the most hardware possible. Once compiler issues are sorted out, this means ARMv4 (StrongARM). by giving flag --with-arch=armv4 and/or --with-cpu=strongarm to gcc's configure script (by default it targets armv5t).

However, Thumb interworking capability is mandatory according to the ARM EABI spec, and the issues in the previous section present a choice of standard function return sequences of various speeds, and that work and/or allow Thumb interworking on a different selection of ARM architectures.

As far as I can tell:

0.  mov pc,lr 

Is what GCC currently emits for -march=armv4. It works on ARMv4 and above but is only Thumb interworking-safe from ARMv7.

1.  bx lr 

Is what GCC emits for -march=armv4t. It works on ARMv4t and above and Thumb interworking is possible on ARMv4t and above. Excludes ARMv4 users hence not an option.

GCC needs modifying to implement any of the following choices.

2.  tst lr, #1; moveq pc, lr; bx lr 

Works on ARMv4 and above and Thumb interworking is possible on ARMv4t and above, but everyone gets two extra instructions they don't need at the end of every function.

3.  ldm/ldr 

Works on ARMv4 and above but Thumb interworking is only possible on ARMv5t and above, excluding ARMv4t users from using Thumb code with Debian. Gcc currently emits this for non-leaf functions on ARMv4 and ARMv5 (but not ARMv4t, where it uses BX, the only way to do interworking on v4t). Although a single instruction, this method may be slower than the three-instruction sequence because of the memory accesses it requires.

For applications that require minimum total code size, the two extra instructions per function are outweighed by being able to use Thumb, which generates object code that is 70% of the size of 32-bit ARM code (using gcc -Os in both cases).

Most projects currently (2006) under development for forthcoming products seem to be based on ARMv4t, which are the chips with the best price/performance ratio at the moment.

4. Drop Thumb interworking

A final option would be simply to compile the standard Debian repo --with-arch=armv4 --with-no-thumb-interwork. This would work on all processors without the dangers inherent in modifying GCC and, according to the GCC manual page, saves a slight size and speed overhead caused by being thumb-interworkable.

Projects that need to use Thumb to reduce code size would do better to build their own Debian repository that uses Thumb code everywhere, in the same way that people build repo's optimised for their specific CPUs, and they would gain far more space this way than by just compiling their largest apps in Thumb, linking them against ARM-code libraries and using them with a standard ARM-code Debian system.

There is significant discussion of the technical merits of these various schemes in the debian-arm mailing list thread [http://lists.debian.org/debian-arm/2006/06/msg00015.html Re: ARM EABI port: minimum CPU choice] of which the above is a partial summary.

Why a new port

In Debian, we want to assure complete binary compatibility. Since the old ABI is not compatible with the new one, we can't allow packages built with old ABI to link against new-abi libs, or the other direction. So the options are:

0. Not an option!

Under no circumstances distribute EABI binaries as .arm.deb depending on current library package names!!!

1. Rename all library packages

This is an ABI transition that affects all architectures, and it has been done before (aout -> elf, c++ ABI)

  • + apt-get dist-upgrade for users is possible
  • - Requires insane amounts of work - every single library package needs to be renamed
  • - Requires a very long transition period, in which unstable will be broken for all archs.
    • c++ ABI transition takes about half an year, full transition could thus take around 2 years
  • - Achieving Consensus for such transition on debian-devel would be very hard.
    • Non-ARM developers will object doing such amount of work only for a minor arch. If arm gets dropped from Release Arch's, we can't even file RC bugs for the migration.
  • - Very invasive change, affecting every user and developer of Debian.

2. New arch

  • + Technically, since we drop FPA instruction support, and gcc dumpmachine triplet is different, we can argue we have a new arch
  • + Does not affect non-ARM users
  • + we can target EABI for armv4(t?)+ while we can can keep oldabi port for ARMv3 (RiscPC) and maybe armv4 (StrongARM) users.
  • + Allows using new instructions (thumb) and drops the old FPU instruction set
  • + Can be done quickly, does not affect other arch's release cycle
  • + requires less archive space during migration
  • - Current ARM users don't have a easy upgrade path

For the last point, a statically compiled ?ArchUpgrade tool could be created. This would also allow i386->amd64 style migrations.

3. ABI: field in control file

This was suggested as part of Multiarch proposal. It is unknown if it would actually become part of Debian or not

  • + Reflects the packages ABI correctly, would help other transitions as well
  • - no working implementation
  • - no consensus on how to do it (apt developers want something more generic instead)
  • - might be hard to fit into current archive infrastructure
  • - make dependency resolving hard

From these choices, we believe a new port is the best compromise.

4. conflicting libc packages

In this scenario, we create a libc6-eabi(-dev) package that has eabi glibc and ld-linux.so.3. This package will conflict with libc6(-dev), and thus you can mix and match eabi and non-eabi binaries and libs.

  • + similar to the libc6.1 style packages on some archs
  • + requires modifying only glibc
  • - ugly
  • - most of ARM port will remain uninstallable for long time
  • - apt-get dist-upgrade will still not work, since it gives up quickly when lots of packages conflict

Let's not make perfect an enemy of good!!

Roadmap

Armel (EABI) will be released with etch+1 as it should be in good shape by then. That release will thus contain arm and armel. Arm will be dropped in etch+2, assuming that the above gcc changes to support armv4 CPUs in armel prove practical. If we cannot support armv4 in armel then arm will remain around until we drop v4/StrongARM support, i.e. the port falls into general disuse.

EABI status

The commercial ARM ?RealView C/C++ compiler was the first to support EABI, and usable EABI support came into GCC from version 4.1.0.

CodeSourcery provide http://www.codesourcery.com/gnu_toolchains/arm/download.html GNU ARM toolchains. The 2005Q3 release is a modified version of gcc-3.4.4 while 2006Q1 is from gcc-4.1.0. These toolchains produce EABI object code and the 2006Q1 release also uses the EABI Linux kernel interface.

EABI is supported in the ARM Linux kernel from version 2.6.16 and there is an optional compatibility feature to allow the running of old-ABI binaries with an EABI kernel. The inverse mechanism, to run EABI binaries in an ABI kernel, is not implemented and is said to be non-trivial to do.

Riku Voipio has built a booting [http://scratchbox.org/~rvoipio/eabi-rootfs.tgz EABI root filesystem] up to X as proof of concept, which seems stable, built with codesourcery gcc 3.4 toolchain.

Koen Kooi has used ?OpenEmbedded to build a pure EABI root filesystem including native toolchain, visible under [http://dominion.kabel.utwente.nl/koen/cms/working-native-eabi-toolchain] The compiler is gcc-4.1.1 with glibc-2.4: the exact versions we need. The system boots and runs fine on armv5t and the C compiler seems to work well. However the C++ compiler is not working because libstdc++ is not installed and perl does not execute because libperl.so.5 is not installed as well. Both problems can be solved by using ipkg to install them.

The Angstrom distro of ?OpenEmbedded has a public repository of ARM EABI ipackages compiled for armv5te and visible under [http://angstrom-distribution.org/unstable/feed/]

QEMU 0.8.1 can run ARM EABI systems, though when running with the 2.6.16 kernel it is mind-bogglingly slow on x86 processors of a few hundred megahertz. Using 2.6.17-rc3 or later fixes this anomaly. To run a single ARM EABI executable in qemu-user mode, [http://freaknet.org/martin/QEMU/ some patches] are required, though these are not complete yet.

Minimum versions of components with the first working EABI support are:

  • binutils - from 2.16.92 - already in Debian
  • gcc - gcc 4.1.0 (Thumb interworking on armv4t needs 4.1.1)
  • glibc - fully upstream in 2.4. Will also be in 2.3.7
    • Earlier glibcs (2.3.6?) support EABI userspace but had old-style syscalls to work with older kernels (2.6.8-2.6.13ish).
  • kernel - eabi support is present from 2.6.16.
  • dpkg, apt - patches will be submitted when port name consensus is achieved

Naming

At the Extremadura emdebian meeting, 12-16 April 2006, the name "armel" was chosen and the current unofficial "armeb" repository will be re-implemented using EABI before it enters Debian proper.

Strategy

The ultimate aim is a new standalone architecture, composed of three concrete components:

The chronological steps to bootstrap the new arch seem to be:

1) Make Debian packages of a cross-compiler targeting ARM EABI. This means gcc-4.1, glibc-2.4+glibc-ports-2.4, binutils-2.17 and linux-2.6.16. This can be compiled using [http://kegel.com/crosstool crosstool-0.42] and the patches and control files at [http://freaknet.org/martin/crosstool], and packaged with the scripts at the same location.

2) Make a package for the existing Debian experimental ARM arch of the Linux kernel compiled to use EABI internally, with run-old-ABI-binaries enabled, and test it with existing old-ABI Debian userland.

3) Cross-build essential and required EABI userland packages using dpkg-cross. A parallel effort is the [http://www.emdebian.org/slind.html slind project], which is busy improving dpkg-cross support within the Emdebian framework.

4) Make the Debian installer debootstrap work for the new arch.

5) Populate the new-arch repository(-ies) with the rest of the Debian packages.

Is there a "HOWTO Create a New Debian Arch" document?

[http://sourceware.org/ml/crossgcc/2006-05/msg00057.html On the crossgcc list], Michael K. Edwards says of the procedure to bootstrap a new Debian arch by doing all building in a QEMU chroot on a fast host:

  • You build your crosstools, you build your minimal chroot and a Canadian cross, then you qemu-chroot in and build a fully native toolchain. Instantiate a dpkg database with the host's dpkg and use equivs to fake up dpkg entries for your toolchain. Build your Required packages and real packages for build-essential, then construct a fresh qemu-chroot, this time with debootstrap and your pile o' packages. If you're paranoid, rebuild all your packages in this chroot and debootstrap afresh -- it's your first chance to test that your glibc built the Debian Way really works. Then start layering on applications, without worrying about whether they cross-compile easily.

Bootstrapping armel With Scratchbox/Maemo

Nokia released [http://www.maemo.org/downloads/releases.html#maemo20beta maemo 2.0 beta], with debian/armel rootfilesystem.

This is based on CodeSourcery 2005q3 toolchain, with non-eabi syscalls in glibc. Most of the FOSS apps,libs and tools are available as armel.deb from:

 deb     http://repository.maemo.org mistral-beta free non-free
 deb-src http://repository.maemo.org mistral-beta free non-free

Maemo 2.0 based on sbox 0.9.8.7, which has a "armel" patched dpkg

There is some initial work on proper EABI setup that would be closer to the EABI system that would be used by Debian at:

  • http://scratchbox.org/~rvoipio/armel/

  • arm-linux-gnueabi-codesourcery-4.1-2006q1 - EABI gcc-4.1 + glibc 2.3.x, like usually, target libc6.deb etc are under packages/ directory
  • scratchbox-devkit-cputransp_1.0.6~20060717armel - needed because includes QEMU updated work with new syscall convention and NPTL (incomplete)
  • scratchbox-devkit-debian-1.0.6~20060612armel - dpkg with armel support + latest debhelper

With these installed on top of [http://scratchbox.org/download/scratchbox-apophis/ Scratchbox Apophis], it is possible to crosscompile quickly a range of Debian packages. The idea is to use scratchbox to help compile enough packages to build a native EABI gcc-4.1 packages, + rest of build-essential and those packages needed by debootstrap, to start a native buildd.

Note that these packages are intended for debian-testing, (etch), and will not install over debian-stable, (sarge). Also, after installing the packages, to get libc6/libstdc++ armel debs installed on target, run inside your sbox target:

(dpkg -i /scratchbox/compilers/arm-linux-gnueabi-codesourcery-4.1-2006q1-3/packages/*.deb)

Temporary location of armel.deb:s built with ?CroCoDiLe:

deb http://piipiip.net/~nchip/armel/repo/ ./

It's a mix of sarge and etch packages, selected on easiness of bootstrapping. For example modular X in etch is easier to boostrap, while libselinux build-depends on python2.4 in etch.. The base .deb files such as libc6 are under /scratchbox/compilers/toolchain-dir/packages until proper self-hosted are built.

Next stage would be to build a sbox toolchain using proper debian sources instead of the codesourcery toolchain or a crosstool hack.

From the cpu devkit, select qemu-arm-0.8.0-sb, later versions of qemu have race condition problems.