The Hacking Way: Part 1. First Steps.
Table of Contents
1. Introduction..................................................................
2. The Basics....................................................................
2.1 The C standard library (LibC)...........................................
2.2 Myth #1: AmigaOS4 behaves like UNIX.....................................
2.3 Myth #2: AmigaOS4 binaries are fat......................................
2.4 Genuine ELF executables.................................................
3. PowerPC Assembly..............................................................
3.1 Registers...............................................................
3.1.1 General-purpose registers.........................................
3.1.2 Some special registers............................................
3.2 Instructions............................................................
3.3 Function Prologue and Epilogue..........................................
4. Writing programs in assembler.................................................
4.1 Assembler programming using LibC........................................
4.2 Assembler programming without LibC......................................
5. Hacking it for real...........................................................
5.1 Linker scripts (ldscripts)..............................................
5.2 Getting rid of relocation...............................................
5.3 The ELF Loader..........................................................
5.4 What else can we do ?...................................................
6. Final Words...................................................................
7. Links.........................................................................1. Introduction
Back in the past, I wanted to make the smallest possible executables on UNIX-ish operating systems (SunOS, Tru64, OS9, OpenVMS and others). As a result of my research I wrote a couple of small tutorials for various hacking-related magazines (like Phrack or x25zine). Doing the same on AmigaOS naturally became a topic of interest for me - even more so when I started seeing, in Amiga forums, questions like "Why are AmigaOS4 binaries bigger than they should be?" Therefore I believe that producing small OS4 executables could make an interesting topic for an article. Further in the text I'll explain how ldscripts can help the linker make non-aligned binaries, and cover various other aspects associated with the topic. I hope that at least for programmers the article will be an interesting and thought-provoking read.
Before you go on, please note that it is assumed here that you have basic programming skills and understanding of C and assembler, that you are familiar with BSD syntax, know how UNIX and AmigaOS3/4 work, and that you have the PPC V.4-ABI and ELF specification at hand. But if you don't, there's no need to stop reading as I'll try to cover the basics where necessary.
2. The Basics
2.1 The C standard library (libc)
Libc is platform-independent in the sense that it provides the same functionality regardless of operating system - be it UNIX, Linux, AmigaOS, OpenVMS, AROS, whatever. The actual implementation may vary from OS to OS. For example in UNIX, the most popular implementation of the C standard library is glibc (GNU Library C). But there are others: uClibc (for embedded Linux systems, without MMU), dietlibc (as the name suggests, it is meant to compile/link programs to the smallest possible size) or Newlib. Originally developed for a wide range of embedded systems, Newlib is the preferred C standard library in AmigaOS4 and is now part of the kernel.
On AmigaOS4, three implementations of libc are used: clib2, newlib and vclib. The GCC compiler supports clib2 and newlib, the VBCC compiler supports newlib and vclib.
clib2
Newlib
Newlib does not cover the ANSI C99 standard only: it's an expanded library that also includes common POSIX functions (clib2 implements them as well). But certain POSIX functions - such as glob(), globfree(), or fork() - are missing; and while some of them are easy to implement, others are not - fork() being an example of the latter.
Newlib is also available as a shared object.
2.2 Myth #1: AmigaOS4 behaves like UNIX
Yet getting close to UNIX or Linux in terms of software or programming tools does not mean that AmigaOS4 behaves in the same way as regards, for example, library initialization, passing arguments or system calls. On AmigaOS4 there are no "system calls" as they are on UNIXes, where you can simply pass arguments to registers and then use an instruction (like "int 0x80h" on x86 Linux, "trap 0" on M68 Linux, or "sc" on some PPC/POWER CPU based OSes), which will cause a software interrupt and enter the kernel in supervisor mode. The concept of AmigaOS is completely different. There is no kernel as such; Amiga's Kickstart is actually a collection of libraries (of which "kernel.kmod" is just one module - a new incarnation of the old exec.library). Also, an AmigaOS program, when calling a library function, won’t enter supervisor mode but rather stays in user mode when the function is executed.

When you program in assembler under AmigaOS4, you cannot do much until you initialize and open all the needed libraries (unlike, for example, on UNIX where the kernel does all the necessary initialisation for you).
2.3 Myth #2: AmigaOS4 binaries are fat
Luckily, the size difference is only noticeable in small programs, like Hello World, where the resulting executable grows to 65KB. Which of course is unbelievable and looks like something is wrong. But once you start programming for real and produce bigger programs, the code fills up the ELF segments as required, there’s little need for padding, and so there’s little size difference in the end. The worst-case scenario is ~64KB of extra padding, which only happens (as we said) in very small programs, or when you’re out of luck and your code only just exceeds a boundary between two segments.
It is likely that a newer SDK will adapt binutils for AmigaOS4 and the padding will no longer be needed. Currently, to avoid alignment you can use the "-N" switch, which tells the linker to use an ldscript that builds non-aligned binaries. Check the SDK:gcc/ppc-AmigaOS/lib/ldscripts directory; all the files ending with an "n" (like “AmigaOS.xn” or “ELF32ppc.xbn”) are linker scripts that ensure non-aligned builds. Such a script will be used when the GCC compiler receives the “-N” switch. See the following:
7/0.RAM Disk:> type hello.c
#include <stdio.h> main() { printf("aaaa"); }
6/1.Work:> gcc hello.c -o hello 6/1.Work:> strip hello 6/1.Work:> filesize format=%s hello 65k 6/1.Work:> hello aaaa
6/1.Work:> gcc -N hello.c -o hello 6/1.Work:> strip hello 6/1.Work:> filesize format=%s hello 5480 6/1.work:> hello aaaa
2.3 Genuine ELF executables
A more detailed description of the ELF internals will be given later; all you need to know for now is that the executable ELF file contains headers (the main header, and headers for the various sections) and sections/segments. The ELF file layout looks like this:

The advantage of objects is that they are smaller and that relocations are always included. But there is a drawback as well: the linker will not tell you automatically whether all symbols have been resolved because an object is allowed to have unresolved references. (On the other hand, vlink could always detect unresolved references when linking PowerUP and MorphOS objects because it sees them as a new format.) This is why ELF shared objects cannot be used easily (though it’s still kind of possible using some hacks), and it explains why the OS4 team decided to go for real executables.
By specification, ELF files are meant to be executed from a fixed absolute address, and so AmigaOS4 programs need to be relocated (because all processes share the same address space). To do that, the compiler is passed the -q switch ("keep relocations"). Relocations are handled by the MMU, which will create a new virtual address space for each new process.
If you look at the linker scripts provided to build OS4 executables (in the SDK:gcc/ppc-AmigaOS/lib/ldscripts directory), you’ll find the following piece of code:
ENTRY(_start)
....
SECTIONS
{
PROVIDE (__executable_start = 0x01000000); . = 0x01000000 + SIZEOF_HEADERS;
[...]To perform a test, let’s see what happens if we build our binary without the "-q" switch (that is, without making the binary relocatable):
7/0.RAM Disk:> type test.c
#include <stdio.h> main() { printf("aaaa"); }
shell:> gcc test.c -S -o test.s shell:> as test.s -o test shell:> ld test.o -o test /SDK/newlib/lib/crtbegin.o /SDK/newlib/lib/LibC.a /SDK/newlib/lib/crtend.o
6/0.RAM Disk:> objdump -D --no-show-raw-insn --stop-address=0x10000d0 test | grep -A8 "_start"
010000b0 <_start>: 10000b0: stwu r1,-64(r1) # 10000b4: mflr r0 # prologue (reserve 64 byte stack frame) 10000b8: stw r0,68(r1) # 10000bc: lis r9,257 # 257 is loaded into the higher half-word (msw) of r9 (257 << 16) 10000c0: stmw r25,36(r1) # offset into the stack frame 10000c4: mr r25,r3 # save command line stack pointer 10000c8: mr r27,r13 # r13 can be used as small data pointer in the V.4-ABI, and it also saved here 10000cc: stw r5,20(r9) # Write value (257 << 16) + 20 = 0x01010014 to the r5 (DOSBase pointer)
Of course it is possible to make a working binary without relocation, if the program doesn’t need to relocate and you are lucky enough to have the 0x1000000 address free of important contents. And of course you can use a different address for the entry point, by hex-editing the binary or at build-time using self-made ldscripts. Making a non-relocatable binary will be discussed further in the text.
3. PowerPC assembly
3.1 Registers
3.1.1 General-purpose registers
r0 - volatile register that may be modified during function linkage
r1 - stack-frame pointer, always valid
r2 - system reserved register
r3 - command-line pointer
r4 - command-line length
r5 - DOSBase pointer
* The contents of registers r3-r5 is only valid when the program starts)
r6 - r10 - volatile registers used for parameter passing
r11 - r12 - volatile registers that may be modified during function linkage
r13 - small data area pointer register
r14 - r30 - registers used for local variables; they are non-volatile; functions have to save and restore them
r31 - preferred by GCC in position-independent code (e.g. in shared objects) as a base pointer into the GOT section; however, the pointer can also be stored in another register
3.1.2 Some special registers
lr - link register; stores the "ret address" (i.e. the address to which a called function normally returns)
cr - condition register
3.2 Instructions
b
bctr
lis
lis %r3,0x1234 ori %r3,%r3,0x5678
Later in the article you’ll notice that this kind of construction is used all the time.
mtlr
stwu
3.3 Function prologue and epilogue
stwu %r1,-16(%r1) mflr %r0 # prologue, reserve 16 byte stack frame stw %r0,20(%r1) ... lwz %r0,20(%r1) addi %r1,%r1,16 # epilogue, restore back mtlr %r0 blr
The prologue code generally opens a stack frame with a stwu instruction that increments register r1 and stores the old value at the first address of the new frame. The epilogue code just loads r1 with the old stack value.
C programmers needn’t worry at all about the prologue and epilogue because the compiler will add them to their functions automatically. When you write your programs in pure assembler you can skip the prologue and the epilogue if you don’t need to keep the return address.
Plus, a new stack frame doesn’t need to be allocated for functions that do not call any subroutine. By the way, the V.4-ABI (application binary interface) defines a specific layout of the stack frame and stipulates that it should be aligned to 16 bytes.
4. Writing programs in assembler
--using libc (all initializations are done by crtbegin.o/crtend.o and libc is attached to the binary)
--the old way (all initializations - opening libraries, interfaces etc. - have to be done manually in the code)
The advantage of using libc is that you can run your code "out of the box" and that all you need to know is the correct offsets to the function pointers. On the minus side, the full library is attached to the binary, making it bigger. Sure, a size difference of ten or even a hundred kilobytes doesn’t play a big role these days – but here in this article we’re going down the old hacking way (that’s why we’re fiddling with assembler at all) so let’s call it a drawback.
The advantage of not using libc is that you gain full control of your program, you can only use the functions you need, and the resulting binary will be as small as possible (a fully working binary can have as little as 100 bytes in size). The drawback is that you have to initialize everything manually.
We’ll first discuss assembler programming with the use of libc.
4.1 Assembler programming using libc
#include <stdio.h> main() { printf("aaaa"); exit(0); }
6/0.RAM Disk:> gcc -gstabs -O2 2.c -o 2 2.c: In function 'main': 2.c:6: warning: incompatible implicit declaration of built-in function 'exit' 6/0.RAM Disk:> GDB -q 2
(GDB) break main Breakpoint 1 at 0x7fcc7208: file 2.c, line 4. (GDB) r Starting program: /RAM Disk/2 BS 656d6ed8 Current action: 2 Breakpoint 1, main () at 2.c:4 4 { (GDB) disas Dump of assembler code for function main: 0x7fcc7208 <main+0>: stwu r1,-16(r1) 0x7fcc720c <main+4>: mflr r0 0x7fcc7210 <main+8>: lis r3,25875 ; that addr 0x7fcc7214 <main+12>: addi r3,r3,-16328 ; on our string 0x7fcc7218 <main+16>: stw r0,20(r1) 0x7fcc721c <main+20>: crclr 4*cr1+eq 0x7fcc7220 <main+24>: bl 0x7fcc7234 <printf> 0x7fcc7224 <main+28>: li r3,0 0x7fcc7228 <main+32>: bl 0x7fcc722c <exit> End of assembler dump. (GDB)
(GDB) disas printf Dump of assembler code for function printf: 0x7fcc723c <printf+0>: li r12,1200 0x7fcc7240 <printf+4>: b 0x7fcc7244 <__NewLibCall> End of assembler dump. (GDB) (GDB) disas exit Dump of assembler code for function exit: 0x7fcc7234 <exit+0>: li r12,1620 0x7fcc7238 <exit+4>: b 0x7fcc7244 <__NewLibCall> End of assembler dump. (GDB)
(GDB) disas __NewLibCall Dump of assembler code for function __NewLibCall: 0x7fcc7244 <__NewLibCall+0>: lis r11,26006 0x7fcc7248 <__NewLibCall+4>: lwz r0,-25500(r11) 0x7fcc724c <__NewLibCall+8>: lwzx r11,r12,r0 0x7fcc7250 <__NewLibCall+12>: mtctr r11 0x7fcc7254 <__NewLibCall+16>: bctr End of assembler dump. (GDB)
6/0.RAM Disk:> objdump -d 1 | grep -A5 "<__NewLibCall>:"01000280 <__NewLibCall>: 1000280: 3d 60 01 01 lis r11,257 1000284: 80 0b 00 24 lwz r0,36(r11) 1000288: 7d 6c 00 2e lwzx r11,r12,r0 100028c: 7d 69 03 a6 mtctr r11 1000290: 4e 80 04 20 bctr
We will now remove the prologue (because we don’t need it in this case) and reorganize the code a bit:
.globl main main: lis %r3,.msg@ha # la %r3,.msg@l(%r3) # printf("aaaa"); bl printf # li %r3,0 # exit(0); bl exit # .msg: .string "aaaa"
6/0.RAM Disk:> as test.s -o test.o 6/0.RAM Disk:> ld -N -q test.o -o test /SDK/newlib/lib/crtbegin.o /SDK/newlib/lib/LibC.a /SDK/newlib/lib/crtend.o 6/0.RAM Disk:> strip test 6/0.RAM Disk:> filesize format=%s test 5360 6/0.RAM Disk:> test aaaa 6/0.RAM Disk:>
.globl main main: #printf("aaaa") lis %r3,.msg@ha # arg1 part1 la %r3,.msg@l(%r3) # arg1 part2 li %r12, 1200 # 1200 - pointer offset to function b __NewLibCall #exit(0) li %r3, 0 # arg1 li %r12, 1620 # 1620 - pointer offset to function b __NewLibCall .msg: .string "aaaa"
6/0.RAM Disk:> as test.s -o test.o 6/0.RAM Disk:> ld -N -q test.o -o test /SDK/newlib/lib/crtbegin.o /SDK/newlib/lib/LibC.a /SDK/newlib/lib/crtend.o 6/0.RAM Disk:> strip test 6/0.RAM Disk:> filesize format=%s test 5336 6/0.RAM Disk:> test aaaa 6/0.RAM Disk:>
.globl main main: lis %r3,.msg@ha # la %r3,.msg@l(%r3) # printf("aaaa"); li %r12, 1200 lis %r11,26006 lwz %r0,-25500(%r11) lwzx %r11,%r12,%r0 # __NewLibCall mtctr %r11 bctr li %r3, 0 li %r12, 1620 # exit lis %r11,26006 lwz %r0,-25500(%r11) lwzx %r11,%r12,%r0 # __NewLibCall mtctr %r11 bctr .msg: .string "aaaa"
7/0.RAM Disk:> objdump -dr 1 | grep -A7 "<__NewLibCall>:"01000298 <__NewLibCall>: 1000298: 3d 60 01 01 lis r11,257 100029a: R_PPC_ADDR16_HA INewlib 100029c: 80 0b 00 24 lwz r0,36(r11) 100029e: R_PPC_ADDR16_LO INewlib 10002a0: 7d 6c 00 2e lwzx r11,r12,r0 10002a4: 7d 69 03 a6 mtctr r11 10002a8: 4e 80 04 20 bctr
.macro OUR_NEWLibCALL lis %r11,INewlib@ha lwz %r0,INewlib@l(%r11) lwzx %r11,%r12,%r0 mtctr %r11 bctr .endm .globl main main: lis %r3,.msg@ha la %r3,.msg@l(%r3) # printf("aaaa"); li %r12, 1200 OUR_NEWLibCALL li %r3, 0 li %r12, 1620 # exit(0); OUR_NEWLibCALL .msg: .string "aaaa"
By the way, when we debug our binary, you’ll notice that GCC has put a strangely-looking instruction right before the call to a libc function: crxor 6,6,6 (crclr 4*cr1+eq). This is done in compliance with the ABI specification, which says that before a variadic function is called, an extra instruction (crxor 6,6,6 or creqv 6,6,6) must be executed that sets Condition Register 6 (CR6) to either 1 or 0. The value depends on whether one or more arguments need to go to a floating-point register. If no arguments are being passed in floating-point registers, crxor 6,6,6 is added in order to set the Condition Register to 0. If you call a variadic function with floating-point arguments, the call will be preceded by a creqv 6,6,6 that sets Condition Register 6 to the value of 1.
You may ask where on Earth we got the numerical values (offsets) for the libc functions, i.e. “1200” representing printf() and “1620” representing exit(). For newlib.library, there is no documentation, header files or an interface description in the official AmigaOS4 SDK so you have to find it all out yourself. There are a couple of ways to do it:
--1. Write the program in C and obtain the numbers by disassembling the code (using GDB or objdump). Not much fun but at least you can inspect what arguments are used and in which registers they are stored.
--2. If you only need the list of function offsets you can disassemble the LibC.a file using objdump:
shell:> objdump -dr SDK:newlib/lib/LibC.a
---- SNIP ---- Disassembly of section .text: 00000000 <realloc>: 0: 39 80 01 64 li r12,356 4: 48 00 00 00 b 4 <realloc+0x4> 4: R_PPC_REL24 __NewLibCall stub_realpath.o: file format ELF32-AmigAOS Disassembly of section .text: 00000000 <realpath>: 0: 39 80 0c 00 li r12,3072 4: 48 00 00 00 b 4 <realpath+0x4> 4: R_PPC_REL24 __NewLibCall stub_recv.o: file format ELF32-AmigaOS ---- SNIP ----
4.2 Assembler programming without libc
* obtain SysBase (pointer to exec.library)
* obtain the exec.library interface
* IExec->Obtain()
* open dos.library and its interface (if you want to use dos.library functions)
* IExec->GetInterface()
... your code ...
* IExec->DropInterface()
* IExec->CloseLibrary()
* IExec->Release()
* exit(0)
There is a Hello World example written by Frank Wille for his assembler 'vasm'; I’ll adapt it for the GNU assembler ('as') in order to make the article related to one compiler. (Both the original and the adapted version can be found in the archive that comes with the article):
# ExecBase .set ExecBase,4 .set MainInterface,632 # Exec Interface .set Obtain,60 .set Release,64 .set OpenLibrary,424 .set CloseLibrary,428 .set GetInterface,448 .set DropInterface,456 # DOS Interface .set Write,88 .set Output,96 .macro CALLOS reg,val # Interface register, function offset lwz %r0,\val(\reg) mr %r3,\reg mtctr %r0 bctrl .endm .text .global _start _start: mflr %r0 stwu %r1,-32(%r1) stmw %r28,8(%r1) mr %r31,%r0 # get SysBase li %r11,ExecBase lwz %r3,0(%r11) # get Exec-Interface lwz %r30,MainInterface(%r3) # r30 IExec # IExec->Obtain() CALLOS %r30,Obtain # open dos.library and get DOS-Interface # IExec->OpenLibrary("dos.library",50) lis %r4,dos_name@ha addi %r4,%r4,dos_name@l li %r5,50 CALLOS %r30,OpenLibrary mr. %r28,%r3 # r28 DOSBase beq release_exec # IExec->GetInterface(DOSBase,"main",1,0) mr %r4,%r28 lis %r5,main_name@ha addi %r5,%r5,main_name@l li %r6,1 li %r7,0 CALLOS %r30,GetInterface mr. %r29,%r3 # r29 IDOS beq close_dos # IDOS->Output() CALLOS %r29,Output # IDOS->Write(stdout,"Hello World!\n",13) mr %r4,%r3 lis %r5,hello_world@ha addi %r5,%r5,hello_world@l li %r6,hello_world_end-hello_world CALLOS %r29,Write # IExec->DropInterface(IDOS) mr %r4,%r29 CALLOS %r30,DropInterface close_dos: # IExec->CloseLibrary(DOSBase) mr %r4,%r28 CALLOS %r30,CloseLibrary release_exec: # IExec->Release() CALLOS %r30,Release # exit(0) li %r3,0 mtlr %r31 lmw %r28,8(%r1) addi %r1,%r1,32 blr .rodata dos_name: .string "dos.library" main_name: .string "main" hello_world: .string "Hello World!" hello_world_end:
6/0.Work:> as hello.s -o hello.o 6/0.Work:> ld -q hello.o -o hello 6/0.Work:> strip hello 6/0.Work:> filesize format=%s hello 4624
To obtain the numerical values that identify system functions, you need to study the interface description XML files that are provided in the AmigaOS4 SDK. For example, for exec.library functions you need to read the file “SDK:include/interfaces/exec.xml”. All interfaces contain a jump table. The offset for the first interface "method" is 60, the next one is 64 and so on. So you just open the appropriate interface description XML file, start counting from 60, and add +4 for any method that follows.
5. Hacking it for real
5.1 Linker scripts (ldscripts)
What does all of that mean for us and how is it related to this article? Well, when you read the ldscripts documentation (see Links below), you can build your own ldscript that will only create the necessary sections. That is: we can produce a minimum working executable and thus get rid of parts that even 'strip' wouldn’t be able to remove.
So following the first-test example from the ldscript documentation, we’ll write our own script now:
SECTIONS
{
. = 0x00000000;
.text : { *(.text) }
}shell:> as hello.s -o hello.o shell:> ld -Tldscript -q -o hello hello.o shell:> stat -c=%s hello =66713
SIZEOF_HEADERS: Returns the size in bytes of the output file’s headers. You can use this number as the start address of the first section, to facilate paging.
This looks like the thing we need, so:
SECTIONS
{
. = SIZEOF_HEADERS;
.text : { *(.text) }
}shell:> as hello.s -o hello.o shell:> ld -Tldscript -q -o hello hello.o shell:> stat -c=%s hello =1261 shell:> strip hello shell:> stat -c=%s hello =832 shell:> hello Hello World! shell:>
832 bytes of size and works!
5.2 Getting rid of relocation
Now, lets see what kind of sections our 832 bytes binary has:
7/0.Work:> readelf -S hello There are 7 section headers, starting at offset 0x198: Section Headers: [Nr] Name Type Addr Off Size ES Flg Lk Inf Al [ 0] NULL 00000000 000000 000000 00 0 0 0 [ 1] .text PROGBITS 00000054 000054 0000f8 00 AX 0 0 1 [ 2] .rela.text RELA 00000000 0002f8 000048 0c 5 1 4 [ 3] .rodata PROGBITS 0000014c 00014c 00001e 00 A 0 0 1 [ 4] .shstrtab STRTAB 00000000 00016a 00002e 00 0 0 1 [ 5] .symtab SYMTAB 00000000 0002b0 000040 10 6 3 4 [ 6] .strtab STRTAB 00000000 0002f0 000008 00 0 0 1 Key to Flags: W (write), A (alloc), X (execute), M (merge), S (strings) I (info), L (link order), G (group), x (unknown) O (extra OS processing required) o (OS specific), p (processor specific) 7/0.Work:>
As you can see there are some sections that should be relocated:
--1..rela.text - relocations for .text.
--2..rodata - data (our strings like "helloworld", "dos.library", etc)
So what about .rela.text and .rodata? Well, they can be removed by modifing our code a bit, to avoid any relocations (thanks to Frank again). We place the data to the .text section, together with the code. So the distance between the strings and the code is constant (kind of like base-relative addressing). With "bl initbase" we jump to the following instruction while the CPU places the address of this instruction into LR. This is the base address which we can use:
# non-relocated Hello World # by Frank Wille, janury 2012 # adapted for "as" by kas1e # ExecBase .set MainInterface,632 # Exec Interface .set Obtain,60 .set Release,64 .set OpenLibrary,424 .set CloseLibrary,428 .set GetInterface,448 .set DropInterface,456 # DOS Interface .set Write,88 .set Output,96 .macro CALLOS reg,val # Interface register, function offset lwz %r0,\val(\reg) mr %r3,\reg mtctr %r0 bctrl .endm .text .global _start _start: mflr %r0 stw %r0,4(%r1) stwu %r1,-32(%r1) stmw %r28,8(%r1) # initialize data pointer bl initbase initbase: mflr %r31 # r31 initbase # get Exec-Interface lwz %r30,MainInterface(%r5) # r30 IExec # IExec->Obtain() CALLOS %r30,Obtain # open dos.library and get DOS-Interface # IExec->OpenLibrary("dos.library",50) addi %r4,%r31,dos_name-initbase li %r5,50 CALLOS %r30,OpenLibrary mr. %r28,%r3 # r28 DOSBase beq release_exec # IExec->GetInterface(DOSBase,"main",1,0) mr %r4,%r28 addi %r5,%r31,main_name-initbase li %r6,1 li %r7,0 CALLOS %r30,GetInterface mr. %r29,%r3 # r29 IDOS beq close_dos # IDOS->Output() CALLOS %r29,Output # IDOS->Write(stdout,"Hello World!\n",13) mr %r4,%r3 addi %r5,%r31,hello_world-initbase li %r6,hello_world_end-hello_world CALLOS %r29,Write # IExec->DropInterface(IDOS) mr %r4,%r29 CALLOS %r30,DropInterface close_dos: # IExec->CloseLibrary(DOSBase) mr %r4,%r28 CALLOS %r30,CloseLibrary release_exec: # IExec->Release() CALLOS %r30,Release # exit(0) li %r3,0 lmw %r28,8(%r1) addi %r1,%r1,32 lwz %r0,4(%r1) mtlr %r0 blr dos_name: .string "dos.library" main_name: .string "main" hello_world: .string "Hello World!" hello_world_end:
6/0.Work:> as hello.s -o hello.o 6/0.Work:> ld -Tldscript hello.o -o hello 6/0.Work:> strip hello 6/0.Work:> stat -c=%s hello =644 6/0.Work:> hello Hello World! 6/0.Work:>
6/0.Work:> readelf -S hello There are 5 section headers, starting at offset 0x184: Section Headers: [Nr] Name Type Addr Off Size ES Flg Lk Inf Al [ 0] NULL 00000000 000000 000000 00 0 0 0 [ 1] .text PROGBITS 10000054 000054 00010e 00 AX 0 0 1 [ 2] .shstrtab STRTAB 00000000 000162 000021 00 0 0 1 [ 3] .symtab SYMTAB 00000000 00024c 000030 10 4 2 4 [ 4] .strtab STRTAB 00000000 00027c 000008 00 0 0 1 Key to Flags: W (write), A (alloc), X (execute), M (merge), S (strings) I (info), L (link order), G (group), x (unknown) O (extra OS processing required) o (OS specific), p (processor specific) 6/0.Work:>
5.3 The ELF loader
Let's briefly cover the parts an ELF executable contains:
* ELF Header
* Program (segments) header table
* Segments
* Sections header table
* optional sections (certain sections can sometimes come before the sections header table, like for example .shstrtab)
This is what our 644-byte Hello World example looks like, with the various parts defined by the ELF specification highlighted in different colours (press on image to have original size):

db 0x7f, "ELF" ; magic db 1,2,1 ; 32 bits, big endian, version 1 db 0,0,0,0,0,0,0,0,0 ; os info db 0,2 ; e_type (for executable=2) db 0,0x14 ; 14h = powerpc. db 0,0,0,1 ; version (always must be set to 1) dd 0x10000054 ; entry point (on os4 make no sense) dd 0x00000034 ; program header table file offset in bytes dd 0x00000184 ; section header table file offset in bytes db 0,0,0,0 ; e_flag - processor specific flags db 0,0x34 ; e_ehsize - size of ELF header in bytes db 0,0x20 ; e_phentsize - size of one entry in bytes, of program header table (all the entries are the same size) db 0,2 ; e_phnum - number of entires in the program header table. db 0,0x28 ; e_shentsize - section headers size in bytes db 0,5 ; e_shnum - number of entires in the section header table db 0,2 ; e_eshstrndx - section header table index of the entry assosiated with the section name string table
* magic (first 7 bytes): db 0x7f,"ELF", 0x01,0x02,0x01 (100% required)
* all the subsequent fields are not parsed at all and can contain any data, until the loader reaches the section header tables' file offset in bytes field (required)
* then again there can be any data, until e_phnum (the number of entires in the program header table, which is required as well)
* and then the next 8 bytes of info (4 fields) about section headers/sections are required
The ELF header is not the only place where you can insert (at least with the current version of the loader) your own data. After the ELF header there come program headers (i.e. headers that describe segments). In our particular case we have one program section header for the .text segment. And here comes the suprise: the AmigaOS4 ELF loader does not parse the program headers at all! Instead, the parsing is done in sections and section headers only. Apparently, the OS4 loader does something that on UNIXes is normally put in the ELF executable and the loader just gets data from it. But under AmigaOS4 this is not the case. Although the ELF binary produced by GCC is built correctly and according to specification, half of the sections and many fields are not used under OS4.
So the programs section headers can fully be used for your own needs. We can remove section names completely (and give them, for example, an "empty" name by writing 0 string-offset in the sh_name field of each section header entry). But .shstrtab must still be kept, with a size of 1 byte. A NULL section header can be reused too (you can see that a NULL section header comes after the .shrstab section, so we have plenty of space). Check the file "bonus/unused_fields/hello" to see which areas can be reused (these are indicated by 0xAA bytes).
Now it‘s clear that we can manipulate sections (i.e. delete empty ones and those ignored by the ELF loader) and recalculate all the addresses in the necessary fields. To do that you will really need to dig into the ELF specification. For example, you can put the _start label to any suitable place (such as the ELF header, or right at the begining of an ignored field) and then just put the adjusted address in the .strtab section offset field. This way you can save 8 bytes, so the size of our binary is now 636 bytes. Then there is the .symtab section at the end of the file, which is 48 bytes long. We can put it right in the place of .shstrtab (34 bytes in our case) and in the following part of the NULL section header (so as to squeeze the remaining 14 bytes in). Just like this (press on image to see it in full size):

In the bonus directory that comes with this article, you can try out an example binary the altered structure of which is depicted by the image above. In the binary, .strtab (the _start symbol) is moved to the program section header, and .symtab is moved on top of .shstrtab + the NULL section header (see directory "bonus/shift_sections").
6. Final Words
7. Links
ELF specification: http://flint.cs.yale.edu/cs422/doc/ELF_Format.pdf
PPC SYSV4 ABI.pdf: http://refspecs.linuxbase.org/elf/elfspec_ppc.pdf
Green Book (MPCFPE32B.pdf): http://www.freescale.com/files/product/doc/MPCFPE32B.pdf
GDB.txt: http://www.gnu.org/s/GDB/documentation/
Linker Scripts: http://sourceware.org/binutils/docs/ld/Scripts.html#Scripts
or SDK:Documentation/CompilerTools/ld.pdf , chapter 3.0 "Linker Scripts"
ps. you also can get the PDF version of that article (see attach), or download it here
Submitted by kas1e on Thu, 2012-03-22 01:45

Is there a new location for the PPC SYSV4 ABI pdf specification?
as trying the link here reports some kind of error about it having moved
Just a heads up...