The Hacking Way: Part 1. First Steps.

Blog Rate

The Hacking Way: Part 1. First Steps.
by kas1e.
proofread and grammar corrections by Trixie.

Table of Contents
 
      1. Introduction..................................................................
      2. The Basics....................................................................
           2.1 The C standard library (LibC)...........................................
           2.2 Myth #1: AmigaOS4 behaves like UNIX.....................................
           2.3 Myth #2: AmigaOS4 binaries are fat......................................
           2.4 Genuine ELF executables.................................................
      3. PowerPC Assembly..............................................................
           3.1 Registers...............................................................
               3.1.1 General-purpose registers.........................................
               3.1.2 Some special registers............................................
           3.2 Instructions............................................................
           3.3 Function Prologue and Epilogue..........................................
      4. Writing programs in assembler.................................................
           4.1 Assembler programming using LibC........................................
           4.2 Assembler programming without LibC......................................
      5. Hacking it for real...........................................................
           5.1 Linker scripts (ldscripts)..............................................
           5.2 Getting rid of relocation...............................................
           5.3 The ELF Loader..........................................................
           5.4 What else can we do ?...................................................
      6. Final Words...................................................................
      7. Links.........................................................................

1. Introduction

2. The Basics

2.1 The C standard library (libc)
clib2
Newlibvclib

2.2 Myth #1: AmigaOS4 behaves like UNIX

IMAGE(http://kas1e.mikendezign.com/aos4/docs/hack1/image_1_thematic.png)

2.3 Myth #2: AmigaOS4 binaries are fat

7/0.RAM Disk:> type hello.c

#include <stdio.h>
main()
{
  printf("aaaa");
}

6/1.Work:> gcc hello.c -o hello
6/1.Work:> strip hello
6/1.Work:> filesize format=%s hello 
65k
6/1.Work:> hello
aaaa

6/1.Work:> gcc -N hello.c -o hello
6/1.Work:> strip hello
6/1.Work:> filesize format=%s hello 
5480
6/1.work:> hello
aaaa

2.3 Genuine ELF executables

IMAGE(http://kas1e.mikendezign.com/aos4/docs/hack1/image_2_elf_layout.png)

ENTRY(_start)
....
SECTIONS
{
 PROVIDE (__executable_start = 0x01000000); . = 0x01000000 + SIZEOF_HEADERS;
[...]

7/0.RAM Disk:> type test.c

#include <stdio.h>
main()
{
  printf("aaaa");
}

shell:> gcc test.c -S -o test.s
shell:> as test.s -o test
shell:> ld test.o -o test /SDK/newlib/lib/crtbegin.o /SDK/newlib/lib/LibC.a  /SDK/newlib/lib/crtend.o

6/0.RAM Disk:> objdump -D --no-show-raw-insn --stop-address=0x10000d0 test | grep -A8 "_start"

010000b0 <_start>:
 
10000b0:       stwu    r1,-64(r1)    #
10000b4:       mflr    r0            # prologue (reserve 64 byte stack frame)
10000b8:       stw     r0,68(r1)     #
 
10000bc:       lis     r9,257        # 257 is loaded into the higher half-word (msw) of r9 (257 << 16)
10000c0:       stmw    r25,36(r1)    # offset into the stack frame 
10000c4:       mr      r25,r3        # save command line stack pointer
10000c8:       mr      r27,r13       # r13 can be used as small data pointer in the V.4-ABI, and it also saved here
10000cc:       stw     r5,20(r9)     # Write value (257 << 16) + 20 = 0x01010014 to the r5 (DOSBase pointer)

3. PowerPC assembly

3.1 Registers

3.1.1 General-purpose registers

r0 - volatile register that may be modified during function linkage

r1 - stack-frame pointer, always valid

r2 - system reserved register

r3 - command-line pointer

r4 - command-line length

r5 - DOSBase pointer

* The contents of registers r3-r5 is only valid when the program starts)

r6 - r10 - volatile registers used for parameter passing

r11 - r12 - volatile registers that may be modified during function linkage

r13 - small data area pointer register

[b]r14 - r30[/b] - registers used for local variables; they are non-volatile; functions have to save and restore them

r31 - preferred by GCC in position-independent code (e.g. in shared objects) as a base pointer into the GOT section; however, the pointer can also be stored in another register

3.1.2 Some special registers

lr - link register; stores the "ret address" (i.e. the address to which a called function normally returns)

cr - condition register

3.2 Instructions

bctr

lis

lis %r3,0x1234
ori %r3,%r3,0x5678

Later in the article you’ll notice that this kind of construction is used all the time.

mtlr

stwu

3.3 Function prologue and epilogue

stwu %r1,-16(%r1)    
mflr %r0             # prologue, reserve 16 byte stack frame
stw %r0,20(%r1)      
 
...
 
lwz %r0,20(%r1)      
addi %r1,%r1,16      #  epilogue, restore back
mtlr %r0              
blr

4. Writing programs in assembler

4.1 Assembler programming using libc

#include <stdio.h>
 
main()
{
   printf("aaaa");
   exit(0);
}

6/0.RAM Disk:> gcc -gstabs -O2 2.c -o 2
2.c: In function 'main':
2.c:6: warning: incompatible implicit declaration of built-in function 'exit'
 
6/0.RAM Disk:> GDB -q 2

(GDB) break main
Breakpoint 1 at 0x7fcc7208: file 2.c, line 4.
(GDB) r
Starting program: /RAM Disk/2 
BS 656d6ed8
Current action: 2
 
Breakpoint 1, main () at 2.c:4
4       {
(GDB) disas
Dump of assembler code for function main:
0x7fcc7208 <main+0>:    stwu    r1,-16(r1)
0x7fcc720c <main+4>:    mflr    r0
0x7fcc7210 <main+8>:    lis     r3,25875         ; that addr
0x7fcc7214 <main+12>:   addi    r3,r3,-16328     ; on our string
0x7fcc7218 <main+16>:   stw     r0,20(r1)
0x7fcc721c <main+20>:   crclr   4*cr1+eq
0x7fcc7220 <main+24>:   bl      0x7fcc7234 <printf>
0x7fcc7224 <main+28>:   li      r3,0
0x7fcc7228 <main+32>:   bl      0x7fcc722c <exit>
End of assembler dump.
(GDB)

(GDB) disas printf
Dump of assembler code for function printf:
0x7fcc723c <printf+0>:  li      r12,1200
0x7fcc7240 <printf+4>:  b       0x7fcc7244 <__NewLibCall>
End of assembler dump.
(GDB)
 
(GDB) disas exit
Dump of assembler code for function exit:
0x7fcc7234 <exit+0>:    li      r12,1620
0x7fcc7238 <exit+4>:    b       0x7fcc7244 <__NewLibCall>
End of assembler dump.
(GDB)

(GDB) disas __NewLibCall
Dump of assembler code for function __NewLibCall:
0x7fcc7244 <__NewLibCall+0>:    lis     r11,26006
0x7fcc7248 <__NewLibCall+4>:    lwz     r0,-25500(r11)
0x7fcc724c <__NewLibCall+8>:    lwzx    r11,r12,r0
0x7fcc7250 <__NewLibCall+12>:   mtctr   r11
0x7fcc7254 <__NewLibCall+16>:   bctr
End of assembler dump.
(GDB)

6/0.RAM Disk:> objdump -d 1 | grep -A5 "<__NewLibCall>:"

01000280 <__NewLibCall>:
1000280:       3d 60 01 01     lis     r11,257
1000284:       80 0b 00 24     lwz     r0,36(r11)
1000288:       7d 6c 00 2e     lwzx    r11,r12,r0
100028c:       7d 69 03 a6     mtctr   r11
1000290:       4e 80 04 20     bctr

   .globl main
main:
        lis %r3,.msg@ha          #
        la %r3,.msg@l(%r3)       # printf("aaaa");
        bl printf                #
 
        li %r3,0                 # exit(0);
        bl exit                  #  
 
.msg:
        .string "aaaa"

6/0.RAM Disk:> as test.s -o test.o
6/0.RAM Disk:> ld -N -q test.o -o test /SDK/newlib/lib/crtbegin.o /SDK/newlib/lib/LibC.a /SDK/newlib/lib/crtend.o
6/0.RAM Disk:> strip test 
6/0.RAM Disk:> filesize format=%s test
5360
6/0.RAM Disk:> test
aaaa
6/0.RAM Disk:>

   .globl main
main:
        #printf("aaaa")
 
        lis %r3,.msg@ha          # arg1 part1
        la %r3,.msg@l(%r3)       # arg1 part2
        li %r12, 1200            # 1200 - pointer offset to function
        b __NewLibCall
 
        #exit(0)
 
        li %r3, 0               # arg1
        li %r12, 1620           # 1620 - pointer offset to function
        b __NewLibCall          
 
.msg:
        .string "aaaa"

6/0.RAM Disk:> as test.s -o test.o
6/0.RAM Disk:> ld -N -q test.o -o test /SDK/newlib/lib/crtbegin.o /SDK/newlib/lib/LibC.a /SDK/newlib/lib/crtend.o
6/0.RAM Disk:> strip test 
6/0.RAM Disk:> filesize format=%s test
5336
6/0.RAM Disk:> test
aaaa
6/0.RAM Disk:>

   .globl main
main:
        lis %r3,.msg@ha          #
        la %r3,.msg@l(%r3)       # printf("aaaa");
        li %r12, 1200
 
        lis     %r11,26006
        lwz     %r0,-25500(%r11)
        lwzx    %r11,%r12,%r0      # __NewLibCall
        mtctr   %r11
        bctr
 
        li %r3, 0
        li %r12, 1620           # exit
 
        lis     %r11,26006
        lwz     %r0,-25500(%r11)
        lwzx    %r11,%r12,%r0      # __NewLibCall
        mtctr   %r11
        bctr
 
.msg:
        .string "aaaa"

7/0.RAM Disk:> objdump -dr 1 | grep -A7 "<__NewLibCall>:"

01000298 <__NewLibCall>:
 1000298:       3d 60 01 01     lis     r11,257
                        100029a: R_PPC_ADDR16_HA        INewlib
 100029c:       80 0b 00 24     lwz     r0,36(r11)
                        100029e: R_PPC_ADDR16_LO        INewlib
 10002a0:       7d 6c 00 2e     lwzx    r11,r12,r0
 10002a4:       7d 69 03 a6     mtctr   r11
 10002a8:       4e 80 04 20     bctr

.macro OUR_NEWLibCALL    
        lis     %r11,INewlib@ha
        lwz     %r0,INewlib@l(%r11)   
        lwzx    %r11,%r12,%r0     
        mtctr   %r11
        bctr
.endm
 
  .globl main
main:
        lis %r3,.msg@ha          
        la %r3,.msg@l(%r3)       # printf("aaaa");
        li %r12, 1200
 
        OUR_NEWLibCALL
 
        li %r3, 0
        li %r12, 1620           # exit(0);
 
        OUR_NEWLibCALL 
 
.msg:
        .string "aaaa"

--1. Write the program in C and obtain the numbers by disassembling the code (using GDB or objdump). Not much fun but at least you can inspect what arguments are used and in which registers they are stored.

--2. If you only need the list of function offsets you can disassemble the LibC.a file using objdump:

shell:> objdump -dr SDK:newlib/lib/LibC.a

---- SNIP ----
 
Disassembly of section .text:
 
00000000 <realloc>:
    0:	39 80 01 64 	li      r12,356
    4:	48 00 00 00 	b       4 <realloc+0x4>
			4: R_PPC_REL24	__NewLibCall
 
 stub_realpath.o:     file format ELF32-AmigAOS
 
Disassembly of section .text:
 
00000000 <realpath>:
    0:	39 80 0c 00 	li      r12,3072
    4:	48 00 00 00 	b       4 <realpath+0x4>
	 		4: R_PPC_REL24	__NewLibCall
 
stub_recv.o:     file format ELF32-AmigaOS
 
---- SNIP ----

4.2 Assembler programming without libc

# ExecBase
.set	ExecBase,4
.set	MainInterface,632
 
# Exec Interface
.set	Obtain,60
.set	Release,64
.set	OpenLibrary,424
.set	CloseLibrary,428
.set	GetInterface,448
.set	DropInterface,456
 
# DOS Interface
.set	Write,88
.set	Output,96
 
 
.macro CALLOS reg,val   # Interface register, function offset
	lwz %r0,\val(\reg)
	mr %r3,\reg
	mtctr %r0
	bctrl
.endm
 
	.text
 
	.global	_start
_start:
 
	mflr	%r0
	stwu	%r1,-32(%r1)
	stmw	%r28,8(%r1)
	mr	%r31,%r0
 
	# get SysBase
	li	%r11,ExecBase
	lwz	%r3,0(%r11)
 
	# get Exec-Interface
	lwz	%r30,MainInterface(%r3)	# r30 IExec
 
	# IExec->Obtain()
	CALLOS	%r30,Obtain
 
	# open dos.library and get DOS-Interface
	# IExec->OpenLibrary("dos.library",50)
	lis	%r4,dos_name@ha
	addi	%r4,%r4,dos_name@l
	li	%r5,50
	CALLOS	%r30,OpenLibrary
	mr.	%r28,%r3			# r28 DOSBase
	beq	release_exec
 
	# IExec->GetInterface(DOSBase,"main",1,0)
	mr	%r4,%r28
	lis	%r5,main_name@ha
	addi	%r5,%r5,main_name@l
	li	%r6,1
	li	%r7,0
	CALLOS	%r30,GetInterface
	mr.	%r29,%r3			# r29 IDOS
	beq	close_dos
 
	# IDOS->Output()
	CALLOS	%r29,Output
 
	# IDOS->Write(stdout,"Hello World!\n",13)
	mr	%r4,%r3
	lis	%r5,hello_world@ha
	addi	%r5,%r5,hello_world@l
	li	%r6,hello_world_end-hello_world
	CALLOS	%r29,Write
 
	# IExec->DropInterface(IDOS)
	mr	%r4,%r29
	CALLOS	%r30,DropInterface
 
close_dos:
	# IExec->CloseLibrary(DOSBase)
	mr	%r4,%r28
	CALLOS	%r30,CloseLibrary
 
release_exec:
	# IExec->Release()
	CALLOS	%r30,Release
 
	# exit(0)
	li	%r3,0
	mtlr	%r31
	lmw	%r28,8(%r1)
	addi	%r1,%r1,32
	blr
 
	.rodata
 
dos_name:
	.string	"dos.library"
main_name:
	.string	"main"
hello_world:
        .string "Hello World!"
hello_world_end:

6/0.Work:> as hello.s -o hello.o
6/0.Work:> ld -q hello.o -o hello
6/0.Work:> strip hello
6/0.Work:> filesize format=%s hello
4624

5. Hacking it for real

5.1 Linker scripts (ldscripts)

 SECTIONS
 {
   . = 0x00000000;
   .text           : { *(.text) }
 }

 shell:> as hello.s -o hello.o
 shell:> ld -Tldscript -q -o hello hello.o
 shell:> stat -c=%s hello
 =66713

SIZEOF_HEADERS: Returns the size in bytes of the output file’s headers. You can use this number as the start address of the first section, to facilate paging.

This looks like the thing we need, so:

SECTIONS
 {
   . = SIZEOF_HEADERS;
   .text           : { *(.text) }
 }

 shell:> as hello.s -o hello.o
 shell:> ld -Tldscript -q -o hello hello.o
 shell:> stat -c=%s hello
 =1261
 
 shell:> strip hello
 shell:> stat -c=%s hello
 =832
 
 shell:> hello
 Hello World!
 shell:>

832 bytes of size and works!

5.2 Getting rid of relocation

Now, lets see what kind of sections our 832 bytes binary has:

7/0.Work:> readelf -S hello
There are 7 section headers, starting at offset 0x198:
 
Section Headers:
  [Nr] Name	Type			Addr     Off    Size   ES Flg Lk Inf Al
  [ 0] 		 NULL		00000000 000000 000000 00      0   0  0
  [ 1] .text             PROGBITS        00000054 000054 0000f8 00  AX  0   0  1
  [ 2] .rela.text        RELA            00000000 0002f8 000048 0c      5   1  4
  [ 3] .rodata           PROGBITS        0000014c 00014c 00001e 00   A  0   0  1
  [ 4] .shstrtab         STRTAB          00000000 00016a 00002e 00      0   0  1
  [ 5] .symtab           SYMTAB          00000000 0002b0 000040 10      6   3  4
  [ 6] .strtab           STRTAB          00000000 0002f0 000008 00      0   0  1
Key to Flags:
  W (write), A (alloc), X (execute), M (merge), S (strings)
  I (info), L (link order), G (group), x (unknown)
  O (extra OS processing required) o (OS specific), p (processor specific)
 
7/0.Work:>

As you can see there are some sections that should be relocated:

--1..rela.text - relocations for .text.
--2..rodata - data (our strings like "helloworld", "dos.library", etc)

# non-relocated Hello World 
# by Frank Wille, janury 2012
# adapted for "as" by kas1e
 
 # ExecBase
.set	MainInterface,632
 
# Exec Interface
.set	Obtain,60
.set	Release,64
.set	OpenLibrary,424
.set	CloseLibrary,428
.set	GetInterface,448
.set	DropInterface,456
 
# DOS Interface
.set	Write,88
.set	Output,96
 
 
.macro CALLOS reg,val   # Interface register, function offset
	lwz %r0,\val(\reg)
	mr %r3,\reg
	mtctr %r0
	bctrl
.endm
 
	.text
 
	.global	_start
_start:
	mflr	%r0
	stw	%r0,4(%r1)
	stwu	%r1,-32(%r1)
	stmw	%r28,8(%r1)
 
	# initialize data pointer
	bl	initbase
initbase:
	mflr	%r31	# r31 initbase
 
	# get Exec-Interface
	lwz	%r30,MainInterface(%r5)	# r30 IExec
 
	# IExec->Obtain()
	CALLOS	%r30,Obtain
 
	# open dos.library and get DOS-Interface
	# IExec->OpenLibrary("dos.library",50)
	addi	%r4,%r31,dos_name-initbase
	li	%r5,50
	CALLOS	%r30,OpenLibrary
	mr.	%r28,%r3	# r28 DOSBase
	beq	release_exec
 
	# IExec->GetInterface(DOSBase,"main",1,0)
	mr	%r4,%r28
	addi	%r5,%r31,main_name-initbase
	li	%r6,1
	li	%r7,0
	CALLOS	%r30,GetInterface
	mr.	%r29,%r3	# r29 IDOS
	beq	close_dos
 
	# IDOS->Output()
	CALLOS	%r29,Output
 
	# IDOS->Write(stdout,"Hello World!\n",13)
	mr	%r4,%r3
	addi	%r5,%r31,hello_world-initbase
	li	%r6,hello_world_end-hello_world
	CALLOS	%r29,Write
 
	# IExec->DropInterface(IDOS)
	mr	%r4,%r29
	CALLOS	%r30,DropInterface
 
close_dos:
	# IExec->CloseLibrary(DOSBase)
	mr	%r4,%r28
	CALLOS	%r30,CloseLibrary
 
release_exec:
	# IExec->Release()
	CALLOS	%r30,Release
 
	# exit(0)
	li	%r3,0
	lmw	%r28,8(%r1)
	addi	%r1,%r1,32
	lwz	%r0,4(%r1)
	mtlr	%r0
	blr
 
dos_name:
	.string	"dos.library"
main_name:
	.string	"main"
hello_world:
	.string	"Hello World!"
hello_world_end:

 6/0.Work:> as hello.s -o hello.o
 6/0.Work:> ld -Tldscript hello.o -o hello
 6/0.Work:> strip hello
 6/0.Work:> stat -c=%s hello
 =644
 
 6/0.Work:> hello
 Hello World!
 6/0.Work:>

6/0.Work:> readelf -S hello
There are 5 section headers, starting at offset 0x184:
 
Section Headers:
  [Nr] Name              Type            Addr     Off    Size   ES Flg Lk Inf Al
  [ 0]                   NULL            00000000 000000 000000 00      0   0  0
  [ 1] .text             PROGBITS        10000054 000054 00010e 00  AX  0   0  1
  [ 2] .shstrtab         STRTAB          00000000 000162 000021 00      0   0  1
  [ 3] .symtab           SYMTAB          00000000 00024c 000030 10      4   2  4
  [ 4] .strtab           STRTAB          00000000 00027c 000008 00      0   0  1
Key to Flags:
  W (write), A (alloc), X (execute), M (merge), S (strings)
  I (info), L (link order), G (group), x (unknown)
  O (extra OS processing required) o (OS specific), p (processor specific)
 
6/0.Work:>

5.3 The ELF loader

* ELF Header
* Program (segments) header table
* Segments
* Sections header table
* optional sections (certain sections can sometimes come before the sections header table, like for example .shstrtab)

IMAGE(http://kas1e.mikendezign.com/aos4/docs/hack1/image_3_full_file_tumb.png)

  db 0x7f, "ELF"         ; magic
  db 1,2,1               ; 32 bits, big endian, version 1
  db 0,0,0,0,0,0,0,0,0   ; os info
 
  db 0,2                 ; e_type (for executable=2)
  db 0,0x14              ; 14h = powerpc. 
  db 0,0,0,1             ; version (always must be set to 1)
  dd 0x10000054          ; entry point (on os4 make no sense)
  dd 0x00000034          ; program header table file offset in bytes
  dd 0x00000184          ; section header table file offset in bytes
  db 0,0,0,0             ; e_flag   - processor specific flags
  db 0,0x34              ; e_ehsize - size of ELF header in bytes
 
 
  db 0,0x20              ; e_phentsize - size of one entry in bytes, of program header table (all the entries are the same size)      
  db 0,2                 ; e_phnum - number of entires in the program header table.
 
  db 0,0x28              ; e_shentsize - section headers size in bytes
  db 0,5                 ; e_shnum - number of entires in the section header table
  db 0,2                 ; e_eshstrndx - section header table index of the entry assosiated with the section name string table

     * magic (first 7 bytes): db 0x7f,"ELF", 0x01,0x02,0x01 (100% required)
     * all the subsequent fields are not parsed at all and can contain any data, until the loader reaches the section header tables' file offset in bytes field (required)
     * then again there can be any data, until e_phnum (the number of entires in the program header table, which is required as well)
     * and then the next 8 bytes of info (4 fields) about section headers/sections are required

IMAGE(http://kas1e.mikendezign.com/aos4/docs/hack1/image_4_elf_header.png)

IMAGE(http://kas1e.mikendezign.com/aos4/docs/hack1/image_5_moved_sections_tumb.png)

6. Final Words

7. Links

ELF specification: http://flint.cs.yale.edu/cs422/doc/ELF_Format.pdf
PPC SYSV4 ABI.pdf: http://refspecs.linuxbase.org/elf/elfspec_ppc.pdf
Green Book (MPCFPE32B.pdf): http://www.freescale.com/files/product/doc/MPCFPE32B.pdf
GDB.txt: http://www.gnu.org/s/GDB/documentation/
Linker Scripts: http://sourceware.org/binutils/docs/ld/Scripts.html#Scripts
or SDK:Documentation/CompilerTools/ld.pdf , chapter 3.0 "Linker Scripts"

ps. you also can get the PDF version of that article (see attach), or download it here

Blog post type:

Tutorial

Attachment	Size
the_hacking_way._part_1._first_steps.zip	1.05 MB

Comments

Submitted by Belxjander on Mon, 2012-05-14 15:09

Is there a new location for the PPC SYSV4 ABI pdf specification?

as trying the link here reports some kind of error about it having moved

Just a heads up...

OS4 Coding

The Hacking Way: Part 1. First Steps.

Tags:

Blog post type:

Comments

Coding Tools

Book Shelf

Search form

Tags:

Blog post type:

Comments

Coding Tools

Book Shelf

User login