Difference between revisions of "Assembler and ABI Resources"
m (→Assembler source formats: More text.) |
m (→Assembler source formats: More text.) |
||
Line 23: | Line 23: | ||
=== ARM === | === ARM === | ||
− | This fragment is from FpSysCall alias FPC_SYSCALL6 in ./rtl/linux/arm/syscall.inc: | + | This fragment is from FpSysCall alias FPC_SYSCALL6 in FPC's ./rtl/linux/arm/syscall.inc: |
asm | asm | ||
Line 34: | Line 34: | ||
end; | end; | ||
− | Note that register names are r1 | + | This fragment is from ret_from_fork in Linux's ./arch/arm/kernel/entry-common.S: |
+ | |||
+ | ENTRY(ret_from_fork) | ||
+ | bl schedule_tail | ||
+ | get_thread_info tsk | ||
+ | ldr r1, [tsk, #TI_FLAGS] @ check for syscall tracing | ||
+ | mov why, #1 | ||
+ | tst r1, #_TIF_SYSCALL_TRACE @ are we tracing syscalls? | ||
+ | beq ret_slow_syscall | ||
+ | mov r1, sp | ||
+ | mov r0, #1 @ trace exit [IP = 1] | ||
+ | bl syscall_trace | ||
+ | b ret_slow_syscall | ||
+ | ENDPROC(ret_from_fork) | ||
+ | |||
+ | Note that register names are r0, r1 etc. without a sigil, and that assignment is right-to-left. | ||
=== AVR === | === AVR === | ||
+ | This fragment is from ret_from_fork in Linux's ./arch/avr32/kernel/entry-avr32b.S: | ||
+ | |||
+ | ret_from_fork: | ||
+ | call schedule_tail | ||
+ | |||
+ | /* check for syscall tracing */ | ||
+ | get_thread_info r0 | ||
+ | ld.w r1, r0[TI_flags] | ||
+ | andl r1, _TIF_ALLWORK_MASK, COH | ||
+ | brne syscall_exit_work | ||
+ | rjmp syscall_exit_cont | ||
+ | |||
+ | Note that register names are r0, r1 etc. without a sigil, and that assignment is right-to-left. | ||
=== i386 === | === i386 === | ||
+ | This fragment is from FpSysCall alias FPC_SYSCALL6 in FPC's ./rtl/linux/i386/syscall.inc: | ||
+ | |||
+ | asm | ||
+ | push %ebx | ||
+ | push %edx | ||
+ | push %esi | ||
+ | push %edi | ||
+ | push %ebp | ||
+ | push %ecx | ||
+ | cmp $0, sysenter_supported | ||
+ | jne .LSysEnter | ||
+ | movl %edx,%ebx // param1 | ||
+ | pop %ecx // param2 | ||
+ | movl param3,%edx // param3 | ||
+ | movl param4,%esi // param4 | ||
+ | movl param5,%edi // param5 | ||
+ | movl param6,%ebp // param6 | ||
+ | int $0x80 | ||
+ | jmp .LTail | ||
+ | .LSysEnter: | ||
+ | movl %edx,%ebx // param1 | ||
+ | pop %ecx // param2 | ||
+ | movl param3,%edx // param3 | ||
+ | movl param4,%esi // param4 | ||
+ | movl param5,%edi // param5 | ||
+ | movl param6,%ebp // param6 | ||
+ | call psysinfo | ||
+ | .LTail: | ||
+ | pop %ebp | ||
+ | pop %edi | ||
+ | pop %esi | ||
+ | pop %edx | ||
+ | pop %ebx | ||
+ | cmpl $-4095,%eax | ||
+ | jb .LSyscOK | ||
+ | negl %eax | ||
+ | call seterrno | ||
+ | movl $-1,%eax | ||
+ | .LSyscOK: | ||
+ | end; | ||
+ | |||
+ | This fragment is from ret_from_fork in Linux's ./arch/x86/kernel/entry_32.S: | ||
+ | |||
+ | ENTRY(ret_from_fork) | ||
+ | CFI_STARTPROC | ||
+ | pushl %eax | ||
+ | CFI_ADJUST_CFA_OFFSET 4 | ||
+ | call schedule_tail | ||
+ | GET_THREAD_INFO(%ebp) | ||
+ | popl %eax | ||
+ | CFI_ADJUST_CFA_OFFSET -4 | ||
+ | pushl $0x0202 # Reset kernel eflags | ||
+ | CFI_ADJUST_CFA_OFFSET 4 | ||
+ | popfl | ||
+ | CFI_ADJUST_CFA_OFFSET -4 | ||
+ | jmp syscall_exit | ||
+ | CFI_ENDPROC | ||
+ | END(ret_from_fork) | ||
+ | |||
+ | Note that register names are eax, ebx etc. with % as a sigil, and that assignment is left-to-right. | ||
=== IA-64 === | === IA-64 === |
Revision as of 18:22, 9 October 2011
The Assembler
The FPC Pascal Compiler translates Pascal source code into assembly language which is then processed by an assembler running as a separate backend. Some other Pascal compilers directly generate object modules or executable programs directly, i.e. they do not require a separate assembler.
An assembler is itself an executable program that translates assembly language into an object module. In most cases the object modules are passed to a linker which then produces an executable program, although in some there are additional stages (code signing for secure operating systems, conversion to a binary for embedded systems and so on).
The ABI
The interface between an executable program and the underlying operating system is referred to as the Application Binary Interface or ABI. This includes the CPU's operating mode (e.g. whether word and address sizes default to 32 or 64 bits), operand alignment, function calling conventions, system call numbers, and a selection of constants (e.g. file open modes) and structures (e.g. as returned by the stat() function). It is also usually considered to include the format of the object modules, executable and library files.
Obviously the ABI is grossly different between operating systems: in general a program compiled for Windows will not run on Linux and vice versa. In addition, however, there is a significant amount of variation between different "flavours" of related operating systems, for example not only are the system call numbers different between SPARC Solaris and SPARC Linux but they are different between SPARC Linux and x86 Linux.
Purpose of this note
In most cases FPC uses the GNU assembler (as or gas) as its backend. However, the assembly language syntax expected by this is different for each target CPU, sections below give examples of this. The original incentive for this was because the author (MarkMLl) found that he needed to write an assembler reader for the MIPS processor, and that there was no straightforward comparison of existing formats on which he could base new code.
In addition, in some cases the details of the assembly language format or the ABI specification are only available to users registered with the relevant manufacturer, where possible links to unofficial mirrors are given below for casual reference.
Assembler source formats
Assembler source emitted by the compiler's code generator has to be (a subset of what is) acceptable to the assembler for the relevant target CPU. In addition, small portions of the RTL (e.g. prt0.as) are of necessity written in assembler, and some Pascal source files (e.g. syscall.inc) contain inline assembler which the compiler has to be able to parse before it is passed to the backend.
The list of CPUs below is taken from the compiler as of late 2011. Some of these are no longer supported, or exist merely as minimal stubs.
Alpha
This compiler exists only as a minimal stub.
ARM
This fragment is from FpSysCall alias FPC_SYSCALL6 in FPC's ./rtl/linux/arm/syscall.inc:
asm stmfd sp!,{r4,r5,r6} ldr r4,param4 ldr r5,param5 ldr r6,param6 bl FPC_SYSCALL ldmfd sp!,{r4,r5,r6} end;
This fragment is from ret_from_fork in Linux's ./arch/arm/kernel/entry-common.S:
ENTRY(ret_from_fork) bl schedule_tail get_thread_info tsk ldr r1, [tsk, #TI_FLAGS] @ check for syscall tracing mov why, #1 tst r1, #_TIF_SYSCALL_TRACE @ are we tracing syscalls? beq ret_slow_syscall mov r1, sp mov r0, #1 @ trace exit [IP = 1] bl syscall_trace b ret_slow_syscall ENDPROC(ret_from_fork)
Note that register names are r0, r1 etc. without a sigil, and that assignment is right-to-left.
AVR
This fragment is from ret_from_fork in Linux's ./arch/avr32/kernel/entry-avr32b.S:
ret_from_fork: call schedule_tail /* check for syscall tracing */ get_thread_info r0 ld.w r1, r0[TI_flags] andl r1, _TIF_ALLWORK_MASK, COH brne syscall_exit_work rjmp syscall_exit_cont
Note that register names are r0, r1 etc. without a sigil, and that assignment is right-to-left.
i386
This fragment is from FpSysCall alias FPC_SYSCALL6 in FPC's ./rtl/linux/i386/syscall.inc:
asm push %ebx push %edx push %esi push %edi push %ebp push %ecx cmp $0, sysenter_supported jne .LSysEnter movl %edx,%ebx // param1 pop %ecx // param2 movl param3,%edx // param3 movl param4,%esi // param4 movl param5,%edi // param5 movl param6,%ebp // param6 int $0x80 jmp .LTail .LSysEnter: movl %edx,%ebx // param1 pop %ecx // param2 movl param3,%edx // param3 movl param4,%esi // param4 movl param5,%edi // param5 movl param6,%ebp // param6 call psysinfo .LTail: pop %ebp pop %edi pop %esi pop %edx pop %ebx cmpl $-4095,%eax jb .LSyscOK negl %eax call seterrno movl $-1,%eax .LSyscOK: end;
This fragment is from ret_from_fork in Linux's ./arch/x86/kernel/entry_32.S:
ENTRY(ret_from_fork) CFI_STARTPROC pushl %eax CFI_ADJUST_CFA_OFFSET 4 call schedule_tail GET_THREAD_INFO(%ebp) popl %eax CFI_ADJUST_CFA_OFFSET -4 pushl $0x0202 # Reset kernel eflags CFI_ADJUST_CFA_OFFSET 4 popfl CFI_ADJUST_CFA_OFFSET -4 jmp syscall_exit CFI_ENDPROC END(ret_from_fork)
Note that register names are eax, ebx etc. with % as a sigil, and that assignment is left-to-right.
IA-64
This compiler exists only as a minimal stub.
M68K
This compiler exists in FPC v1 but has never been ported to v2.
MIPS
PowerPC
PowerPC-64
SPARC
VIS
This compiler exists only as a minimal stub.
x86
Refer to i386 above.
x86-64
ABI references
The list of CPUs etc. is based on those found in the compiler (see above).
Alpha
ARM
AVR
i386
IA-64
M68K
MIPS
PowerPC
PowerPC-64
SPARC
VIS
x86
x86-64
Other resources
As a general point, there's some useful thoughts on binary disassembly at http://chdk.wikia.com/wiki/GPL_Disassembling for situations where IDA or equivalent aren't available.