Difference between revisions of "Assembler and ABI Resources"

From Free Pascal wiki
Jump to navigationJump to search
m (More text.)
Line 17: Line 17:
 
Assembler source emitted by the compiler's code generator has to be (a subset of what is) acceptable to the assembler for the relevant target CPU. In addition, small portions of the RTL (e.g. prt0.as) are of necessity written in assembler, and some Pascal source files (e.g. syscall.inc) contain inline assembler which the compiler has to be able to parse before it is passed to the backend.
 
Assembler source emitted by the compiler's code generator has to be (a subset of what is) acceptable to the assembler for the relevant target CPU. In addition, small portions of the RTL (e.g. prt0.as) are of necessity written in assembler, and some Pascal source files (e.g. syscall.inc) contain inline assembler which the compiler has to be able to parse before it is passed to the backend.
  
 +
The list of CPUs below is taken from the compiler as of late 2011. Some of these are no longer supported, or exist merely as minimal stubs.
  
 
+
=== Alpha ===
 
+
=== ARM ===
 +
=== AVR ===
 +
=== i386 ===
 +
=== IA-64 ===
 +
=== M68K ===
 +
=== MIPS ===
 +
=== PowerPC ===
 +
=== PowerPC-64 ===
 +
=== SPARC ===
 +
=== VIS ===
 +
=== x86 ===
 +
=== x86-64 ===
  
 
== ABI references ==
 
== ABI references ==

Revision as of 10:45, 9 October 2011

The Assembler

The FPC Pascal Compiler translates Pascal source code into assembly language which is then processed by an assembler running as a separate backend. Some other Pascal compilers directly generate object modules or executable programs directly, i.e. they do not require a separate assembler.

An assembler is itself an executable program that translates assembly language into an object module. In most cases the object modules are passed to a linker which then produces an executable program, although in some there are additional stages (code signing for secure operating systems, conversion to a binary for embedded systems and so on).

The ABI

The interface between an executable program and the underlying operating system is referred to as the Application Binary Interface or ABI. This includes the CPU's operating mode (e.g. whether word and address sizes default to 32 or 64 bits), operand alignment, function calling conventions, system call numbers, and a selection of constants (e.g. file open modes) and structures (e.g. as returned by the stat() function). It is also usually considered to include the format of the object modules, executable and library files.

Obviously the ABI is grossly different between operating systems: in general a program compiled for Windows will not run on Linux and vice versa. In addition, however, there is a significant amount of variation between different "flavours" of related operating systems, for example not only are the system call numbers different between SPARC Solaris and SPARC Linux but they are different between SPARC Linux and x86 Linux.

Purpose of this note

In most cases FPC uses the GNU assembler (as or gas) as its backend. However, the assembly language syntax expected by this is different for each target CPU, sections below give examples of this. The original incentive for this was because the author (MarkMLl) found that he needed to write an assembler reader for the MIPS processor, and that there was no straightforward comparison of existing formats on which he could base new code.

In addition, in some cases the details of the assembly language format or the ABI specification are only available to users registered with the relevant manufacturer, where possible links to unofficial mirrors are given below for casual reference.

Assembler source formats

Assembler source emitted by the compiler's code generator has to be (a subset of what is) acceptable to the assembler for the relevant target CPU. In addition, small portions of the RTL (e.g. prt0.as) are of necessity written in assembler, and some Pascal source files (e.g. syscall.inc) contain inline assembler which the compiler has to be able to parse before it is passed to the backend.

The list of CPUs below is taken from the compiler as of late 2011. Some of these are no longer supported, or exist merely as minimal stubs.

Alpha

ARM

AVR

i386

IA-64

M68K

MIPS

PowerPC

PowerPC-64

SPARC

VIS

x86

x86-64

ABI references

Other resources

As a general point, there's some useful thoughts on binary disassembly at http://chdk.wikia.com/wiki/GPL_Disassembling for situations where IDA or equivalent aren't available.