Difference between revisions of "Writing portable code regarding the processor architecture"

From Free Pascal wiki
Jump to navigationJump to search
Line 2: Line 2:
  
 
== Endianess ==
 
== Endianess ==
 +
 +
Endianess is the way how values larger than a byte (e.g. a 16,32,64-bits
 +
integer) are stored by the processor.
 +
 +
Generally there are two ways:
 +
 +
# Store the lowest value on the lowest address
 +
# store the highest value on the highest address
 +
 +
The first way is called ''little endian'', and the second way ''big endian''.
 +
 +
This is generally a given choice per processorfamily, but some families of processors can be either one depending on the mainboard they are attached to (ARM, PPC)
 +
 +
The best know little endian processor family is x86, the processor in PCs, and uts brethren x86-64
 +
 +
Typical Big endian processors are PPC (usually, see above note), and Motorola's m68k.
 +
 +
Since TCP/IP specifies that all structures that go over the wire should be big endian, this notation is sometimes also refered to as ''Network order''
 +
 +
Endianess is important
 +
# when exchanging data between different architectures
 +
# when accessing data sometimes as (an array of) a larger type, like integer,
 +
  and sometimes as( an array of ) a byte
 +
 +
An example of the latter :
 +
 +
var x : ^longint;
 +
 +
begin
 +
  new(x);
 +
  x^:=5;
 +
  writeln(chr(ord(pchar(x)^)+48));  // writes 5 or 0 depending on endianness
 +
end.
 +
 +
On little endian (PCs), the above code will write 5 (since longint(5) is stored as 05 00 00 00 in memory), while on big endian machines (e.g. Powermacs) it will
 +
write 0 (since longint(5) is stored 00 00 00 05 in memory)
  
 
== Alignment ==
 
== Alignment ==

Revision as of 16:09, 4 May 2004

There are several main issues when writing code which is portable regarding the processor architecture: endianess and 32 vs. 64 Bit processors.

Endianess

Endianess is the way how values larger than a byte (e.g. a 16,32,64-bits integer) are stored by the processor.

Generally there are two ways:

  1. Store the lowest value on the lowest address
  2. store the highest value on the highest address

The first way is called little endian, and the second way big endian.

This is generally a given choice per processorfamily, but some families of processors can be either one depending on the mainboard they are attached to (ARM, PPC)

The best know little endian processor family is x86, the processor in PCs, and uts brethren x86-64

Typical Big endian processors are PPC (usually, see above note), and Motorola's m68k.

Since TCP/IP specifies that all structures that go over the wire should be big endian, this notation is sometimes also refered to as Network order

Endianess is important

  1. when exchanging data between different architectures
  2. when accessing data sometimes as (an array of) a larger type, like integer,
 and sometimes as( an array of ) a byte

An example of the latter :

var x : ^longint;

begin

  new(x);
  x^:=5;
  writeln(chr(ord(pchar(x)^)+48));  // writes 5 or 0 depending on endianness

end.

On little endian (PCs), the above code will write 5 (since longint(5) is stored as 05 00 00 00 in memory), while on big endian machines (e.g. Powermacs) it will write 0 (since longint(5) is stored 00 00 00 05 in memory)

Alignment

Some processors generate hardware processor exceptions when data is badly aligned. (e.g. Alpha or ARM). Sometimes the hardware exceptions are caught and fixed using emulation by the OS, but this is very slow, and should be avoided. This can also cause records to have different sizes, so always use sizeof(recordtype) as size of a record. If you define packed record, try to ensure that data is naturally aligned, if possible.

To check if the CPU requires proper alignment, check the FPC_REQUIRES_PROPER_ALIGNMENT define. On 32 Bit CPUs this usually means that data up to a size of 4 must be naturally aligned. If you've to access unaligned data, use the move procedure to move it to a aligned location before processing it. The move procedure takes care of unaligned data and handles it properly.

32 Bit vs. 64 Bit

To achive a maximum compatiblity with older code, FPC doesn't change the size of predefined data types like integer, longint or word when changing from 32 to 64 Bit. However, the size of a pointer is 8 bytes on a 64 bit architecture so constructs like longint(pointer(p)) are doomed to crash on 64 bit architectures. However, to allow you to write portable code, the FPC system unit introduces the types PtrInt and PtrUInt which are signed and unsigned integer data types with the same size as a pointer.

Calling conventions