Difference between revisions of "Mach-O"

From Free Pascal wiki
Jump to navigationJump to search
 
(7 intermediate revisions by 3 users not shown)
Line 1: Line 1:
from [http://en.wikipedia.org/wiki/Mach-O Wikipedia]:
+
{{Platform only|macOS}}
 +
{{LanguageBar}}
  
''Mach-O, short for Mach object file format, is a file format for executables, object code, shared libraries, dynamically-loaded code, and core dumps. A derivation of the a.out format, Mach-O offered more extensibility and faster access to information in the symbol table.''
 
  
 +
''Mach-O, short for Mach object file format, is a file format for executables, object code, shared libraries, dynamically-loaded code, and core dumps. A derivation of the a.out format, Mach-O offered more extensibility and faster access to information in the symbol table.'' Source: [http://en.wikipedia.org/wiki/Mach-O Wikipedia].
  
File Format Reference can be found [http://developer.apple.com/documentation/DeveloperTools/Conceptual/MachORuntime/Reference/reference.html here]
+
The Apple file format reference can be found [https://developer.apple.com/library/archive/documentation/Performance/Conceptual/CodeFootprint/Articles/MachOOverview.html here]
  
  
Following tools are used in Mac OS X to view Mach-O files:
+
The following tools are used in macOS to view Mach-O files:
 +
 
 +
'''otool''' - object file displaying tool
 +
 
 +
'''nm''' - display name list (symbol table)
  
[http://developer.apple.com/documentation/Darwin/Reference/ManPages/man1/otool.1.html otool] - object file displaying tool
 
  
[http://developer.apple.com/documentation/Darwin/Reference/ManPages/man1/nm.1.html nm] - display name list (symbol table)
 
  
 
== Objective-C segment ==
 
== Objective-C segment ==
Line 83: Line 86:
 
   super_class  : PChar;  // contains the pointer to super_class name
 
   super_class  : PChar;  // contains the pointer to super_class name
 
   name          : PChar;  // pointer to class name
 
   name          : PChar;  // pointer to class name
   version      : PChat;  // = 0 (for obj-c version 1?)
+
   version      : PChar;  // = 0 (for obj-c version 1?)
 
   
 
   
 
   info          : culong;  // CLS_CLASS for classes
 
   info          : culong;  // CLS_CLASS for classes
Line 89: Line 92:
 
   
 
   
 
   instance_size : culong;  // size of the instance:
 
   instance_size : culong;  // size of the instance:
                             // sizeof(objc_class) + VariablesSize (+ 8 bytes if meta-class)
+
                             // class: size of the class instance
 +
                            // meta-class: 48 bytes
 
                              
 
                              
 
   ivars        : Pobjc_ivar_list;      // virtual address of objc_ivar_list (stored in "__instance_vars" section)
 
   ivars        : Pobjc_ivar_list;      // virtual address of objc_ivar_list (stored in "__instance_vars" section)
Line 99: Line 103:
 
    
 
    
 
   cache        : Pobjc_cache;          // zero
 
   cache        : Pobjc_cache;          // zero
   protocols    : Pobjc_protocol_list;  // todo:
+
   protocols    : Pobjc_protocol_list;  // pointer to protocols list. (stored in "__cat_cls_meth" section)
 
  end;
 
  end;
  
Line 142: Line 146:
  
 
=== __protocol ===
 
=== __protocol ===
 +
 +
=== __cat_cls_meth (protocols list) ===
  
 
=== __cat_inst_meth ===
 
=== __cat_inst_meth ===
Line 154: Line 160:
 
  end.
 
  end.
  
gives a 30k (stripped) executable for Win, and 60k for Mac OS X.
+
gives a 30k (stripped) executable for Win, and 60k for macOS.
 +
 
 +
Jonas Maebe: ''It's because there was a bug in older versions of the Darwin linker that required adding ".reference" assembler directives for routines that have more than one assembler name (most of the compiler helpers in the RTL have that). This fixed the problem, but as a result they are never smart linked out. It's only a fixed overhead of that 30kb (most programs don't contain any extra routines with multiple assembler names)''
  
 +
FPC 3.3.1 (trunk r44876) produces a 64 bit binary of 427,920 bytes; add -XX to smart link and it is reduced to 55,656 bytes; now strip it and it's just 47,224 bytes (~46K) which is pretty reasonable.
  
Jonas Maebe: ''It's because there was a bug in older versions of the Darwin linker that required adding ".reference" assembler directives for routines that have more than one assembler name (most of the compiler helpers in the rtl have that). This fixed the problem, but as a result they are never smart linked out. It's only a fixed overhead of that 30kb (most programs don't contain any extra routines with multiple assembler names)''
+
[[Category: macOS]]

Latest revision as of 14:11, 17 May 2020

macOSlogo.png

This article applies to macOS only.

See also: Multiplatform Programming Guide

English (en)


Mach-O, short for Mach object file format, is a file format for executables, object code, shared libraries, dynamically-loaded code, and core dumps. A derivation of the a.out format, Mach-O offered more extensibility and faster access to information in the symbol table. Source: Wikipedia.

The Apple file format reference can be found here


The following tools are used in macOS to view Mach-O files:

otool - object file displaying tool

nm - display name list (symbol table)


Objective-C segment

There is no documentation about __OBJC segment and its sections. The following information has been gathered from cctools sources

Structures strings

Some structures in sections contain name pointers. These names are stored in the c-strings section (segment: __TEXT; section: __cstring). The file offset for the string name can be evaluated in the following way:

string_file_offset := cstr_section.offset + (name_addr - cstr_section.addr);

Sections

__image_info

The section contains only the image info information:

imageInfo = packed record
  version : uint32_t;  // zero
  flags   : uint32_t;  // for objc 1.0 - zero
end;

Flags values:

ImageInfo_F_and_C = $01;
ImageInfo_GC      = $02;
ImageInfo_GC_only = $04;

__module_info

(objc_module record is declared at objc headers).

The number of objc_module structures depends on the number of .m files with objects declarations compiled. _symtab contains the number of classes and categories declared in the module.

objc_module = packed record
  version : culong; // version number = 7 
  size    : culong; // sizeof(objc_module)?
  name    : PChar;  // virtual memory address of the module name
                    // Usually mapped to NULL string
  _symtab : Symtab; // virtual memory address of a proper objc_symtab structure (in __symbols section)
end;

__symbols

the section contains symbol table for a module. (symtable record is declared at objc headers)

 objc_symtab = record
   sel_ref_cnt : culong;  // zero
   refs        : PSEL;    // zero
   cls_def_cnt : cushort; // number of declared classes in the module
   cat_def_cnt : cushort; // number of declared categories
   defs: array [0..cls_def_cnt+cat_def_cnt-1] of Pointer; // array of virtual address of declarations
      // 0..cls_def_cnt - 1           : virtual address of class declarations (in "__class" section, can be empty)
      // cls_def_cnt..cat_def_cnt-1   : virtual address of categories declarations (can be empty)
 end;

__class, __meta_class

the section may omit, if none of modules declare any custom classes

Both sections use identical structure objc_class. (objc_class record is declared at objc header) Objective-C classes are declared in pair with their meta_classes.

objc_class = record
  isa           : PChar;   // for class declaration: virtual address of meta-class declaration
                           // for meta-class declaration: virtual address of "NSObject" string?

  super_class   : PChar;   // contains the pointer to super_class name
  name          : PChar;   // pointer to class name 		
  version       : PChar;   // = 0 (for obj-c version 1?)

  info          : culong;  // CLS_CLASS for classes
                           // CLS_META  for meta-classes 

  instance_size : culong;  // size of the instance:
                           // class: size of the class instance 
                           // meta-class: 48 bytes 
                           
  ivars         : Pobjc_ivar_list;       // virtual address of objc_ivar_list (stored in "__instance_vars" section)
                                         // meta-classes don't have ivars list (=0)

  methodLists   : PPobjc_method_list;    // virtual address of objc_method_list
                                         // class declaration has instance methods list (stored in "__inst_meth" section)
                                         // meta-class declaration has class methods list (stored in "__cls_meth" section)
 
  cache         : Pobjc_cache;           // zero
  protocols     : Pobjc_protocol_list;   // pointer to protocols list. (stored in "__cat_cls_meth" section)
end;

__instance_vars

the section is optional and may omit if none of classes declares instance variables

The section consists of number objc_ivar_list structutres. The number of structures depends upon number of declared classes using instance variables. (objc_ivar_list is declared in objc header)

objc_ivar_list = record
  ivar_count  : cint;   // number of variables in the list
  ivar_list   : array[0..ivar_count-1] of objc_ivar;  variable length structure }
end;
objc_ivar = record
  ivar_name   : PChar; // vm addr of the variable name
  ivar_type   : PChar; // vm addr of obj-c variable encoded type 
  ivar_offset : cint;  // offset from the start of the instance. (the lowest ivar_offset is 40,
                       // because sizeof(objc_class) = 40, and objc_class is also part of the instance)
end;

__inst_meth, __cls_meth

the section is optional and may omit if none of classes declares any methods

Both section has identical format. They contain number of objc_method_list records. Number of records is depending on classes declared in modules.

objc_method_list = record
  obsolete     : Pobjc_method_list;  // not used, always zero
  method_count : cint;               // number of objc_method in method_list array 
  method_list  : array[0..method_count-1] of objc_method;	
end;
objc_method = record
  method_name   : SEL;   // virtual address of the selector name 
                         // (selector name is stored as other names is in __TEXT __cstring section)
                         // selector name is short: i.e. "methodName:", and not "+[className methodName:]"
  method_types  : PChar; // obj-c encoded function parameters 
  method_imp    : IMP;   // virtual address of method implementation function entry point (in __TEXT __text section)
end;

__protocol

__cat_cls_meth (protocols list)

__cat_inst_meth

__cat_cls_meth

Mach-O additional 30Kb size

FPC built mach-o executables are somehow larger, compared to the win32 target, for example

begin
  writeln('hello world');
end.

gives a 30k (stripped) executable for Win, and 60k for macOS.

Jonas Maebe: It's because there was a bug in older versions of the Darwin linker that required adding ".reference" assembler directives for routines that have more than one assembler name (most of the compiler helpers in the RTL have that). This fixed the problem, but as a result they are never smart linked out. It's only a fixed overhead of that 30kb (most programs don't contain any extra routines with multiple assembler names)

FPC 3.3.1 (trunk r44876) produces a 64 bit binary of 427,920 bytes; add -XX to smart link and it is reduced to 55,656 bytes; now strip it and it's just 47,224 bytes (~46K) which is pretty reasonable.