Memory Management

From Free Pascal wiki
Revision as of 11:32, 10 August 2023 by Warfley (talk | contribs) (Created page with "Pascal, by default, provides three kinds of memory management mechanisms for different datatypes and in different contexts: Local Lifetime, Manual Memory Management and Refere...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigationJump to search

Pascal, by default, provides three kinds of memory management mechanisms for different datatypes and in different contexts: Local Lifetime, Manual Memory Management and Reference Counting.

In the following these mechanisms will be explained in detail, as well as outlining some best practices for their use.

Local Lifetime

Local lifetime is the first form of memory management most programmers come into contact with. This is the default way on how variables and function parameters are handled. Here the compiler fully takes over the management of the lifetime of the variable. Take for example this simple function:

procedure PrintTenth(i: Integer);
var
  x: Double;
begin
  x := i / 10;
  WriteLn(x);
end;

The compiler will create the memory for both the parameter i and the variable x when the function is called. Once the function ends, the memory will be freed. For functions this is ususally done through the local program stack. When calling the function, the compiler generates code that will push the memory for all local variables and parameters on the stack. This is called a stack frame. Once the function returns, the whole stack frame will be poped from the stack, and thereby freeing the allocated memory.

Global variables are created on program startup and will be freed when the program is closed.

var
  GlobalX: Double;

procedure PrintTenth(i: Integer);
begin
  GlobalX := i / 10;
  WriteLn(x);
end;

Here GlobalX, as a global variable is not bound by the lifetime of the function PrintTenth and will be available as long as the program is running.

Lastly there are therad local variables. Those are global variables, whose lifetime is bound to a thread. The memory will be newly allocated for every thread started within the program, and will be freed when it's corresponding thread is finished or killed.

threadvar
  ThreadedX: Double;

procedure PrintTenth(i: Integer);
begin
  ThreadedX := i / 10;
  WriteLn(x);
end;

So when PrintTenth would be called from two different threads, they would use a different memory for their thread local ThreadedX, while when called twice from the same thread, it would be the same memory.

Dangling Pointers

Local lifetime is a very easy and also very efficient way of memory management. This allows for usually quite care-free usage of that memory, as the compiler will take care of all of it.

That said, there is a big limitation here, any data can only live as long as it's context. This is not an issue for global variables, because by definition, they live as long as any code that could access them. But specifically when using local variables, it can happen that there may still be references to that variable after the function containing it has ended. This is called a dangling pointer.

Take the following program:

function Dangling: PInteger;
var
  x: Integer;
begin
  x := 42;
  Result := @x;
end;

var
  p: PInteger;
begin
  p := Dangling;
  WriteLn(p^);
end.

The function Dangling returns a pointer to a local variable. The problem here is, that as soon as the function ends, x doesn't exist anymore, so the Pointer returned (and stored in p) will point to already freed memory. Those kinds of bugs can be hard to find, because of the nature of the stack, the memory is still around and may not have been reused at this point. In the example above, WriteLn(p^) will still print 42, as nothing happend to override this newly freed memory yet. But if another function is called:

function Dangling: PInteger;
var
  x: Integer;
begin
  x := 42;
  Result := @x;
end;

procedure Smack;
var
  x: Integer;
begin
  x := -1;
  WriteLn('X: ', x);
end;

var
  p: PInteger;
begin
  p := Dangling;
  Smack;
  WriteLn(p^);
end.

Now the function Smack has it's own stack frame, and it will be overlapping with the memory previously used by x and pointed to by p.

This makes finding such bugs quite hard, as the code might work as expected initially, but later, when another function was called, suddenly the results are completely different.

Best Practices

Generally usage of local lifetime managed variables is very easy and mostly safe. Due to the implementation through the stack it is also very efficient. It is therefore recommended to always use local variables whenever possible, and therefore preferable over the other methods outlined in this article.

That said, the programmer should take a few precautions to avoid having dangling pointers:

  • Never return the pointer to a local variable:
This is the most straight forward to follow guideline. Simply avoid taking the address of any local variable in the Result of a function.
This is always an error.
  • Avoid taking pointers from local variables:
While it's quite easy to check if a function returns a pointer to a local variable through it's Result, there are other ways a pointer to a local variable can be used after the function that took that pointer has ended.
For example if the pointer is written into a global variable, or written into an object. A common mistake is to put the pointer into the tag of an LCL control, e.g. in a ListView:
procedure TForm1.CreateItem(Data: TMyTreeData);
begin
  With ListView1.Items.Add do
  begin
    Caption := Data.Name;
    Tag := IntPtr(@Data);
  end;
end;
Because the Tag allows to put in a Pointer for storing additional data, the usage of local variables is a common mistake here. As the function ends, the local variable (or parameter in this case) will be destroyed, but the newly created list item will still prevail. Therefore if later the tag is accessed, it will be trying to dereference a pointer to memory that is already long gone, and therefore result in a bug.
The best way to avoid this, is to simply never take pointers to local variables if you can avoid it. Passing parameters as var to another function allows to reference them in a safe manner, as the var parameter will also only live as long as the called function.
  • Know your lifetimes:
If you must take the pointer of a local variable, e.g. when you create an object that must requires a pointer, e.g. to a buffer, you must make sure that any of those occurances where the pointer is used, cannot outlive the function.
In case of objects this means that the object is freed within the same function that holds the local variable. In case of global variables, the global variable must be reset before the function has ended, or it must be guaranteed that the global variable is not touched afterwards.
This requires extensive knowledge about the code base, and can be really hard to track. Therefore this should be avoided whenever possible.

Notes

Some additional notes about local lifetime variables:

Writeable Constants

One big issue with global variables is the (lack of) scoping. A global variable is, as the name suggests, globally accessible. In order to avoid bugs through misuse, you may want to restrict access to a variable within a context, the so called scope, but without restring the lifetime to that scope. This can be achived with writable consts. Those are local variables with a global lifetime, meaning they will be created when the program first calls the encapsulating function, but they won't be freed until the program ends.

function NextID: Integer;
const
  CurrentID: Integer = 0;
begin
  Result := CurrentID;
  CurrentID += 1;
end;

This function declares CurrentID as writable const, meaning it has a global lifetime, even though it is locally defined in NextID. Therefore every call to the function NextID uses the same memory for CurrentID, so NextID can "remember" the CurrentID from the last call. This allows this function to return a new ID, always being one more than the last time, in each consecutive call.

Management Operators

For most types the local lifetime will just manage the memory. Often a programmer wants to associate code with the construction or destruction of the memory. For example, when a TFileStream is created, not only the memory is allocated, but also a file is opend. Similarly when a TFileStream is freed, not only the memory will be freed, but also that file will be closed.

Previously this was only possible for classes, which had constructors and destructors tied to their memory lifetime. But those classes can only be used with manual memory management, or reference counting (through COM interfaces, see below), and therefore was not available for local lifetime managed variables. This has changed in FPC 3.2.0 with the use of management operators. These allow to write code that will be automatically called by the compiler when the memory is allocated (through the Initialize operator), and when it is freed (through the Finalize operator). See the wiki article for further information.

Manual Memory Management

TODO

Reference Counting

TODO