Difference between revisions of "Defensive programming techniques"

From Free Pascal wiki
Jump to navigationJump to search
 
(13 intermediate revisions by 3 users not shown)
Line 1: Line 1:
== A broad range of tips on how to catch and prevent Range Errors ==
+
== How to catch and prevent Range Errors ==
Range errors are easy to introduce and sometimes hard to find. They can exist for years without being noticed. I have seen production units where range checks were deliberately turned off by adding {$R-} in a unit and nobody noticed this for years. When I compiled the code during a review with range checks on {$R+} I found a huge bug that potentially could crash a vital piece of software. Mind you, there can be reasons to turn range checks off but never for a whole unit or a whole program, unless it is fully tested for a release.<br>
+
Range errors are easy to introduce and sometimes hard to find.
I will show you how to find range errors, how to debug them and how to prevent them. Defensive programming is important with ranges.
+
They can exist for years without being noticed.
<br>
+
I have seen production units where range checks were deliberately turned off by adding [[sRangechecks|<syntaxhighlight lang="pascal" enclose="none">{$R-}</syntaxhighlight>]] in a unit and nobody noticed this for years.
 +
When I compiled the code during a review with range checks on <syntaxhighlight lang="pascal" enclose="none">{$R+}</syntaxhighlight> I found a huge bug that potentially could crash a vital piece of software.
 +
Mind you, there can be reasons to turn range checks off but never for a whole unit or a whole program, unless it is fully tested for a release.
 +
 
 +
I will show you how to find range errors, how to debug them and how to prevent them.
 +
Defensive programming is important with ranges.
 +
 
 
=== The bug ===
 
=== The bug ===
 
Let's introduce you to a small piece of code with a range bug.
 
Let's introduce you to a small piece of code with a range bug.
<syntaxhighlight>
+
<syntaxhighlight lang="pascal">
 
program dtp_1a;
 
program dtp_1a;
 
{$mode objfpc}
 
{$mode objfpc}
Line 12: Line 18:
 
   i:integer;
 
   i:integer;
 
begin
 
begin
   for i := 1 to 10 do  
+
   for i := 1 to 10 do
 
   begin
 
   begin
 
     anArray[i] := i;
 
     anArray[i] := i;
 
     write(anArray[i]:3);
 
     write(anArray[i]:3);
   end;
+
   end;
end.</syntaxhighlight>
+
end.
This code compiles without error and on some systems it even runs! without error:<br>
+
</syntaxhighlight>
'''fpc -glh dtp_1a.pas''<br>
+
 
Note -glh obtains line info in case of an error.
+
This code compiles without error and on some systems it even runs! without error:
 +
 
 +
<syntaxhighlight lang="bash">
 +
$ fpc -glh dtp_1a.pas
 +
</syntaxhighlight>
 +
 
 +
Note <code>-glh</code> obtains line info in case of an error.
 
Running the program yields:
 
Running the program yields:
<syntaxhighlight>
+
 
 +
<syntaxhighlight lang="text">
 
dtp
 
dtp
   1  2  3  4  5  6  7  8  9 10</syntaxhighlight>
+
   1  2  3  4  5  6  7  8  9 10
That may seem right, but is wrong! It could also SEGFAULT or worse ... Which you know if you have spotted the bug.<br>
+
</syntaxhighlight>
 +
 
 +
That may seem right, but is wrong!
 +
It could also SEGFAULT or worse
 +
Which you know if you have spotted the bug.
  
 
=== Turn range checks on ===
 
=== Turn range checks on ===
 
Now let's see what happens when we compile with range checks:
 
Now let's see what happens when we compile with range checks:
<syntaxhighlight>
+
 
 +
<syntaxhighlight lang="pascal" line highlight="3">
 
program dtp_1b;
 
program dtp_1b;
{$mode objfpc}{$R+}
+
{$mode objfpc}
 +
{$R+}
 
var
 
var
 
   anArray:array[0..9] of integer; // ten elements
 
   anArray:array[0..9] of integer; // ten elements
 
   i:integer;
 
   i:integer;
 
begin
 
begin
   for i := 1 to 10 do  
+
   for i := 1 to 10 do
 
   begin
 
   begin
 
     anArray[i] := i;
 
     anArray[i] := i;
 
     write(anArray[i]:3);
 
     write(anArray[i]:3);
   end;
+
   end;
end.</syntaxhighlight>
+
end.
'''fpc -glh dtp_1b.pas'''<br>
+
</syntaxhighlight>
You may not expect this code to compile if you discovered the error, but unfortunately it compiles without error or warning. The fun starts when you run it:
+
 
<syntaxhighlight>
+
<syntaxhighlight lang="bash">
 +
$ fpc -glh dtp_1b.pas
 +
</syntaxhighlight>
 +
 
 +
You may not expect this code to compile if you discovered the error, but unfortunately it compiles without error or warning.
 +
The fun starts when you run it:
 +
 
 +
<syntaxhighlight lang="text">
 
dtp
 
dtp
 
   1  2  3  4  5  6  7  8  9Runtime error 201 at $000101B8
 
   1  2  3  4  5  6  7  8  9Runtime error 201 at $000101B8
   $000101B8  main,  line 9 of dtp.pas
+
   $000101B8  main,  line 10 of dtp.pas
 
   $00010124
 
   $00010124
  
Line 53: Line 79:
 
Exitcode = 201
 
Exitcode = 201
 
</syntaxhighlight>
 
</syntaxhighlight>
Ok, we found a bug at line 9 of our program and 201 means range error. Useful, but not very, since we had to run the program to make it crash. Hardly acceptable. Furthermore not every programmer sees what the bug is since it occurs in a loop. Which is wrong? '''i''' or '''anArray[i]''' or both? And when it goes wrong is also not obvious to all.<br>
 
Both the fp textmode IDE and Lazarus are able to debug our program, so we set a breakpoint on line 9 and press F9 a couple of times. Note I also set a watch on i.<br>
 
[[File:dtp_1b.png]]<br><br>
 
So I pressed F9 10 times and hey presto, the error is when i becomes 10 and we try to access anArray[10].
 
But that means the actual error is on line 7. We are over-indexing because the array is from 0..9, not from 1 to 10.<br>
 
'''Bug found and cause of bug found.''' But not fixed, remember we found it at runtime, not at compile time.<br>
 
<syntaxhighlight>
 
'To summarize, turning range checks on finds range errors at run time, but not always at compile time.'</syntaxhighlight>
 
  
=== Declare ranges and use Low() and High() ===
+
Ok, we found a bug at line 10 of our program and 201 means range error.
Object Pascal has a nice feature that is a bit underused, but is very useful in our case, '''ranges'''. Basically, by declaring a range we can find range errors at compile time and that is exactly what we want.
+
Useful, but not very, since we had to run the program to make it crash.
<syntaxhighlight>
+
Hardly acceptable.
 +
Furthermore not every programmer sees what the bug is since it occurs in a loop.
 +
Which is wrong?
 +
<syntaxhighlight lang="pascal" enclose="none">i</syntaxhighlight> or <syntaxhighlight lang="pascal" enclose="none">anArray[i]</syntaxhighlight> or both?
 +
And when it goes wrong is also not obvious to all.
 +
 
 +
Both the FP [[Textmode IDE|textmode IDE]] and [[Lazarus]] are able to debug our program, so we set a breakpoint on line 10 and press <kbd>F9</kbd> a couple of times.
 +
Note I also set a watch on <syntaxhighlight lang="pascal" enclose="none">i</syntaxhighlight>.
 +
 
 +
[[File:dtp_1b.png]]
 +
 
 +
So I pressed <kbd>F9</kbd> ten times and hey presto, the error occurs when <syntaxhighlight lang="pascal" enclose="none">i</syntaxhighlight> becomes 10 and we try to access <syntaxhighlight lang="pascal" enclose="none">anArray[10]</syntaxhighlight>.
 +
But that means the actual error is on line 9.
 +
We are over-indexing because the array is from <syntaxhighlight lang="pascal" enclose="none">0..9</syntaxhighlight>, not from <syntaxhighlight lang="pascal" enclose="none">1</syntaxhighlight> to <syntaxhighlight lang="pascal" enclose="none">10</syntaxhighlight>.
 +
 
 +
'''Bug found and cause of bug found.'''
 +
But not fixed, remember we found it at runtime, not at compile time.
 +
 
 +
{{Note|To summarize, turning range checks on finds range errors at run time, but not always at compile time.}}
 +
 
 +
=== Declare ranges and use <syntaxhighlight lang="pascal" enclose="none">low()</syntaxhighlight> and <syntaxhighlight lang="pascal" enclose="none">high()</syntaxhighlight> ===
 +
Object Pascal has a nice feature that is a bit underused, but is very useful in our case, '''ranges'''.
 +
Basically, by declaring a range we can find range errors at compile time and that is exactly what we want.
 +
<syntaxhighlight lang="pascal" highlight="5">
 
program dtp_1c;
 
program dtp_1c;
 
{$mode objfpc}{$R+}
 
{$mode objfpc}{$R+}
 
var
 
var
   anArray:array[0..9] of integer; // ten elements
+
   anArray: array[0..9] of integer; // ten elements
   i:0..9; // range of 10 elements, same as array
+
   i: 0..9; // range of 10 elements, same as array
 
begin
 
begin
   for i := 1 to 10 do  
+
   for i := 1 to 10 do
 
   begin
 
   begin
 
     anArray[i] := i;
 
     anArray[i] := i;
 
     write(anArray[i]:3);
 
     write(anArray[i]:3);
   end;
+
   end;
end.</syntaxhighlight>
+
end.
By declaring a range we probably also immediately see the discrepancy in the for to code, but that is not always the case, so let's try to compile the code:<br>
+
</syntaxhighlight>
 +
By declaring a range instead of an integer we probably also immediately see the discrepancy in the for to code, but that is not always the case, so let's try to compile the code:
 +
 
 
[[File:dtp_1c.png]]
 
[[File:dtp_1c.png]]
<br> Does not work, as you can see. The code will not compile because we protected our index variable by applying a range to it. And that is exactly what we want, code that contains bugs should not compile.<br>
+
 
It is a bit difficult to maintain such code, since we have to keep the array and the range in sync, but that is easy to fix with code like this: Note I also fixed the bug here, because we found the bug and a proper debugging message that the range was wrong.<syntaxhighlight>program dtp_1d;
+
Does not work, as you can see.
 +
The code will not compile because we protected our index variable by applying a range to it.
 +
And that is exactly what we want, code that contains bugs should not compile.
 +
 
 +
It is a bit difficult to maintain such code, since we have to keep the array and the range in sync, but that is easy to fix with code like this:
 +
Note I also fixed the bug here, because we found the bug and a proper debugging message that the range was wrong.
 +
 
 +
<syntaxhighlight lang="pascal" highlight="5-6">
 +
program dtp_1d;
 
{$mode objfpc}{$R+}
 
{$mode objfpc}{$R+}
 
var
 
var
 
   anArray:array[0..9] of integer; // ten elements
 
   anArray:array[0..9] of integer; // ten elements
   i:Low(anArray)..High(anArray);  // if we change the array size this is automatically also correct.
+
   // if we change the array size this is automatically also correct.
 +
  i: low(anArray)..high(anArray);
 
begin
 
begin
 
   for i := 0 to 9 do  // can't write 10 here...
 
   for i := 0 to 9 do  // can't write 10 here...
Line 90: Line 142:
 
     anArray[i] := i;
 
     anArray[i] := i;
 
     write(anArray[i]:3);
 
     write(anArray[i]:3);
   end;
+
   end;
end.</syntaxhighlight>
+
end.
<br>
+
</syntaxhighlight>
For completeness you can also use it like this. If any size needs to change, simply change the type:
+
 
<syntaxhighlight>
+
For completeness you can also use it like this.
 +
If any size needs to change, simply change the type:
 +
 
 +
<syntaxhighlight lang="pascal">
 
program dtp_1e;
 
program dtp_1e;
 
{$mode objfpc}{$R+}
 
{$mode objfpc}{$R+}
Line 107: Line 162:
 
     anArray[i] := i;
 
     anArray[i] := i;
 
     write(anArray[i]:3);
 
     write(anArray[i]:3);
   end;
+
   end;
 
end.
 
end.
 
</syntaxhighlight>
 
</syntaxhighlight>
<syntaxhighlight>'To summarize: declaring a specific range can help you find range errors at compile time. Using Low() and High() can prevent you from making range errors'</syntaxhighlight>
 
  
=== Use for in do ===
+
{{Note|To summarize:
Now, forget all the above.... When it is possible, you should use '''for.. in.. do..'''
+
Declaring a specific range can help you find range errors at compile time.
The Pascal language has Low() and High() for many years and as shown above it can prevent you from introducing range errors.<br>
+
Using <syntaxhighlight lang="pascal" enclose="none">low()</syntaxhighlight> and <syntaxhighlight lang="pascal" enclose="none">high()</syntaxhighlight> can prevent you from making range errors.}}
Modern Pascal has a new similar construct but with a new syntax: '''for..in..do'''. This syntax will simply iterate over all possible values in a collection of data like an array, but without an explicit index.<br>
+
 
 +
=== Use <syntaxhighlight lang="pascal" enclose="none">for</syntaxhighlight> … <syntaxhighlight lang="pascal" enclose="none">in</syntaxhighlight> … <syntaxhighlight lang="pascal" enclose="none">do</syntaxhighlight> ===
 +
Now, forget all the above.
 +
When it is possible, you should use [[For|<syntaxhighlight lang="pascal" enclose="none">for</syntaxhighlight>]] … [[In|<syntaxhighlight lang="pascal" enclose="none">in</syntaxhighlight>]] … [[Do|<syntaxhighlight lang="pascal" enclose="none">do</syntaxhighlight>]].
 +
The Pascal language has <syntaxhighlight lang="pascal" enclose="none">low()</syntaxhighlight> and <syntaxhighlight lang="pascal" enclose="none">high()</syntaxhighlight> for many years and as shown above it can prevent you from introducing range errors.
 +
Modern Pascal has a new similar construct but with a new syntax:
 +
[[for-in loop|<syntaxhighlight lang="pascal" enclose="none">for</syntaxhighlight> … <syntaxhighlight lang="pascal" enclose="none">in</syntaxhighlight> … <syntaxhighlight lang="pascal" enclose="none">do</syntaxhighlight>]].
 +
This syntax will simply iterate over all possible values in a collection of data like an array, but without an explicit index.
 +
 
 
We can get rid of our bug by preventing it in the first place by removing the index altogether.
 
We can get rid of our bug by preventing it in the first place by removing the index altogether.
<syntaxhighlight>program dtp_1d;
+
<syntaxhighlight lang="pascal">
 +
program dtp_1d;
 
{$mode objfpc}{$R+}
 
{$mode objfpc}{$R+}
 
var
 
var
 
   anArray:array[0..9] of integer; // ten elements
 
   anArray:array[0..9] of integer; // ten elements
 
   i:0..9;  // could use j, but this is for clarity.
 
   i:0..9;  // could use j, but this is for clarity.
   j:integer; // j is an integer here: it is not an index, but a value from the array
+
   Item:integer; // Item is an integer here: it is not an index, but a value from the array
 
begin
 
begin
 
   // data to show what for in do does
 
   // data to show what for in do does
 
   for i := Low(anArray) to High(anArray) do anArray[i] := 100+i;
 
   for i := Low(anArray) to High(anArray) do anArray[i] := 100+i;
   for j in anArray do  // for every integer value that is contained in the array
+
   for Item in anArray do  // for every integer value that is contained in the array
     write(j:4); // writes the value of an array cell, this is not an index.
+
     write(Item:4); // writes the value of an array cell, this is not an index.
end.</syntaxhighlight>
+
end.
<br>
+
</syntaxhighlight>
<syntaxhighlight>'To summarize: with for .. in ..do you can safely iterate over a collection of data without using an explicit index and the risk of range errors'</syntaxhighlight>
+
 
 +
{{Note|To summarize:
 +
with <syntaxhighlight lang="pascal" enclose="none">for</syntaxhighlight> … <syntaxhighlight lang="pascal" enclose="none">in</syntaxhighlight> … <syntaxhighlight lang="pascal" enclose="none">do</syntaxhighlight> you can safely iterate over a collection of data without using an explicit index and the risk of range errors.}}
 +
 
 +
=== Bonus: Using a range? You may want a set, too! ===
 +
If you have declared a range, why not declare a [[Set|set]] as well?
 +
This will give you a safe way of performing filters on a data collection like an array.
  
=== Bonus: Use a range? You may want a set too.. ===
 
If you have declared a range, why not declare a set as well? This will give you a safe way of performing filters on a data collection like an array.<br>
 
 
A simple example looks like this:
 
A simple example looks like this:
<syntaxhighlight>
+
<syntaxhighlight lang="pascal">
 
program dtp_1f;
 
program dtp_1f;
 
{$mode objfpc}{$R+}
 
{$mode objfpc}{$R+}
Line 145: Line 212:
 
   anArray:array[TmyRange] of integer; // ten elements
 
   anArray:array[TmyRange] of integer; // ten elements
 
begin
 
begin
   j:=[1,3,5,7,9];// odd elements  
+
   j:=[1,3,5,7,9];// odd elements
 
   for i in j do
 
   for i in j do
 
   begin
 
   begin
 
     anArray[i] := i;
 
     anArray[i] := i;
 
     write(anArray[i]:3);
 
     write(anArray[i]:3);
   end;
+
   end;
end.</syntaxhighlight>
+
end.
<br>
+
</syntaxhighlight>
Ranges are powerful, sets are even more so!. And makes your code safe and readable.
+
 
 +
Ranges are powerful, sets are even more so!
 +
And makes your code safe and readable.
 +
 
 
=== Conclusion ===
 
=== Conclusion ===
Range errors are common in every language, often hard to find, but if you are reading this you are probably using Pascal.<br>
+
Range errors are common in every language, often hard to find, but if you are reading this you are probably using Pascal.
And with the right mindset a Pascal programmer can write code in such a way that range errors should hardly exist in the code.<br>
+
 
 +
And with the right mindset a Pascal programmer can write code in such a way that range errors should hardly exist in the code.
 +
 
 
Because Pascal is so strongly typed and has so many features to help you prevent range errors.
 
Because Pascal is so strongly typed and has so many features to help you prevent range errors.
* use '''{$rangechecks on}''' or '''{$R+}''' during development and run your code. Turn it off if you are sure there are no range errors but protect your code with ranges.
+
* use [[sRangechecks|<syntaxhighlight lang="pascal" enclose="none">{$rangechecks on}</syntaxhighlight>]] or <syntaxhighlight lang="pascal" enclose="none">{$R+}</syntaxhighlight> during development and run your code. Turn it off if you are sure there are no range errors but protect your code with ranges.
* use '''ranges''' and think about range when writing your code! It will prevent you from introducing range errors and you will catch them at compile time.
+
* use ''ranges'' instead of integers for your index and think about range when writing your code! It will prevent you from introducing range errors and you will catch them at compile time.
* use '''low()''' and '''high()''' not 1 to 10 or 0 to 9 when you iterate a data collection. Make it a habit.
+
* use <syntaxhighlight lang="pascal" enclose="none">low()</syntaxhighlight> and <syntaxhighlight lang="pascal" enclose="none">high()</syntaxhighlight> not <syntaxhighlight lang="pascal" enclose="none">1</syntaxhighlight> to <syntaxhighlight lang="pascal" enclose="none">10</syntaxhighlight> or <syntaxhighlight lang="pascal" enclose="none">0</syntaxhighlight> to <syntaxhighlight lang="pascal" enclose="none">9</syntaxhighlight> when you iterate a data collection. Make it a habit.
* use '''for..in..do''' if applicable, try to make that your first option!
+
* use <syntaxhighlight lang="pascal" enclose="none">for</syntaxhighlight> … <syntaxhighlight lang="pascal" enclose="none">in</syntaxhighlight> … <syntaxhighlight lang="pascal" enclose="none">do</syntaxhighlight> if applicable, try to make that your first option!
* use a '''set of range''' to safely filter
+
* use a ''set of range'' to safely filter
There is more to this subject, but if you follow these simple rules you avoid bugs and trust me: there is no speed penalty.<br>
+
 
A bit of brains instead of fingers will prevent this nasty category of bugs and prevents you from spending more debug time than coding time!
+
There is more to this subject, but if you follow these simple rules you avoid bugs and trust me: there is no speed penalty.
 +
A bit of “brains instead of fingers” will prevent this nasty category of bugs and prevents you from spending more debug time than coding time!
  
 
== How to prevent Overflow Errors, catch them and even misuse Overflow  ==
 
== How to prevent Overflow Errors, catch them and even misuse Overflow  ==
 
=== The bug ===
 
=== The bug ===
 
Let's introduce you to a small piece of code with an overflow bug.
 
Let's introduce you to a small piece of code with an overflow bug.
<syntaxhighlight>
+
<syntaxhighlight lang="pascal">
 
program dtp_2a;
 
program dtp_2a;
 
{$mode objfpc}
 
{$mode objfpc}
 
var
 
var
   a:integer = high(integer);
+
   a:NativeInt = high(NativeInt);
 
begin
 
begin
 
   a:= a + 1;
 
   a:= a + 1;
Line 179: Line 252:
 
end.
 
end.
 
</syntaxhighlight>
 
</syntaxhighlight>
Can you see the bug? Concentrate, look again... Can you see it?
+
Can you spot the bug?
<br>
+
Concentrate, look again…
Now compile that like '''fpc dtp_2a.pas'''
+
Can you see it?
 +
 
 +
Now compile that like <code>fpc dtp_2a.pas</code>.
 
Then run it:
 
Then run it:
<syntaxhighlight>
+
<syntaxhighlight lang="bash">
dtp_2a
+
$ ./dtp_2a
-2147483648
+
-2147483648 //depending on nativeint: this is 32 bit
 
</syntaxhighlight>
 
</syntaxhighlight>
It does not crash, it simply prints -2147483648...But is that correct? Of course not!
+
It does not crash, it simply prints <code>-2147483648</code>.
'''[coming more soon]'''
+
But is that correct?
 +
Of course not!
 +
Now with overflowchcks on:
 +
<syntaxhighlight lang="pascal">program dtp_2a;
 +
{$mode objfpc}{$overflowchecks on}
 +
var
 +
  a:NativeInt = high(NativeInt);
 +
begin
 +
  a:= a + 1;
 +
  writeln(a);
 +
end.</syntaxhighlight>
 +
This code will compile, but it will generate an overflow error when you run it: 215.
 +
See the programmers guide on overflow checks.
  
== How to prevent Input and Output Errors (and how to catch them...) ==
+
== How to prevent Input and Output Errors (and how to catch them…) ==
  
 
== How to use meaningful Assertions ==
 
== How to use meaningful Assertions ==
  
 
== To serve and protect: the story of try..finally ==
 
== To serve and protect: the story of try..finally ==
 +
 
== Do you know your String Type? Really? ==
 
== Do you know your String Type? Really? ==
 +
''[This should be written by [[User:JuhaManninen|Juha]]… not [[User:Thaddy|me]]…]''
 +
 +
'''string''' is a devil with many faces: It can be ShortString, AnsiString and UnicodeString.<br>
 +
I have the habit to declare the exact species of string I am using, especially in library code, but what if the code just says '''string'''?
 +
Well, here's a little utility function to obtain the string type you are actually working with:
 +
<syntaxhighlight lang="pascal">
 +
//{$mode delphi}  // tkAString AnsiString
 +
//{$mode delphi}{$H-} // tkSString ShortString
 +
//{$mode delphiunicode} // tkUString Unicode string
 +
//{$mode delphiunicode}{$H-} // tkSString ShortString
 +
//{$mode objfpc} // tkSString ShortString
 +
//{$mode objfpc}{$H+} //tkAString AnsiString
 +
//{$mode fpc}{$modeswitch result} // tkSString ShortString
 +
//{$mode fpc}{$H+}{$modeswitch result} // tkAString AnsiString
 +
// etc.
 +
 +
uses typinfo;
 +
 +
  function StringType(const s:string):TTypeKind;inline;
 +
  var info:PTypeInfo;
 +
  begin
 +
    info:=TypeInfo(s);
 +
    Result := Info^.Kind;
 +
  end;
 +
 +
var s: string = 'testme';
 +
begin
 +
  writeln('My string type is ',StringType(s));
 +
end.
 +
</syntaxhighlight>
 +
'''string''' depends on mode,and this little gem will tell you what kind of string you are dealing with.<br>
 +
That is not always obvious. Try to experiment with some of the mode settings and see what happens.<br>
 +
The result may not always be what you expected, so use this function as a debug utility.
 +
You can be sure it returns what '''string''' means at any given unit.
 +
 +
[[Category:Tutorials]]
 +
[[Category:Code]]

Latest revision as of 12:06, 29 September 2021

How to catch and prevent Range Errors

Range errors are easy to introduce and sometimes hard to find. They can exist for years without being noticed. I have seen production units where range checks were deliberately turned off by adding {$R-} in a unit and nobody noticed this for years. When I compiled the code during a review with range checks on {$R+} I found a huge bug that potentially could crash a vital piece of software. Mind you, there can be reasons to turn range checks off but never for a whole unit or a whole program, unless it is fully tested for a release.

I will show you how to find range errors, how to debug them and how to prevent them. Defensive programming is important with ranges.

The bug

Let's introduce you to a small piece of code with a range bug.

program dtp_1a;
{$mode objfpc}
var
  anArray:array[0..9] of integer; // ten elements
  i:integer;
begin
  for i := 1 to 10 do
  begin
    anArray[i] := i;
    write(anArray[i]:3);
  end;
end.

This code compiles without error and on some systems it even runs! without error:

$ fpc -glh dtp_1a.pas

Note -glh obtains line info in case of an error. Running the program yields:

dtp
  1  2  3  4  5  6  7  8  9 10

That may seem right, but is wrong! It could also SEGFAULT or worse … Which you know if you have spotted the bug.

Turn range checks on

Now let's see what happens when we compile with range checks:

 1program dtp_1b;
 2{$mode objfpc}
 3{$R+}
 4var
 5  anArray:array[0..9] of integer; // ten elements
 6  i:integer;
 7begin
 8  for i := 1 to 10 do
 9  begin
10    anArray[i] := i;
11    write(anArray[i]:3);
12  end;
13end.
$ fpc -glh dtp_1b.pas

You may not expect this code to compile if you discovered the error, but unfortunately it compiles without error or warning. The fun starts when you run it:

dtp
  1  2  3  4  5  6  7  8  9Runtime error 201 at $000101B8
  $000101B8  main,  line 10 of dtp.pas
  $00010124

No heap dump by heaptrc unit
Exitcode = 201

Ok, we found a bug at line 10 of our program and 201 means range error. Useful, but not very, since we had to run the program to make it crash. Hardly acceptable. Furthermore not every programmer sees what the bug is since it occurs in a loop. Which is wrong? i or anArray[i] or both? And when it goes wrong is also not obvious to all.

Both the FP textmode IDE and Lazarus are able to debug our program, so we set a breakpoint on line 10 and press F9 a couple of times. Note I also set a watch on i.

dtp 1b.png

So I pressed F9 ten times and hey presto, the error occurs when i becomes 10 and we try to access anArray[10]. But that means the actual error is on line 9. We are over-indexing because the array is from 0..9, not from 1 to 10.

Bug found and cause of bug found. But not fixed, remember we found it at runtime, not at compile time.

Light bulb  Note: To summarize, turning range checks on finds range errors at run time, but not always at compile time.

Declare ranges and use low() and high()

Object Pascal has a nice feature that is a bit underused, but is very useful in our case, ranges. Basically, by declaring a range we can find range errors at compile time and that is exactly what we want.

program dtp_1c;
{$mode objfpc}{$R+}
var
  anArray: array[0..9] of integer; // ten elements
  i: 0..9; // range of 10 elements, same as array
begin
  for i := 1 to 10 do
  begin
    anArray[i] := i;
    write(anArray[i]:3);
  end;
end.

By declaring a range instead of an integer we probably also immediately see the discrepancy in the for to code, but that is not always the case, so let's try to compile the code:

dtp 1c.png

Does not work, as you can see. The code will not compile because we protected our index variable by applying a range to it. And that is exactly what we want, code that contains bugs should not compile.

It is a bit difficult to maintain such code, since we have to keep the array and the range in sync, but that is easy to fix with code like this: Note I also fixed the bug here, because we found the bug and a proper debugging message that the range was wrong.

program dtp_1d;
{$mode objfpc}{$R+}
var
  anArray:array[0..9] of integer; // ten elements
  // if we change the array size this is automatically also correct.
  i: low(anArray)..high(anArray);
begin
  for i := 0 to 9 do   // can't write 10 here...
  begin
    anArray[i] := i;
    write(anArray[i]:3);
  end;
end.

For completeness you can also use it like this. If any size needs to change, simply change the type:

program dtp_1e;
{$mode objfpc}{$R+}
type
  TmyRange = 0..9;
var
  i:TMyRange;
  anArray:array[TmyRange] of integer; // ten elements
begin
  for i := Low(TMyRange) to High(TMyRange) do
  begin
    anArray[i] := i;
    write(anArray[i]:3);
  end;
end.

Light bulb  Note: To summarize: Declaring a specific range can help you find range errors at compile time.

Using low() and high() can prevent you from making range errors.

Use forindo

Now, forget all the above. When it is possible, you should use forindo. The Pascal language has low() and high() for many years and as shown above it can prevent you from introducing range errors. Modern Pascal has a new similar construct but with a new syntax: forindo. This syntax will simply iterate over all possible values in a collection of data like an array, but without an explicit index.

We can get rid of our bug by preventing it in the first place by removing the index altogether.

program dtp_1d;
{$mode objfpc}{$R+}
var
  anArray:array[0..9] of integer; // ten elements
  i:0..9;  // could use j, but this is for clarity.
  Item:integer; // Item is an integer here: it is not an index, but a value from the array
begin
  // data to show what for in do does
  for i := Low(anArray) to High(anArray) do anArray[i] := 100+i;
  for Item in anArray do  // for every integer value that is contained in the array
    write(Item:4); // writes the value of an array cell, this is not an index.
end.

Light bulb  Note: To summarize:

with forindo you can safely iterate over a collection of data without using an explicit index and the risk of range errors.

Bonus: Using a range? You may want a set, too!

If you have declared a range, why not declare a set as well? This will give you a safe way of performing filters on a data collection like an array.

A simple example looks like this:

program dtp_1f;
{$mode objfpc}{$R+}
type
  TmyRange = 0..9;
var
  i:TMyRange;
  j:set of TMyRange;
  anArray:array[TmyRange] of integer; // ten elements
begin
  j:=[1,3,5,7,9];// odd elements
  for i in j do
  begin
    anArray[i] := i;
    write(anArray[i]:3);
  end;
end.

Ranges are powerful, sets are even more so! And makes your code safe and readable.

Conclusion

Range errors are common in every language, often hard to find, but if you are reading this you are probably using Pascal.

And with the right mindset a Pascal programmer can write code in such a way that range errors should hardly exist in the code.

Because Pascal is so strongly typed and has so many features to help you prevent range errors.

  • use {$rangechecks on} or {$R+} during development and run your code. Turn it off if you are sure there are no range errors but protect your code with ranges.
  • use ranges instead of integers for your index and think about range when writing your code! It will prevent you from introducing range errors and you will catch them at compile time.
  • use low() and high() not 1 to 10 or 0 to 9 when you iterate a data collection. Make it a habit.
  • use forindo if applicable, try to make that your first option!
  • use a set of range to safely filter

There is more to this subject, but if you follow these simple rules you avoid bugs and trust me: there is no speed penalty. A bit of “brains instead of fingers” will prevent this nasty category of bugs and prevents you from spending more debug time than coding time!

How to prevent Overflow Errors, catch them and even misuse Overflow

The bug

Let's introduce you to a small piece of code with an overflow bug.

program dtp_2a;
{$mode objfpc}
var
  a:NativeInt = high(NativeInt);
begin
  a:= a + 1;
  writeln(a);
end.

Can you spot the bug? Concentrate, look again… Can you see it?

Now compile that like fpc dtp_2a.pas. Then run it:

$ ./dtp_2a
-2147483648  //depending on nativeint: this is 32 bit

It does not crash, it simply prints -2147483648. But is that correct? Of course not! Now with overflowchcks on:

program dtp_2a;
{$mode objfpc}{$overflowchecks on}
var
  a:NativeInt = high(NativeInt);
begin
  a:= a + 1;
  writeln(a);
end.

This code will compile, but it will generate an overflow error when you run it: 215. See the programmers guide on overflow checks.

How to prevent Input and Output Errors (and how to catch them…)

How to use meaningful Assertions

To serve and protect: the story of try..finally

Do you know your String Type? Really?

[This should be written by Juha… not me…]

string is a devil with many faces: It can be ShortString, AnsiString and UnicodeString.
I have the habit to declare the exact species of string I am using, especially in library code, but what if the code just says string? Well, here's a little utility function to obtain the string type you are actually working with:

//{$mode delphi}  // tkAString AnsiString
//{$mode delphi}{$H-} // tkSString ShortString
//{$mode delphiunicode} // tkUString Unicode string
//{$mode delphiunicode}{$H-} // tkSString ShortString
//{$mode objfpc} // tkSString ShortString 
//{$mode objfpc}{$H+} //tkAString AnsiString
//{$mode fpc}{$modeswitch result} // tkSString ShortString
//{$mode fpc}{$H+}{$modeswitch result} // tkAString AnsiString
// etc.

uses typinfo;
 
  function StringType(const s:string):TTypeKind;inline; 
  var info:PTypeInfo;
  begin
    info:=TypeInfo(s);
    Result := Info^.Kind;
  end;
 
var s: string = 'testme';
begin
  writeln('My string type is ',StringType(s));
end.

string depends on mode,and this little gem will tell you what kind of string you are dealing with.
That is not always obvious. Try to experiment with some of the mode settings and see what happens.
The result may not always be what you expected, so use this function as a debug utility. You can be sure it returns what string means at any given unit.