OpenMP support
What is OpenMP?
OpenMP is an API accessed by language directives to do multi threaded programming, see also http://www.openmp.org. Currently, there is only OpenMP syntax defined for C and Fortran. This page tries to collect some stuff to settle down pascal syntax for it.
Pascal syntax for OpenMP
Proposal 1
Foreword
At first, I must admit that some parts of the OpenMP specification I still don't understand. They did a terrible good job throwing away all common terms ever used in multi threading context, and invented their own ones.
Syntax vs. Compiler directives
OpenMP for C and C++ is implemented by using compiler directives mainly due to the reasons of source code compatibility (or: standards compliance). So a conforming program is intended to behave the same regardless if the actual compiler compiling the program supports those special pragmas or not.
For FreePascal I don't think this is the way to go, because first it changes comments into code and second, it makes the program far less readable. For C programs this doesn't seem to be an issue, if you get my meaning. But in my opinion, readability is a far more important issue than compatibility to older/different compilers. If all else fails, a preprocessor could be provided to strip out the parallel specific stuff, as has been suggested by Marco. Note that you would need this preprocessor for the directives too because older FPCs and Delphi don't skip unknown directives.
Well, enough talk, I start with the easier directives which are luckily the more fundamental ones.
Ok, I got more input than I'd expected and less time than I wished. :) Anyway, against my own objection, the idea of enclosing the parallel code into (local) functions looks very appealing, so I've changed the example accordingly.
parallel
The parallel
construct can only be used for a structured block. That means in Pascal it should be enclosed in some sort of begin
/end
pair anyway - so, as it has been suggested, we could use a (in this particular example non-local) function instead. Though, I don't know yet, if this may bite with other parts of the spec as this is evolving. Let's try:
(Original example A.4.1.c of the OpenMP V2.5 specification):
procedure SubDomain (var x : array of Float;
istart : Integer;
ipoints : Integer)
var
i : Integer;
begin
for i := 0 to ipoints - 1 do
x[istart + i] := 123.456;
end {SubDomain};
parallel procedure Sub (var x : array of Float);
// Variables declared here have private
context.
// So each instance of the parallel function has its own set, as usual.
var
iam : Integer;
nt : Integer;
ipoints : Integer;
// Any variable access outside of the function's scope accesses the variable in
// a shared context.
// This might prove problematic, especially because it causes special semantics
// on the function's parameters, probably depending on the parameter mode or worse:
// On the calling convention actually used (call-by-value vs. call-by-reference).
begin // of (possibly) parallel section
iam := OMP.Get_Thread_Num; // OMP library calls.
nt := OMP.Get_Num_Threads;
ipoints := Length (x) div nt; // size of partition
istart := iam * ipoints; // starting array index
if iam = Pred (nt) then
ipoints := Length (x) - istart; // last thread may do more
SubDomain (x, istart, ipoints);
end {Sub};
var
arr : array[0 .. 9999] of Float;
begin // Main program
Sub (arr);
end.
I don't like the idea of declaring variables inside the actual statments, this looks very unpascalish. Maybe we can find a way around it. --FPK 10:22, 26 July 2006 (CEST)
I agree with Florian that this is not the way to go. Why not require all parallelizable code to be in local functions ? After all, that's almost what you are doing: declaring a local function. That would be a simple extension of the current syntax. You have access to all local variables; all you'd need is to add a parallel keyword to the local function declaration.
Ok, so what do you think about the changed example above? OpenMP really is about coarse grain parallelism, so I see indeed no strong reason, why parallel blocks shouldn't be enclosed in procedures. Parallel functions obviously do not make sense, as every thread could return its own return value, but the block calling the parallel function can only evaluate one. I would have liked the notion of a local block, though (I'm quite used to it), but as I seem to be the only one... --V.hoefler 21:03, 27 July 2006 (CEST)
How would a try/finally type blocking approach work? (No begin needed) Like,
Parellel SomeCodeHere; SomeCodeHereToo; End; { Parellel Block }
Very simple, just as easy when we first discovered Exception handling. --Raid 19:11, 1 September 2009 (CEST)
parallel for
This is simply a parallel for-loop. There's nothing special to it.
Although OMP2.5 states a for-loop iteration variable is private
in that construct, which I consider rather redundant, I hardly can imagine correctly behaving code with a shared
loop iteration variable. It also places some restrictions onto the allowed loop-statements (no change of iteration variable inside the loop, simple iteration constructs, ...), but these are already implemented in the language, so there's no need to elaborate on that much further.
(Example A.1.1c):
procedure a1 ( n : Integer; const a : array of Float; var b : array of Float); var i : Integer; begin parallel for i := Succ (Low (a)) to High (a) do b[i] := (a[i] + a[i - 1]) / 2.0; end {a1};
That's it. Now probably someone sees the reason why I wouldn't use the parallel
keyword as function modifier like inline
or cdecl
are used, but rather prepend it to the function header itself. I think, it's a more consistent usage of a new keyword.
-- V.hoefler 21:17, 27 July 2006 (CEST)
data sharing attributes
To me these seem quite complex constructs considering that most of the time you probably won't need it at all, because the default is fine and follows normal programming logic. So if anyone has an idea, if and why we need to support them explicitely, here's the place.
threadprivate
This attribute closely resembles, what FreePascal already knows as threadvar
, so I even think, we can reuse this keyword here. I see some semantic issue though:
The OMP2.5 specification states:
The values of the data in the threadprivate objects of threads other than the initial thread are guaranteed to persist between two consecutive parallel regions only if all the conditions hold:
and then follows a list of condition, which basically state, that the number of threads in both sections must be the same.
So to write some simple pseudo code to demonstrate:
procedure Thread_Vars; threadvar Count : Integer; var i : Integer; begin Count := 0; // initial state parallel for i := 1 to SOME_VALUE do Count := Count + 1; // Point A parallel for i := 1 to SOME_OTHER_VALUE do Count := Count + 2; // Point B end {Thread_Vars};
Now, each iteration of the loop is executed in parallel, so each copy also gets its own copy of Count
. At Point A after the loop, Count
would equal 1
, because if each loop iteration was executed by a single thread, the incrementing operation would have happened only once (per thread). Let aside the question which copy is seen after the loop, the thing gets more interesting. What value is seen at Point B?
Well, if I understood the specification correctly, the value would be 3
if and only if the actual values of the place-holding constants SOME_VALUE
and SOME_OTHER_VALUE
are equal.
In any other case, the value of Count
at Point B would be undefined.
Proposal 2: Using local functions
Instead of using new block types (like parallel), it uses a nested procedure, with the parallel modifier.
I think we could use a sequential keyword for the sub() procedure. More on talk page. -- MarkMLl 12:46, 9 December 2007 (CET)
parallel
procedure SubDomain (var x : array of Float; istart : Integer; ipoints : Integer); var i : Integer; begin for i := 0 to ipoints - 1 do x[istart + i] := 123.456; end {SubDomain}; procedure Sub (var x : array of Float); procedure ParallelBlock; parallel; var iam : Integer; nt : Integer; ipoints : Integer; begin iam := OMP.Get_Thread_Num; // OMP library calls. nt := OMP.Get_Num_Threads; ipoints := Length (x) div nt; // size of partition istart := iam * ipoints; // starting array index if iam = Pred (nt) then ipoints := Length (x) - istart; // last thread may do more SubDomain (x, istart, ipoints); end; begin ParallelBlock; end {Sub}; var arr : array[0 .. 9999] of Float; begin // Main program Sub (arr); end.
Proposal 3
parallel
, future
, and async
keywords like implemented in the "Oxygen" Pascal dialect. Oxygen uses features of the CIL (aka ".NET") framework to implement this. (IMHO this does not qualify the paradigm of these keywords as bad.) With FP, this can be implemented in native code in the RTL.
see
- http://wiki.oxygenelanguage.com/en/Parallel_Loops - http://wiki.oxygenelanguage.com/en/Futures - http://wiki.oxygenelanguage.com/en/Asynchronous_Statements
Proposal 4
Benefit from the efforts of Modula-2+ and Modula-2* and maybe use (or build upon) their ideas.