Difference between revisions of "Generics proposals"
m (typo) |
(Avoid "of" in between generic type and parameters) |
||
Line 267: | Line 267: | ||
end; | end; | ||
− | TCollection = TGenericCollection | + | TCollection = specialize TGenericCollection(TCollectionItem); |
− | TFieldDefs = TGenericCollection | + | TFieldDefs = specialize TGenericCollection(TFieldDef); |
TGenericList = generic(T: PtrInt) class(TObject) | TGenericList = generic(T: PtrInt) class(TObject) | ||
Line 274: | Line 274: | ||
end; | end; | ||
− | TList = TGenericList | + | TList = specialize TGenericList(Pointer); |
For procedures: | For procedures: | ||
Line 302: | Line 302: | ||
TList = generic(T) class | TList = generic(T) class | ||
private | private | ||
− | fHead: PListItem | + | fHead: specialize PListItem(T); |
− | fTail: PListItem | + | fTail: specialize PListItem(T); |
published | published | ||
procedure Add(Item: T); | procedure Add(Item: T); | ||
Line 310: | Line 310: | ||
procedure generic(T) TList.Add(Item: T); | procedure generic(T) TList.Add(Item: T); | ||
var | var | ||
− | node: PListItem | + | node: specialize PListItem(T); |
begin | begin | ||
New(node); | New(node); | ||
Line 331: | Line 331: | ||
end; | end; | ||
− | TAppleList = TList | + | TAppleList = specialize TList(TApple); |
var | var |
Revision as of 15:57, 9 November 2005
The first thing someone who knows thie stuff needs to do is explain what a generic is so we understand what the issue is.
Why
- Typesafety !
- Speed
- Readability
See "Already existing stuff" for possible but lacking ways right now.
Interested Parties
- Almindor - dannym - fpk - giantm - oliebol - plugwash - Synopsis - tom - PublicJoke - neli
Usage example (uncategorized)
TODO: combine the below stuff into one example using the <> syntax
(tom proposal)
type // explicit addition of "generic" keyword. Maybe something like "uses generic <...>" // better use the <>-brackets because otherwise it may cause confusion with arrays // in the actual instantiation abc = class(..) generic <a, b> end; var // because they're still unused ;) xyz : abc<String, Integer>; xyz2 : abc<AObj1, AObj2>; begin // don't repeat generics in instance creation (if not necessary) xyz := abc.Create; xyz2 := abc<descendant_of_Aobj1, descendant_of_Aobj2>.Create; end.
(dannym proposal new)
type GList = generic class of T public procedure Add(const stuff: T); end;
procedure GList.Add(const stuff: T); // note that here no 'of T' begin ... end;
var intlist: GList of Integer; begin intlist := GList.Create; // note that here the detail type is missing and // needs to be deducted from the var declaration //or intlist := typeof(intlist).Create; //intlist := GList of Integer.Create; <-- this is unclear if that works end;
(dannym proposal - old)
type GList<T> = class public procedure Add(const stuff: T); end;
procedure GList<T>.Add(const stuff: T); begin ... end;
var intlist: GList<Integer>; begin intlist := GList<Integer>.Create; end;
plugwash:
tmycollection=template(tobject)[x] tintegercollection=tmycollection[integer]
oliebol:
template blaat<t:integer>(a:t); var x : t; begin do something with a, and use t with it end;
note that we should have a generics section (like interface and implementation sections) containing the generics:
unit xyz; generics type GList...... procedure GList.Add(const value: T); .... end.
(eiffel)
http://paste.lisp.org/display/6236
(PublicJoke proposal)
unit Unit1(TMathType); interface type TMathClass = class public class function Add(a1, a2: TMathType): TMathType; end; implementation class function TMathClass.Add(a1, a2: TMathType): TMathType; begin Result:=a1+a2; end; end.
unit Unit2; type TMathType = Longword; interface implementation end.
program prog; uses Unit2, Unit1(Byte) as ByteMath in 'Unit1.pas', Unit1(TMathType) as PolyMath in 'Unit1.pas'; type TByteMathClass = ByteMath.TMathClass; TPolyMathClass = PolyMath.TMathClass; var b: Byte; p: TMathType; begin b:=TByteMathClass.Add(1, 2); p:=TPolyMathClass.Add(3, 4); end.
Usage example
If you add more syntax examples, please add equivalent to both. Two main syntaxes seem to be popular, "<>" which looks more or less like C++, c#, etc.. and adding a "generic" keyword which is more along the Pascal style of code. Go to Generics Vote to vote for one or the other syntax.
Using brackets: <>
type TGenericCollection<T: TCollectionItem> = class(TComponent) ...implement TCollection and use T end;
TCollection = TGenericCollection<TCollectionItem>; TFieldDefs = TGenericCollection<TFieldDef>;
TGenericList<T: PtrInt> = class(TObject) ...implement TList and use PtrInt size for code generation end;
TList = TGenericList<Pointer>;
Implementation of TGenericCollection can be compiled as if (T: TCollectionItem) were used.
Would this solve the circular dependency ? It seems so to me, because one knows at least as much as in current implementation of these classes, but I'm no compiler engineer.
For procedures:
function MyFunc<T: TTreeItem>(A: T): T; begin // do something with the tree item end;
function Max<T>(A, B: T): T; begin if A < B then Result := B else Result := A; end;
My restrictions won't allow implementing generic Max and Min, I guess. That really needs macro-alike stuff (for the compiler implementation).
Another example for linked lists: (ripped from mail to fpc-devel of dannym, but modified a bit)
type TListItem<T> = record Data: T; Next: TListItem; end; PListItem = ^TListItem; TList<T> = class private fHead: PListItem<T>; fTail: PListItem<T>; published procedure Add(Item: T); end; procedure TList<T>.Add(Item: T); var node: PListItem<T>; begin New(node); node^.Data := Item; node^.Next := nil; if Assigned(fTail) then begin fTail^.Next := node; end else begin fHead := node; end; fTail := node; end; type TApple = class end; TOrange = class end; TAppleList = TList<TApple>; var apples: TAppleList; apple: TApple; begin apples.Add(TApple.Create); // works apples.Add(TOrange.Create); // compile error apple := apples[0]; // works apple := apples[1]; // not applicable apple := apples[0] as TApple; // works, but unneccessary apple := apples[1] as TApple; // not applicable end;
Using the "generic" keyword
type TGenericCollection = generic(T: TCollectionItem) class(TComponent) ...implement TCollection and use T end;
TCollection = specialize TGenericCollection(TCollectionItem); TFieldDefs = specialize TGenericCollection(TFieldDef);
TGenericList = generic(T: PtrInt) class(TObject) ...implement TList and use PtrInt size for code generation end;
TList = specialize TGenericList(Pointer);
For procedures:
function generic(T: TTreeItem) MyFunc(A: T): T; begin // do something with the tree item end;
function generic(T) Max(A, B: T): T; begin if A < B then Result := B else Result := A; end;
Another example for linked lists: (ripped from mail to fpc-devel of dannym, but modified a bit)
type TListItem = generic(T) record Data: T; Next: TListItem; end; PListItem = ^TListItem; TList = generic(T) class private fHead: specialize PListItem(T); fTail: specialize PListItem(T); published procedure Add(Item: T); end; procedure generic(T) TList.Add(Item: T); var node: specialize PListItem(T); begin New(node); node^.Data := Item; node^.Next := nil; if Assigned(fTail) then begin fTail^.Next := node; end else begin fHead := node; end; fTail := node; end; type TApple = class end; TOrange = class end; TAppleList = specialize TList(TApple); var apples: TAppleList; apple: TApple; begin apples.Add(TApple.Create); // works apples.Add(TOrange.Create); // compile error apple := apples[0]; // works apple := apples[1]; // not applicable apple := apples[0] as TApple; // works, but unneccessary apple := apples[1] as TApple; // not applicable end;
Terms
type user:
- the function in the compiler that defines a variable or return type and thus 'uses a (already defined) type'.
immediate type:
- not generic and not specialized, i.e. 'normal type'
generic type:
- template for a class with some unspecified types, never to be filled in into this class type. Only by specialisation.
specialization:
- copy generic type class to new specialized type class
type parameters:
- T and Q, the unknown types for the generic TSomeGeneric<T,Q>
specialized type:
- a generic type with all type parameters know, written into a real class type (tree) and reused if possible
Changes
Changes in class/record definition representation
Each class definition representation has added fields for:
- class instantiation mode 0 immediate type 1 generic type, no instantiation, just generate specialized type 2 specialized type
- when mode 2: generic type uplink (XXXX, what, unitname, typename?) when mode 1: list of specialized types known so far (and where), XXXX list items when mode 0: nothing
Changes in 'type user' compilation (for class/record types only)
Each class type user will have to check mode of the class type.
For the used class type, If mode is immediate
- proceed as always till now.
If mode is specialized
- proceed as always till now - keep in mind the generic type for some checks later (and debugging).
If mode is generic, this:
- check 'list of specialized types' for the type parameter to use. If there is already a specialized type, use that. (given that it is compatible with 'compile parameters' XXXX)
- If not available, clone the generic type in the tree, and fill in the type params with the actual types to use. Generate machine code as normal. Remember the new specialized type for later in the list by the generic type and type params used.
(the generic type is best written to some ppu as parse tree so that it can be refetched for cloning somewhen. To keep it easy, initially it should be limited to having its .pas file around when wanting to use a generic)
type params
Generics store a list of the type params (names).
('T', 'Q')
And Specializations store a list of type mappings from the real types to the type params, in the way of
type T = Integer; Q = Boolean;
of course these are local to this class/record specialization.
Other things to note (for *later*)
What if a method implementation depends on the type used to know how the implementation should look like ?
procedure GList<T is TObject>.Add(const value: T); procedure GList<T is Integer>.Add(const value: T); ?
or more cryptic like
procedure GList<TObject>.Add(const value: T); procedure GList<Integer>.Add(const value: T); ?
or like
procedure GList<T>.Add(const value: Integer); procedure GList<T>.Add(const value: TObject);
the latter would need extra compiler support but would be nicer, overall. Potentially clashing with normal overloads though.
How to do internal storage classes
type PNode<T> = ^TNode<T>; TNode<T> = record prev,next: PNode<T>; data: T; end;
type GList<T> = class private head, tail: PNode<T>; cnt: Integer; public procedure Add(const stuff: T); end; ?
This implies that generics are best implemented both for records and for classes alike! Also implies that pointer types need to support generic too.
Also one specialization should chain other specializations (probably also when deriving from a generic).
How to derive from the generic type
It would be good if generic types could be derived from.
type TGenericEventyList<T> = class(TGenericList<T>) public procedure Add(value: T); override; end; ...
Interfaces should support generics too
type IBa<T> = interface procedure Add(value: T); end; TBa<T> = class(...,IBla<T>,IInterface) end;
(what to do about the guids of the specializations? probably just generate whatever, they dont have any real meaning after all)
Of course this has limitations to check, like
- the generic interface cannot be used, just its specializations
- the specialized interface cannot be used in a generic class interface list
- (the generic interface can be used in a generic class interface list)
- the generic interface itself is invalid to use anywhere (just as generic classes are) except for specializing and deriving from
- normal interfaces cannot derive from generic interfaces
- (generic interfaces can derive from generic interfaces) - given the type params are the same
- specialized interfaces cannot be derived from
Limitation of possible specialisations
type TGenericList<T: TObject or Integer> = ... end;
Probably useful...
What to do with class functions
I dont know a whole lot of how class functions are stored internally. What happens to class functions in generics?
'class of'
<Almindor> don't forget class of, we need class of compatibility for dynamic class factories
<Almindor> There are possible problems with 'class of'
Implementation details
Sanity checks
- Generic types cannot be instantiated from
- (Generic types can be derived from)
- Specialized types cannot be derived from
- Normal types cannot use generic types in its definition
- (Normal types can use specialized types in its definition)
- There are no half specialized types (one type param specified, the other not)
- prevent loops while 'unpacking' specializations (loop counter field in every specialization)
Things to note and not forget
- A class or record generic can use itself as type within its definition
Example:
type TBla<T> = class parent: TBla<T>; end;
- A class or record generic can use another generic as type within its definition
type TBla<T> = class parent: TBlo<T>; end;
Implementation steps
Extend parser to support '<T>' and '<T: bla or bla or bla>' and '<T,Q>' notation
Easy to do. Done.
Different ways to do it, one excluding the other
Way 1
Change class/record/object/pointertype/interface parser to mark if it is a generic, a specialization or normal
Mediate to do. Have done it for class/object and pointertype. Needs all kinds of clone functions for the defs and syms. fpk says he has patches. waiting.
insert dummy symbols (typesym + typedef) for the type params in the generic
Cannot be inserted into the object because parsing of types within objects has been disabled to make that possible:
type Row = Cardinal; TBla = class Row: Integer; end;
weird.. (function searchsym_type, is not possible to have type definitions in records, objects, parameters)
parse implementation section methods *for the generic*
Is harder than it sounds, because the class/object the method is member of has to be searched for the type symbol for the type params (and these *not* derefed but just used as is).
when parsing generics, use a dummy type in place of the type params
For that, defer the type checking of the compiler in that case to 'as late as possible' or 'never for the generic' <--- TODO... hard...
Change derivation handlers to do the sanity checks
Should pose no problems other than finding all the places.
Add error messages for it
Change pexpr (tnode creating)
<Synopsis+> see the do_resulttypepass() calls in pexpr.pas But the compiler needs to known for example if it allows a . after the type
- add a new tnode type for postfix operators (instead of resolving the operators into the fields/methods they reference) and defer type checking until later (just before binary generation - which is not done for the generic type).
Examples for postfix operators are:
- '[]' array member access (normal array, class default array property?)
- '.' record/class member method/variable access
- '^' pointer dereference
- (automatic pointer dereference)
<Synopsis> grep for m_auto and you'll find the places where it is used <Synopsis> only in pexpr.pas
Change ppu streaming routines to save the tree of generics to the ppu
Change type user to instantiate specialized types for generics when used
Somewhat hard to do. (main work here :))
For the instantiation
- there is a dummy type sym needed (ok),
- a new objectdef
- clone of all the syms and defs (at least proc and property) of the original class/object
- and replacing all the type params by the correct types.
- Store that into the module as def (not on disk).
- (<Synopsis+> an additional symtable shall be added to tmodule: tempsymtable; all temporary tdef's shall be stored in that symtable)
- Then resume compiling for the class/object.
Way 2
just keep the source code of the generic around and do blind replacing of the type params and then compile a specialization from that as 'source'
Notes
<Synopsis+> but the code to generate the specialization at runtime can be generated at the time the specialization is created. It doesn't need to be stored in ppu. all information is available to generate the tree at that time.
<Synopsis+> In the 2.1.x development series a rewrite of the ppu loading is planned to make it more clean and more maintainable. Currently it is risky to fix bugs without introducing new bugs
<plugwash-> It needs to be stored in some form so that the specialisations can be generated from it. <Synopsis+> but you can't store them in the ppu. Or with the same limitations as inlining. No references to the implementation symtable.
fpk has patches to make cloning defs possible.
01:28:24 <Synopsis+> users will complain; things like:
interface generictype declar implementation procedure helper; begin end constructor generictype.create begin helper end end.
Will not be allowed.
Another problem:
unit a; interface generictypeA implementation uses B end.
unit b; interface uses a implemenation begin generictypeA<integer>.craete end.
At the time that the implementation of B is parsed it needs already the code of generictypeA. but that is not yet parsed! The compiler doesn't know anything yet. It only knows that generictypeA exists.
<fpk-> dannym, Synopsis: we can simply forbid such code in generics which cause the problem
<fpk-> i.e. everything used by generics must be already included in the interface uses clause ... or forbid references to units used only in the implementation section :)
<Synopsis+> fpk had a good idea that can solve some of the issues: unit bla; interface ... generics ... implementation ... initialisation ... finalizatrion end.
<Synopsis+> The generics delcarations shall be put in a separate section before implementation is parsed
<dannym> at 'procedure TTest.Add(x: T);', T needs to be known. but it doesnt seem that the object symtable is searched for the class method parameters. correct ?
<Synopsis+> dannym: correct, object symtables can not contain types, because that created problems. there is special code for that: remove the objectsymtable before parsing type declarations.
{ for records, don't search the recordsymtable for the symbols of the types } oldsymtablestack:=symtablestack; symtablestack:=symtablestack.next;
Because you can have a field with the same name as a type
type Wrd = Word; R = Record Wrd : Wrd; end
this is allowed. To support this the record or object symtable is remvoed from the symtablestack when the type is parsed.
Already existing stuff
metaclasses
type TObjectClass = class of TObject; TTypedList = class private Fwanttype: TObjectClass; ... public constructor Create(itemtype: TObjectClass); procedure Add(item: TObject); ... end; constructor TTypedList.Create(itemtype: TObjectClass); begin Fwanttype := itemtype; ... end; procedure TTypedList.Add(item: TObject); begin assert(item.Class = Fwanttype); ... end;
... TSomeStuff = class end; ... Flist := TTypedList.Create(TSomeStuff);
Advantages
- already works
Disadvantages
- slow at runtime
- cannot do simple types
- need to cast all the time even for the supported types (i.e. when reading). eeek.
To be incorporated here
Chat logs
<Synopsis> http://www.eleforum.com/cgi-bin/eleweb_lift?action=3&script=wall&num=18&reversesort=true&type=today&starttime=00:00&endtime=23:59&startdate=01-02-05&enddate=01-02-05 <Synopsis> or <Synopsis> http://neli.hopto.org:3980/~micha/irc-fpc/2005/01/%23fpc.20050102.log