packages

From Free Pascal wiki
Revision as of 11:24, 9 July 2007 by Marcov (talk | contribs)
Jump to navigationJump to search

See also Shared Libraries

The FPC/Lazarus project unfortunately is a bit ambiguous with respect to the term packages. In general it is a collection of related units, but the precise way these units are related varies. The most common definitions:

  1. A set of units in FPC that are treated together for installation purposes. Both in the fpcmake as the newer fppkg packagemanager contexts.
  2. A set of units in Lazarus (with designtime parts that require registration into the IDE). See Lazarus Packages
  3. A set of units compiled into a DLL with some extra features that make the package a logical part of the main program, instead of an "external" part. Also called Delphi package. I call this Library package from now on.

This article is meant as a brainstorm session, and requirements analysis relating to the latter definition, one of the missing pieces of the Delphi compability. Most created from browsing an hour on the web, looking at articles about fixes and modding the packages system

Note that in time Lazarus packages might be implemented as Library (Delphi) packages.

Delphi packages

Since Delphi packages is a bad name (it refers both to the IDE angle and the principle), I think Library package would be a better designation. I'll at least use it for this article till something better comes up.

A Library-package can be seen from an implementation point, then it is essentially a DLL with some extra's, or from a language point, then it is a library that consists out of a couple of units, and the library itself (the package) is also an own entity with respect to dependancies. (have a dependancy on the library, then you have a dependancy on all units in it)

What do we need them for?

A library package allows to transparently split a program in multiple binaries (exe + dll/so) . The dynamical variant also allows for plugin systems. The transparant issue is the key that sets library packages apart from normal libraries. Packages allow splitting the program up by grouping units into librarypackages without much additional effort.

A side Windows specific use might be implementing COM components that fully integrate with the mainprogram. This because a COM component is usually done by registering a DLL (not a .EXE). A package is also a DLL, but they are still part of the main program. I mention this because the Delphi OO bridge package (Open Office connectivity pack by Clootie) might use this.

Implementation

Some implementation points:

  • If your system uses packages,
    • the RTL and other units the packages use, must also be in a package.
    • A unit can exist only once in one package (see Swart reference below), and probably only once in EXE+packages too .
    • On load of a package this condition (uniqueness of an unit) is checked, but possibly only when packages are dynamically loaded by the program. (the rest being handled by normal dll dependancies) This combined with the previous condition avoids problems like multiple definitions of the VMTs and RTL state in general. Note also that this means a list of units per package must be accessable to the main program on load (?) It is not entire clear what "module" means in library package context. Probably only the librarypackage itself, but maybe a little additional metadata or also e.g. a list of units and their CRCs.
      • So note that apps that load packages dynamically, must have all relevant modules loaded. The following scenario would be interesting. Package C depends on B and A, and B also depends on A. Now the exe is statically linked to pkg A (the RTL dll), and then dynloads first B and then C. Does this work?
    • Since they are DLLs, due to the Windows DLL symbol resolving, packages can probably survive some minor patching (allowing implementations to be fixed), but in general it must be pretty much the same packages as the program was compiled with.
  • One of the additional features over normal shared libraries are pascal level initializer's ("Initialize") and a module finalizer ("Finalize"). I'm not sure if they are library procedures that traverse compiler generated tables, or if they are wholly compiler generated.
    • Note that probably means that all units in a package are initialised and finalised at the same time. This means that a program built with packages might force a little bit more order on unit initialisation than exactly the same program built without. (because it has the additional requirement that all units in a package are initialised in a sequence). This might have consequences for the compiler.
  • Some form of identification and dependancy management. I'm not sure how much these requirements are additional to normal library/DLL versioning, and what is more language dependant. It seems for runtime loading there should be a _runtime_ check on presence of all units the library depends on, maybe also an crc check to verify the right ones.
  • Packages can use higher level functions (like ansistrings and classes) over packages borders. This means they use the same memory manager. It is not know if this automatically is changed to COM compatible memmanager sharemem (e.g. in the rtl BPL creation) or if this is the normal suballocator.
  • Like DLLs, packages can be statically linked (so that its presence is required for startup), and programs can load additional packages using loadpackage (unloadpackage?). Since a package can only depends on existing packages, and not on the exe (RTL also in package remember!), this means that dynloaded packages can be crafted AFTER .exe generation, allowing for plugin systems.
  • packages are loaded by a procedure called "safeloadlibrary" which is a loadlibrary wrapper that saves the FPU statusword and disables some windows errorwindows on failure to load. (?!?)
  • relocation, for this a little thought experiment using the following realistic Delphi scenario: Assume packages A and B both import packages RTL, C, D, but are compiled separately and used in one .EXE program. This means that the compiler can't guarantee all BPL's are already on a unique baseaddress and addresspace (since A and B could be compiled without the other present). -> they are relocatable, though probably DLL loading will resolve this transparantly mostly.

References

What needs to be done

This roughly splits into three or four parts: (note that afaik Delphi supports all this)

  1. Be able to create and use packages on all platforms. (statically)
    1. RTL into a DLL
    2. mainprogram uses RTL, and can access all symbols it could without. (properties, vars, typed constants, functions, RTTI)
    3. Add another package to the mix. (and multiple in general)
  2. Be able to use package dynamically
    1. Make sure the RTL checks all required packages for a dynloaded package are loaded.
  3. Implement the language/semantic parts.
    1. language and compiler support for package dependancies (requires etc)
    2. Some way to handle versioning (PPL versioning?)
    3. Delphi's "build with packages" and the package list to link is signaled from outside the source. IOW cmdline parameters or a small file with data in the FPC case.

The language support requires a new concept above mere ppu level, that of collection of ppus, that is crosslinked to ppu level. This because on package level, the dependancies are administrated on package level, and in the source on unit level. (the keywords of a "library" unit explicit hint on this "requires" is for package level dependancies, and the unit list is the unit list the package supports). However instead of "requires" this can also be done using cmdline parameters (toggling "build with packages" and specifying the package list in the Delphi IDE)

Lazarus and Library packages

Even though Delphi uses this system for its packages, it doesn't mean Lazarus should also. Borland was mostly fixed on one platform (this is pre-Kylix) when they designed this.

Since we need packages regardless of what Lazarus does, Lazarus can evaluate at a later time what they use or do. They might e.g. decide against using packages out of versioning reasons (they want to release more frequent than Delphi) or because it can't be implemented in a reasonable way on one of the core platforms Lazarus must support.

Versioning

A somewhat harder nut to crack is library versioning. Borland only released a dozen delphi's over more than a decade (updates are apparantly kept compatible), which keeps the versioning problem somewhat managable.

We all know the notorious cygwin, but also cygwin only releases an incompatible DLL every two months. Granted, currently we don't make a semiannual schedule, but we are also not THAT far from such frequency.

I don't have any ready made proposals here. It will probably need evaluation after the initial implementation period, how easy it is to make incompatible versions.

During a quick brainstorm at a FPC meeting possible problems are:

  • normal methods and procedures can be added, and the DLL lazy resolving (on name) will be robust against such changes. However inserting VMT methods is dangerous.
  • (FPC <->FPC-in-Lazarus-distro incompabilities (?))

PPUMove

ppumove is a standalone binary in the FPC distribution that creates a shared libs from already compiled units. As far as I can see, this is already the beginning of a package system, except that it is manual (create packages manually yourself). This support and the makefile targets mostly dates back to the late 0.99.x/1.0.x series. (already before fpcmake)

The RTL (on linux only?) makefile has a "shared" target, and Florian says it should be possible to create a shared lib of the rtl by running make clean all CREATESHARED=1 on Linux. Non x86_32 Linux needs the units to be compiled with PIC though, but the makefile seems to already add -Cg this for non x86_32.

See also