packages

From Free Pascal wiki
Revision as of 11:22, 16 September 2007 by Marcov (talk | contribs)
Jump to navigationJump to search

See also Shared Libraries

The FPC/Lazarus project unfortunately is a bit ambiguous with respect to the term packages. In general it is a collection of related units, but the precise way these units are related varies. The most common definitions:

  1. A set of units in FPC that are treated together for installation purposes. Both in the fpcmake as the newer fppkg packagemanager contexts.
  2. A set of units in Lazarus (with designtime parts that require registration into the IDE). See Lazarus Packages
  3. A set of units compiled into a DLL with some extra features that make the package a logical part of the main program, instead of an "external" part. Also called Delphi package. I call this Library package from now on.

This article is meant as a brainstorm session, and requirements analysis relating to the latter definition, one of the missing pieces of the Delphi compability. Most created from browsing an hour on the web, looking at pages about fixes and modding the Delphi packages system, and examining the linker backend of FPC.

Note that in time Lazarus packages might be implemented as Library (Delphi) packages.

Delphi packages

Since Delphi packages is a ambiguous name (it refers both to the IDE angle and the principle), I think Library package would be a better designation. I'll at least use it for this article till something better comes up.

A Library-package can be seen from an implementation point, then it is essentially a DLL with some extra's, or from a language point, then it is a library that consists out of a couple of units, and the library itself (the package) is also an own entity with respect to dependancies. (have a dependancy on the library, then you have a dependancy on all units in it).

What do we need them for?

A library package allows to transparently split a program in multiple binaries (exe + dll/so) . The dynamical variant also allows for plugin systems. The transparant issue is the key that sets library packages apart from normal libraries. Packages allow splitting the program up by grouping units into librarypackages without much additional effort. The transparancy is not 100%, unit dependancies may shift, and not any division of a codebase into packages might be possible due to unit initialization order requirements. However these cases will be rare.

A side Windows specific use might be implementing COM components that fully integrate with the mainprogram. This because a COM component is usually done by registering a DLL (not a .EXE). A package is also a DLL, but they are still part of the main program. I mention this because the Delphi OO bridge package (Open Office connectivity pack by Clootie) might use this.

Implementation

Some implementation points:

  • If your system uses packages,
    • the RTL and other all units the packages use, must also be in a package. In other words, a package can't USES a unit directly in the .EXE
    • A unit can exist only once in one package (see Swart reference below), and probably only once in EXE+packages too .
    • On load of a package this condition (uniqueness of an unit) is checked, but possibly only when packages are dynamically loaded by the program. (the rest being handled by normal dll dependancies) This combined with the previous condition avoids problems like multiple definitions of the VMTs and RTL state in general. Note also that this means a list of units per package must be accessable to the main program on load (?) It is not entire clear what "module" means in library package context. Probably only the librarypackage itself, but maybe a little additional metadata or also e.g. a list of units and their CRCs.
      • So note that apps that load packages dynamically, must have all relevant modules loaded. The following scenario would be interesting. Package C depends on B and A, and B also depends on A. Now the exe is statically (but shared) linked to pkg A (the RTL dll), and then dynloads first B and then C. Does this work?
    • Since they are DLLs, due to the Windows DLL symbol resolving, packages can probably survive some minor patching (allowing implementations to be fixed, probably the Delphi fixpacks (or however they call them) work this way), but in general it must be pretty much the same packages as the program was compiled with.
  • One of the additional features over normal shared libraries are pascal level initializer's ("Initialize") and a module finalizer ("Finalize"). I'm not sure if they are library procedures that traverse compiler generated tables, or if they are wholly compiler generated.
    • Note that probably means that all units in a package are initialised and finalised at the same time. This means that a program built with packages might force a little bit more order on unit initialisation than exactly the same program built without. (because it has the additional requirement that all units in a package are initialised in a sequence). This might have consequences for the compiler.
  • Some form of identification and dependancy management. I'm not sure how much these requirements are additional to normal library/DLL versioning, and what is more language dependant. It seems for runtime loading there should be a _runtime_ check on presence of all units the library depends on, maybe also an crc check to verify the right ones.
  • Packages can use higher level functions (like ansistrings and classes) over packages borders. This means they use the same memory manager. See separate paragraph on this
  • Like DLLs, packages can be statically linked (so that its presence is required for startup), and programs can load additional packages using loadpackage (unloadpackage?). Since a package can only depends on existing packages, and not on the exe (RTL also in package remember!), this means that dynloaded packages can be crafted AFTER .exe generation, allowing for plugin systems.
  • packages are loaded by a procedure called "safeloadlibrary" which is a loadlibrary wrapper that saves the FPU statusword and disables some windows errorwindows on failure to load. (?!?)
  • relocation, for this a little thought experiment using the following realistic Delphi scenario: Assume packages A and B both import packages RTL, C, D, but are compiled separately and used in one .EXE program. This means that the compiler can't guarantee all BPL's are already on a unique baseaddress and addresspace (since A and B could be compiled without the other present). -> they are relocatable, though probably DLL loading will resolve this transparantly mostly.

References

What needs to be done

This roughly splits into three or four parts: (note that afaik Delphi supports all this)

  1. Be able to create and use packages on all platforms. (statically)
    1. RTL into a DLL
    2. mainprogram uses RTL, and can access all symbols it could without. (properties, vars, typed constants, functions, RTTI)
    3. Add another package to the mix. (and multiple in general)
  2. Be able to use package dynamically
    1. Make sure the RTL checks all required packages for a dynloaded package are loaded.
  3. Implement the language/semantic parts.
    1. language and compiler support for package dependancies (requires etc)
    2. Some way to handle versioning (PPL versioning?)
    3. Delphi's "build with packages" and the package list to link is signaled from outside the source. IOW cmdline parameters or a small file with data in the FPC case.

The language support requires a new concept above mere ppu level, that of collection of ppus, that is crosslinked to ppu level. This because on package level, the dependancies are administrated on package level, and in the source on unit level. (the keywords of a "library" unit explicit hint on this "requires" is for package level dependancies, and the unit list is the unit list the package supports). However instead of "requires" this can also be done using cmdline parameters (toggling "build with packages" and specifying the package list in the Delphi IDE)

The system unit init problem (plugin units like memmanager)

In normal FPC programs, the system unit is inited first, and then the main program uses clauses is walked. This allows the user to specify units to initialize before any other unit. If the system unit is in the RTL package, the RTL package will init as a whole. If it is not in a package, it makes plugging units into the basic RTL difficult. So probably the RTL package is an exception in the rule that packages init as a whole. Of course the RTL package can't be loadlibraried, since it is always a requirement for program startup.

So most likely it works as follows:

  1. the system unit inits first, and must be (kept) hardened so that plugin units (like other memmanagers) can plug in. (it already is)
  2. Then units are inited in normal order as would it be a whole program, but a package initializes fully when the first unit would initialize.

Lazarus and Library packages

Even though Delphi uses this system for its packages, it doesn't mean Lazarus should also. Borland was mostly fixed on one platform (this is pre-Kylix) when they designed this.

Since we need packages regardless of what Lazarus does, Lazarus can evaluate at a later time what they use or do. They might e.g. decide against using packages out of versioning reasons (they want to release more frequent than Delphi) or because it can't be implemented in a reasonable way on one of the core platforms Lazarus must support.

Versioning

A somewhat harder nut to crack is library versioning. Borland only released a dozen delphi's over more than a decade (updates are apparantly kept compatible), which keeps the versioning problem somewhat managable for them.

We all know the notorious cygwin, but keep in mind that cygwin also only releases an incompatible DLL every six months, less in more recent times. Granted, currently we don't make a semiannual schedule, but we are also not THAT far from a comparable frequency, and thus similar versionitis.

I don't have any ready made proposals here. It will probably need evaluation after the initial implementation period, how easy it is to maintain compatible versions. However this has the potential to become a stifling compability hold on the minor release schedule.

During a quick brainstorm at a FPC meeting possible problems are:

  • normal methods and procedures can be added, and the DLL lazy resolving (on name) will be robust against such changes. However inserting VMT methods is dangerous.
  • (FPC <->FPC-in-Lazarus-distro incompabilities (?))

Compiling a package

That's easy a library unit is compiled, and "build with packages" is on, a library package will be created.

Compiling with packages

This is a bit harder. The problem is that in the source we find only unit names, and we must somehow translate this to packages. This means that a table must be constructed with unit to package mappings, and package dependancies.

Such a table can be constructed in two ways:

  1. Dynamically -> Scanning for library package header files (*.ppl) and examining them on startup if true.
  2. Statically -> packages must be registered.

I suspect delphi does this dynamically, since runtime packages don't need to be registered. This means that a scan for .ppls (*.dcp in Delphi) and some structures must be added, and the unit search mechanism must be changed to search in these structures first only IF "BUILD WITH PACKAGES" IS ON.

Extensions

Delphi FPC Description
*.dcp *.ppl (ppumove) Combined .ppu files for this library.
*.bpl, .so ? , .so The linkable library file.

Note that combining .ppu's into a ppl means that there won't be duplicate .ppu's (one for static, one for packages), so no need to complicate the dir structure.

Debugging

Our achilles' heel. Can't say much about this. Probably similar to dll/.so debugging.

PPUMove

ppumove is a standalone binary in the FPC distribution that creates a shared libs from already compiled units. As far as I can see, this is already the beginning of a package system, except that it is manual (create packages manually yourself). This support and the makefile targets mostly dates back to the late 0.99.x/1.0.x series. (already before fpcmake)

The RTL (on linux only?) makefile has a "shared" target, and Florian says it should be possible to create a shared lib of the rtl by running make clean all CREATESHARED=1 on Linux. Non x86_32 Linux needs the units to be compiled with PIC though, but the makefile seems to already add -Cg this for non x86_32.

See also