Difference between revisions of "Size Matters/de"

Revision as of 23:09, 24 January 2007

│ Deutsch (de) │ English (en) │ français (fr) │ русский (ru) │ 中文（中国大陆）‎ (zh_CN) │

Einleitung

Auf dieser Seite geht es um die Größe von Binärdateien. Die Größe der von Lazarus und dem FPC erstellten Binärdateien sorgten in der Vergangenheit für einige Verwirrung. Dieser Artikel soll hier Klarheit schaffen. Insbesondere soll gezeigt werden, wie die Größe der Binärdateien ohne großen Aufwand vermindert werden kann.

Realistische Größen der von Lazarus/FPC erstellten Binärdateien

Dateigrößen unter 1MB sollten kein Problem darstellen!.
- Stellen Sie sicher, dass Ihre Binärdateien mit der Smartlinking-Option erstellt wurden, und anschließend mit dem Befehl strip sämtliche Debuggerinformationen entfernt wurden. Stellen Sie insbesondere sicher, dass ALLE Bibliotheken mit der Smartlinking-Option erstellt wurden.
- Komprimieren Sie NICHT standardmäßig Ihre Dateien mit UPX!: Durch die UPX-Kompression werden die Dateien zwar kleiner - sie verbrauchen bei der Ausführung aber mehr Arbeitsspeicher. Bevor Sie Ihre Binärdateien mit UPX komprimieren, sollten Sie sicherstellen, dass auf den Zielsystemen ausreichend RAM zur Verfügung steht!
Bei kleinen Anwendungen spielt die Systemabhängigkeit der Größe der RTL eine gewichtigere Rolle. However 100k standalone binaries that do something can be done, usually even below 50k.
- Unter Windows ist es kein Problem, die Windows API benutzende Binärdateien zu erstellen, die nur 20 Kb groß sind.
- Die Unit Sysutils beinhaltet Internationalisierungen, Texte zur Fehlerbenachrichtigung, Routinen zum Abfangen von Ausnahmefehlern und viele andere Dinge, die stets in das zu erstellende Programm gelinkt werden, wenn die entsprechende Unit benutzt wird. Dadurch kann die Größe des erstellten Programms schätzungsweise um 40 bis 100 Kb anwachsen.
Windows-Lazarus-Anwendungen sind ungefähr 500 Kb groß. Je nachdem, wie viele Komponenten der Benutzeroberfläche (widgets) verwendet werden, kann ihre Größe jedoch schnell anwachsen (>1.5MB)
- Summa summarum sind Lazarus-Anwendungen etwas größer als entsprechende Delphi-Anwendungen. Dies ist der Preis, der für die große Plattforumunabhängigkeit bezahlt wird und die Wartbarkeit des Lazarus-Projektes garantiert..
- Das Wachstum der Binärdateien mit dem Zufügen neuer Komponenten/Units stagniert, wenn mit jedem zusätzlich verwendeten Code kein weiterer zusätzlicher Code aus der LCL verlinkt werden muss.
- Die Größe der Binärdateien hängt entscheidend von der Komplexität der Benutzeroberfläche Ihrer Programme ab.
- In Lazarusanwendungen (wie in Delphianwendungen auch) besteht die Binärdatei zu einem gewissen Prozentsatz aus Strings und Tables.
Einfache Lazarus-Binärdateien sind unter Linux/FreeBSD größer als entsprechende GCC-Binärdateien, da sie keine gemeinsamen Bibliotheken (shared libraries) verwenden. (Die Bibliotheken können Sie über die Konsole mit dem Befehl "ldd ./Programmname" einsehen)
64-Bit-Binärdateien sind stets größer als X86-Dateien. Genauso ist der für RISC-Plattformen erstellte Code größer als der für CISC erstellte.

Weshalb sind die Binärdateien derart groß?

Antwort: Bei richtigen Einstellungen sind die mit Lazarus kompilierten Binärdateien nicht außergewöhnlich groß. (Bemerkung: Der Begriff "groß" soll hier bedeuten: Größer als im vorhergegangenen Abschnitt geschildert)

Sollten Sie nach dem Kompilieren "große" Binärdateien erhalten, so können die Gründe dafür also

in einer falschen Konfiguration des FreePascal-Compilers,
in einer unrealistischen Vorstellung von der Größe von Binärdateien ;)
oder daran liegen, dass Sie den FPC für Zwecke benutzen, für die er nicht konzipiert wurde.

Der letzte Punkt ist der unwahrscheinlichste - der FPC ist hochflexibel. In den folgenden Paragraphen wird den möglichen Ursachen einer zu großen Binärdatei genauer nachgegangen.

Über Framework

A framework greatly decreases the amount of work to develop an application.

This comes however at a cost, because a framework is not a mere library, but more a whole subsystem that deals with interfacing to the outside world. A framework is designed for applications that demand a lot of functionality, and the needs of applications that do not demand a lot of functionality are generally of no or little importance to the designers of the framework.

This means an empty application in a framework can be rather large.

This size of empty applications is not caused by compiler inefficiencies, but by framework overhead. The compiler will remove unused code automatically, but not all code can be removed automatically. The design of the framework determines what code the compiler will be able to remove at compile time.

Einige Frameworks verursachen sehr wenig Overhead, einige verursachen eine Menge Overhead. Expected binary sizes for empty applications on well known frameworks:

Kein Framework: +/- 25KB
MSEGUI: +/- 600KB
Lazarus LCL: +/- 1000KB
Free Vision: +/- 100KB
Key Objects Library: +/- 50KB

Kurz gefasst, wählen sie ihr Framework sorgfältig. Ein leistungsstarkes Framework kann ihnen eine Menge Zeit sparen, aber, wenn der Speicher knapp ist, mag ein kleineres Framework die bessere Wahl sein. But be sure you really need that smaller size. Viele Amateure wählen routinemäßig das kleinste Framework, und landen bei nicht wartbaren Anwendungen und geben auf.

Sind große Binärdateien schlecht?

Well, depends on the magnitude of course. But it is safe to say that hardly anybody should be worried of having binaries as big as a few MB or even over ten MB for sizable applications.

However, there still are a few categories that might want to have some control over keeping binaries small.

the embedded programming world obviously (and then I mean not the embedded PCs which still have tens of MBs)
people that really distribute daily by modem
Contests, benchmarking (the notorious language shootout)

Ein oft zitiertes Missverständnis ist, daß größere Binärdateien langsamer in der Ausführung sind. Im Allgemeinen ist das nicht wahr, exotic last-cycle stuff as code cachelines aside.

Embedded

While Free Pascal is reasonably usable for embedded or system purposes, the final release engineering and tradeoffs are more oriented at general application building. For really specialistic purposes, people could set up a shadow project, more in the way like e.g. there are specialised versions of certain Linux distro's. Worrying the already overburdened FPC team with such specialistic needs is not an option, especially since half of the serious embedded users will roll their own anyway.

modem distribution

The modem case it not just about "downloading from the Net" or "my shareware must be as small as possible", but e.g. in my last job we did a lot of deployment to our customers and our own external sites via remote desktop over ISDN. But even with a 56k modem you can squeeze a MB through quite quickly.

Be careful to not abuse this argument to try to provide a rational fundamental for an emotional opinion about binary size. If you make this point, also dig up statistics about percentage of actual modem users for your application is having (most modem users don't download software from the net, but use e.g. magazine shareware CDs).

Wettbewerbe

Another reason to keep binaries small is language comparison contents (like the Language Shootout). However this is more like solving a puzzle, and not really related to responsible software engineering.

Mangelhafte Kompilerkonfiguration

I'm not going to go explain every aspect of the compiler configuration in great lengths, since it is a FAQ, not the manual. This is meant as an overview only. Read manuals and buildfaq thoroughly for more background info.

Generally, there are several reasons why the binary would be bigger than expected. These are, in descending order of likelihood:

The binary still contains debug information.
The binary was not (fully) smartlinked
The binary includes units that have initialisation sections that execute a lot of code.
You link in complete (external) libraries statically.
Optimization is not (entirely) turned on.
Lazarus project file (lpr) has package units in uses section (this is done automagicly by lazarus)

In the future, shared linking to a FPC and/or Lazarus runtime library might significantly alter this picture. Of course then you will have to distribute a big DLL with lots of other stuff in it and the resulting versioning issues. This is all still some time in the future, so it is hard to quantify what the impact on binary sizes would be.

Debug information

Free Pascal verwendet GDB als Debugger und LD als Linker. These work with a system of in-binary debuginfo, be it stabs or dwarf. People often see e.g. Lazarus binaries that are 40MB. The correct size should be about 6MB, the rest is debuginfo (and maybe 6 MB from not smartlinking properly).

Stabs debuginfo is quite bulky, but has as advantage that it is relatively independant of the binary format. In time it will be replaced on all but the most legacy platforms by DWARF.

There is often confusion with respect to the debuginfo, which is caused by the internal strip in a lot of win32 versions of the binutils. Also some versions of the win32 strip binary don't fully strip the debuginfo generated by FPC. So people toggle some (lazarus/IDE or FPC commandline) flag like -Xs and assume it worked, while it didn't. FPC has been adapted to remedy this, but this will only be in versions from 2006 or later.

So, when in doubt, always try to strip manually, and, on windows, preferably with several different STRIP binaries. Don't drive this too far though, using shoddy binaries of doubtful origin to shave of another byte. Stay with general released (cygwin/mingw and their better beta's) versions.

In time, when 2.1.1 goes gold, this kind of problems might get rarer on specially Windows, since the internal linker provides a more consistent treatment of these problems. However they may apply to people using non-core targets for quite some time to come.

Smartlinking

The base principle of smartlinking is simple and commonly known: don't link in what is not used. This of course has a good effect on binary size.

However the compiler is merely a program, and doesn't have a magic crystal ball to see what is used, so the base implementation is more like this

The compiler finely divides the code up in so called "sections".
Then basically the linker determines what sections are used using the rule "if no label in the section is referenced, it can be removed.

There are some problems with this simplistic view:

virtual methods may be implicitely called via their VMTs. The GNU linker can't trace call sequences through these VMTs, so they must all be linked in;
tables for resource strings reference every string constant, and thus all string constants are linked in (one reason for sysutils being big).
symbols that approachable from the outside of the binary (this is possible for non library ELF binaries too) must be kept. This last limitation is necessary to e.g. avoid stripping exported functions from shared libraries..
Another such pain point are published functions and properties. References to published functions/properties can be constructed on the fly using string operations, and the compiler can't trace them. This is one of the downsides of reflection.
Published properties and methods can be resolved by creating the symbolnames using stringmanipulation, and must therefore be linked in if the class is referenced anywhere. Published code might in turn call private/protected/public code and thus a fairly large inclusion.

Another important sideeffect that is logical, but often forgotten is that this algoritm will link in everything referenced in the initialization and finalization parts of units, even if no functionality from those units are used. So be careful what you USE.

Anyway, most problems using smartlinking stem from the fact that for the smallest result FPC generally requires "compile with smartlinking" to be on WHEN COMPILING EACH AND EVERY UNIT, EVEN THE RTL

The reason for this is simple. LD only could "smart" link units that were the size of an entire .o file until fairly recently. This means that for each symbol a separate .o file must be crafted. (and then these tens of thousands of .o files are archived in .a files). This is a time (and linker memory) consuming task, thus it is optional, and is only turned on for release versions, not for snapshots. Often people having problems with smartlinking use a snapshot that contains RTL/FCL etc that aren't compiled with smartlinking on. Only solution is to recompile the source with smartlinking (-CX) on. See buildfaq for more info.

In the future this will be improved when the compiler will emit smartlinking code by default, at least for the main targets. This is made possible by two distinct developments. First, the GNU linker LD now can smartlink more finely grained (at least on Unix) using --gc-sections, second the arrival of the FPC internal linker (in the 2.1.1 branch) for all working Windows platforms (wince/win32/win64). The smartlinking using LD --gc-sections still has a lot of problems because the exact assembler layout and numerous details with respect to tables must be researched, we often run into the typical problem with GNU development software here, the tools are barely tested (or sometimes not even implemented, see DWARF standard) outside what GCC uses/stresses.

The internal linker can now smartlink Lazarus (17 seconds for a full smartlink on my Athlon64 3700+ using about 250MB memory) which is quite well, but is windows only and 2.1.1 for now. The internal linker also opens the door to more advanced smartlinking that requires Pascal specific knowledge, like leaving out unused virtual methods (20% code size on Lazarus examples, 5% on the Lazarus IDE as a rough first estimate), and being smarter about unused resourcestrings. This is all still in alpha, and above numbers are probably too optimistic, since Lazarus is not working with these optimizations yet.

Initialization and finalization sections

If you include a unit in USES section, even when USES'd indirectly via a different unit, then IF the unit contains initialization or finalization sections, that code and its dependancies is always linked in.

A unit for which this is important is sysutils. As per Delphi compatibility, sysutils converts runtime errors to exceptions with a textual message. All the strings in sysutils together are a bit bulky. There is nothing that can be done about this, except removing a lot of initialisation from sysutils that would make it delphi incompatible. So this is more something for a embedded release, if such a team would ever volunteer.

Static binaries

One can also make fully static binaries on any OS, incorporating all libraries into the binary. This is usually done to ease deployment, but has as tradeoff huge binaries. Since this is wizard territory I only mention this for the sake of completeness. People that can do this, hopefully know what they are doing.

Optimierung

Optimization can also shave off a bit of code size. Optimized code is usually tighter. (but only tenths of a percent) Make sure you use -O3.

Lazarus lpr Dateien

In Lazarus, if you add a package to your project/form you get it's registration unit added to the lpr file. The lpr file is not normally opened, if you want to edit it, first open it (via project -> view source). Then remove all the unnecessary units (Interfaces, Forms, and YOUR FORM units are only required, anything else is useless there, but make sure you don't delete units that only need to register things, such as image readers (jpeg) or testcases).

You can save up to megabytes AND some linking dependencies too if you use big packages (such as glscene).

UPX

The whole UPX cult is a funny thing that originates in a mindless pursuit of minimal binary sizes. In reality it is a tool with advantages and disadvantages.

Die Vorteile sind:

The decompression is easy for the user because it is self contained
If, and only if, some size criterium is on the binary size itself (and not on e.g. the binary in a zip), like with demo contests, it can save some. However, specially in the lowest classes it might be worthwhile to code your compression yourself, because you probably can get the decompression code much tighter for binaries that don't stress all aspects of the binary format.
For rarely used applications or applications run off removable media the diskspace saving may outweigh the performance/memory penalties.
Many users don't know about UPX and judge applications on size (and yes this includes reviewers on shareware listings sites) so if other developers in the category use it you will look bloated if you do not follow suit.

Die Nachteile sind:

worse compression (and also the decompression engine must be factored into _EACH_ binary)
decompression must occur each time.
Since windows XP+ now features a built-in decompressor for ZIP, the the whole point of SFX goes away a bit.
Binary that are internally compressed can't be memorymapped by windows, and must be loaded in its entirity. This means that the entire binary size is loaded into VM space (memory+swap), including resources.

The last point can use some explanation: With normal binaries under windows, all unused code remains in the .EXE, which is why Windows binaries are locked while running. Code is paged in 4k at a time as needed, and under low mem conditions simply discarded (because it can be reloaded from bin at any time). This also goes for (graphical/string) resources.

A compressed binary usually must be decompressed in its entirety, or compression ratio will hurt badly. So windows must decompress the whole binary on startup, and page the unused pages to the system swap, where they rot unused.

Unrealistische Erwartungen

A lot of people simply look at the size of a binary and scream bloat!. When you try to argue with them, they hide behind comparisons (but TP only produces...), they never really say 'why' they need the binary to be smaller at all costs. Some of them don't even realise that 32-bit code is ALWAYS bigger than 16-bit code, or that OS independance comes at a price, or ...,or ..., or...

As said earlier, with the current HD sizes, there is not that much reason to keep binaries extremely small. FPC binaries being 10, 50 or even 100% larger than compilers of the previous millenium shouldn't matter much. A good indicator that these views are pretty emotional and wellfounded is the overuse of UPX (see above), which is a typical sign of binary-size madness, since technically it doesn't make much sense.

So where is this emotion coming from them? Is it just resisting change, or being control-freaks? I never saw much justified cause, except that sometimes some of them were pushing their own inferior libraries, and tried to gain ground against well established libs based on size arguments. But this doesn't explain all cases, so I think the binary size thing is really the last "640k should be enough for anybody" artefact. Even though not real, but just mental.

A dead giveway for that is that the number of realistic patches in this field is near zero, if not zero. It's all maillist discussion only, and trivial RTL mods that hardly gain everything, and seriously hamper making real applications and compability. (and I'm not a compability freak to begin with). There are no cut down RTLs externally maintained, no patch sets etc, while it would be extremely easy.

Anyway, the few embedded people I know that use FPC intensively all have their own customized cut back libraries. For one person internationalisation matters (because he talks a language with accents), and exceptions do not, for somebody else requirements are different again. Each one has its own tradeoffs and choices, and if space is 'really' tight, you don't compromise to use the general release distro.

And yes, FPC could use some improvements here and there. But those shouldn't hurt the "general programming", the multiplatform nature of FPC, the ease of use and be realistic in manpower requirements. Complex things take time. Global optimizers don't fall from the sky readily made.

Vergleiche mit GCC

Somewhat less unrealistic are comparisons with GCC. Even the developers mirror themselves (and FPC) routinely against gcc. Of course gcc is a corporate sponsored behemoth, who is also the Open Source's world favorite. Not all comparisons are reasonable or fair. Even compilers that base themselves on GCC don't support all heavily sponsored "c" gcc's functionality.

Nevertheless, considering the differences in project size, FPC does a surprisingly good job. Speed is ok, except maybe for some cases of heavily scientific calculating, binary sizes and memory use are sufficient or even better in general, the number of platforms doesn't disappoint (though it is a pity that 'real' embedded targets are missing).

Another issue here is that freepascal generally statically links (because it is not abi stable and would be unlikely to be on the target system already even if it was) its own rtl. GCC dynamically links against system libraries. This makes very small (in terms of source size) programs made with fpc have significantly larger binaries than those made with gcc. It's worth mentioning here, that the binary size has nothing to do with the memory footprint of the program. FPC is usually much better in this regard than gcc.

Still, I think that considering the resources, FPC is doing extraordinary well.

Vergleiche mit Delphi

In comparisons with Delphi one should keep in mind that 32-bit Delphi's design originates in the period that a lot of people DIDN'T even have pentium-I's, and the developer that had 32MB RAM was a lucky one. Moreover Delphi was not designed to be portable.

Considering this, Delphi scaled pretty well, though there is always room for improvement, and readjustments that correct historical problems and tradeoffs. (it is a pretty well known fact that a lot of assembler routines in newer Delphi's were slower than their Pascal equivalents, because they were never updated for newer processors. Only the recent D2006 is said to have corrected this).

Still, slowly on the compiler front, FPC isn't Delphi's poor cousin anymore. The comparisons are head-on, and FPC 2.1.1 winning over Delphi is slowly getting the rule, and not the exception anymore.

Of course that is only the base compiler. In other fields there is still enough work to do, though the internal linker helps a lot. The debugger won't be fun though :-) Also in the language interoperability (C++, Obj C, JNI) and shared libraries is lots of work to do, even within the base system.

Vergleiche mit .NET/Java

Be very carefull with comparisons to these JIT compiled systems, JITed programs have different benchmark characteristics and also extrapolating results from benchmarks to full programs is different.

While a JIT can do a great job sometimes (specially in small programs that mostly consist out of a single tight loop), but this good result often doesn't scale. Overall my experience is that statically compiled code is usually faster in most code that is not mainly bound by some highly optimizable tight loop, despite the numerous claims on the net otherwise.

A fairly interesting quantitative source for this is thisShootout faq entry.

@@ Line 25: / Line 25: @@
 Antwort: Bei <b>richtigen Einstellungen</b> sind die mit Lazarus kompilierten Binärdateien nicht außergewöhnlich groß.
+(Bemerkung: Der Begriff "groß" soll hier bedeuten: Größer als im vorhergegangenen Abschnitt geschildert)
 Sollten Sie nach dem Kompilieren "große" Binärdateien erhalten, so können die Gründe dafür also
@@ Line 32: / Line 33: @@
 * oder daran liegen, dass Sie den FPC für Zwecke benutzen, für die er nicht konzipiert wurde.
-Der letzte Punkt ist der unwahrscheinlichste - der FPC ist hochflexibel. In den folgenden Paragraphen wird den möglichen Ursachen einer als zu groß empfundenen Binärdatei genauer nachgegangen.
+Der letzte Punkt ist der unwahrscheinlichste - der FPC ist hochflexibel. In den folgenden Paragraphen wird den möglichen Ursachen einer zu großen Binärdatei genauer nachgegangen.
 === Über Framework ===