# FPC JVM/Language

English (en) русский (ru)

# General information

This is a compiler-only port. That means that except for the system unit and a unit that imports the JDK classes, no other standard RTL or other units are available. Furthermore, even the system unit is quite limited in terms of the functionality that it provides (details are below).

Over time, it is possible to implement most of the standard RTL and other unit functionality in a way that it can be compiled using the JVM-targeted compiler. That is however not the goal of this initial port and bug reports about missing unit-level functionality are not useful at this point (but patches are of course welcome! :).

Compiling code for the JVM target is currently quite slow. The reason is that the used Java assembler, Jasmin, is very slow. The reason is probably not so much that it is written in Java, but simply that it is quite a slow program.

The minimum JDK version required for running programs generated by the FPC JVM port is 1.5.

# Used terminology and conventions

• on the Pascal side/on the Java side: refers to code written in resp. Pascal or Java. A number of features are only available when programming in Pascal, because the Java language has no similar concept.
• implicit pointer types: a number of types that are not pointers in Pascal, are implemented on top of Java classes or arrays. This means that they are implicitly pointers to the actual data, since a class or array is also just a pointer in Java. While possibly counterintuitive, this means that variables of such types can actually be made to behave more like their counterparts on native targets than other types.
• fully qualified class names: the compiler currently does not support namespaces (such as org.freepascal or java.lang) as a syntactic element, nor does it support dotted unit names. This means that it is not possible to use identifiers such as java.lang.Object in the source code. All Pascal headers generated for imported classes abbreviate class names by taking the first letter of each part of the package name, followed by the full class name (e.g., java.lang.Object becomes JLObject). For classes or other types declared in Pascal source code, the regular unit.identifier syntax works. Nevertheless, all identifiers used below will appear using their fully qualified Java name, since the Pascal name can be derived from that while the opposite is not the case.

# Base platform behaviour

Internally the Java/JVM target is treated by the compiler as a 32 bit target. This only means that arithmetic expressions are by default evaluated using 32 bit arithmetic, rather than by 64 bit arithmetic, just like for native 32 bit targets. The generated code will still run perfectly on 64 bit JVMs, and the compiler will also use the built-in 64 bit arithmetic opcodes of the JVM when performing 64 bit computations.

Reasons for this choice include:

• 32 bit arithmetic can be expressed more efficiently in Java bytecode than 64 bit arithmetic
• array indices are always 32 bit in the JVM

The extended floating point type maps to double, just like it does on other targets that do not support native 80 bits floating point support.

# Language feature support

While the language supported by the JVM port is as close as possible to the one supported for native FPC targets, there are some differences due to the nature of the JVM platform. Any language feature not mentioned below can be expected to behave the same as it does on native FPC targets (except for accidental omissions). This behaviour is only guaranteed at the purely semantic level, and not in any way in terms of absolute or relative performance.

## General language features information

### Unsupported language features

• Turbo Pascal-style objects. Support may be added in the future, since they are very similar to records.
• Bitpacking, or indeed any other kind of feature that influences the data layout ({$packset xxx}, {$packrecords xxx}, {$packenum xxx}, ...). • Class helpers. Possible to implement, but non-trivial. On the Java side, they would at best only be usable by explicitly calling the methods of these class helpers, since the Java language does not support automatically redirecting method calls from one class to another. • Variants. Could be implemented, although using them in Java code would not be very convenient (Java does not support operator overloading). • Delphi-style RTTI. Could probably be emulated. • Resourcestring. Unknown how difficult/easy this would be to implement. • Nested procedure variables. While both nested procedures and procedure variables are supported via emulation, the combination is not yet supported. This may be added in the future. • Non-local goto. This will probably never be implemented (it may be theoretically possible via a combination of exceptions and intraprocedural gotos). • Inline assember. There is no support (yet?) for inline assembler, aka Java byte code. ### Partially supported language features • System unit functionality. At this time, the system unit is extremely limited and misses support routines for several standard language features (including any kind of input/output, resource handling, and variants). Most such features can still be implemented in the future. The currently supported system routines are randomize/random, copy (on arrays and strings), halt, lo/hi, abs, sqr, odd, endian swapping routines, ror*/rol*/sar*/bsf*/bsr*, upcase/lowercase, runerror, insert, pos, delete, val, str and most math routines (cos, sin, etc). • Pointers: it is possible to declare pointer types. It is however only possible to take the address of var/out/constref-parameters and of #Implicit_pointer_types. Pointer arithmetic is not supported. Indexing pointers to non-implicit pointer types as arrays is however supported when {$pointermath on} is activated. FIXME: currently it's always enabled.
• Variant records: the variant parts of variant records do not overlap in memory, and hence cannot be used to map the same data in different ways. As a result, they also do not save any memory unlike on native targets.
• Call-by-reference parameters (var/out/constref): the JVM does not support call-by-reference, nor taking the address of a local variable. For #Implicit_pointer_types, this is no problem since there the compiler always has a pointer to the actual data available. For other types, call-by-reference is emulated by the compiler via copy-in/copy-out, which means that changes are not immediately visible outside the called routine. The steps followed by the compiler are:
• construct an array of one element
• store the parameter value in the array (in case of var/constref)
• pass the array to the called routine
• the routine can change the value in the array
• on return, the potentially changed value is copied out of the array back into the original variable (in case of var/out)
• Untyped const/var/out parameters. These are supported in the same way as they are in Delphi.NET, see http://hallvards.blogspot.com/2007/10/dn4dp24-net-vs-win32-untyped-parameters.html (with the same limitations as regular var/out parameters on the JVM target)
• Include files. While include files will work fine at compile time, the Java class file format does not support referring to more than one source file. As a result, the compiler will only insert debugging line information for code in the main unit file. This limitation may be resolved in the future through the use of SMAP files as described in http://jcp.org/aboutJava/communityprocess/final/jsr045/index.html
• Resources. Currently, files specified in {$r xxx} directives will be copied without any further processing into a jar file with the same name as the program, under the directory org/freepascal/rawresources. If you add this jar file to the Java class path when executing the program, you can load the resource files using the JDK's built-in resource file helpers (see java.lang.Class.getResource() and java.lang.Class.getResourceAsStream()). Delphi-style resource support may be (partially) added in the future. • Unit initialization code. If a unit is used from an FPC-compiled program, then the unit initialization code will be run on startup. If the main program is a Java program, then the unit initialization code will only run when the first method is entered that accesses a global constant or variable or calls a global procedure/function from that unit. Note: using classes, types or class variables from a unit does not by itself trigger executing the unit's initialzation code. If you wish to manually trigger this, you can do so by adding a statement such as Object dummy = new unitName(); to your Java code. ### New language features • {$namespace x.y.z} directive. While dotted unit names are not supported, a (global) namespace directive can be used to tell the compiler to put all definitions in the current unit under the specified Java package name.
• Formal class definitions. These do not exist in Java, but can be required to solve problems with circular references amongst Java classes. They are the same as their Objective-C equivalents, except that they use the class rather than objcclass keyword.

#### Ansistring and shortstring code page

The "ansi" code page is set by the java program on startup. A default code page is chosen based on the environment on Linux and Windows. On Mac OS X, for legacy reasons the default code page is always MacRoman, which almost never is what you really want to use. See the Usage instructions for details on how to specify the default code page.

#### String conversions

The compiler will implicitly convert any string type to java.lang.String and vice versa.

### Type aliases and subrange types

The JVM supports neither type aliases nor subrange types. As a result, none of the marked type definitions in the following example will be visible to Java code. Of course, they still work as expected in Pascal code.

type
tc = class
end;

tenum = (ea,eb,ec);

tc2 = tc; // Pascal-only
intsub = 3..41; // Pascal-only
enumsub = eb..ec; // Pascal-only

# Using JDK functionality

The JVM port of FPC includes a Java utility called javapp. It is based on the source code of the standard JDK program javap, which can be used to print the contents of a Java .class file.

The javapp utility can be used to create Pascal headers for compiled Java classes. The system unit contains the subset of the standard JDK classes that is required to implement standard language functionality. The rest of the standard classes that are part of the JDK 1.5 are available via the jdk15 unit. It includes all JDK classes from the java.*, javax.* and org.* hierarchies. Other classes from the JDK are not exported because they are platform-specific and/or can change between different JDK versions.

Some things to watch out for:

• All field names in the translated headers are prefixed by f. The reason is that Java is case-sensitive and the JDK often declares constants and fields with the same name (except for case) in a single class, which is invalid in Pascal.
• The javapp utility cannot handle circular references involving nested classes, because they cannot be expressed in Pascal. There is one such circular reference in the standard JDK, between java.awt.Window and java.awt.Dialog. This is currently worked around declaring java.awt.Dialog as a formal class, which means that none of its methods, fields or nested classes are available.
• In case a field, constant or method name is a reserved Pascal keyword, it is escaped using &. You will have to do the same in your Pascal code to use them.

Run java -jar javapp.jar -help to see usage information.

# Using Android SDK functionality

The principle is the same as with the JDK, except that the javapp utility cannot fully automatically translate the Android SDK class hierarchy due to a circular class reference, a bug related to constants called create, and the use of an inner class. The FPC RTL for the Android/JVM target however includes a unit called androidr14 in which those problems are manually resolved. It contains header translations for all classes in R14 of the Android SDK (corresponds to Android 4.0) from the java.*, javax.*, org., junit.* and android.* hierarchies.

# Issues to watch out for

• Uninitialized values. The JVM requires that all variables are initialized before their first use, and similarly that function results are initialized before they are returned. Some specific situations that are likely to arise in existing Pascal code:
• Non-implicit pointer type variables passed to var-parameters must be initialized prior to the call. Alternatively, you can change such var-parameters into out-parameters.
• Even if a case-statement handles all possible situations that could arise while running the program (e.g., because it handles all possible values of an enumeration), you also have to initialise any unitialized variable you may use afterwards in the else statement
• If a variable is first initialized in a try block, the JVM will assume that an exception could happen before this initialization is executed and hence will consider it to be still uninitialized in the except and finally blocks, as well as after the try block.
• Uninitialized enumeration values. In Java, enumerations are classes. When constructing a new class with enumeration fields or when declaring a global variable whose type is an enumeration, FPC will automatically initialize them with the enumeration instance corresponding to the ordinal value 0. If no such enumeration exists, the field/variable will remain nil. The Java compiler does not perform any such initializations, which means that it is technically possible for Java code to pass nil pointers as enumeration parameters. This is not supported by FPC and if your code attempts to read such parameters it will abort with a java.lang.NullPointerException.
• Call-by-reference parameters. As described earlier, except for #Implicit_pointer_types all call-by-reference parameters (var, out, constref) are emulated via copy-in/copy-out. This means:
• updates to the parameter values will only be globally visible on return from the routine
• such parameters are inconvenient to use from Java code, because there the programmer will have to manually create the temporary arrays and copy the values in and/or out
• Calling constructors. The JVM requires that every constructor calls either another constructor for the same class, or an inherited constructor, before accessing any field or calling any method of the instance. If you do not call any other constructor at all, the compiler is required to insert a call to the parameterless constructor in the parent class. You are not allowed to call a constructor again on an already initialized instance to reinitialize it. Of all of these conditions, FPC currently only performs the automatic calling of the parent's parameterless constructor when required. It does not check the other requirements, which means you will get a run time error when you do not observe them.
• Classes with abstract methods. The JVM requires that classes containing one or more abstract methods are marked as abstract themselves. The compiler will do this for you if you do not do so yourself. However, unlike Pascal, the JVM will throw an exception when you try to instantiate an abstract class.
• Empty strings and dynamic arrays. In most cases, empty arrays and ansi/unicodestrings will be represented internally by a non-nil array/string with length 0. This is different from native FPC targets, where such arrays/strings are represented by nil pointers. This is a conscious choice to make it easier to interact with Java code, since in Java code a nil string will cause a NullPointerException rather than behaving like an empty string. The same goes for arrays. Assigning nil to dynamic arrays to initialize them with an empty array is still completely valid though (the compiler will convert such assignments to emptying the array).
• Synchronized. Access to Java's synchronized functionality is currently not yet exposed (neither the method modifier nor synchronized blocks). This will still be added in the future, in a way that also works for native targets.
• Known bugs. See FPC_JVM/Usage#Known_bugs.