Difference between revisions of "ZSeries/Part 1"

From Free Pascal wiki
Jump to navigationJump to search
Line 8: Line 8:
  
 
I also should have a copy of the Stanford Pascal Compiler sources (also over 20 years old), and I believe that the CBT tapes archive (a huge set of over 200 mainframe source code magnetic tapes originally collected by Connecticut Bank and Trust) may contain either a Pascal Compiler or some other compiler source I can use.  It also contains lots of IBM 360/370 assembly language sources which will be useful.  All these resources and others should help in doing this implementation.
 
I also should have a copy of the Stanford Pascal Compiler sources (also over 20 years old), and I believe that the CBT tapes archive (a huge set of over 200 mainframe source code magnetic tapes originally collected by Connecticut Bank and Trust) may contain either a Pascal Compiler or some other compiler source I can use.  It also contains lots of IBM 360/370 assembly language sources which will be useful.  All these resources and others should help in doing this implementation.
 +
 +
=Background on this project=
 +
I am aware, first, that this will be a huge undertaking; this is not a weekend project.  I'm probably looking at a minimum six months work, possibly longer.  I have a story.  When I wasn't doing programming I did mobile notary.  I was, then, a commissioned Notary Public for the Commonwealth of Virginia (I'm now commissioned in Virginia and Maryland), and one of the things that was developed was, when people refinanced their mortgages, the company would send a notary right to the person's home or office.  Well, one of the customers happened to discuss how they wanted to upgrade the application that their people ran to handle booking of services.  It was a DOS application, they wanted to do more things with it and they wanted it to be a GUI under windows.  At that time Free Pascal did not have the equivalent of Lazarus, so I unfortunately had to use Visual Basic to do so.  (Please do not send me hate mail, I have to use what I have available.)
 +
 +
Well, anyway, so I had several meetings with the customer and arranged terms including price.  I worked on it on a regular basis, implemented it through "accretion" in which you get part of it working and they can see how it's coming along.  We had to change things along the way and various fits and starts, but in the end, they were very pleased with the program that they can use for their employees who book orders on their laptop computers.  It wasn't a very big program in terms of what was going on, they have it running on about 60 computers, but from when we first sat down to decide what they wanted, until it was fully running and distributed out to everyone took one solid year. 
 +
 +
So the point of this shaggy dog story is that I'm aware that this is a long-term project and will probably take months.  I am also aware that the compiler won't work when I first make changes because I have to learn where the main part of the compiler hands off control to the machine-specific and OS-specific routines so that it can ask that routine, "Hey, the user is implementing a FOR statement, you need to create the code to do this." or "Hey, the guy is declaring the start of a procedure that has an integer argument, here's the information on how it's supposed to be defined."
 +
 +
So anyway, I know there's a lot involved, there will be fits, and starts, and things won't always go right.  But it's a learning experience, and, if you follow this as it goes along over the months as it progresses, maybe you'll learn something too.
  
 
=Before Getting Started=
 
=Before Getting Started=

Revision as of 21:38, 17 January 2012

Introduction

Go back to zSeriesOnward to Part 2

My name is Paul Robinson, I am the chief programmer at Viridian Development Corporation, which has decided to develop a cross-compiler version of Free Pascal for the IBM 370/390/zSeries mainframe computer. I decided it would be a learning experience, it would allow me to better understand how the Free Pascal compiler works, and because I didn't particularly want to work with a compiler written in C (such as the GCC Pascal Compiler), I wanted to work with one written in Pascal.

There is an existing more-or-less open source compiler for the 370 architecture which I have a copy of the sources and run-time library, it was modified from an earlier incrementally updated Pascal Compiler called P6 or P7 (depending on which release it was), I think it was P6 when it was on the Decsystem/20 mainframe back in the late 1970s and early 1980s (I have the source to that one too), and may have been P7 by the time it was upgraded for the latest release for Control Data Corporation computers. (When Nicklaus Wirth was creating the Pascal language at ETH in Zurich, that's what they had, so Pascal originally started with Control Data computers. In fact, because of the functionality provided by the existing Control Data libraries, Wirth wrote the first Pascal compiler using Fortran. I am not kidding.)

This version of P6/P7 Pascal Compiler (the internal comments refer to it as "Stepwise refinement of a Pascal Compiler") was developed by the Australian Atomic Energy Commission. Only problem with it is it's over 25 years old and only supports Standard Pascal. No objects, no strings (you can do "array [1..256] of character" but concatenating two strings? Write your own function!), and doesn't even come up to the level of capability of Turbo Pascal 3 for DOS. It's old, and I wanted to work with something more recent. I can, however, borrow from it to figure out how some Pascal keywords are translated into 370 code.

I also should have a copy of the Stanford Pascal Compiler sources (also over 20 years old), and I believe that the CBT tapes archive (a huge set of over 200 mainframe source code magnetic tapes originally collected by Connecticut Bank and Trust) may contain either a Pascal Compiler or some other compiler source I can use. It also contains lots of IBM 360/370 assembly language sources which will be useful. All these resources and others should help in doing this implementation.

Background on this project

I am aware, first, that this will be a huge undertaking; this is not a weekend project. I'm probably looking at a minimum six months work, possibly longer. I have a story. When I wasn't doing programming I did mobile notary. I was, then, a commissioned Notary Public for the Commonwealth of Virginia (I'm now commissioned in Virginia and Maryland), and one of the things that was developed was, when people refinanced their mortgages, the company would send a notary right to the person's home or office. Well, one of the customers happened to discuss how they wanted to upgrade the application that their people ran to handle booking of services. It was a DOS application, they wanted to do more things with it and they wanted it to be a GUI under windows. At that time Free Pascal did not have the equivalent of Lazarus, so I unfortunately had to use Visual Basic to do so. (Please do not send me hate mail, I have to use what I have available.)

Well, anyway, so I had several meetings with the customer and arranged terms including price. I worked on it on a regular basis, implemented it through "accretion" in which you get part of it working and they can see how it's coming along. We had to change things along the way and various fits and starts, but in the end, they were very pleased with the program that they can use for their employees who book orders on their laptop computers. It wasn't a very big program in terms of what was going on, they have it running on about 60 computers, but from when we first sat down to decide what they wanted, until it was fully running and distributed out to everyone took one solid year.

So the point of this shaggy dog story is that I'm aware that this is a long-term project and will probably take months. I am also aware that the compiler won't work when I first make changes because I have to learn where the main part of the compiler hands off control to the machine-specific and OS-specific routines so that it can ask that routine, "Hey, the user is implementing a FOR statement, you need to create the code to do this." or "Hey, the guy is declaring the start of a procedure that has an integer argument, here's the information on how it's supposed to be defined."

So anyway, I know there's a lot involved, there will be fits, and starts, and things won't always go right. But it's a learning experience, and, if you follow this as it goes along over the months as it progresses, maybe you'll learn something too.

Before Getting Started

The normal distribution includes most of the sources including the run-time library but does not include the sources to the compiler itself because most people do not need it and it's about another 40 meg. You need to obtain the zip/tar archive file for the compiler from the download location you're using for the rest of Free Pascal (probably Sourceforge or a mirror) and extract from that archive the compiler directory, and include it with the 2.6.0 source release.

First thing is to start by creating a new directory and copying all files and subdirectories from the Compiler subdirectory (and all files in its subdirectories) to a new directory, in order not to contaminate the pristine sources of the current compiler. This compiler will be a cross-compiler, it will run on a PC and will generate an assembly language file for the 370 Architecture with the use of the standard High-Level Assembler syntax. That file will be uploaded to the target mainframe (real or simulated) and run through the high-level assembler there.

You would also want to copy the rtl directory (which is normally outside of the Compiler directory) because you may need to modify some files there. That will also require an i370 subdirectory for its run-time library, which might be different depending on which mainframe OS is targeted. We'll worry about that later.

There will also be created a new i370 subdirectory within the Compiler directory for all the local files related to that architecture.

Note that the pages here will just basically walk through what was done, if a correction is more than a few lines, the user will be directed to the replacement source file. Once the work is completed a zip file containing all of the new or changed source files will be available.

Issues

There are a number of issues when doing this. The 370 has a number of quirks different from the Wintel architecture or Mac hardware

  • It's big-endian
  • It uses non-IEEE floating point so it may have different limit values
  • While it has more registers (15), about 5 of these are generally not usable due to conventions or hardware requirements
  • The maximum amount of memory you can directly address at one time is much smaller (you can only address about 4K at a time, either as code or data). If you're working with two pieces of data, either may be up to 4K in size that you can work with directly.
  • Depending on whether you target a 370, a 390 or a zSystem you may have access to a 32-bit address space or a 64-bit address space and a much larger area than 4K. Since the target I'm going to be using is a 370, I have to restrict the code to a 4K "page" and 32-bit addressing.
  • There are several different operating systems that could be targeted, such as
    • The MUSIC/SP emulator I'm using (not going to be very popular as MUSIC was essentially deprecated by its distributor, McGill University in Montreal)
    • a program running on a terminal under VM/370
    • a program running on the TSO timesharing system
    • a program running as a batch job on OS/VS1
    • a program potentially running as a screen application on the CICS terminal monitor (very similar to how Windows programs work, with a few gotchas) or
    • a program running on Linux/370.

This issue of where the program will run will be dealt with by using generic I/O instructions (basically private macroes) and having an appropriate run-time library for the particular system.

  • The IBM 370/390/zSystem uses the EBCDIC character set, PCs use ASCII (or Unicode). Unicode support may be available but I won't depend upon it. Where the program has to generate constants, only standard, known characters will be used because they translate fine from ASCII to EBCDIC. If there is anything important where the value matters, hexadecimal constants will be used.

Target Choice

I don't really have access to a real zSystem or 390, I have a 370 simulator running on my PC that has an operating system (MUSIC/SP) and an assembler, so the cross-compiler will be restricted to 370 instructions and 32-bit addressing (so the CPU target will be "i370"), with a potential for upgrading this if circumstances permit. I'll also declare the target operating system to be OS/VS1 (since Pascal doesn't permit / in an identifier), it will be "osvs1".

It's entirely possible to design the compiler so that it's a generic "vanilla" like this, then add units for different mainframe operating systems as I figure out how, so that one can "use cics" or "use vm" or "use musicsp" (or God help us, "use dosvse"!) and then have functionality such as ExecCICS(blah,blah,blah) and then read the public parameters or use functionality provided by that operating system. I think that's probably the way to go.

To Begin

This compiler is huge. It's hundreds of source files, and is going to be an enormous task. Where do you start? Well, you start with the main program of the command-line compiler, and you look at it. That file is pp.pas. Recursively follow the sources of every unit referenced by it or any unit they reference until every one has been done (I more-or-less explain this in Part 2) and then you know that you caught all the places you might need to declare, add or change something. (It also gives you at least a fleeting understanding of what each unit does.)

Note that line numbers indicated in any source file are from the version 2.6.0 compiler sources and as such, as lines are added, other line numbers where things were found and changed will increase. So line numbers will be referenced in a file from top to bottom so the references should match. Also, so as not to brand this as "windows centric" since the hope is to build a cross-compiler for I370 that could run on either Windows or Linux, when file names are specified, directory separators will use /.

Note that from this point on, all editing occurs in our "sandbox" directory separate from the original compiler. So let's get started, with Part 2 of this article.

Go back to zSeriesOnward to Part 2