rochus-keller / LisaPascal

A parser and browser for the Lisa source code published by the Computer History Museum

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

This is a parser and code navigator for the Lisa Pascal dialect and associated assembler which I started to implement on January 22, 2023. The reason was that the Computer History Museum published the Apple Lisa source code on January 19, 2023 (see https://computerhistory.org/blog/the-lisa-apples-most-influential-failure/), and I was looking for a tool to study/analyze the code.

Lisa Pascal was an extension to the Pascal used on the Apple II and III, but not yet object-oriented. I found a specification of the language here which I used to implement the parser. During the project it turned out that also the Lisa OS source tree includes Clascal (~5%), i.e. not only the Lisa Toolkit, and that also a preprocessor had to be implemented to understand the code.

To develop and debug the grammar I used my EbnfStudio tool. I started based on a Pascal grammar corresponding to the 1975 version of Wirth and Jensen's "Pascal User Manual & Report" which I found here. I then modified this grammar for EbnfStudio compatibility, fixed left recursion and LL(1) ambiguity, and then correlated the grammar with the mentioned "Pascal Reference Manual for the Lisa", which lead to quite a few additions, removals or other changes. After writing a lexer and generating a parser using Coco/R (based on the grammar file generated by EbnfStudio) I adapted the grammar to the actual syntax used in the Lisa source files.

The Lisa Source code provided by the Computer History Museum contains 1092 plain text source files; further analysis revealed that 614 of which are Pascal and ~203 are MC68000 assembler files; another 126 files are EXEC scripts, and the remaining 144 are emails, documentation, protocols and text resources. About 120 files look like binary encoded sources and other texts which I didn't inspect yet.

Cloc (http://cloc.sourceforge.net) counts 456 kSLOC, of which 408 kSLOC are Pascal, 45 kSLOC are Assembler and 4 kSLOC are EXEC scripts.

Lisa Code Navigator

LisaCodeNavigator Screenshot

LisaCodeNavigator Screenshot

Planned features

My primary goal was to implement tools as I did it e.g. for Oberon or Smalltalk to analyze the Lisa source code and check it e.g. for completeness and feasibility to compile and run it on a Lisa emulator (such as https://lisa.sunder.net/).

  • Lisa Pascal and Clascal parser, adapted to the source code at hand.
  • Preprocessor to include files and employ conditional code directives
  • Overlay file system to accomodate original structure and resolve dependencies
  • Highlighted code browser
  • Mark symbols and navigate from symbols do declarations across all files
  • BUSY & LeanQt build
  • resolve qualifiers by type for navigation of record fields
  • precompiled binary versions for main platforms
  • Module detail outline view
  • MC68000 assembler integrated with code model and symbol navigation

Features in evaluation

  • Class hierarchy outline view
  • transpiler to Free Pascal for MC68000 code generation
  • run the code on a Lisa emulator

Status on January 23, 2023

The parser works in principle; 381 of the 614 Lisa Pascal file found in the source tree parse without an error; most of the other 233 Lisa Pascal files have the same parser error, which is caused because the files are either incomplete or there are alternative parts of the code in the same file. It turned out that the Lisa source code heavily depends on a preprocessor, which is controled by source code directives of the form {$directive arguments}. There is e.g a {$I filepath} directive to include other source files, or an {$IFC variable} directive to enable or mute sections of the file (like #ifdef/#endif in C). Without this preprocessor, which I have not implemented (yet), the mentioned syntax errors cannot be avoided. I e.g. had to add a "non_regular_unit" to the grammar to deal with the fact, that some files only include a part of a program or a unit, so the parser could handle these files; but this is only a provisional fix and to continue the project the implementation of the preprocessor is unavoidable.

Status on January 26, 2023

The source tree is converted to a virtual file system using the original file names, considering Pascal files only. All program and unit files of this file system are parsed and include directives are resolved. Of the 360 program or unit files (with includes) 199 parse without an error. The Lisa toolkit uses Clascal which is not yet supported and leads to parsing errors.

Status on January 28, 2023

Conditional compilation and the Clascal syntax are implemented and tested. The LISA_OS part of the tree (about 200 kSLOC) parses with no errors but four missing include files. The full source tree (about 400 kSLOC measured with my tools) has 12 missing include files and 7 syntax errors. The 400 kSLOC parse on my machine in less than 7 seconds.

The CodeNavigator shows the overlay file system (i.e. the one assumed by the source files) with all programs and units; the files can be browsed in a syntax highlighted viewer; semantic navigation is pending.

Status on January 29, 2023

An AST (as far as required) is constructed from the syntax and symbols are resolved (including imported modules) for crossreferencing. Source code navigation works (e.g. press Ctrl key while moving the cursor over idents and click to navigate to declaration if an underline is visible), but not all idents are indexed yet. Symbol lookup is too slow in the current implementation; I will do it as I did in my Oberon+ IDE with pointer instead of string comparisons, which requires an extra copy of the ident in lower case; the analysis of the about 400 kSLOC currently requires 11 seconds (instead of 7) on my machine.

Status on January 30, 2023

After a lot of debugging and fixing source code navigation works as expected (besides some symbols not yet indexed), and also the "Used by" cross-reference list is implemented and can be used for navigation (double-click). Also navigation history is improved (ALT+Left, ALT+Right) und synced with the views. The indexer now considers the full syntax, including modules and imports, but qualifiers are not yet resolved. I also implemented internalized strings with comparisons by address instead of by string, and now indexing takes only ~8% more time than just parsing. There is now also a BUSY build file, see below how to use it.

Status on February 8, 2023

The MC68000 assembly language parser is implemented and integrated; Pascal procedures marked as external look for assembler implementations in the same virtual directory; the majority of names can be resolved that way (776 vs 534). I also implemented some convenience features like grayed-out sections in Pascal, sizable browser font, marks for unresoved symbols, and more Clascal support. With this release, all features planned so far have been implemented.

Precompiled versions

The following precompiled versions are available at this time:

How to build the parser and code navigator

The executable can be built on all common platforms using regular Qt 5.x or using LeanQt with minimal dependencies.

The project includes the CodeNavigator.pro file which can be opened and built in Qt Creator or directly with qmake on the command line.

To build the Code Navigator using LeanQt and the BUSY build system (with no other dependencies than a C++98 compiler) instead, do the following:

  1. Create a new directory; we call it the root directory here
  2. Download https://github.com/rochus-keller/LisaPascal/archive/refs/heads/master.zip and unpack it to the root directory; rename the resulting directory to "LisaPascal".
  3. Download https://github.com/rochus-keller/LeanQt/archive/refs/heads/master.zip and unpack it to the root directory; rename the resulting directory to "LeanQt".
  4. Download https://github.com/rochus-keller/BUSY/archive/refs/heads/master.zip and unpack it to the root directory; rename the resulting directory to "build".
  5. Open a command line in the build directory and type cc *.c -O2 -lm -O2 -o lua or cl /O2 /MD /Fe:lua.exe *.c depending on whether you are on a Unix or Windows machine; wait a few seconds until the Lua executable is built.
  6. Now type ./lua build.lua ../LisaPascal (or lua build.lua ../LisaPascal on Windows); wait until the LisaCodeNavigator executable is built; you find it in the output subdirectory.

Support

If you need support or would like to post issues or feature requests please use the Github issue list at https://github.com/rochus-keller/LisaPascal/issues or send an email to the author.

About

A parser and browser for the Lisa source code published by the Computer History Museum

License:GNU General Public License v3.0


Languages

Language:C++ 99.4%Language:QMake 0.6%