pts / insight-mininasm

compiling the Insight real-mode DOS 16-bit debugger with mininasm

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

compiling the Insight real-mode DOS 16-bit debugger with mininasm
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
This project is a case study of porting NASM assembly source code to
mininasm, more specifically making the NASM source code of the Insight
debugger compilable with mininasm (and also remaining compilable with NASM),
with the goal of producing the same insight.com executable program with NASM
and mininasm, both identical to the file officially released.

This goal has been reached, the commit history of this project shows the
journey, and the remaining of this README explains the steps needed and
their reasons.

Relevant software:

* NASM: the Netwide Assembler (NASM), x86 (16-bit, 32-bit and 64-bit)
  assembler written in C: https://www.nasm.us/ .

  Program size for NASM 0.98.39 (released on 2005-01-20) is ~220 KiB for DOS
  8086, ~274 KiB for Linux i386 and ~325 KiB for Win32. (The large spread is
  because of the different integer sizes, C libraries, compilers and program
  file formats.)

* mininasm: NASM-compatible mini assembler for 8086, 186 and 286, able to
  run on DOS and on modern systems, written in C (with alternative
  implementation fork written in itself): https://github.com/pts/mininasm

  mininasm is indeed mininalistic: compiled program size is ~21 KiB for DOS
  8086 and Linux i386, and ~25 KiB for Win32.

  mininasm aims for compatibility with NASM 0.98.39 (released on 2005-01-20).

  The mininasm executable programs that are able to compile the modified and
  simplified Insight source code can be downloaded from
  https://github.com/pts/mininasm/releases/tag/v6

* Insight real-mode DOS 16-bit debugger, version 1.24, released on
  2009-01-11, written mostly in NASM, released under the GPL v2 license.

  Insight is a very small debugger (insight.com is 32935 bytes) for
  analyzing real-mode DOS programs. It features a 486 disassembler, an 8086
  assembler, ``Trace into'' and ``Step over;; functions, simple breakpoint
  handling, extended code or data navigation, simple color-highlighting, and
  a nice menu-driven interface comparable to Borland's Turbo Debugger.

  Project page with download including source code:

  * http://www.bttr-software.de/products/insight/
  * http://www.bttr-software.de/products/insight/insig124.zip

  The same package also available in the FreeDOS repositories, with some
  additonal FreeDOS metadata files and the original Insight 1.24 files
  (program, documentation and source), labeled as version 1.24a:

  * https://www.ibiblio.org/pub/micro/pc-stuff/freedos/files/distributions/1.2/repos/pkg-html/insight.html
  * https://www.ibiblio.org/pub/micro/pc-stuff/freedos/files/distributions/1.2/repos/devel/insight.zip
  * https://www.ibiblio.org/pub/micro/pc-stuff/freedos/files/repositories/latest/pkg-html/insight.html
  * https://www.ibiblio.org/pub/micro/pc-stuff/freedos/files/repositories/latest/devel/insight.zip

  The difference between the original release (insig124.zip) and the FreeDOS
  archive (insight.zip):

  * Different capitalization of the filenames.
  * A bit different directory layout.
  * The `appinfo' directory is only in the FreeDOS archive, because it's
    specific to FreeDOS.

* kvikdos: a very fast headless DOS emulator for Linux (with KVM):
  https://github.com/pts/kvikdos

  kvikdos can optionally be used to run the DOS 8086 host version of NASM,
  to double check that it generates the same output as the Linux host
  version.

Step 1. Download the Insight release including the source code
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
These files were downloaded:

* http://www.bttr-software.de/products/insight/insig124.zip
* https://www.ibiblio.org/pub/micro/pc-stuff/freedos/files/repositories/latest/devel/insight.zip

Downloaded files were saved to:
https://github.com/pts/insight-mininasm/releases/tag/orig

The target executable program insight.com (32935 bytes) was saved to:
https://github.com/pts/insight-mininasm/releases/download/orig/insight.com

Source code and build scripts were extracted from the downloaded .zip
files, and saved to https://github.com/pts/insight-mininasm/releases/download/orig/insight_1.24_src.zip
, and also to this Git repository.

Step 2. Run all the build steps manually which produce input for NASM
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The DOS tools mkhelp.com and uclpack.exe were run in DOSBox to generate data/page1.xlp and
data/page2.xlp.

The files unnecessary after the generation (e.g. mkhelp.com, uclpack.exe and
their source code, data/page1.hlp and data/page2.hlp) were removed.

Step 3. Change the source code so that it builds with NASM 0.98.39
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Backslashes (`\') were replaced with slashes (`/') in the pathnames in
actdata.inc, for Linux NASM compatibility.

The original insight.com (Insight version 1.24, 32935 bytes) was added, to
facilate comparisons in the future.

The DOS makefile was replaced with Linux compilation shell script
compile.sh. NASM optimization level was updated from `-O4' (buggy in NASM
0.98.39) to `-O9' (between to makefile and the shell script).

Compilation with NASM 0.98.39 `-O9' (tried on Linux and also on DOS)
produces insight.com identical to the release.

Detection and use of local ./nasm-0.98.39.li3 executable program was added
to compile.sh.

Detection and use of kvikdos and local nasm98.exe executable program was
added to compile.sh. This is for running the DOS 8086 host version of NASM
in a DOS emulator, to double check that it generates the same output as the
Linux host version.

NASM 0.98.39 executable programs for Linux i386 (also works on Linux amd64),
Win32 (works on 32-bit and 64-bit Windows) and DOS 8086 were added to
https://github.com/pts/insight-mininasm/releases/tag/nasm-0.98.39 .

Source files (and pregenerated source files) were moved from the data/
directory to the main repository directory.

The macro SCR_WIDTH was moved from the command line to insight.asm.

The macro COL_SCH was moved from the command line to insight.asm.

The now unused source files data/colors_b.inc and data/colors_c.inc were
removed.

Step 4. Change the source code so that it builds with mininasm
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Local labels with prefix `@@' were renamed to prefix `.'. This made it
possible to drop the `-t' command-line switch of NASM, which mininasm
doesn't support. A small Perl scipt was used to automate this:

  $ perl -pi -e 's/[@][@]/[.]/g' *.asm *.inc

Source code of non-8086 code was changed:

* Very little part of Insight optionally takes advantage of CPU features 386
  and 586 (Pentium) processors, and it has corresponding blocks of assembly
  code under `cpu 386' and `cpu 586', something like this:

    cpu 386
    pushfd  ; db 0x66, 0x99
    pop eax  ; db 0x66, 0x58
    mov ecx, eax  ; db 0x66, 0x98, 0xc1
    cpu 8086

  Please note that Insight doesn't switch to 32-bit protected mode (`bits
  32'), so the `db 0x66' prefix is still used with these instructions.

  The example above was replaced with:

    db 0x66
    pushf  ; db 0x99
    db 0x66
    pop ax  ; 0x58
    db 0x66
    mov cx, ax  ; db 0x98, 0xc1

  All these instances were replaced manually, looking at the output of
  `ndisasm -b 16' for the actual `db' values.

  Most of the `db' values were instances of the 386 size prefix `db 0x66',
  and there was also the 586 `cpuid' instruction `db 0x0f, 0xa2'.

Multi-line macro `reg_pos' was replaced half-manually (using a short, costom
Perl script) with its expansions. That was needed because mininasm doesn't
support multi-line macros.

Single-line macros `BUILD' and `SERIAL_NUMBER' were manually replaced with
their string literal expansions. That was needed because mininasm supports
only integer-valued single-line macros.

Single-line macros in colors_k.inc were replaced with `equ' labels:

* Example:

    %define MAIN_BG BLACK ; Old, doesn't work in mininasm.
    MAIN_BG equ BLACK     ; New, works in mininasm
    %assign MAIN_BG BLACK ; This would also work in mininasm.

* That was necessary, because in mininasm, the value of a single-line macro
  defined with `%define' must be an integer literal (possibly prefixed with
  unary operators). This restriction is there because single-line macro
  facilities in mininasm are very limited, and they are implemented using
  (multiple) labels, and this restriction is only sane way to prevent
  deviation from NASM macro expansion semantics.

The empty value of the single-line macro BORLAND_MENU was replaced with the
value `1', because of the above restriction on `%define'.

Moved the non-integer part of the single-line macro `follow_stack' to the
expansion locations:

* Original:

    %define PaletteBuffer (window_buffer + STRING_BUF_SIZE)
    %define follow_stack PaletteBuffer
    cmp si, follow_stack  ; A few more like this in the source.

* Replacement:

    follow_stack_mmm equ window_buffer + STRING_BUF_SIZE
    cmp si, follow_stack_mmm

* The reason is the above restriction on `%define'.

Similarly to `follow_stack', the `+buffer' part of the macro values of
`CMD_X' and `OPER_X' was moved to the expansion locations, because of the
above restriction on `%define'.

The non-integer-valued single-line macros `jmpn' and `jmps' (convenience
shorthand for `jmp short') were repleaced with their expansions, because of
the above restrictions on `%define'.

Operator `==' in `%if' expressions was replaced with `-':

* Example original:

    %if SCR_WIDTH==80
      ...A
    %else
      ...B
    %endif

* Replacement:

    %if SCR_WIDTH-80  ; True if not equals.
      ...B
    %else
      ...A
    %endif

* That was necessary because mininasm doesn't support comparison or logical
  operators in `%if' expressions.

Default code when SCR_WIDTH has an unsupported value was removed.

The multi-line macros `save_act_regs', `restore_act_regs', `save_user_regs'
and `restore_user_regs' were manually replaced with their expansions. That
was necessary because mininasm doesn't support multi-line macros. The
actual manual replacement was quick and straightforward, because these were
simple macros without arguments, and there were only a few expansions.

The unnecessary `section .text' directive, which is not supported by
mininasm, was removed.

The `absolute udata_start' directive, was replaced with `absolute $',
because mininasm supports `absolute' only with `$'. Actually, this was one
of the few instances where esupport for a feature (`absolute $' this time)
was added to mininasm during this effort (of porting Insight to mininasm).

The source code of Insight modified as above compiles with mininasm.

Compilation with mininasm (and comparison of the output with the original
insight.com) was added to compile.sh.

During these steps several bugs were discovered and fixed in mininasm. Some
of the more interesting ones:

1. mininasm was using the global variable `global_label' for two different
   purposes, thus using local labels from a local label definition
   didn't work. First a workaround was implemented in
   https://github.com/pts/mininasm/commit/1312306f70a983a2c90f2945c15f561fe9289215
   which makes it an explicit failure to use local labels. Then the bug was
   fixed in
   https://github.com/pts/mininasm/commit/cc9afbe5ec38b69e30b61527496466d0f80b3ec7
   , which actually made the code shorter. This bugfix involved adding a
   helper function named strcatcmp (my_strcatcmp_far), which compares the
   concatenation of strings S1A and S1B to the string S2. By using this
   unusual comparison, S1A and S1B doesn't have to be concatenated in a
   temporary buffer (used to be `global_label'), but they can be used in
   their original location.

2. There was a bug in the convergence detection of the output size
   optimizer. mininasm reads the same input file in multiple pass, each time
   using the label values from the previous pass in forward-referenced
   labels. This if there are many unrelated size changes between the
   previous pass and the current pass before the current jump, then the
   jmp target of a strict short relative jump (8-bit signed delta, such as
   the `jcxz' instruction), may temporarily appear as if it doesn't fit the
   8 bits. In subsequent passes it would fit, but the buggy mininasm aborted
   too early, in the first pass where it didn't fit. The bug was fixed by
   moving these 8-bit jump target delte error reports to the last pass, like
   this:
   https://github.com/pts/mininasm/commit/a368ab6bc6854b154604d44cad11f39698a82522
   . A unit test was also added with more explanation.

   Discovering and fixing such bugs were one of the primary reasons for
   choosing a large program to be ported to mininasm. By using only
   boot-sector-size (at most 512 byte) test programs this bug wouldn't have
   been discovered.

   See
   https://retrocomputing.stackexchange.com/questions/25808/looking-for-an-open-source-dos-com-program-written-in-assembly
   for the quest of finding programs to port to mininasm.

   The first large test program to be ported to mininasm was
   minnnasm (with 3 `n's in the middle):
   https://github.com/pts/mininasm/blob/master/minnnasm.nasm , a port of
   mininasm itself, the DOS 8086 executable program minnnasm.com being ~15
   KiB at the time (now, with the new features and the bugfixes, it's closer
   to 21 KiB). It uncovered not only minnnasm bugs, but also quirks of NASM
   0.98.39, most of which mininasm now carefully reproduces by default.
   However, it was unlikely that minnnasm leads to discovery of many
   surprising bugs, because its source code was generated by the OpenWatcom
   C compiler, and thus its use of instructions and directives was very
   limited compared to assembly programs written organically.

__END__

About

compiling the Insight real-mode DOS 16-bit debugger with mininasm


Languages

Language:Assembly 99.6%Language:Shell 0.4%