dmsc / emu2

Simple x86 and DOS emulator for the Linux terminal.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

PSP FCB1 and FCB2 not initialized in the loader

tsupplis opened this issue · comments

I am using emu2 very successfully with vintage old dos 1.X, some of which is still relying a lot on CP/M tricks. One of them is the feeding of the 2 first args of command line into the PSP (FCB1, FCB2). It is unfortunately a convoluted process with the algorithm used by dos with the special characters but I reached a fairly comfortable parity with real dos 1.x and modern dos. It is actually more accurate with the only other dos emulator I use: Dosbox.

Attached a proposed fix/implementation that works a treat for me. I am happy to propose a PR if you prefer. The fix is in my fork: https://github.com/tsupplis/emu2

It also fixes the default program size (psp[6,7])

the patched loader.c file
loader.zip

example:
fcbmap a*.b* c:cdd.dddddd

this is how first arguments are mapped in the PSP

00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |  ................ 
00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |  ................ 
00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |  ................ 
00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |  ................ 
00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |  ................ 
00 00 00 00 00 00 00 00  00 00 00 00 00 41 3F 3F  |  .............A?? 
3F 3F 3F 3F 3F 42 3F 3F  00 00 00 00 03 43 44 44  |  ?????B??.....CDD 
20 20 20 20 20 44 44 44  00 00 00 00 00 00 00 00  |       DDD........ 
00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |  ................ 
00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |  ................ 
00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |  ................ 
00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |  ................ 
00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |  ................ 
00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |  ................ 
00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |  ................ 
00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |  ................ 

btw, the following would be mapped the same way:

  • fcbmap a*.b* c:cdd.dddddd
  • fcbmap a*.b*;c:cdd.dddddd
  • fcbmap a*.b*yz;c:cdd.ddd
  • fcbmap a*.b*+c:cdd.dddddd
  • fcbmap ,a*.b*,,c:cdd.dddddd
    ....
    hence the need for a tiny state machine.
commented

Hi, thanks for your work!

Do you have a test program that uses the FCB1 and 2 from the PSP?

Also, did you see implementation of INT 29h (parse filename to FCB)? Perhaps the common code that parses one filename could be factored out into a function in dosnames.c?

Doing this emulator I have found that there is little detailed information over all those FCB functions and the filename parsing rules...

About doing a pull-request, yes, if you do a pull-request I can comment there and then merge the results, but it is not necessary.

Have Fun!

Yes I do. the original asm.com from msdos 1. It takes a weird parameter where each of the letters of the extension as a meaning. for example test.ccx means output in c for the 2 files created and x for listing to console.
attached the compiled binary but you can get the source under MIT License from ms-dos sources on git..
https://github.com/microsoft/MS-DOS/blob/master/v1.25/source/ASM.ASM
asm.zip

mmm you're correct ... I should have looked at taking the peek at 29h. but looking at details, it parses correctly file names but does not map exactly to the state machine I validated. In particular, the way it switches or not from the first to the second FCB. An example of that is ; or + that move field. and some of the invalid characters that make it in the FCB even though they are incorrect file names ("<|>" for example)

To finish, I just created a small c program that dumps the PSP, I unfortunately did not list the regression, emu2 vs dos. I should have.
I will recreate my scenarios...give me a couple of days.

Ok so I get a couple of regressions digging further...
So ... going back to your 29h may be necessary...

Here is the mini test framework (Updated with dos 1.1 log as well)
fcbtest.zip

  • pspfcb.com/.c is a dummy .com binary hexing the 2 fabs
  • fcbmap/.c is my standalone parser on macOS
  • test.bat/test are running the test on unix/dos

dosbox.log, dos.log, emu2.log the results ....

very happy to help if you need anything else

I confirm hex2bin is working perfectly with the new PSP setting on 0x05-0x09.

Ok 😉 My regressions are DOS 1.1 vs DOS 2.0 ..... So I suppose we should look at 2.0 and not the vintage one ...

PCDOS 1.1:
image

PCDOS 7.01:
image

Looking at all of them, the parser is 1.1 compliant ... Do you want me to fix it for DOS 2.X? That being said the FCB stuff is really a 1.1 stuff ...

Ok so I fixed the only regression I found on DOS 1.1 (wrong FCB set on one of the states). There are plenty if I compare dos 1.1 and modern dos but they are only on the edge cases. DOSBox is ... out of sync.

dos.log vs dos1.log
image

The last test harness:
fcbtest.zip

Last version of the fixed loader.c in the mapping function
loader.zip

Another small notice on the function 38 change:
// TODO: Initialize PSP values: int 22h, 23h, 24h and parent PSP segment to 0
Actually the code of MSDOS does do that, you are correct. well spoted. again.
This also confirms the values for PSP 5 to 9 ....

https://github.com/microsoft/MS-DOS/blob/master/v1.25/source/MSDOS.ASM

NEWBASE: ; Interrupt call 38
        MOV     ES,DX
        LDS     SI,CS:DWORD PTR [SPSAVE]
        MOV     DS,[SI.CSSAVE]
        XOR     SI,SI
        MOV     DI,SI
        MOV     AX,DS:[2]
        MOV     CX,80H
        REP     MOVSW

SETMEM:

; Inputs:
;       AX = Size of memory in paragraphs
;       DX = Segment
; Function:
;       Completely prepares a program base at the 
;       specified segment.
; Outputs:
;       DS = DX
;       ES = DX
;       [0] has INT 20H
;       [2] = First unavailable segment ([ENDMEM])
;       [5] to [9] form a long call to the entry point
;       [10] to [13] have exit address (from INT 22H)
;       [14] to [17] have ctrl-C exit address (from INT 23H)
;       [18] to [21] have fatal error address (from INT 24H)
; DX,BP unchanged. All other registers destroyed.

        XOR     CX,CX
        MOV     DS,CX
        MOV     ES,DX
        MOV     SI,EXIT
        MOV     DI,SAVEXIT
        MOVSW
        MOVSW
        MOVSW
        MOVSW
        MOVSW
        MOVSW
        MOV     ES:[2],AX
        SUB     AX,DX
        CMP     AX,MAXDIF
        JBE     HAVDIF
        MOV     AX,MAXDIF
HAVDIF:
        MOV     BX,ENTRYPOINTSEG
        SUB     BX,AX
        SHL     AX,1
        SHL     AX,1
        SHL     AX,1
        SHL     AX,1
        MOV     DS,DX
        MOV     DS:[6],AX
        MOV     DS:[8],BX
        MOV     DS:[0],20CDH    ;"INT INTTAB"
        MOV     DS:(BYTE PTR [5]),LONGCALL
        RET
commented

Hi!

I did test: IBM DOS 1.0, 1.1, 2.0, 2.1, MS-DOS 3.3, 4.0, 5.0 and FreeDOS 1.1, attached is the log file for all the cases, I included the disk image used to run the tests.

fulltests.zip

As you see, the old DOS 1.0 and 1.1 are different between themselves, and from 2.0 up to 5.0 the result is always the same. FreeDOS gives bad data for the tests.

So, now I don't know, which result is the best to emulate? Does any program relies on the values returned by DOS before 2.0?

About the initial values in the PSP jump, I avoid reading code from Microsoft sources, because I want the emulator to be a new implementation, free of any license issues. I will write the values taken from a running DOS, I suppose that would work.

Have Fun!

super. So may I suggest, I can create another version of the cmd of fxb function based on 2.0 and you choose one as a ref for your own implementation or reuse it? would that suit you? my biased view is to use 1.1 because of the reduced value of this in modern DOS but this is biased. But probably 2.0-7.0 behavior is what should be adopted. In any case, whether you choose one or the other I think it does not matter too much if it works well for well formed cases. In this case even freedos seems ok.

I understand your view on code pollution but now that it is MIT I think this help a lot align the interface and effects rather than copying the behavior.

BTW, thank's for this tool. It is really super useful. It works a treat with most of the legacy compilers I use in my software archiving projects, form masm to turbo x. I can now comfortably use those tool under a modern macOS and integrate them to standard dev practices. Very similar to zxcc for z80 which is working a treat.

There are a couple of other bugs that do not affect me too much but I would be happy to investigate and report to you, like the need for a 81+ x 25+ terminal or some weirdness in the initial refresh of the screen due to your loop optimization, so if you maintain the software, happy to bring you a small flow of issues and suggested fixes.

commented

Hi!

super. So may I suggest, I can create another version of the cmd of fxb function based on 2.0 and you choose one as a ref for your own implementation or reuse it? would that suit you? my biased view is to use 1.1 because of the reduced value of this in modern DOS but this is biased. But probably 2.0-7.0 behavior is what should be adopted.

If the difference is small (in the code), perhaps a flag could be added to the code, so he user can choose. But I hesitate to do that without an existing program that needs one behavior or the other.

I can try to write a minimal implementation factoring the function 29h, so you can enhance it from there.

In any case, whether you choose one or the other I think it does not matter too much if it works well for well formed cases. In this case even freedos seems ok.

I understand your view on code pollution but now that it is MIT I think this help a lot align the interface and effects rather than copying the behavior.

I did not know it was MIT licensed, that certainly makes at least safe to look at the code.

BTW, thank's for this tool. It is really super useful. It works a treat with most of the legacy compilers I use in my software archiving projects, form masm to turbo x. I can now comfortably use those tool under a modern macOS and integrate them to standard dev practices. Very similar to zxcc for z80 which is working a treat.

It is great that you find it useful! Using old compilers (or other command line tools) was the target of the emulator.

It is named "emu2" because in spanish you read it as "emu-dos" :)

There are a couple of other bugs that do not affect me too much but I would be happy to investigate and report to you, like the need for a 81+ x 25+ terminal or some weirdness in the initial refresh of the screen due to your loop optimization, so if you maintain the software, happy to bring you a small flow of issues and suggested fixes.

Yes,any bug report is welcomed.

Terminal output is complicated because DOS applications do not follow a common way to access the screen: there are apps that use DOS write to stdout, other that use int10h (video BIOS) calls and others that simply write to the video memory directly.

Other area that I want to enhance is keyboard input - there are still some applications that do not receive keystrokes, and I want to be be able to detect applications that are busy waiting for the keyboard and sleep the emulator - to not use 100% of the CPU.

commented

Hi!

Another small notice on the function 38 change:
// TODO: Initialize PSP values: int 22h, 23h, 24h and parent PSP segment to 0
Actually the code of MSDOS does do that, you are correct. well spoted. again.
This also confirms the values for PSP 5 to 9 ....

https://github.com/microsoft/MS-DOS/blob/master/v1.25/source/MSDOS.ASM

After reading this https://stackoverflow.com/questions/16669352/call-5-interface-on-ms-dos I implemented the CP/M call by patching and INT 21 into address 0xC0, and adjusting the stack over there, see commit e354112.

Now, an old BASIC-86 version from DOS 1 seems to work: basic-86.zip

This is cool. Thank you. I did not realize that msbasic was using call 5. will monitor my old apps. I will look at the new fob parser on Friday.

Hello.
So here is the new parser:
pak.zip
I also made a few tweaks to the test command and increased largely the number of tests.

It matches with 3 configs:

  • #define DOS11=PCDOS1.10
  • #define DOS20=PCDOS2.10
  • default=PCDOS7.01=MSDOS6.22=PCDOS5.02=PCDOS4.01=PCDOS3.3
    Other types of DOS are at odds a bit (DR-DOS,FreeDOS) and I ignored the poor PCDOS1.0

I have noticed a last little weirdness, this time on the command line.
DOS >6 handles double quotes, other don't but ... I spotted that by adding a dump to the command line to the tool and it shows the discrepancy before and after 6.
You are behaving <6 but knowing that we are in a unix world it does not make any bit of concern, I think...

commented

Hi!

Hello.
So here is the new parser:
pak.zip
I also made a few tweaks to the test command and increased largely the number of tests.

Looked at it a little, it would be better to discuss this as a pull-request.

See at line 124:

#if defined(DOS11)
            case '+':
            case ';':
                offset = fcb2 + 1;
                state = 13;
                break;
#endif
#if defined(DOS11)
            case ':':
                offset = fcb2 + 1;
                state = 4;
                break;
#endif

I think both defines could be joined together.

Can you give a little description to what each state represents? Like, 0 is start of command line, 2 is parsing after the dot on FCB1, etc.

Thanks!

No issues, will do some clean up first.

Pull request: #10
Happy to make any changes from your review.

commented

Thanks!

I already did a first review, with a few ideas.

commented

Pushed your code.