PSP FCB1 and FCB2 not initialized in the loader
tsupplis opened this issue · comments
I am using emu2 very successfully with vintage old dos 1.X, some of which is still relying a lot on CP/M tricks. One of them is the feeding of the 2 first args of command line into the PSP (FCB1, FCB2). It is unfortunately a convoluted process with the algorithm used by dos with the special characters but I reached a fairly comfortable parity with real dos 1.x and modern dos. It is actually more accurate with the only other dos emulator I use: Dosbox.
Attached a proposed fix/implementation that works a treat for me. I am happy to propose a PR if you prefer. The fix is in my fork: https://github.com/tsupplis/emu2
It also fixes the default program size (psp[6,7])
the patched loader.c file
loader.zip
example:
fcbmap a*.b* c:cdd.dddddd
this is how first arguments are mapped in the PSP
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 | ................
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 | ................
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 | ................
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 | ................
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 | ................
00 00 00 00 00 00 00 00 00 00 00 00 00 41 3F 3F | .............A??
3F 3F 3F 3F 3F 42 3F 3F 00 00 00 00 03 43 44 44 | ?????B??.....CDD
20 20 20 20 20 44 44 44 00 00 00 00 00 00 00 00 | DDD........
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 | ................
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 | ................
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 | ................
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 | ................
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 | ................
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 | ................
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 | ................
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 | ................
btw, the following would be mapped the same way:
- fcbmap a*.b* c:cdd.dddddd
- fcbmap a*.b*;c:cdd.dddddd
- fcbmap a*.b*yz;c:cdd.ddd
- fcbmap a*.b*+c:cdd.dddddd
- fcbmap ,a*.b*,,c:cdd.dddddd
....
hence the need for a tiny state machine.
Hi, thanks for your work!
Do you have a test program that uses the FCB1 and 2 from the PSP?
Also, did you see implementation of INT 29h (parse filename to FCB)? Perhaps the common code that parses one filename could be factored out into a function in dosnames.c?
Doing this emulator I have found that there is little detailed information over all those FCB functions and the filename parsing rules...
About doing a pull-request, yes, if you do a pull-request I can comment there and then merge the results, but it is not necessary.
Have Fun!
Yes I do. the original asm.com from msdos 1. It takes a weird parameter where each of the letters of the extension as a meaning. for example test.ccx means output in c for the 2 files created and x for listing to console.
attached the compiled binary but you can get the source under MIT License from ms-dos sources on git..
https://github.com/microsoft/MS-DOS/blob/master/v1.25/source/ASM.ASM
asm.zip
mmm you're correct ... I should have looked at taking the peek at 29h. but looking at details, it parses correctly file names but does not map exactly to the state machine I validated. In particular, the way it switches or not from the first to the second FCB. An example of that is ; or + that move field. and some of the invalid characters that make it in the FCB even though they are incorrect file names ("<|>" for example)
To finish, I just created a small c program that dumps the PSP, I unfortunately did not list the regression, emu2 vs dos. I should have.
I will recreate my scenarios...give me a couple of days.
Ok so I get a couple of regressions digging further...
So ... going back to your 29h may be necessary...
Here is the mini test framework (Updated with dos 1.1 log as well)
fcbtest.zip
- pspfcb.com/.c is a dummy .com binary hexing the 2 fabs
- fcbmap/.c is my standalone parser on macOS
- test.bat/test are running the test on unix/dos
dosbox.log, dos.log, emu2.log the results ....
very happy to help if you need anything else
I confirm hex2bin is working perfectly with the new PSP setting on 0x05-0x09.
Ok so I fixed the only regression I found on DOS 1.1 (wrong FCB set on one of the states). There are plenty if I compare dos 1.1 and modern dos but they are only on the edge cases. DOSBox is ... out of sync.
The last test harness:
fcbtest.zip
Last version of the fixed loader.c in the mapping function
loader.zip
Another small notice on the function 38 change:
// TODO: Initialize PSP values: int 22h, 23h, 24h and parent PSP segment to 0
Actually the code of MSDOS does do that, you are correct. well spoted. again.
This also confirms the values for PSP 5 to 9 ....
https://github.com/microsoft/MS-DOS/blob/master/v1.25/source/MSDOS.ASM
NEWBASE: ; Interrupt call 38
MOV ES,DX
LDS SI,CS:DWORD PTR [SPSAVE]
MOV DS,[SI.CSSAVE]
XOR SI,SI
MOV DI,SI
MOV AX,DS:[2]
MOV CX,80H
REP MOVSW
SETMEM:
; Inputs:
; AX = Size of memory in paragraphs
; DX = Segment
; Function:
; Completely prepares a program base at the
; specified segment.
; Outputs:
; DS = DX
; ES = DX
; [0] has INT 20H
; [2] = First unavailable segment ([ENDMEM])
; [5] to [9] form a long call to the entry point
; [10] to [13] have exit address (from INT 22H)
; [14] to [17] have ctrl-C exit address (from INT 23H)
; [18] to [21] have fatal error address (from INT 24H)
; DX,BP unchanged. All other registers destroyed.
XOR CX,CX
MOV DS,CX
MOV ES,DX
MOV SI,EXIT
MOV DI,SAVEXIT
MOVSW
MOVSW
MOVSW
MOVSW
MOVSW
MOVSW
MOV ES:[2],AX
SUB AX,DX
CMP AX,MAXDIF
JBE HAVDIF
MOV AX,MAXDIF
HAVDIF:
MOV BX,ENTRYPOINTSEG
SUB BX,AX
SHL AX,1
SHL AX,1
SHL AX,1
SHL AX,1
MOV DS,DX
MOV DS:[6],AX
MOV DS:[8],BX
MOV DS:[0],20CDH ;"INT INTTAB"
MOV DS:(BYTE PTR [5]),LONGCALL
RET
Hi!
I did test: IBM DOS 1.0, 1.1, 2.0, 2.1, MS-DOS 3.3, 4.0, 5.0 and FreeDOS 1.1, attached is the log file for all the cases, I included the disk image used to run the tests.
As you see, the old DOS 1.0 and 1.1 are different between themselves, and from 2.0 up to 5.0 the result is always the same. FreeDOS gives bad data for the tests.
So, now I don't know, which result is the best to emulate? Does any program relies on the values returned by DOS before 2.0?
About the initial values in the PSP jump, I avoid reading code from Microsoft sources, because I want the emulator to be a new implementation, free of any license issues. I will write the values taken from a running DOS, I suppose that would work.
Have Fun!
super. So may I suggest, I can create another version of the cmd of fxb function based on 2.0 and you choose one as a ref for your own implementation or reuse it? would that suit you? my biased view is to use 1.1 because of the reduced value of this in modern DOS but this is biased. But probably 2.0-7.0 behavior is what should be adopted. In any case, whether you choose one or the other I think it does not matter too much if it works well for well formed cases. In this case even freedos seems ok.
I understand your view on code pollution but now that it is MIT I think this help a lot align the interface and effects rather than copying the behavior.
BTW, thank's for this tool. It is really super useful. It works a treat with most of the legacy compilers I use in my software archiving projects, form masm to turbo x. I can now comfortably use those tool under a modern macOS and integrate them to standard dev practices. Very similar to zxcc for z80 which is working a treat.
There are a couple of other bugs that do not affect me too much but I would be happy to investigate and report to you, like the need for a 81+ x 25+ terminal or some weirdness in the initial refresh of the screen due to your loop optimization, so if you maintain the software, happy to bring you a small flow of issues and suggested fixes.
Hi!
super. So may I suggest, I can create another version of the cmd of fxb function based on 2.0 and you choose one as a ref for your own implementation or reuse it? would that suit you? my biased view is to use 1.1 because of the reduced value of this in modern DOS but this is biased. But probably 2.0-7.0 behavior is what should be adopted.
If the difference is small (in the code), perhaps a flag could be added to the code, so he user can choose. But I hesitate to do that without an existing program that needs one behavior or the other.
I can try to write a minimal implementation factoring the function 29h, so you can enhance it from there.
In any case, whether you choose one or the other I think it does not matter too much if it works well for well formed cases. In this case even freedos seems ok.
I understand your view on code pollution but now that it is MIT I think this help a lot align the interface and effects rather than copying the behavior.
I did not know it was MIT licensed, that certainly makes at least safe to look at the code.
BTW, thank's for this tool. It is really super useful. It works a treat with most of the legacy compilers I use in my software archiving projects, form masm to turbo x. I can now comfortably use those tool under a modern macOS and integrate them to standard dev practices. Very similar to zxcc for z80 which is working a treat.
It is great that you find it useful! Using old compilers (or other command line tools) was the target of the emulator.
It is named "emu2" because in spanish you read it as "emu-dos" :)
There are a couple of other bugs that do not affect me too much but I would be happy to investigate and report to you, like the need for a 81+ x 25+ terminal or some weirdness in the initial refresh of the screen due to your loop optimization, so if you maintain the software, happy to bring you a small flow of issues and suggested fixes.
Yes,any bug report is welcomed.
Terminal output is complicated because DOS applications do not follow a common way to access the screen: there are apps that use DOS write to stdout, other that use int10h (video BIOS) calls and others that simply write to the video memory directly.
Other area that I want to enhance is keyboard input - there are still some applications that do not receive keystrokes, and I want to be be able to detect applications that are busy waiting for the keyboard and sleep the emulator - to not use 100% of the CPU.
Hi!
Another small notice on the function 38 change:
// TODO: Initialize PSP values: int 22h, 23h, 24h and parent PSP segment to 0
Actually the code of MSDOS does do that, you are correct. well spoted. again.
This also confirms the values for PSP 5 to 9 ....https://github.com/microsoft/MS-DOS/blob/master/v1.25/source/MSDOS.ASM
After reading this https://stackoverflow.com/questions/16669352/call-5-interface-on-ms-dos I implemented the CP/M call by patching and INT 21 into address 0xC0, and adjusting the stack over there, see commit e354112.
Now, an old BASIC-86 version from DOS 1 seems to work: basic-86.zip
This is cool. Thank you. I did not realize that msbasic was using call 5. will monitor my old apps. I will look at the new fob parser on Friday.
Hello.
So here is the new parser:
pak.zip
I also made a few tweaks to the test command and increased largely the number of tests.
It matches with 3 configs:
- #define DOS11=PCDOS1.10
- #define DOS20=PCDOS2.10
- default=PCDOS7.01=MSDOS6.22=PCDOS5.02=PCDOS4.01=PCDOS3.3
Other types of DOS are at odds a bit (DR-DOS,FreeDOS) and I ignored the poor PCDOS1.0
I have noticed a last little weirdness, this time on the command line.
DOS >6 handles double quotes, other don't but ... I spotted that by adding a dump to the command line to the tool and it shows the discrepancy before and after 6.
You are behaving <6 but knowing that we are in a unix world it does not make any bit of concern, I think...
Hi!
Hello.
So here is the new parser:
pak.zip
I also made a few tweaks to the test command and increased largely the number of tests.
Looked at it a little, it would be better to discuss this as a pull-request.
See at line 124:
#if defined(DOS11)
case '+':
case ';':
offset = fcb2 + 1;
state = 13;
break;
#endif
#if defined(DOS11)
case ':':
offset = fcb2 + 1;
state = 4;
break;
#endif
I think both defines could be joined together.
Can you give a little description to what each state represents? Like, 0 is start of command line, 2 is parsing after the dot on FCB1, etc.
Thanks!
No issues, will do some clean up first.
Thanks!
I already did a first review, with a few ideas.
Pushed your code.