dmsc / emu2

Simple x86 and DOS emulator for the Linux terminal.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Selection of ansi codepage to use during emulation

RastislavKish opened this issue · comments

Hi there,
I have just tried emulating with Emu2 and it works very nice, it is only Dos emulator accessible for screenreaders, that I know of, thanks to its textual output.
I have just encountered problems with text encoding, while playing a Slovak textual arcade game. The game as I guess uses codepage for central Europe, while Emu2 uses some other, perhaps western Europe?
Those are just guesses, but definite thing is, that I get messed special characters, for example σ instead of ň, τ instead of š, £ instead of ť etc.
Is there something I could do about this? Is Emu2 responsible for encoding translation, or does it send just playn bytes to the console, to be handled by system?

Thanks for any help, and for the amazing program!

commented

Hi!

There are two different issues here:

  • The output to the screen currently uses codepage 437 (US ASCII), with a simple translation table from DOS bytes to unicode. I will modify the code to support other common codepages, I can test with CP850, as that was used here. What is the codepage do you need support?

  • The keyboard input currently does not support international characters at all - it passes al unicode characters untranslated. I will need to add a simple mechanism to go from the UTF code to the correct DOS byte. Doing this will probably take more time.

Thanks for your report, have fun!

Hi,
thanks for quick response!
I believe the codepage used by this game is 852. I have manually checked few messed letters and it seems to be the case.
Kamenický encoding was also quite popular in Czech and Slovak, as it was usable on wider range of devices, thanks to its special characters arrangement. I don't have a game in this encoding currently, but there is a big online archive of old slovak and czech DOS text games, where it will most likely occur. It would be thus great to have support for it as well, but I don't know whether it's supported by localizations libraries such as C++' std::locale or not.

Also with std::locale it was quite easy to convert utf-8 string to desired codepage, I did this once and it worked very well. Sadly I don't know whether C has similar library in its standard, so I can't advice here, but I hope you'll find something similarly efficient.

Thanks again for your effort!

commented

Hi!

I added support for setting the codepage with the EMU2_CODEPAGE environment variable; included the CP850, CP852 and Kamenický codepages, but you can also use any mapping table.

Please, test the code at the branch https://github.com/dmsc/emu2/tree/add-codepages , I will merge it if works ok, and you can help me with reviewing the README also.

Keyboard input and screen output should be working now.

Have Fun!

Hi,
wow, nice! I have compiled it and both encodings seem to work.
I don't know if the input works as well, as this game doesn't require diacritic input, but I'll test that tomorrow on another game.

Anyway, thanks for very nice and quick job!

commented

Ok, will merge and close this.