dpethes / rerogue

Tools to extract data from Star Wars: Rogue Squadron 3D

Home Page:http://satd.sk/web/rs/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Speech MORT format codec parsing/decoding

JackCarterSmith opened this issue · comments

Hi,

Okay everybody, it's a big one!
Speech in RS are recorded in propritary codec bearing the name of "MORT", MORT can be found in header of files inside "speech" file inside data.dat. Used in a lot of Factor5's games, MORT audio codec does not seem to be fully decoded for now...

I haven't found anything more accomplished than a reversed toolset of N64 version (https://github.com/jombo23/N64-Tools).
Despite efforts to recompile myself the tool, wav output seem broken...

IDA/Ghidra doesn't help me further than speech file RAM allocation and some pointer/config datas. I don't think the datas are "complex" to decode, but the decoder functions seem fragmented in the main program... Pretty hard to extract.

I'll keep trace of my "quest" in this issue, reverse engineering of RS doesn't have a lot of topic in the www universe x)

By adapting some of the @SubDrag code to make it a standalone testing tool, I was able to convert the MORT subfile into an "horrible" sound...

However, the general appearance of the audio signal seems to me not to be noise (not "random" enough). By stretching the sound as it went along, it was not too difficult to recognize a recognizable cry.
I wasn't that far off when I said the sound was ugly!

I'm looking for adjustement in wav generation and looking further in C-asm code to better understand MORT compression protocol.

BE <- be aware of this rare BE file inside other LE type!

Header | Track list | Tracks datas


Header (4B):
unsigned int [4B]: count of tracks

Track list (count of tracks x4B)
repeat_for( count of tracks ) {
	byte [1B]: flags - only the first bit is used, switch between two sound speed factor (0: 0.2 / 1: 0.35)
	unsigned int [3B]: track offset from beginning of file
}

Tracks datas (filesize - Header size - Track list size xB):
MORT compressed data [xB]: audio datas compressed using Factor5's audio codec.

beepbeeporsomething.zip

I don't have much notes on the speech compression. It was part of the Musyx package and iirc there was a special microcode by F5 for N64's audio chip that handled sound and maybe MORT decoding as well?
Anyway, great effort!

Indeed, on the N64 version I could read that there was a microcode injected by the game cartridge.
For those I know about the architecture of the N64, there is indeed the GPU which contains a programmable part for signal processing (like the shaders on our graphic cards today).

From what I could see and hear about the Factor5's ways, I would be inclined to think that it is the multi-channel support of the MusyX engine, allowing the superposition of SFX, soundtracks and voices of the game with a lower memory/computing power.
Just a hypothesis but one that seems to me to be in agreement with those I have observed in the PC code, interviews and remarks on various forums about Factor5's games.

Happy to share! I'm not a good Pascal programmer :P

Little update:

The "bit flag" in header correspond to samplerate (0=>8000 Hz and 1=>16000 Hz).
It's correspond to value obtained during dev interview 👍
I've adapted my debug tool to use it as arguments: first argument is the MORT file manually extracted from speech file and the second is 1 or 0 in function of the flag.

I didn't tried to test it on linux system but I think it can work well...
MORTDecoder.zip

Thanks for the code. I might give it a try and clean it up somewhat and put it in repo afterwards.

Thanks for the code. I might give it a try and clean it up somewhat and put it in repo afterwards.

I've just added main part to handle input and to drive the "big" MORT decoder. The bulk of the clean up should be here.
I'm not certain about the 4096 empty offset before the data in input of the decoder, it doesn't work without it.

So the N64 Sound Tool can rip audio and speech from all MORT games. Rip is a strong word, it's kind of running through disassembled (at low level) code translated to C++ almost directly to spit out the output.

It would be nice if someone cleaned this all up and made an encoder though, instead of the sort of hacky way it's done.

Now anyways, what progress have you made here, or what are you trying to accomplish that's different? Other than a couple imperfect sounds here and there, and sampling rate being off due to games all having their own unique methods, 100% of N64 Sound can be ripped; I have no known misses.

We are trying to reverse RS datas format on the PC version (even if N64 and PC code are similar: some of N64 functions/elements are present on PC).

As you said, the parser act like a "blackbox" with datas in input and retrieve stream from output. I wished to be able to clean it up to get a more human readable code and writting a note on the MORT codec algorithm. Perhaps, we can make a MORT encoder after that...

For now, I can extract all the quotes/speechs from PC datas with "speech" file in input. I've continued to search for MORT parser in PC code but no result. Maybe I should try with signature-sniffer approach...

Sounds good, good luck. I did this before Ghidra - using Ghidra might not be a bad idea for it to make the code a little cleaner, once you find it on PC - looks like Ghidra doesn't map that well to the pure ASM rip here.

It's a complex algorithm, with multiple stages, and something that happened long ago can impact something much later, and it's streaming. It also seems to grab different chunks/amounts of sound per framerate. It sure would be interesting to see this thing decoded for real and how it works.

If you're parsing this on PC version, MORT decoder must exist...

Maybe try and find these tables:
table8004867C[0x00] = 0xC7;
table8004867C[0x01] = 0xD7;
table8004867C[0x02] = 0xE3;
table8004867C[0x03] = 0xE7;
table8004867C[0x04] = 0xF1;
table8004867C[0x05] = 0xF3;
table8004867C[0x06] = 0xF5;
table8004867C[0x07] = 0xF7;

Also are constants in Function80048B3C:
0x2B33
0x4E66
0x6600

See this commit to fix the starting too early sound issue: jombo23/N64-Tools@4fe1073

If you can make appropriate adjustments to fix issue. I see both sounds exactly matching in N64 that you posted above for MORT, and they are shown as 8000/16000 properly in N64 Sound Tool. They are at MORT header +0x6. So for example:
4D4F525404E43E80
0x3E80 is 16000 in the sample you showed.

So are you ripping it properly now, or it's just remotely garbage but sounds like a real sound? The MORT samples you posted above match exactly to N64, so should you should be able to rip these perfectly.
matchingn64rip.zip
Here is their rip from N64 - yours should byte for byte match wav?

MORT is fully software decoding btw - not using microcode on N64. And FYI the ASM rip is from Pokemon Stadium US 1.0 on N64 (addresses match that). I assume you are trying to find the algorithm on PC to use ghidra/decompile though.

It's already a great work, knowing that it is the only usable code for the MORT codec that I could find until now!

That's the point! As I know Factor5's methods, it's largely optimization oriented, I'll not be surprised to found some "tricks" with datas like "pattern reuse" or other such things.
But thanks for the tips about the constants values of MORT decoder, I can use them to locate the portion of code who process it! I'll keep trace of my progress in this topic, but it take time to do so.

Yes I've got a clean sound after setting samplerate to 8000 or 16000 in function of specific header in PC version.
NiceRogue.zip
I've just "index" number of the tracks, the address are different between N64 and PC. But you got 16000 tracks in N64 version?
I compared the files and it's perfectly the same 💯 (except for the last "smpl" I've volontary truncated as it's useless, I suppose it's use by the engine as generic property? I don't remember if it's same for all tracks...).

image

I was wondering on N64, it would have been possible/useful but yes on PC, the decoder is necessarily present if there are MORT files in the game data. I hope I can find it soon to get a base of comparison with N64 asm extracted instructions.

EDIT: I've located the 0x2B33, 0x4E66, 0x6600 parameters with Ghidra at function offset 0x5bce13. I see similarities in the structure of the function calls with the ripped N64 version.

So the actual N64 game outputs raw 16-bit sound data. My toolchain spits out wav files, so that's a wav chunk. smpl is used for loops. Though this game doesn't have loops in there as far as I know, so it's all nulls, and not useful. OK good luck! Note that Ghidra kind of...lumps functions together, so it's not a trivial 1:1 mapping. Especially if you're comparing x86_64, but anyways, you have reference output, which should help hopefully. Good luck! I really would love to understand the algorithm, and have an encoder, but no small feat.

Loop instruction? Interesting...
That's the most complex part of the RS reverse engineering process, no doubt. But, as you say: the most awesome part! A lot of time to extract, test and compare isolated code, the MORT encoder should be the conclusion of this "quest".
May the force be with us!

I've finished my first pass on the main class. Removed a lot of redondant variables and added loop when necessary.

It can always process rogue data correctly, I don't know if it's can always work with other N64 games @SubDrag.

Some parts always seems unclear... Perhaps I should try to clean it up more once again.
MORTDecoder.zip

Yeah it works on all MORT games on N64. If you get something pretty high level would be interesting to see, but it's definitely a long shot to support an encoder, but maybe possible if you spend enough effort!

Yeah it works on all MORT games on N64. If you get something pretty high level would be interesting to see, but it's definitely a long shot to support an encoder, but maybe possible if you spend enough effort!

Ah? Did you try with the new "middle-level" sources I've posted with my answer?
Yeah, encoder is more a bonus challenge for me, the big one should be to clearly understand MORT encoding through decoder analysis!
And I'm a very big fan of F5's works. Old tech certainly, but it's a good XP ref.

I didn't try it, but presumably you tested if it matches the output; if so it's a valid update.