panzi / mediaextract

Extracts media files (AVI, Ogg, Wave, PNG, ...) that are embedded within other files.

Home Page:http://panzi.github.com/mediaextract/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Almost perfect

Frotty opened this issue · comments

Thanks for this great tool. It allowed me to extract many textures and sounds from an old game I am trying to remake.

However as the title suggests, it's not perfect yet - there are some textures and sounds in the game, that for some reason are not found by the tool.
Here you can see several textures that I was able to extract.
image
However the two most used objects of the game seen in this animation, rocks and unids, are not extracted.
image

The same goes for sounds. Many are there, but not all.

Do you know what could be the reason for this?
I can provide you with further info if needed.

Maybe the game archive stores raw image and audio data without a BMP/Wav/whatever file header? What game is it and where can I get it? From time to time I like to look at game archive file formats.

Hi, you can download it here: http://www.autofish.net/shrines/rox/rox_1_4_setup.zip
All the textures I found were inside rox.pak and the editor exe.

I reverse engineered enough of the file format to dump the files. The animation shown in the GIF is Gems.bmp (colored and animated). Use this tool to dump the files: https://github.com/panzi/roxpak

Strangely I indeed get 3 more BMP files this way than with mediaextract. Have to investigate that another day.

Wow, amazing! Thanks a lot, I will try it when I get home today.

On another note: How did you do this? I'm kinda interested in this topic as well but wouldn't know how to start other than looking at it with a hexeditor. If you ever have some spare time I'd like to be enlightened or pointed the right way (also in german if preferred).

Looking at it in a hex editor is exactly what I did. It's guessing, then writing a script that uses the guess and adapting it until it works.

I assumed it uses little endian because it's a Windows program. I assumed most sizes will be 32 bit numbers. I saw that the first 4 bytes (32 bits) are a number that could be a file count (correct guess). Then there is a little bit "garbage" and then there is obviously a RIFF Wave file. Because I know how RIFF works I could determine the size of that file and then I looked for that size within the "garbage" before the file data. Also I recognized that this part obviously contains a file name, but strangely encoded using 3 bytes per character. So I searched for the length of the file name before it. I found all these things and playing around also made me realize that in addition to all that there is always also a nil byte after the file data. That was enough to write this script to dump the contained files. There are still fields in the "garbage" that I don't know what they do, so I can't create such an archive, though.

I looked at the remaining field, found out how it's calculated and thus roxpak.py can now also create archives.

Now also fixed extracting of BMP files with mediaextract.

Impressive, thanks for the fix and the explanation!

Btw what was the issue with mediaextract? From the commit I can only guess the bmps had slightly different specs/headers?

I was too strict in validating the header fields. Bitmaps have no really good file magic, so I also look at some header fields to check if they contain values that make sense for a BMP file. Apparently I got that not 100% right and was too restrictive.