PDA

View Full Version : Reversal of DKII Binary File Formats



Wyrmcast
March 19th, 2014, 05:11
This thread is devoted to the reversal of the DKII level and data binary file formats, for modding. Feel free to contribute.



Binary Data - Level Map File Terrain Slabs (Work in Progress)


The first byte I GUESS is the slab type.

The second byte seems to be the owner of the slab type.

The third and fourth bytes are unknown to me, but I guess they
are always 01 and 00 (maybe junk data or seperators/spacers)?



Hex Code = Slab Type


01 02 01 00 = Impenetrable Rock Slab owned by Neutral Player

02 02 01 00 = Rock Slab owned by Neutral Player

03 02 01 00 = Dirt Path Slab owned by Neutral Player

04 02 01 00 = Water Slab owned by Neutral Player

05 02 01 00 = Lava Slab owned by Neutral Player

06 02 01 00 = Gold Slab owned by Neutral Player

07 02 01 00 = Gems Slab owned by Neutral Player

08 00 01 00 = Claimed Path Slab owned by Unknown

08 01 01 00 = Claimed Path Slab owned by Good Player

08 02 01 00 = Claimed Path Slab owned by Neutral Player

08 03 01 00 = Claimed Path Slab owned by Keeper 1

08 04 01 00 = Claimed Path Slab owned by Keeper 2

08 05 01 00 = Claimed Path Slab owned by Keeper 3

08 06 01 00 = Claimed Path Slab owned by Keeper 4

08 07 01 00 = Claimed Path Slab owned by Keeper 5

08 08 01 00 = Claimed Path Slab owned by Unknown

09 03 01 00 = Reinforced Wall Slab owned by Keeper 1

0A 02 01 00 = Treasury owned by Neutral Player

0B 02 01 00 = Lair owned by Neutral Player

0C 02 01 00 = Portal Slab owned by Neutral Player

0D 02 01 00 = Hatchery owned by Neutral Player

0E 01 01 00 = Dungeon Heart Slab owned by Good Player

0E 02 01 00 = Dungeon Heart Slab owned by Neutral Player

0E 03 01 00 = Dungeon Heart Slab owned by Keeper 1

0F 02 01 00 = Library owned by Neutral Player

10 02 01 00 = Training Room owned by Neutral Player

11 02 01 00 = Wooden Bridge owned by Neutral Player

12 02 01 00 = Guard Room owned by Neutral Player

13 02 01 00 = Workshop owned by Neutral Player

14 02 01 00 = Prison owned by Neutral Player

15 02 01 00 = Torture Chamber owned by Neutral Player

16 02 01 00 = Temple owned by Neutral Player

17 02 01 00 = Graveyard owned by Neutral Player

18 02 01 00 = Casino owned by Neutral Player

19 02 01 00 = Combat Pit owned by Neutral Player

1A 02 01 00 = Stone Bridge owned by Neutral Player

1C 01 01 00 = Hero Gate (Final) owned by Good Player

1E 02 01 00 = Edge of Map Slab owned by Neutral Player

1F 02 01 00 = Unclaimed Mana Vault Slab owned by Neutral Player

20 03 01 00 = Claimed Mana Vault Slab owned by Keeper 1

21 01 01 00 = Hero Gate (2x2) owned by Good Player

22 01 01 00 = Hero Gate (Front End 3D Level) owned by Good Player

23 01 01 00 = Hero Lair Slab owned by Good Player

24 01 01 00 = Hero Stone Bridge owned by Good Player

25 01 01 00 = Hero Gate (3x1) owned by Good Player

28 02 01 00 = Mercenary Hero Portal Slab owned by Neutral Player

werkt
March 19th, 2014, 15:21
The map element block on disk looks as follows:

struct Element {
BYTE idTerrain; /* lookup into Terrain kwd with id (BYTE at Terrain Block 0x1d6) */
BYTE idPlayer; /* lookup into Players kld with id (BYTE at Player Block 0xa8) */
BYTE flags; /* only a '2' bit here is interpreted to do anything special at load, 1 may indicate 'valid', but it is not interpreted as such */
BYTE unused; /* as you suspected */
};

The flags 2 bit being set assigns an as yet unknown flag in the runtime map element (0x1000 in WORD at 8, the lower 24 of which correspond to the Tag Id of the player owner).

I've been upside down inside of this executable for the past year and a half. What else do you want to know?

werkt
March 19th, 2014, 21:31
I have most of the code execution for level/resource loading mapped out. I will be posting a kwd/kld loading framework to github after some cleanup, with all that I know about the on disk game structures.

If you happen to know of any particular crash locations (any trace information would be invaluable), assuming we're talking about 1.7 (I don't exactly know all the conventions used for executable versioning by the community), let me know. If I can reproduce a crash, I can probably tell you why it's happening.

And I'm just doing a manual decompile into C++, not any sort of binary reuse. 52k lines and counting, I'm pretty sure I can see the matrix at this point.

werkt
March 23rd, 2014, 01:31
https://github.com/werkt/kwd - Enjoy. Consider the unannotated structures an exercise left for the reader.

Dark Light
March 25th, 2014, 21:33
I've already kind of done all this, all you had to do was ask.

werkt
March 25th, 2014, 23:51
Dark Light, where is your code?

Dark Light
March 26th, 2014, 00:33
I'll post my editor and outline the format when I get a chance.

Dark Light
March 26th, 2014, 00:43
Here is the format that I found when I created my editor.

Map ->
(0,1) - (n,1)
(0,n) - (n,n)

8 - 11: Size of file w*h*4+36
20 : height
24 : width
32 - 33: Size of map w*h*4
Little-endian

Things to note is that there is a 36 byte header. The map itself uses 32 bit values, the first byte (or last depending which way you want to look at it) is the block type, the second byte is the owner of the block, 1 good, 2 hero (I think, it's the same order as in the official editor), then 3, 4, 5 and 6 are players (again same order as the editor). The third and forth are unused as a result of 32 bit. Also note, there are two mana vaults.

werkt
March 26th, 2014, 03:15
Things to note is that there is a 36 byte header.

This is what my code deals with, in addition to providing the rest of the asset structures.

The keeper world/level data format is organized as follows:

field:
uint32_t code; /* table below */
uint32_t size; /* size in bytes of content */
uint8_t content[size]; /* content */

code above is n * 10 for the following values:

10 = Map
11 = Terrain
12 = Rooms
13 = Traps
14 = Doors
15 = KeeperSpells
16 = CreatureSpells
17 = Creatures
18 = Players
19 = Things
20 = (also Things, but no 200 code)
21 = Triggers
22 = Level
23 = Variables
24 = Objects
25 = EffectElements
26 = Shots
27 = Effects

Each block is followed by either another block, or the end of file.

For m in n * 10 + m, the content is usually as follows:

0 = total size of the field content (e.g. 250, 4, 2048 would appear in a 2048 byte file with EffectElements)
1 = a header with a count of the related items, along with some other version or timestamp related fields (e.g. 251, 28, 67 would appear in a file with 67 defined EffectElements). kudos to anyone who discovers what the rest of the fields are, they're definitely meaningless (uninterpreted) at asset load time by game executable.
2 = the content of items, concatenated. The size of individual items here is (usually) the content size / item count from m = 1

Notable exceptions are Things (with a variety of field codes, see kwdThings in my code) and Triggers, with separate codes for actions and whens.
These codes usually follow one another in sequence, making up the "36 byte header" present in the map and many other files. n * 100 + 0, 4, <size>; n * 100 + 1, 28, <count>, <24 more bytes>; n * 100 + 2, <size - 36>, <content>; EOF, describes most of the asset files completely.

werkt
April 4th, 2014, 06:27
Pulled GIM and patched for this.

Have not had any luck reproducing, tried a myriad of options and settings. I drop there, he drops, I drop more, he drops more, I drop it all, battle ensues, win, never a crash.

Incidentally, I did find a crash in 1.7 that does not appear in GIM related to combat pit building.
The same problem does not appear to happen with 1.51, however the "stamp room" effect I noticed does not play in GIM as the room progresses through the tiles.
I've mapped out the path to that crash, and was able to recognize a path along a player update routine, so I bet I can pick out a crash if you can get me an address (running DK2 in any debugger should be able to at least get you that much for your crash).

werkt
April 6th, 2014, 11:57
Windows 7, with Compatibility mode for 2000. 98 Compatibility mode would not start the game.

That address is outside of the range of the executable's identity mapping - 0x00401000 -> 0x007ae400. I'm assuming there's a relocation base there, otherwise the crash is in a loaded dll (entirely possible if this is a ddraw problem, but as you indicated it applied over multiple graphics settings, I would expect gameplay code to be the source).

I know I'm reaching here, but is there any chance of getting a video of the crash (from you or anyone who might be able to reproduce).

tonihele
July 23rd, 2014, 23:12
Hi!

I'm also interested in reversing the file formats. To try make an Open Dungeon Keeper 2, using the old assets (requiring license to the original game). I know how to extract the WAD files with different tools, but it would be great to be able to do this on-the-fly in the code. The meshes are somewhat reversed but I have no idea how to handle the WAD archives... Anyone got info on them?

mefistotelis
July 24th, 2014, 18:17
I have no idea how to handle the WAD archives... Anyone got info on them?

http://keeper.lubiki.pl/html/dk2_tools_other.php

There's unfinished tool there with C source, if I remember correctly it can read uncompressed WADs (some have compression).

Also, I think "Dragon unpacker" is open source and can read uncompressed WADs too, its made in Pascal i think.

werkt
July 29th, 2014, 05:36
For accessing wad files:

1479
1480

The decompress routine here was extracted by someone else - managed to find the post and its hosting disappeared, hopefully it will be a bit more permanent here - so to the unnamed author I say thanks.

This is an older interface that I was using to do some model loading, which searches linearly for the file name to be loaded.

This will neatly open compressed and uncompressed wads (.txt added because forum file manager does not allow .c/.h). Enjoy!

tonihele
July 31st, 2014, 06:49
Very nice, thank you. I tried to read the Dragon Unpacker code, but lost my patience :) I used to code Object Pascal for many years when I was a kid (20 years ago). And it is still easy to read but I found it clumsy to find the bits I needed. Last time I touched C code was 10 years ago. Now it is only JAVA. If I manage to convert the code, I'll release the source here if someone should ever need it. For loading the models (and initially creating the game, or trying) I would use JMonkeyEngine. The models will prove more difficult however...

werkt
July 31st, 2014, 17:38
I have been writing up documentation for the entire set of file formats used by the game. I am not much of a typesetter, but my preliminary version has a full reference sheet for the model formats.

Feel free to use any information from this guide, but credit me either as a member of this forum, or with my name contained in the document. The full version will be released in latex source form via github eventually. If anyone has any suggestions for content, I will entertain them and accept content.

Enjoy!

1483

tonihele
August 3rd, 2014, 20:08
Thank you for the excellent guide! I've managed to extract the meshes WAD file and read imp KMF. Although I haven't really managed to import the model to the engine I'm using, and no GROP or ANIM support yet. I found it hard enough to try to use those unsigned numbers in JAVA (using 7, 8 is not officially supported by the engine) :) The code is messy and not really refactored, wrongly commented (I don't really understand bit functions anymore) etc. The usual. But here if someone wants to use these or something.
https://github.com/tonihele/OpenDungeonKeeper

tonihele
August 7th, 2014, 19:10
@werk Did you have the other formats covered? Or anyone else? I got the impression you cracked the compression on the textures. I've successfully extracted the textures, but they are still in unknown format. I found your code from the texture formats thread, but I think it was incomplete. Many externs without any values. Or did I just not understand it...? Thanks.

werkt
August 8th, 2014, 00:40
The incompleteness of the code was based around tables that are present in the executable (DKII.exe silver 1.7) of which I was unsure of the copyright for - extracting the values at the positions given by their names, types and sizes will make things work. I recommend using IDA freeware to extract the data at those locations.

The image format is actually a variant of EA's TQI format, itself based on IDC, detailed at http://wiki.multimedia.cx/index.php?title=Electronic_Arts_TQI, with similar segment code implemented for ffmpeg at http://ffmpeg.org/doxygen/trunk/eaidct_8c_source.html.

tonihele
August 13th, 2014, 10:05
The image format is actually a variant of EA's TQI format, itself based on IDC, detailed at http://wiki.multimedia.cx/index.php?title=Electronic_Arts_TQI, with similar segment code implemented for ffmpeg at http://ffmpeg.org/doxygen/trunk/eaidct_8c_source.html.

When you say variant, what is the difference? Or could I just use the TQI format extraction?

werkt
September 10th, 2014, 14:41
When you say variant, what is the difference? Or could I just use the TQI format extraction?

Family/work induced obligatory delay here. There are minor differences in how the composition happens per-8x8 block that I have not properly decomposed - the TQI and compressed extraction routines share similar looking code routes, but are completely different sections of the executables. I will be trying sometime soon to extract each in their own vacuum in deference to the ffmpeg routines to figure out how to properly answer "what's different?"

tonihele
September 11th, 2014, 17:42
I'll be happiest if you could provide me with a ready code snipplet :)

tonihele
September 20th, 2014, 08:42
About them ANIMATIONs in KMF's. My animation models look like car wrecks, not in anyway recognizable. I think it is due to the vertice coordinates. I understood that if I read a data from a certain frame, it should be a complete model pose. In your document you, @werkt described that the coordinates are 10bit fixed point numbers :

The anim vector structure encodes a fixed-point
3 dimensional coordinate in 10 bits per dimension. This coordinate vector is
then scaled via ../../HEAD/scale and added to the ../../HEAD/pos basis to
transform into world space.

Bits Meaning
0:9 Z
10:19 Y
20:29 X

It seems I'm unable to read these correctly. I've tried countless of ways, like reading it as number (0-1023) / 1000 * scale, putting the binary point between the first and second bit etc. Nope. How can I read these, what is the magic? I'm pretty sure it is the coordinates. I attached some data of me trying to parse the imp's pickaxe blade from anim.

3rd geomery (pick axe blade), indices
ANIM Imp_Claim(land)start.kmf MESH: Imp.kmf
GEOM index: 255 GEOM index: 93
(0.22934061, 0.309546, -0.083577394) (-0.22019999, -0.0014, -0.3139)
GEOM index: 250 GEOM index: 92
(0.2296967, 0.23725769, -0.0800164) (-0.21329999, -0.0016, -0.2902)
GEOM index: 245 GEOM index: 91
(0.20655021, 0.1653255, 0.08735061) (-0.113, -0.0013, -0.30539998)
GEOM index: 260 GEOM index: 94
(0.2296967, 0.241887, 0.11939961) (-0.3053, -0.0022, -0.2527)
GEOM index: 245 GEOM index: 91
(0.20655021, 0.1653255, 0.08735061) (-0.113, -0.0013, -0.30539998)
GEOM index: 250 GEOM index: 92
(0.2296967, 0.23725769, -0.0800164) (-0.21329999, -0.0016, -0.2902)
GEOM index: 255 GEOM index: 93
(0.22934061, 0.309546, -0.083577394) (-0.22019999, -0.0014, -0.3139)
GEOM index: 260 GEOM index: 94
(0.2296967, 0.241887, 0.11939961) (-0.3053, -0.0022, -0.2527)

werkt
September 20th, 2014, 16:14
The coordinates are signed. Fixed point arithmetic extraction for geom's coords:


x = (((coord >> 20) & 0x3ff) - 0x200) / 511.0f;
y = (((coord >> 10) & 0x3ff) - 0x200) / 511.0f;
z = (((coord >> 0) & 0x3ff) - 0x200) / 511.0f;

tonihele
September 22nd, 2014, 07:14
Thank you! Now to fix JME's pose animation system.. Phew, JAVA is powerful. But when it comes to multimedia.. MP2 is also "impossible" to play with any library. So much ground work needed to be laid down.

MaxHayman
September 24th, 2014, 20:08
Oh nice work guys. I wrote a reader for Dk1 files and this format: http://keeperklan.com/threads/3570-Text-based-map-file-format
Should add Dk2 support to it when i get time :P

tonihele
October 12th, 2014, 08:32
Nice. If you have any information about the Shots.kwd map file, I would appreciate greatly. Seems werkt has missed this one, or forgot to commit.

werkt
September 10th, 2015, 03:05
werkt never misses anything. KWD and the full assets are next on my list for documentation.

Without any preamble and lots of warnings about my terrible typesetting, here's my latest copy of the guide, including KCS files and Text Assets.

1612

As always, enjoy!

Trass3r
October 9th, 2015, 08:53
Hi werkt, do you have an IDA database for DK2?
Or exported changes you could share like
http://keeper.lubiki.pl/html/dk_keeperfx_devel.php?

werkt
October 15th, 2015, 19:39
Hi werkt, do you have an IDA database for DK2?
Or exported changes you could share like
http://keeper.lubiki.pl/html/dk_keeperfx_devel.php?

I'd prefer not to share my entire database, secret sauce and all. What are you after in particular?

mefistotelis
October 16th, 2015, 16:22
I'd prefer not to share my entire database, secret sauce and all.

You used different terms of genitals, sexual acts and deviations do name functions and structures, didn't you?

We all do that.


You may try exporting the function names into .map file - it's easy to handle/truncate, and can be re-applied to IDA database with loadmap plugin:
https://github.com/mefistotelis/ida-pro-loadmap

YourMaster
October 16th, 2015, 18:26
I did it even in stuff I designed for customers.

tonihele
October 17th, 2015, 12:19
Oh boy :) Hey, since the conversation here is lively again. Has anyone had a crack at the sound playing mechanics? We can load the .MAP files, but not know any of the parameter meanings. Any help would be much appreciated. I don't think they are on anyone's top priority list, and perhaps rightfully so.

mefistotelis
October 17th, 2015, 13:49
Has anyone had a crack at the sound playing mechanics?

The sound library which loads sound bank files and plays samples with 3D coords is used in other EA games too.

If we had a full list of games which use the same library, we could check whether any of them was released (in release or beta) with debug symbols.

I know that "F1-2002" uses the same format, as "MAStudio 2002" (with sound bank editor somewhere in options) was made for that game.

tonihele
October 20th, 2015, 18:10
As said, if it helps, we have the *Sfx.Map file format cracked out (well, hopefully, no verification, it parses). Available as JAVA code in our GitHub page. For *Bank.Map we used your specs. If it would be possible to crack some parameters by reverse engineering it by the ear only.

Trass3r
December 30th, 2019, 02:37
I'd prefer not to share my entire database, secret sauce and all. What are you after in particular?

What's so secret about it?
And what did you do with those 50k lines of decompiled code?