Jump to content

[PMD WiiWare] data1 and data2 Binary Archive Information

Platinum Lucario

Recommended Posts

Pokémon Mystery Dungeon (WiiWare)
data1_XXXX.bin and data2_XXXX.bin archive information

The files, data1 and data2 are archive binaries, the "XXXX" indicated in the file name, is the game code for the files, which can vary depending on the version (eg. data2_WPAJ.bin).

Since data1 and data2 archive binaries are compressed with the AT7 compression algorithm, the archives must be decompressed before any data can be extracted.

But here's how the decompressed data1 and data2 files are structured:

Part 1 - Pointer and file name table
Part 2 - Actual data of all files

Part 1 - Pointer and file name table

This data contains all the pointers of all files, as well as their file names.

Each table row consists of:

Offset Endianness Type
0x0-3 Big Endian Data location offset
0x4-7 Big Endian File size
0x8-1B Big Endian Namespace data








Namespace data

This data contains the filename data, and all filenames can only be in ASCII format. Filenames can only be a maximum of 19 characters long, every filename always ends with a 00 value byte and must be in the namespace data in order for it to function properly. Anything after the 00 valued byte through to offset 0x1B will be what we'll refer to it as "junk text".

Junk text is the remains of what used to be a previous file that existed on that entry before it was overwritten by a different filename, either through renaming or removal of a file during development. So essentially it would be development leftovers. It's possible to create a program that would re-replicate the same type of junk data results though file namespace overwriting. For example, we have a file called adev_app_icon.tex, then we name it "app_icon.tex", it would look like this: "app_icon.tex tex". Alternatively if we renamed or removed "bg_event.sed", the file name in the table row below it called "bg.swd" would take priority and overwrite the text (and file offset data) on that table row and the namespace would look like "bg.swd t.sed", then if the file was renamed again to "b.swd", it would be "b.swd  t.sed" which would look like "62 2E 73 77 64 00 00 74 2E 73 65 64 00 00 00 00 00 00 00 00" in hex. This means that an addition of a new file or renaming a file in the archive has the potential to create junk data, overwriting text in an existing entry while leaving some of it there.

Part 2 - Actual data

Each file data fits into this area as followed:

First - File data
Second - Line break filling

Line break filling

This data occurs between each file to show that its sepparate data, it can either be 16 or 32 bytes long, the FF value marks the end of the file data and start of the filling data, all the rest of the filling data is 00 value bytes, after that, the next file begins. The end of the entire data1 or data2 archive is marked with a FF value byte as well, but without the 00 values after it. 


MegaMinerd: Each file is terminated with FF then padded to an offset divisible by 32 using 00. Evidence: files have many different lengths, for example dgMapTxt00.txt has a length of 0x0562.

However, every pointer is divisible by 32.

Also, I believe 00 00 00 00 7C is the data root terminator and the data root is then also padded to an offset divisible by 32.

32-byte filling data:

FF 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

16-byte filling data:

FF 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00



MegaMinerd - For most of the research into filling data

Edited by Platinum Lucario
Updated with MegaMinerd's information
Link to comment
Share on other sites

  • Create New...