Pmd2 PX Compression

The PX compression format, for a lack of better name, is a custom format used notably by the Pokémon Mystery Dungeon : Explorers of Time/Darkness/Sky games. Most of the files using this compression method are contained within either PKDPX or AT4PX containers.

Overview of the Format

The format itself is nothing wildly complex. It revolve around using what we'll call a command byte.

The command byte is what will tell the decompressor what to do with the data that comes after it. The command byte can hold information for up to 8 "operations" to do on the data that follows. Each operation is represented as a bit. And each operation may use one or more byte(s) of data after the command byte.

If the highest bit is 1, we copy the first following byte as-is. If its 0, we compare the value of the high nybble of the following byte to the list of control flags. And depending on which flag match, or if none match, we'll know what to do next.

Note that, those control flags are always "0n" where "n" is an hexadecimal value from 0 to F.
Also, note that, these flags are computed on a file by file basis! They're "tailor-made" for each individual file.

We discuss this more in details in the other sections below.

Decompression

To make explaining things easier, you could imagine that the bytes we're decompressing come from a FIFO queue. And also imagine that we put the decompressed bytes into a Double Ended Queue or deque for short. (Even if in practice those are probably too inefficient to use in this case!)

To decode the command byte, we simply look at the value of all of its 8 BITS, one at a time. From the highest bit to the lowest. So lets say we're in a loop that isolate the value of a particular bit each turn. Something like:

   uint8 mask = 0x80
   loop while( mask > 0 )
   {
       uint8 bitval = mask & cmdbyte //Bitwise AND
       
       if( ! inqueue.isempty() ) //Make sure our input data is not empty, because its not guaranteed we 
       {                         // won't reach the end of file while handling a command byte!
           ... // Handle the cases written below
       }
       else
       {
           break; //If we reach the end of file, stop the loop!
       }
       
       mask = mask >> 1 //Bitshift right by 1
   }

On each turn of that loop, we'll take the value of the current bit, and depending on its state, we'll decide what to do based on these criteria:

If the Bit is 1

If the BIT we've isolated from the command byte is 1, then we pop the next BYTE from the input queue, and push it as-is at the back of the output deque!

If the Bit is 0

Otherwise if the BIT we've isolated is 0, we'll pop the next BYTE from the input queue. Then, we'll try to find whether the high nybble of the BYTE we just read is the same value as one of our control flags.

We also want to keep the value of the low nybble for later.
We'll refer to the high nybble and low nybble we got here as "nbhigh" and "nblow" respectively.

We got 2 possible cases :

If one of our control flag match the high nybble "nbhigh" of the BYTE we just read, then it means we're inserting a pattern of 4 nybbles or 2 bytes into the output deque. Go to Inserting Byte Pattern.
- We'll refer to the index of the control flag we got as "ctrlflagindex".
If none of our control flags match the high nybble "nbhigh" of the BYTE we just read, it means we're copying a sequence of bytes from our decompressed output so far. Go to Copying a Sequence.

Inserting Byte Pattern

Then its either one of these:

If the index of the control flag "ctrlflagindex" is the first one in the control flag table, at index 0, we calculate the value of the 2 bytes we'll push at the back of the output deque this way:


    byte1 = nblow << 4 BitwiseOR nblow
    byte2 = byte1
    //Then we just push those two bytes, and we're done!

Otherwise, for any other "ctrlflagindex" flag index, we have a few other cases to take into account. They all have in common using the value of "nblow". Got to Inserting a Special Byte Pattern.
- We'll add a new variable called "basenybbleval" to make things easier to understand. And we'll put the value of "nblow" into it right away.

Inserting a Special Byte Pattern

First depending on the control flag index, do one of the following :

If "ctrlflagindex" is 1. We increment the value of all 4 nybbles. Or in our example, the value we use to store the base value of all nybble "basenybbleval".

       
        basenybbleval = basenybbleval + 1

Otherwise, if "ctrlflagindex" is 5. We decrement the value of all 4 nybbles. Or in our case, the value we use to store the base value of all nybble "basenybbleval".


    basenybbleval = basenybbleval - 1

From here, lets put the value of "basenybbleval" into 4 separate variables, each holding the value of the individual 4 nybbles. Lets just name them, "nybble0", "nybble1", "nybble2", "nybble3", and put the current value of "basenybbleval" into all of them. (You should use an array or something, as they're only named like this for clarity's sake)

Then, depending on the control flag index, do one of the following :

If "ctrlflagindex" is between 1 and 4 inclusively. Then substract 1 from the nybble corresponding to the value of (ctrlflagindex - 1). For example, if (ctrlflagindex - 1) is 0, we subtract 1 from "nybble0". If its 3, we subtract 1 from "nybble3", and so on!
Otherwise, we add 1 to the nybble corresponding with the value of (ctrlflagindex - 5). If (ctrlflagindex - 5) is 0 we add one to "nybble0" and so on!

Now, all that is left to do is to assemble the 4 nybbles into 2 bytes, and push those 2 bytes to the back of the output deque!

   byte1 = nybble0 << 4 BitwiseOR nybble1
   byte2 = nybble2 << 4 BitwiseOR nybble3

We just push those two to the back of the output stack, and we're done !

Copying a Sequence

To figure out the offset from the end of our output deque where we begin reading the bytes sequence to copy, we'll need to pop the next byte from the input queue and do the following operations:

   int16 offset = (((-0x1000) + nblow) << 8) BitwiseOR inqueue.pop()

"offset" will contain a negative integer value smaller than the current size of the output queue. Just go from the end of the output deque towards the beginning of it, by the absolute value of "offset". This will get you the beginning position of the sequence of bytes to append to the output later on. We'll call this position "seqbeg".

Then, to get the amount of bytes to copy starting from "seqbeg", we take the value of "nbhigh" and add 3.

Knowing this, we can now copy the sequence and push it into the back of the output deque.

After Handling the Bit

After going through those conditions, we know that the next BYTE we pop from the input queue will be a new command byte guaranteed! (Unless we reached the end of the file) So we just have to repeat this loop using this new command byte! However, keep in mind that, there are no guaranty that there are enough bytes in the input to decompress for each bits in a command byte! So its important to check if we reached the end of file each times we handle a new bit from the command byte !

This sums it pretty much!

Compression

Compression is a bit trickier. The byte patterns and the copy as-is operations are fairly easy to implement, but the string search and managing the control flags gets a little complicated!

The tool that compressed the data originally used a sliding window algorithm, or something you could call a lookback buffer. This window/buffer is the range within which the compressor will look for matching strings. That ranges starts from the position of the byte the algorithm is currently handling, to either the beginning of the data, or at most 4096 bytes towards the beginning of the data.

For the format of PX compression used in PMD2's files, the buffer's/window's range seems the fixed at 4096 bytes. A smaller buffer/window could be used, but not a bigger one, because of the way the data is stored in compressed file.
The shortest matching string possible is 3 bytes, and the longest is 18 bytes.

A word of warning about matching strings. Keep in mind that since the high nybble of the data byte is used both to possibly match a control flag or to indicate the length of the sequence to copy, you have a limited amount of possible lengths. You have to reserve 9 values out of the 15 possible for the control flags. Or else some sequence lengths will trigger a 4 bytes pattern replacement instead of a sequence copy, and the resulting decompressed data will be incorrect. So the decompressor/compressor can only use sequences/strings that have a length that matches one of 7 different lengths..

This means, you have to pick the most optimal values to indicate the possible lengths of sequences to copy, and only then pick the control flags from the remaining values ! It is suggested to at least keep 0x0 and 0xF reserved as sequence lengths, as in several cases they appear to be the most used/flexible. But this is in no way a rule.

The algorithm used in this discussion will be a 2 pass algorithm. The first one builds a list of operations, then based on the results, the control flags are determined. After that, the operations are all encoded, and inserted into a PKDPX or AT4PX container. This makes it much easier to determine how to optimally use the limited value range to use for lengths of sequences to copy.

1st Pass

First, its important to find a good way of storing all the required info for each operations, and preserve the order to process them in! This could be a simple vector/deque of little structures such as this one:

       /*********************************************************************************
           compOp
               Stores an operation to insert into the output buffer.
       *********************************************************************************/
       struct compOp
       {
           ePXOperation type;              //The operation to do
           uint8_t      highnybble,        //The value of the compressed high nybble if applicable
                        lownybble,         //The value of the compressed low nybble 
                        nextbytevalue;     //value of the compressed next byte if applicable
           void reset()
           {
               type          = ePXOperation::COPY_ASIS;
               highnybble    = 0;
               lownybble     = 0;
               nextbytevalue = 0;
           }
       };

Where the types of operations are represented using an enum such as this one:

       /*********************************************************************************
           All the possible operations that can be done to compress data!
           Entries 0 to 8 correspond to their respective ctrl flag indexes!
       *********************************************************************************/
       static enum struct ePXOperation : int8_t
       {
           COPY_ASIS                                 =-1,
           COPY_NYBBLE_4TIMES                        = 0,
           COPY_NYBBLE_4TIMES_EX_INCRALL_DECRNYBBLE0 = 1,
           COPY_NYBBLE_4TIMES_EX_DECRNYBBLE1         = 2,
           COPY_NYBBLE_4TIMES_EX_DECRNYBBLE2         = 3,
           COPY_NYBBLE_4TIMES_EX_DECRNYBBLE3         = 4,
           COPY_NYBBLE_4TIMES_EX_DECRALL_INCRNYBBLE0 = 5,
           COPY_NYBBLE_4TIMES_EX_INCRNYBBLE1         = 6,
           COPY_NYBBLE_4TIMES_EX_INCRNYBBLE2         = 7,
           COPY_NYBBLE_4TIMES_EX_INCRNYBBLE3         = 8,
           COPY_SEQUENCE                             = 9,
       };

Then, for the first pass, we simply iterate over each bytes.

We'll refer to the byte we're currently iterating on as "curbyte"

To determine the optimal operation to use, attempt to find which one out of the 3 methods of compression is the best in this case :

Try finding a matching string of at least 3 bytes, starting from "curbyte", within the lookback buffer / sliding window. And from those matches, try to find those that match more than 3 bytes, up to the maximum of 18. Keep the first longest you find. And keep in mind you can only have sequences matching one out of the 7 different lengths you have. So pick those lengths wisely!
Try finding if the 2 bytes starting from "curbyte" could be compressed by doing the opposite of one of these methods:
1. Copy low nybble 4 times, increment the 4 nybbles by 1, subtract 1 from nybble#0, and make 2 bytes with those.
2. Copy low nybble 4 times, subtract 1 from nybble#1, and make 2 bytes with those.
3. Copy low nybble 4 times, subtract 1 from nybble#2, and make 2 bytes with those.
4. Copy low nybble 4 times, subtract 1 from nybble#3, and make 2 bytes with those.
5. Copy low nybble 4 times, decrement the 4 nybbles by 1, add 1 to nybble#0, and make 2 bytes with those.
6. Copy low nybble 4 times, add 1 to nybble#1, and make 2 bytes with those.
7. Copy low nybble 4 times, add 1 to nybble#2, and make 2 bytes with those.
8. Copy low nybble 4 times, add 1 to nybble#3, and make 2 bytes with those.
Try finding if the 2 bytes starting from "curbyte" contain the exact same nybble repeated 4 times.

If all 3 tests fails, you should "copy as-is" the current byte.

Do this for the whole file, and once it has been completely processed, look at the sequence lengths you've used, and pick the remaining values between 0x0 to 0xF for your control flags. For example, if you're using as your possible sequence lengths:

 0x0, 0xF, 0x2, 0x3, 0x5, 0x7, 0xA

Your control flags would be:

 0x1, 0x4, 0x6, 0x8, 0x9, 0xB, 0xC, 0xD, 0xE

We'll keep those in an array we'll refer to as "ctrlflags".

2nd Pass

For this part, you want to handle operations by batches of 8.

Look at your 8 operations, and make a command byte from those, setting each bit to 0, unless its a "Copy as-is" operation.
Then encode the operation into one or two bytes, depending on the operation type.
Finally just write the command byte followed by the encoded bytes in order!

Extra Examples

To make understanding the command byte, think of it as an array of boolean, where each index matches another array containing small arrays of bytes.

This example might make things clearer:

   0xFD, 0x53, 0x49, 0x52, 0x30, 0x24, 0x38, 0xE0, 0x30 //In this byte sequence, 0xFD is the command byte!
   1111 1101 //Here's 0xFD's value in binary
   
   //Here are what bytes in the sequence each bit in the command byte correspond to:
   [1]111 1101 corresponds to 0x53
   1111 110[1] corresponds to 0x30
   1111 [1]101 corresponds to 0x24

Here's how this whole sequence is decoded(debug output from ppmd_unpx.exe ):

 -> Command Byte 0xfd
     Bit is 1 : Copy 0x53 as is!
     Bit is 1 : Copy 0x49 as is!
     Bit is 1 : Copy 0x52 as is!
     Bit is 1 : Copy 0x30 as is!
     Bit is 1 : Copy 0x24 as is!
     Bit is 1 : Copy 0x38 as is!
     Bit is 0 : appending 2 bytes.. 0 0 
     Bit is 1 : Copy 0x30 as is!

Which results in :

 53 49 52 30 24 38 00 00 30

Now take this one, which comes right after the one above:

 0x0F, 0x0F, 0xFC, 0xE0, 0xE0, 0x1F, 0xFC, 0xEC, 0x01, 0xF4, 0x88,

Which is decoded as such(debug output from ppmd_unpx.exe ):

 -> Command Byte 0xf
     Bit is 0 : appending sequence..
       (highnybble: 0x0, lownybble: 0xf, calculatedoffset: 0xfffc( decimal -4 ) )
       38 00 00 
     Bit is 0 : appending 2 bytes.. 0 0 
     Bit is 0 : appending 2 bytes.. 0 0 
     Bit is 0 : appending sequence..
       (highnybble: 0x1, lownybble: 0xf, calculatedoffset: 0xfffc( decimal -4 ) )
       00 00 00 00 
     Bit is 1 : Copy 0xec as is!
     Bit is 1 : Copy 0x1 as is!
     Bit is 1 : Copy 0xf4 as is!
     Bit is 1 : Copy 0x88 as is!

And results in this :

 38 00 00 00 00 00 00 00 00 00 00 EC 01 F4 88

Credits

A big thanks to Zhorkenwho coined many of the terms used here, and figured out the header and the compression format!