Pmd2 SIR0: Difference between revisions

From ProjectPokemon Wiki
Jump to navigation Jump to search
(Forgot to update the table.)
(→‎Pointer Offsets List: Fixed a serious mistake I only figured out 2 days ago. No zeros are dropped if there's at least a 0x80 flagged byte before. And a zero byte only ends the list when its not right after a 0x80 flagged byte!)
Line 100: Line 100:
The game use this list to change the value of each pointers in the file after it has been loaded in memory, so they're relative to NDS memory.
The game use this list to change the value of each pointers in the file after it has been loaded in memory, so they're relative to NDS memory.
The list will always begin with 04 04, as those are the encoded offsets of the 2 pointers in the SIR0 header.  
The list will always begin with 04 04, as those are the encoded offsets of the 2 pointers in the SIR0 header.  
An encoded byte with a value of 0 indicates the end of the list.


If a byte has its highest bit (1000 0000) set to 1, then we have to "append" the next byte. Here are the 3 possible cases, using example values:
If a byte has its highest bit (1000 0000) set to 1, then we have to "append" the next byte. Here are the 3 possible cases, using example values:
Line 107: Line 105:
  0x80 0x81 0x12      => (0x80 & 0x7F) << 14 | (0x81 & 0x7F) << 7  | 0x12
  0x80 0x81 0x12      => (0x80 & 0x7F) << 14 | (0x81 & 0x7F) << 7  | 0x12
  0x80 0x06          => (0x80 & 0x7F) << 7  |  0x06
  0x80 0x06          => (0x80 & 0x7F) << 7  |  0x06
Note that, since the offsets are stored as 32 bits integer, chaining more than 4 bytes is impossible. Also note that, we get rid of the highest bit's value using the bitmask 0x7F!
Note that, since the offsets are stored as 32 bits integer, chaining more than 4 bytes is impossible. Also note that, while decoding, we get rid of the highest bit's value using the bitmask 0x7F!


If the byte's highest bit (1000 0000) is set to 0, we use the byte as is, still applying the 0x7F (0111 1111) bitmask.
If the byte's highest bit (1000 0000) is set to 0, we use the byte as is, still applying the 0x7F (0111 1111) bitmask.


Also, each times we compute an offset using either of the above, we add its value to the sum of all the previous offsets to get the actual offset.
An encoded byte with a value of 0 indicates the end of the list, but only when the byte that came before didn't have its highest bit set to 1 (1000 0000).
 
Also, each times we decode an offset using either of the above, we add its value to the sum of all the previous offsets to get the actual offset.
This is why the list starts with 04 04, and not 04 08 for example. Because they're added to each others as we process them.
This is why the list starts with 04 04, and not 04 08 for example. Because they're added to each others as we process them.
However consider this case:
This a valid encoded value:
0xCC 0x80          => (0xCC & 0x7F) << 14  |  (0x80 & 0x7F) << 7
It results in 0x130000 once decoded. Note how there are only 2 encoded bytes, but the decoded integer spans 3 bytes.
And if the last byte's content of that base value were to change, it would end up being encoded this way:
0xCC 0x80 0x01      => (0xCC & 0x7F) << 14  |  (0x80 & 0x7F) << 7 | 0x01
The encoded value is stored on 3 bytes. And it results in 0x130001, which is 1 more than the value above, that was encoded on only 2 bytes !


'''Example:'''
'''Example:'''
Line 140: Line 129:
  0x914
  0x914
  0x928
  0x928
Those are the offsets relative to the beginning of the file "/FONT/frame0.wte", where pointers are stored.


'''Some Code:'''<br/>
'''Some Code:'''<br/>
Line 153: Line 144:
     {
     {
         vector<uint32_t> decodedptroffsets( ptroffsetslst.size() ); //worst case scenario
         vector<uint32_t> decodedptroffsets( ptroffsetslst.size() ); //worst case scenario
         auto itcurbyte     = ptroffsetslst.begin();
         auto     itcurbyte     = ptroffsetslst.begin();
         auto itlastbyte   = ptroffsetslst.end();
         auto     itlastbyte     = ptroffsetslst.end();
         uint32_t offsetsum = 0; //This is used to sum up all offsets and obtain the offset relative to the file, and not the last offset
         uint32_t offsetsum     = 0; //This is used to sum up all offsets and obtain the offset relative to the file, and not the last offset
         uint32_t buffer   = 0; //temp buffer to assemble longer offsets
         uint32_t buffer         = 0; //temp buffer to assemble longer offsets
         uint8_t curbyte   = *itcurbyte;
         uint8_t curbyte       = *itcurbyte;
        bool    LastHadBitFlag = false; //This contains whether the byte read on the previous turn of the loop had the bit flag indicating to append the next byte!
         decodedptroffsets.resize(0); //preserve alloc, and allow pushbacks
         decodedptroffsets.resize(0); //preserve alloc, and allow pushbacks
          
          
         while( itcurbyte != itlastbyte && (curbyte = *itcurbyte) != 0 )
         while( itcurbyte != itlastbyte && ( LastHadBitFlag || (*itcurbyte) != 0 ) )
         {
         {
            curbyte = *itcurbyte;
             //Ignore the first bit, using the 0x7F bitmask, as its reserved. And append or assign the next byte's value to the buffer.
             //Ignore the first bit, using the 0x7F bitmask, as its reserved. And append or assign the next byte's value to the buffer.
             buffer |= curbyte & 0x7Fu;  
             buffer |= curbyte & 0x7Fu;  
Line 167: Line 160:
             if( (0x80u & curbyte) != 0 )  
             if( (0x80u & curbyte) != 0 )  
             {
             {
                LastHadBitFlag = true;
                 //If first bit is 1, bitshift left the current buffer, to append the next byte.
                 //If first bit is 1, bitshift left the current buffer, to append the next byte.
                 buffer <<= 7u;
                 buffer <<= 7u;
Line 172: Line 166:
             else
             else
             {
             {
                LastHadBitFlag = false;
                 //If we don't need to append, add the value of the current buffer to the offset sum this far,  
                 //If we don't need to append, add the value of the current buffer to the offset sum this far,  
                 // and add that value to the output vector. Then clear the buffer.
                 // and add that value to the output vector. Then clear the buffer.
Line 197: Line 192:
     {
     {
         uint32_t offsetSoFar = 0; //used to add up the sum of all the offsets up to the current one
         uint32_t offsetSoFar = 0; //used to add up the sum of all the offsets up to the current one
 
 
         for( const auto & anoffset : listoffsetptrs )
         for( const auto & anoffset : listoffsetptrs )
         {
         {
             uint32_t offsetToEncode        = anoffset - offsetSoFar;
             uint32_t offsetToEncode        = anoffset - offsetSoFar;
             bool    hasHigherNonZero      = false; //This tells the loop whether it needs to encode null bytes,  
             bool    hasHigherNonZero      = false; //This tells the loop whether it needs to encode null bytes, if at least one higher byte was non-zero
                                                    //if at least one higher byte was non-zero
             offsetSoFar = anoffset; //set the value to the latest offset, so we can properly subtract it from the next offset.
             offsetSoFar = anoffset; //set the value to the latest offset, so we can properly subtract it from the next offset.
 
 
             //Encode every bytes of the 4 bytes integer we have to
             //Encode every bytes of the 4 bytes integer we have to
             for( int32_t i = 4; i > 0; --i )
             for( int32_t i = 4; i > 0; --i )
Line 213: Line 207:
                 {
                 {
                     //If its the last byte to append, leave the highest bit to 0 !
                     //If its the last byte to append, leave the highest bit to 0 !
                     if( currentbyte != 0 )
                     out_encoded.push_back( currentbyte );
                        out_encoded.push_back( currentbyte );
                    //If the last byte to append is null, we don't need to append anything
                    // as the automatic bitshift of the last byte will take care of that if a higher byte was non-zero.
                    // In any other cases, null pointers are not to be messed with.
                 }
                 }
                 else if( currentbyte != 0 || hasHigherNonZero ) //if any but lowest byte!  
                 else if( currentbyte != 0 || hasHigherNonZero ) //if any bytes but the lowest one! If not null OR if we have encoded a higher non-null byte before!
                 {
                 {
                     //Set the highest bit to 1, to signify that the next byte must be appended
                     //Set the highest bit to 1, to signifie that the next byte must be appended
                     out_encoded.push_back( currentbyte | 0x80u );  
                     out_encoded.push_back( currentbyte | 0x80u );  
                     hasHigherNonZero = true;
                     hasHigherNonZero = true;
Line 227: Line 217:
             }
             }
         }
         }
   
 
         //Append the closing 0
         //Append the closing 0
         out_encoded.push_back(0);
         out_encoded.push_back(0);
     }
     }
</code>
</code>

Revision as of 19:34, 16 February 2015


The SIR0 format is a pretty common wrapper file format. Its also fairly simple. When it gets loaded into memory, its magic number turns from SIR0 to SIRO, and all the pointers in the entire file are modified to be offset relative to the NDS's memory.

File Structure

Overview

Offset Length Endianness Type Name Description
0x00 16 Header The SIR0 header
After Header Varies Content Data The data wrapped by the SIR0.
After Content Data Varies Content Padding Some 0xAA padding bytes inserted to align the next section on 16 bytes. May be omitted completely if not required.
After Content Padding Varies Pointer Offsets List A list containing the offsets to every pointers in the entire file.
After Pointer Offsets List Varies End of File Padding Some 0xAA padding bytes to make the file end on a size divisible by 16 bytes with no leftovers.

Header

SIR0 header. (Total length 16 bytes)
Offset Length Endianness Type Name Description
0x00 4 big Magic Number The 4 ASCII characters for "SIR0" (0x53 0x49 0x52 0x30)
0x04 4 little uint32 Pointer to Content's Header A pointer to the header of the data the SIR0 contains. If there are no headers, it points to the first byte after the SIR0 header.
0x08 4 little uint32 Pointer to Pointer Offsets List A pointer to the Pointer Offsets List located after the contained data.
0x0C 4 Null 4 bytes of zeros.

Pointer Offsets List

This list is what makes the SIR0 container what it is. Its a "compressed" list of all the offsets of all the pointers stored in the file. This includes both the SIR0 structure, and the contained structure. The game use this list to change the value of each pointers in the file after it has been loaded in memory, so they're relative to NDS memory. The list will always begin with 04 04, as those are the encoded offsets of the 2 pointers in the SIR0 header.

If a byte has its highest bit (1000 0000) set to 1, then we have to "append" the next byte. Here are the 3 possible cases, using example values:

0x80 0x81 0x82 0x75 => (0x80 & 0x7F) << 21 | (0x81 & 0x7F) << 14 | (0x82 & 0x7F) << 7 | 0x75
0x80 0x81 0x12      => (0x80 & 0x7F) << 14 | (0x81 & 0x7F) << 7  | 0x12
0x80 0x06           => (0x80 & 0x7F) << 7  |  0x06

Note that, since the offsets are stored as 32 bits integer, chaining more than 4 bytes is impossible. Also note that, while decoding, we get rid of the highest bit's value using the bitmask 0x7F!

If the byte's highest bit (1000 0000) is set to 0, we use the byte as is, still applying the 0x7F (0111 1111) bitmask.

An encoded byte with a value of 0 indicates the end of the list, but only when the byte that came before didn't have its highest bit set to 1 (1000 0000).

Also, each times we decode an offset using either of the above, we add its value to the sum of all the previous offsets to get the actual offset. This is why the list starts with 04 04, and not 04 08 for example. Because they're added to each others as we process them.

Example:

04 04 92 0C 14 00 AA AA AA AA AA AA AA AA AA AA

This list comes from the "/FONT/frame0.wte" file. We ignore all 0xAA bytes, as they are padding.

Here are the calculations:

4
4 + 4
4 + 4 + ( (0x92 & 0x7F) << 7) | 0xC 
4 + 4 + ( (0x92 & 0x7F) << 7) | 0xC + 0x14

And the results:

0x4
0x8
0x914
0x928

Those are the offsets relative to the beginning of the file "/FONT/frame0.wte", where pointers are stored.

Some Code:
Here's a little C++11 code snippet, to decode the string of byte.

#include <vector>
#include <cstdint>
using namespace std;
//...

   std::vector<uint32_t> DecodeSIR0PtrOffsetList( const std::vector<uint8_t>  &ptroffsetslst )
   {
       vector<uint32_t> decodedptroffsets( ptroffsetslst.size() ); //worst case scenario
       auto     itcurbyte      = ptroffsetslst.begin();
       auto     itlastbyte     = ptroffsetslst.end();
       uint32_t offsetsum      = 0; //This is used to sum up all offsets and obtain the offset relative to the file, and not the last offset
       uint32_t buffer         = 0; //temp buffer to assemble longer offsets
       uint8_t  curbyte        = *itcurbyte;
       bool     LastHadBitFlag = false; //This contains whether the byte read on the previous turn of the loop had the bit flag indicating to append the next byte!
       decodedptroffsets.resize(0); //preserve alloc, and allow pushbacks
       
       while( itcurbyte != itlastbyte && ( LastHadBitFlag || (*itcurbyte) != 0 ) )
       {
           curbyte = *itcurbyte;
           //Ignore the first bit, using the 0x7F bitmask, as its reserved. And append or assign the next byte's value to the buffer.
           buffer |= curbyte & 0x7Fu; 
       
           if( (0x80u & curbyte) != 0 ) 
           {
               LastHadBitFlag = true;
               //If first bit is 1, bitshift left the current buffer, to append the next byte.
               buffer <<= 7u;
           }
           else
           {
               LastHadBitFlag = false;
               //If we don't need to append, add the value of the current buffer to the offset sum this far, 
               // and add that value to the output vector. Then clear the buffer.
               offsetsum += buffer;
               decodedptroffsets.push_back(offsetsum);
               buffer = 0;
           }
        
           ++itcurbyte;
       }
  
       //Avoid copying the vector by using std::move to explicitly call the move constructor
       return std::move(decodedptroffsets);
   }

Here's another one to encode the pointers offsets just as they would be at the end of the SIR0 file.

#include <vector>
#include <cstdint>
using namespace std;
//...
 
   void EncodeSIR0PtrOffsetList( const std::vector<uint32_t> &listoffsetptrs, std::vector<uint8_t> & out_encoded )
   {
       uint32_t offsetSoFar = 0; //used to add up the sum of all the offsets up to the current one
       for( const auto & anoffset : listoffsetptrs )
       {
           uint32_t offsetToEncode        = anoffset - offsetSoFar;
           bool     hasHigherNonZero      = false; //This tells the loop whether it needs to encode null bytes, if at least one higher byte was non-zero
           offsetSoFar = anoffset; //set the value to the latest offset, so we can properly subtract it from the next offset.
           //Encode every bytes of the 4 bytes integer we have to
           for( int32_t i = 4; i > 0; --i )
           {
               uint8_t currentbyte = ( offsetToEncode >> (7 * (i - 1)) ) & 0x7Fu;
               
               if( i == 1 ) //the lowest byte to encode is special
               {
                   //If its the last byte to append, leave the highest bit to 0 !
                   out_encoded.push_back( currentbyte );
               }
               else if( currentbyte != 0 || hasHigherNonZero ) //if any bytes but the lowest one! If not null OR if we have encoded a higher non-null byte before!
               {
                   //Set the highest bit to 1, to signifie that the next byte must be appended
                   out_encoded.push_back( currentbyte | 0x80u ); 
                   hasHigherNonZero = true;
               }
           }
       }
       //Append the closing 0
       out_encoded.push_back(0);
   }