Pmd2 SIR0: Difference between revisions
Psy commando (talk | contribs) (Forgot about the End of File Data Block. I got no data on it, but if anyone does, please add to it !) |
Psy commando (talk | contribs) (Added new discoveries on the SIR0 format!) |
||
Line 5: | Line 5: | ||
The SIR0 format is a pretty common wrapper file format. Its also fairly simple. | The SIR0 format is a pretty common wrapper file format. Its also fairly simple. | ||
When it gets loaded into memory, its magic number turns from SIR'''0''' to SIR'''O''', and all the pointers in the entire file are modified to be offset relative to the NDS's memory. | |||
== File Structure == | == File Structure == | ||
Line 45: | Line 44: | ||
| | | | ||
| | | | ||
| [[# | | [[#Pointer Offsets List|Pointer Offsets List]] | ||
| A | | A list containing the offsets to every pointers in the entire file. | ||
|- | |- | ||
| After End of File Data Block | | After End of File Data Block | ||
Line 96: | Line 95: | ||
|} | |} | ||
=== | === Pointer Offsets List === | ||
This list is what makes the SIR0 container what it is! | |||
Its a "compressed" list of all the offsets of all pointers in the files. Both the SIR0 structure, and the contained structure. | |||
The game use this list to change the offsets of each pointers in the file after it has been loaded in memory. | |||
The list will always begin with 04 04, as those are the offsets of the 2 pointers in the SIR0 header. | |||
A byte with a value of 0 indicates the end of the list. | |||
If a byte has its highest bit (1000 0000) set to 1, then we have to "chain" the next byte. Here are the possible cases, using example values: | |||
0x80 0x81 0x82 0x75 => (0x80 & 0x7F) << 23 | (0x81 & 0x7F) << 15 | (0x82 & 0x7F) << 7 | 0x75 | |||
0x80 0x81 0x12 => (0x80 & 0x7F) << 15 | (0x81 & 0x7F) << 7 | 0x12 | |||
0x80 0x06 => (0x80 & 0x7F) << 7 | 0x06 | |||
Note that, since the size are stored as 32 bits integer, chaining more than 4 bytes is impossible. Also not that, we get rid of the highest bit's value using the bitmask 0x7F! | |||
If the byte's highest bit (1000 0000) is set to 0, we use the byte as is, still applying the 0x7F bitmask. | |||
Also, each times we compute an offset using the above, we add its value to the sum of all the previous offsets to get the actual offset. | |||
This is why the list starts with 04 04, and not 04 08 for example. Because they're added to each others as we process them. | |||
'''Example:''' | |||
04 04 92 0C 14 00 AA AA AA AA AA AA AA AA AA AA | |||
This list comes from the "/FONT/frame0.wte" file. We ignore all 0xAA bytes, as they are padding. | |||
Here are the calculations: | |||
4 | |||
4 + 4 | |||
4 + 4 + ( (0x92 & 0x7F) << 7) | 0xC | |||
4 + 4 + ( (0x92 & 0x7F) << 7) | 0xC + 0x14 | |||
And the results: | |||
0x4 | |||
0x8 | |||
0x914 | |||
0x928 | |||
'''Some Code:''' | |||
Here's a little snippet(untested as of now), to decode the string of byte. Its mainly meant to help understand the process, not to be the most efficient, or sensible algorithm. | |||
<code> | |||
std::vector<uint32_t> Handle_SIR0_PointerList( std::deque<uint8_t> pointerlistbytes ) | |||
{ | |||
vector<uint32_t> result; | |||
uint32_t buffer = 0; //temp buffer to assemble longer offsets | |||
uint8_t lastbyteread = pointerlistbytes.front(); | |||
pointerlistbytes.pop_front(); | |||
//The first 2 values are guaranteed to be 0x04 and 0x04, so no worries about checking if they're stored on multiple bytes! | |||
while( lastbyteread != 0 && !pointerlistbytes.empty() ) | |||
{ | |||
buffer |= lastbyteread & 0x7F; | |||
if( 0x80 & lastbyteread != 0 ) | |||
{ | |||
buffer =<< 7; | |||
} | |||
else | |||
{ | |||
result.push_back(buffer); | |||
buffer = 0; | |||
} | |||
lastbyteread = pointerlistbytes.front(); | |||
pointerlistbytes.pop_front(); | |||
} | |||
return std::move( result ); | |||
} | |||
</code> |
Revision as of 10:28, 2 January 2015
The SIR0 format is a pretty common wrapper file format. Its also fairly simple.
When it gets loaded into memory, its magic number turns from SIR0 to SIRO, and all the pointers in the entire file are modified to be offset relative to the NDS's memory.
File Structure
Overview
Offset | Length | Endianness | Type | Name | Description |
---|---|---|---|---|---|
0x00 | 16 | Header | The SIR0 header | ||
After Header | Varies | Content Data | The data wrapped by the SIR0. | ||
After Content Data | Varies | Content Padding | Some 0xAA padding bytes inserted to align the next section on 16 bytes. May be omitted completely if not required. | ||
After Content Padding | Varies | Pointer Offsets List | A list containing the offsets to every pointers in the entire file. | ||
After End of File Data Block | Varies | End of File Padding | Some 0xAA padding bytes to make the file end on a size divisible by 16 bytes with no leftovers. |
Header
Offset | Length | Endianness | Type | Name | Description |
---|---|---|---|---|---|
0x00 | 4 | big | Magic Number | The 4 ASCII characters for "SIR0" (0x53 0x49 0x52 0x30) | |
0x04 | 4 | little | uint32 | Pointer to Content's Header | A pointer to the header of the data the SIR0 contains. If there are no headers, it points to the first byte after the SIR0 header. |
0x08 | 4 | little | uint32 | Pointer to End of File Data | A pointer to a block of data located after the contained data. |
0x0C | 4 | Null | 4 bytes of zeros. |
Pointer Offsets List
This list is what makes the SIR0 container what it is! Its a "compressed" list of all the offsets of all pointers in the files. Both the SIR0 structure, and the contained structure. The game use this list to change the offsets of each pointers in the file after it has been loaded in memory. The list will always begin with 04 04, as those are the offsets of the 2 pointers in the SIR0 header.
A byte with a value of 0 indicates the end of the list.
If a byte has its highest bit (1000 0000) set to 1, then we have to "chain" the next byte. Here are the possible cases, using example values:
0x80 0x81 0x82 0x75 => (0x80 & 0x7F) << 23 | (0x81 & 0x7F) << 15 | (0x82 & 0x7F) << 7 | 0x75 0x80 0x81 0x12 => (0x80 & 0x7F) << 15 | (0x81 & 0x7F) << 7 | 0x12 0x80 0x06 => (0x80 & 0x7F) << 7 | 0x06
Note that, since the size are stored as 32 bits integer, chaining more than 4 bytes is impossible. Also not that, we get rid of the highest bit's value using the bitmask 0x7F!
If the byte's highest bit (1000 0000) is set to 0, we use the byte as is, still applying the 0x7F bitmask.
Also, each times we compute an offset using the above, we add its value to the sum of all the previous offsets to get the actual offset. This is why the list starts with 04 04, and not 04 08 for example. Because they're added to each others as we process them.
Example:
04 04 92 0C 14 00 AA AA AA AA AA AA AA AA AA AA
This list comes from the "/FONT/frame0.wte" file. We ignore all 0xAA bytes, as they are padding.
Here are the calculations:
4 4 + 4 4 + 4 + ( (0x92 & 0x7F) << 7) | 0xC 4 + 4 + ( (0x92 & 0x7F) << 7) | 0xC + 0x14
And the results:
0x4 0x8 0x914 0x928
Some Code: Here's a little snippet(untested as of now), to decode the string of byte. Its mainly meant to help understand the process, not to be the most efficient, or sensible algorithm.
std::vector<uint32_t> Handle_SIR0_PointerList( std::deque<uint8_t> pointerlistbytes )
{
vector<uint32_t> result;
uint32_t buffer = 0; //temp buffer to assemble longer offsets
uint8_t lastbyteread = pointerlistbytes.front();
pointerlistbytes.pop_front();
//The first 2 values are guaranteed to be 0x04 and 0x04, so no worries about checking if they're stored on multiple bytes!
while( lastbyteread != 0 && !pointerlistbytes.empty() )
{
buffer |= lastbyteread & 0x7F;
if( 0x80 & lastbyteread != 0 )
{
buffer =<< 7;
}
else
{
result.push_back(buffer);
buffer = 0;
}
lastbyteread = pointerlistbytes.front();
pointerlistbytes.pop_front();
}
return std::move( result );
}