Jump to content

Decrypting .pkm files; help?


Recommended Posts

Alright, so I'm developing a My Pokemon Ranch save editor. I've figured everything out thus far except how to decrypt the .pkm files. As of now, I allow exporting pokemon for Pokesav to handle any editing, but I'd love to at least be able to display the icon for each pokemon in the box. As of now, they're all egg icons, and you have to export and view in pokesav to see which pokemon you're editing (which makes changing the depositor for specific pokemon pretty danged hard).

I've seen the breakdown of the algorithm here and on Bulbapedia, but no matter what I do, I can't get results. I've tried reversing bytes in bytewords for big/little endian changes, and nothing. I thought this may be the best place to go for this, as there's really no example data to go off of anywhere online that I could find. Chances are I'm misreading the text or misprogrammed something, so if someone who knows this algorithm could help me through a decryption, I'd be quite thankful.

So, here's an example of mine (code broken up as my program uses it):

CODE:

{"6C4BF71F","0000","39AC",

"95E1","747E","D327","B69D","523F","F719","2A57","2CC0",

"FD8C","023C","5755","8CC2","7D33","92EB","888C","1B54",

"9723","9C8E","D706","9DDA","CDF4","32DD","B2AC","0124",

"5276","FCFA","90A7","ADEB","89F3","A23A","B2A0","3210",

"CC26","813E","1056","F4C7","F4A4","8A95","D584","540D",

"3AE7","04B7","8968","C8C1","C22A","64CE","9769","2AB5",

"B092","0E2C","50A5","DA78","4085","D63D","8E5B","F692",

"F672","C612","4F9C","B2FE","9418","37B1","CD85","2847"};

WHAT IT IS:

A Golem (pokemon #76, = 0x4C)

Holding a Blue Scarf (dunno the hex for this)

TID: 20202 = 0x4EEA

SID: 4916 = 0x1334

Level: 31 = 0x1F

XP: 24294 = 0x5EE6

IVs: 15/18/15/0/6/18 => 0x7c9e0348, I think?

So, step 0 would be to find out the block order, I guess:

pv = 0x6C4BF71F = 1816917791

((pv >> 0xD) & 0x1F) % 24

(221791 & 0x1F) % 24

31 % 24

7

So the block order is BADC.

Next is what I assume I'm getting wrong:

Seeding rand with the checksum:

rand(X[n]): X[n+1] = (0x41C64E6D * X[n] + 0x6073) % 0x100000000

rand(chksm) = rand (0x39AC) =

(0x41C64E6D * 0x39AC + 0x6073) % 0x100000000

(0x41C64E6D * 0x39AC + 0x6073) % 0x100000000

(0xED158B2F63C + 0x6073) % 0x100000000

0xED158B356AF % 0x100000000

0x58B356AF

And we take the first 16 bits, so X[1] = 0x58B3.

Now, I found a Japanese site that says to seed rand with the PID, not the checksum. I'm assuming what I've read here is correct, though, and used the checksum. (Even with the PID used, however, the result is incorrect.)

So, an XOR with the first byte word gives us 0xCD52.

We use the output of rand(1) for rand(2):

(0x41C64E6D * 0x58B3 + 0x6073) % 0x100000000

(0x16CA289E4E37 + 0x6073) % 0x100000000

0x16CA289EAEAA % 0x100000000

0x289EAEAA

X[2] = 0x289E

So the second word decrypted is 0x289E XOR 0x747E = 0x5CE0. (I tried using the full 4-byte output as well. Still didn't work.)

Continuing this, we get the decrypted pkm data as:

B:

cd525ce04782bcba9679275cf07b001e

e2ff8da89b21589825994a4e3e70a1a3

A:

1c72e0da7408411203a7d4ba1393e604

7bf8c275546382343dd70d322f6fff62

D:

db3a8382ca683817833a4c4cf4457290

fca5517fb39facdbe6eca6a66897e055

C:

b7f5ecc2673c30262a05299d2a54124e

d2369da6f2cc472180d51e5c6aa308de

Now, since the order is BADC, I expect to see either 0x004C or 0x4C00 at the beginning of the A Block...yet what's there is 0x01c7. None of the other values show up either. Obviously, there's something wrong here. Can anyone help me figure out what it is? I'm completely lost.

Link to comment
Share on other sites

You should use "& 0xFFFFFFFF" instead of "% 0x100000000", btw. It'll do the same thing, it's just more efficient, assuming the compiler doesn't make that optimization for you.

Each time rand() is called, seed is set to the result of 0x41C64E6D * seed + 0x6073. The result of rand(), however, is shifted down by 16 bits before being used. So you can define rand() as:

uint32 seed;
void srand(uint32 newSeed) {
 seed = newSeed;
}
uint32 rand() {
 seed = 0x41C64E6D * seed + 0x6073;
 return seed >> 16;
}

To decrypt:

srand(checksum >> 16);

Then, starting at offset 8 of the Pokemon data, XOR every 2 bytes with the result of rand().

Link to comment
Share on other sites

To decrypt:

srand(checksum >> 16);

Then, starting at offset 8 of the Pokemon data, XOR every 2 bytes with the result of rand().

Wait, the checksum is only 16 bits long. (checksum >> 16) would result in 0, right? Or am I just to seed the first round of rand with the checksum with no modifications, as I've been doing?

So then, if the value of seed is 11 nybbles long before being returned, the >> 16 would mean that a 7-nybble result is returned. Am I correct? That might be the issue (the & 0xffffffff ensures a truncation to 8 nybbles). I'd love to see some intermediary results. Is there a way you could whip up a quick program that prints the in-between steps? I'd love to pinpoint the problem in my code where I'm in error without tearing my hair out.

Link to comment
Share on other sites

Wait, the checksum is only 16 bits long. (checksum >> 16) would result in 0, right? Or am I just to seed the first round of rand with the checksum with no modifications, as I've been doing?

The checksum is stored in the Pokemon data as XXXX0000. The method used for encryption only uses the 2 upper bytes, so it needs to be shifted down.

So then, if the value of seed is 11 nybbles long before being returned, the >> 16 would mean that a 7-nybble result is returned. Am I correct? That might be the issue (the & 0xffffffff ensures a truncation to 8 nybbles).

I have no idea what you're referring to by "nybbles". The seed used by the rand function is 4 bytes, but in the case of Pokemon encryption, only 2 bytes are given to it.

I'd love to see some intermediary results. Is there a way you could whip up a quick program that prints the in-between steps? I'd love to pinpoint the problem in my code where I'm in error without tearing my hair out.

I could, but I can't say that I want to, sorry.

Link to comment
Share on other sites

From http://projectpokemon.org/wiki/Pokemon_NDS_Structure :

0x00-0x03 Personality value (Also known as the PID)

0x04-0x05 Unused

0x06-0x07 Checksum

The checksum is 16 bits.

Also, nybbles are half of a byte (don't look at me like that, I didn't make up the name!). So, 0xEF is made of the nybbles E and F. Essentially, I was asking what that shift by 16 bits meant mostly since the output of that function, when done by hand, doesn't always give the same length response. I assume it's meant to be constricted to 4 bytes before the shift, though, so just ignore that part.

Also, I assume you've never done the decryption yourself then? Because honestly, all it would take is a few lines over a program that already does it (to print out the results). That should take like 5 minutes. I don't need source code or anything, if that's a big deal. I know the theoretical algorithm, and it doesn't seem to work; I can only hope someone is kind enough to help me sort this out. :(

EDIT: I guess you have done this before (quite a bit, actually!), seeing the link in your sig. Is it really too much to ask, though? I mean this in all honesty, I'm not trying to be facetious or rude. :o I don't want to be bothersome, but my rand() output matches all other examples I can find on this site (including the one on the first page of your topic, by Morfeo, about the B-A-C-D Method), and I can't for the life of me figure out why nothing corresponds.

Link to comment
Share on other sites

Just because they have the data there doesn't mean it's guaranteed to be accurate. Yes, there are two zeroed bytes with the checksum, but if you change those bytes to non-zero, it'll cause problems. So I simply consider it part of the checksum. Considering modifying the two zeroed bytes causes problems, they're not unused as documented on the wiki.

Also, nybbles are half of a byte (don't look at me like that, I didn't make up the name!). So, 0xEF is made of the nybbles E and F.

I assumed this is what you meant, but I don't know why you'd be measuring anything in half-bytes.

Essentially, I was asking what that shift by 16 bits meant mostly since the output of that function, when done by hand, doesn't always give the same length response. I assume it's meant to be constricted to 4 bytes before the shift, though, so just ignore that part.

That's correct, the output is uint32, so anything beyond 32 bits is simply truncated. There was some assumption that you'd be familiar with C++-like concepts, sorry.

I guess you have done this before, seeing the link in your sig. Is it really too much to ask, though? I mean this in all honesty, I'm not trying to be facetious or rude. :o

The problem is that I'm a very avid proponent of not doing the work for people. I realize you're not entirely asking for that, but you have more than enough information here to do what you want. Also, just as a note, you shouldn't take anything I'm saying as hostile or negative, because I assure you that's not what's going through my head. I just don't fill my posts with smileys to help show that.

Link to comment
Share on other sites

Heh, that assumption went over my head. I've got Java, Python, Perl, and a few other minor languages under my belt, but C and C++ have escaped me thus far. I understand what you mean, but I am literally at a roadblock. I have mulled through my program over and over and have made no headway, even with all the information there. May I ask at least a few more things, though? I don't think I'll get any help besides your tips, which I may not sound appreciative of, but I very much am. (Also, apologies for the over-use of smilies; I put them in for the reason you stated, as lots of cues in expression and tone are lost via plain text over the web.)

1) Will I see a 0x004C signifying that it's a Golem (since he's #76)? That's the key I'm usually looking for in my output.

2) Are any of the words flipped due to endian-ness before their use in calculations?

3) Can you think of any other place I might ask this request?

If I ever figure this out, I'm definitely going to provide some more thorough documentation on this. I'm a little disappointed that no one's published a hard example to show.

Link to comment
Share on other sites

Also, apologies for the over-use of smilies; I put them in for the reason you stated, as lots of cues in expression and tone are lost via plain text over the web.

There's no need to apologize. I wasn't implying that I'm annoyed by other people using them.

1) Will I see a 0x004C signifying that it's a Golem (since he's #76)? That's the key I'm usually looking for in my output.

At the appropriate offset for the species value, yes.

2) Are any of the words flipped due to endian-ness before their use in calculations?

No, they're all stored in big endian, which is what PCs read and write with.

3) Can you think of any other place I might ask this request?

It's always more beneficial to your own knowledge to figure it out for yourself. Ask more questions, I have no problem answering them. To answer this quoted question though, no, I don't.

If I ever figure this out, I'm definitely going to provide some more thorough documentation on this. I'm a little disappointed that no one's published a hard example to show.

There essentially is with multiple open source projects available. Again though, I haven't done anything like this because of what I said before. I refuse to do the work for someone.

Link to comment
Share on other sites

I can agree that self-discovery is indeed a driving factor in personal growth, but there's a time when frustration sets in that wholly overshadows the will to find answers by oneself (and so the will completely fades and the project dies out...). I'm not usually one to ask for help, but this is perplexing me far beyond what I feel it should be. In all honestly, I feel like something was left out of the written algorithms. I can't come up with a reason why nothing else works. I've done the calculations by hand only to get the same failing results as my programs get.

At that point, I'd consider this more at an angle of education than simply doing the work for another. Honestly, why did you make your own program? One could argue that you should have merely shown others how to write their own code to create what you distribute yourself. While most of my passions are indeed self-taught, passion alone is not enough for anyone to overcome all challenges.

But I digress, and you seem staunch in your standing. I will continue to struggle, it seems.

One more hint I hope to gain: when seeding the next round of rand, do you use the 4-byte result of the last round or the shifted 2-byte value? While I've done both with failing results, I wouldn't mind knowing for certain which it is. Morfeo's display used the full 4-byte result, so I am leaning toward that.

Link to comment
Share on other sites

I can agree that self-discovery is indeed a driving factor in personal growth, but there's a time when frustration sets in that wholly overshadows the will to find answers by oneself (and so the will completely fades and the project dies out...). I'm not usually one to ask for help, but this is perplexing me far beyond what I feel it should be. In all honestly, I feel like something was left out of the written algorithms. I can't come up with a reason why nothing else works. I've done the calculations by hand only to get the same failing results as my programs get.

At that point, I'd consider this more at an angle of education than simply doing the work for another. Honestly, why did you make your own program? One could argue that you should have merely shown others how to write their own code to create what you distribute yourself. While most of my passions are indeed self-taught, passion alone is not enough for anyone to overcome all challenges.

But I digress, and you seem staunch in your standing. I will continue to struggle, it seems.

One more hint I hope to gain: when seeding the next round of rand, do you use the 4-byte result of the last round or the shifted 2-byte value? While I've done both with failing results, I wouldn't mind knowing for certain which it is. Morfeo's display used the full 4-byte result, so I am leaning toward that.

Link to comment
Share on other sites

At that point, I'd consider this more at an angle of education than simply doing the work for another.

There's a lot of skepticism that I place toward people I don't know. In almost every case, people make little or no effort to figure something out for themselves and just want answers. That infuriates me. I'm not saying that's what you're doing, and it's pretty obvious that it isn't, so I agree that it'd be more educational. If you're really that desperate, I suppose I can give a coded example in a PM.

Honestly, why did you make your own program? One could argue that you should have merely shown others how to write their own code to create what you distribute yourself. While most of my passions are indeed self-taught, passion alone is not enough for anyone to overcome all challenges.

I thought about that after I made my post. You're right, it is essentially doing the work for people with the program, but not quite in the same way. I don't think I could accurately explain my views on the matter well enough to make any sense, so I'm not even going to try. Maybe it's just genuinely contradicting, I don't know.

One more hint I hope to gain: when seeding the next round of rand, do you use the 4-byte result of the last round or the shifted 2-byte value? While I've done both with failing results, I wouldn't mind knowing for certain which it is. Morfeo's display used the full 4-byte result, so I am leaning toward that.

As you can see by my example rand() function, it sets the seed to the 4-byte result and returns a 2-byte result. The next call deals with what "seed" was set to.

One thing that may be ruining your efforts: the blocks aren't reorganized prior to decrypting. This only occurs after decryption.

Link to comment
Share on other sites

There's a lot of skepticism that I place toward people I don't know. In almost every case, people make little or no effort to figure something out for themselves and just want answers. That infuriates me. I'm not saying that's what you're doing, and it's pretty obvious that it isn't, so I agree that it'd be more educational. If you're really that desperate, I suppose I can give a coded example in a PM.

Thanks, it would help a lot. Honestly, I really only need maybe the first 3 rounds (words) just so I can check that the algorithm is correct. Once that starts going, the rest ought to follow.

As you can see by my example rand() function, it sets the seed to the 4-byte result and returns a 2-byte result. The next call deals with what "seed" was set to.

Yeah, I've been using the 4-byte result.

One thing that may be ruining your efforts: the blocks aren't reorganized prior to decrypting. This only occurs after decryption.

Yeah, the Data Structure page mentions that. I've been looking toward the right block for the fabled 0x004C.

I can understand your defense, believe me. A lot of work is required to really pound out the details for some of this stuff. I remember really struggling with some of my other projects (most notoriously, finding some random snippets of text for a translation that were, for whatever reason, split up from the rest with like 12 bytes between each letter), and it kind of threw out that exploratory feeling just telling another translator where they were. I imagine you have a lot on your plate as well, and I can't thank you enough for putting up with me and setting aside some time to help. I'm just somewhat baffled that such a simple calculation is throwing me for a loop. I normally don't get caught this badly.

In the hopes of Python somehow screwing up some calculation with its odd 'L' character after some bytes, I reworked my original Java program. It, however, returned the same result. Drat. If it helps, I can relay one of those to you via PM to look over. They're not the most elegant, but they reflect my hand-calculations. Let me know if you'd want a peek at those.

Link to comment
Share on other sites

I'm just somewhat baffled that such a simple calculation is throwing me for a loop. I normally don't get caught this badly.

I'd imagine it happens to everyone. It certainly does to me. I can't tell you how long it took me to wrap my head around the logic for the PID finder initially.

In the hopes of Python somehow screwing up some calculation with its odd 'L' character after some bytes, I reworked my original Java program. It, however, returned the same result. Drat. If it helps, I can relay one of those to you via PM to look over. They're not the most elegant, but they reflect my hand-calculations. Let me know if you'd want a peek at those.

I have no experience in Java and my only Python experience is from modding of a game that used it for what I guess would be considered modules. I doubt I won't understand either though, so feel free to post if you want.

Link to comment
Share on other sites

  • 1 month later...

Excuse my lack of knowledge, but I got a little lost at the contents of this topic. OK, figuring the hex string for the moves was easy enough.

Let me say, I know hex. In years gone I did a lot of game memory hacking on the original Playstation with Gran Turismo 2 and even had some scripts on a Web site to generate Action Replay Codes (with their attendant encryption) from game data in "English". I know bit "functions" such as And (if bits are 1 on both arguments) and Or (1 on either) and I presume Xor (1 on either but not both?).

Is there somewhere with an example decryption taken at a slightly lower level that I may be able to follow. The reason I ask is that I like to put the PP values into Pokegen and currently have the moves in Excel, for ease, and given a decent insight into the workings of the Encryption, I could create a spreadsheet to maybe create an AR code from the Pokemon data which would automatically pick up the PP value. I do have the hex functions active in my Spreadsheet and do use them.

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...