Sign in to follow this  
  • entry
    1
  • comment
    1
  • views
    380

About this blog

Thoughts and tales from the mind of @evandixon

Entries in this blog

evandixon

By night and by weekend I'm an admin here at Project Pokémon (when I'm not busy saving Hyrule or doing other stuff). But by day, I'm a software developer for a certain company. This company has an internal legacy system written in Classic ASP (a really old language used to make websites) that we have to maintain, and releasing changes to it is never fun. Today we released a month's worth of changes, far more than we like to release at one time. (We couldn't release because reasons; although, we probably should have added a feature switch. Lesson learned.)

Within this legacy system, there's a page that is designed to be printed out. Users will then scan an I2of5 barcode on it with a handheld scanner, making it easier to continue their work. After the updated the system, however, we got reports that the barcodes wouldn't scan anymore. Barcodes that were printed before this update worked fine. We reprinted the barcode that was working, and sure enough, the new release gave a different barcode generated from the same source text.

I had my hands all over the page with this barcode, but I know I didn't touch the barcode itself. Thanks to source control, we were able to use git blame and looked at the entire code path responsible for the generation of this I2of5 barcode. This kind of barcode is generated by encoding some text into what looks like garbage, but looks like and scans like a barcode when this garbage is displayed using a particular font. Unfortunately, nothing in that code path has changed for years, and since we released this thing a month ago without any (related) issues, we had to find out what changed. Yet comparing the output of the old and new code showed different barcodes. The new barcode's garbage-looking text had some special characters in it called "replacement characters", which is a special kind of character to indicate something's wrong with the character encoding (more on that in a bit). We inspected the HTTP headers and HTML metadata, and there was no difference, so it wasn't a matter of the web browser interpreting the raw text data incorrectly. So we had to try something else.

We used a git bisect (or rather a manual version of it, because of a bunch of branching weirdness - don't ask) and eventually found the exact commit that introduced the issue. The only thing that changed about the page in question is that a new ASP file was included. This ASP file, along with the ones it in turn includes, are simply containers for functions and classes and are not intended to write anything to the page. To debug what was causing the problem we removed parts of this included file until finally there was nothing left. We tried removing the reference to this file altogether, and that made the problem go away. We put it back and made sure the file was empty, and to double check, we removed that file and re-created it to make double sure it was empty. Turns out including an empty text file causes issues with our barcode!

What is a text file? It's just a file containing bytes that programs interpret as text. But there's different kinds of text. ASCII text interprets one byte as one character. One byte can have a value from 0-255, and while standard ASCII only has characters for 0-127, extended ASCII has characters for 128-255 too. There's also variants of Unicode, where multiple bytes can be used for a single character, which is useful for characters that aren't found in English. How is a program to know what kind of text is in a text file though? By adding extra bytes to the beginning of the file to indicate the text encoding: the Byte Order Mark (BOM). Using HxD, I found that these "empty" text files weren't in fact empty, they had the BOM bytes to mark the file as UTF-8, which is a variant of Unicode that looks like ASCII until the 128-255 byte value range. By including this "empty" file, we were including the BOM, which changed the character encoding of the entire page, completely changing the meaning of the text that's used to display the barcodes (because it uses extended ASCII characters for some reason).

Thankfully, this whole release didn't go as bad as it could have. Because of our Blue/Green deployment setup, we weren't under as much pressure as we could have otherwise been, and we only had to revert to the previous version twice, and the release only took 8 hours (because of testing and debugging other issues*).

* Bonus material: when using Powershell to create a virtual application in IIS, note that "userName" is case sensitive in some versions of IIS. In 7.5, it's fine to use "username", but when using 8.5, that case sensitivity will cause setting the application credentials to silently fail. (Not that we encountered a situation exactly like this one or anything.)

Sign in to follow this