~ 22 Apr 2010, 03:02
Two or three weeks ago I stumbled upon a strange (yet rare) bug in WinRAR's password handling implementation. I abused WinRAR by giving it a password in cyrillic letters (the bug doesn't appear if you use the usual ASCII 7-bit set). The whole experience boils down to the old saying: "Just because a program is written by a patriotic russian guy doesn't mean it is unicode-safe" :)
So, let me get slightly into the details... did you know, that you could use non-ASCII characters for archive passwords in WinRAR? Well, I didn't know either, and so I went to find out. This was along the polishing touches on my RAR password cracker (which I recently sped up by 10-20%, etc.). I started with WinRAR and created a test archive, using the word «България» as the password. WinRAR happily accepted the password and created the archive. On decryption, it also happily accepted the same password (and rejected some random other one). The console-mode utility for win32 also seemed to work under cmd.exe window.
Step 2: I transferred my test RAR to the Linux box, where this time it didn't decrypt (it said my password is wrong). That was strange — initially I thought that the `unrar' utility author had made some implementation mistake in the unicode string handling. However, after a bit of debugging, it turned out that this was not the issue, as my Fedora 11 terminal was passing the utf-8 string correctly and all the underlying machinery worked flawlessly. Actually, the console `unrar' utility seemed to perform the right job! So, after a bit tinkering with the Win CLI again, I noticed, that whenever I typed some cyrillic characters in the terminal, question marks appeared instead. So, in cmd.exe, "България" would equal "????????". When testing the latter "password" (the eight question marks), the archive decrypted successfully.
The cause of this strange behaviour turned out to be hidden in Control Panel->Regional and Language Options->"Language for Non-unicode programs". When I set it to "Bulgarian", both programs (WinRAR and the CLI `rar') behaved correctly, with "България" being no more equal to "????????".
So you may have probably guessed what the bug is? Well, consider the case with a clueless dude, sitting in front of a all-default-setup Windows computer (where the aforementioned Windows setting is not "Bulgarian"). So, the dude creates an archive, and puts up the hyper-strong password "ХакерЩрасе". But in reality, the password he actually sets for the archive turns out to be the trivially-breakable "??????????"!
Ironically, it is evident from the unrar utility's source code, that the author (who is russian btw) has tried hard to support non-ASCII characters... but his program turns out not to be unicode-safe anyway.
The bogus interpretation of cyrillic symbols as question marks hints us to a subtler problem with WinRAR's security: the password input dialog should detect such "patently weak" passwords and (at least) warn the user, so he or she can consider using a stronger password. This way, hidden implementation problems like the one I mentioned will be detected as a simple side-effect.