Recent revelations around security on the Web have left me shaken, not stirred. If I can’t trust people in positions of power to respect the concept of ‘innocent until proven guilty’, it strikes me that withdrawal of my trust would be a rather rational response.
To that end, I’ve been consumed this past week with figuring out how to password encrypt certain files I’m generating that I’d rather people spend effort on if they stumble across them. I also freely admit that my inner Loki is mighty pleased by picturing the delightful dilemma others may face in trying to square off my ‘secrets most mundane’ against the inevitable sunken cost fallacies they’ll need to concoct to justify the discovery time.
I’ve settled on symmetric key encryption via Advanced Encryption Standard (AES) because a) I’m a cheap-arse, b) I’m too lazy to go the extra effort of grokking asymmetric key encryption, and c) AES has a good rep for the less challenging arena of symmetric key encryption.
Now that I have a working implementation, I thought I’d write something that I wish I had whilst I was getting up to speed. I’ve discovered 5 things that you should know if you’d like to implement AES in Java.
1. the Java SDK limits you to AES-128 by default.
Java’s SDK has been capable of offering ‘strong’ encryption by merging its Java Cryptography Extension (JCE) into the base library ever since Java 1.4.2. Now, ‘strong’ encryption in Java is limited to the AES-128 variant . Java make the distinction between ‘strong’ and ‘unlimited’, as certain countries impose import-control restrictions, limiting the amount of cryptography they’ll allow.
If you want to build applications that allow ‘unlimited’ encryption from the Java SDK, you’ll need to download the Unlimited Strength Jurisdiction Policy Files. And yes, you’ll need to ship those unlimited policy jars as part of your final deployment.
Now, before you get your panties in a twist, even Bruce Schneier considers AES-128 adequate enough. Yes, AES-256 is a bit stronger, but 128 will ‘do’ for a while yet despite what’s been thrown at it.
But… but… as a developer, I’m (now) totally aware that the difference between AES-128 and AES-256 is a single constant I use to tell the byte-array that holds my key how long to be. It’e exactly the same effort to code, so why wouldn’t I go with stronger? Ah the stories one may weave!
2. alternatives EXIST to the Java SDK for CRYPTO.
My favourite is the Legion of the Bouncy Castle (what an absolutely fantastic name for a cryptography library). And, because we’re not (yet) pandering to US cryptography law here in Australia, there’s no distinction drawn between ‘strong’ and ‘unlimited’. Crippled Crypto cannot bunk in a Bouncy Castle!
Also, there’s the entire implementation to Java’s blessed JCE interface, giving it an air of ‘appropriate fit’ with respect to how Java prefers its crypto rolled.
Finally, because it’s been registered as an Australian charity, I get tax-deductions for my donations! Oh God! This just keeps getting better and better!
There are others (like Jasypt), but, well.. I came for the crypto, but stayed for the LOLs.
3. As a Black Box, AES is remarkably easy to use.
AES isn’t a trivial algorithm if you want to grok how it does what it does. However, it’s dead easy to understand from the outside. Here’s what I wish someone had told me before I started trawling web sites on AES trying to grok how to make use of it:
- It needs just three pieces of information:
- A key. The length of bits in the key designates its strength. AES-128 literally means AES with a 128 bit key. AES-256 means a key with 256 bits.
- An Initialisation Vector (IV), helping to randomise the encryption as it starts.
- The content you want encrypted.
- So long as you use exactly the same key/IV pair, you’ll be able to decrypt whatever you’ve encrypted.
- The IV must be the same size as the size of a block of data in AES. This is ALWAYS 128 bits. JCE ciphers rely on byte-arrays, and because a Java byte is the same size everywhere, the IV will always be 16 bytes long (128 bits / 8 bits per byte).
- As per the above, your AES key needs to be a 32 element byte array if you want AES-256. Dial it back to 16 bytes for AES-128. Dead-easy.
4. Take Great Care in Consuming Web Examples.
So, I’m a novice when it comes to security with the software I write. It turns out there are a lot of ways to do cryptogrphy code wrong, and the popular examples out there are living proof of what ‘wrong’ looks like.
I’m not claiming that what I’ve built is tight. Only that it’s got less issues than the example code I started from. Also, please don’t take my supplying of this link as criticism. I found worse in my hunt, and this one allowed me a great bootstrap. It focuses just on the AES algorithm, which is good for seeing the particular AES tree in the forest of crypto code that you need to ultimately have written. Would I deploy with it? No. Am I grateful it showed only what was needed as I was de-nubefying? Yes.
However, there are things I now know that I feel you should be aware of before embarking on your own virgin AES journey:
- Use SecureRandom whenever you need make things random. The basic Random class falls short. Specifically, password salt and IV should both be ‘very random’.
- Don’t reuse randomised IV, salts, etc across multiple encryptions. If you’re encrypting, you’re also re-randomising those things that should be random.
- If you’re encrypting via a user-supplied password, do not just re-use the password as the AES key. Salt and hash the password into the key so dictionary, brute-force and rainbow attacks become far less likely to succeed.
- Consider salt size. You want it big enough that as random numbers go, it stands a good chance of being truly unique. I’ve seen between 16 and 32 bytes as a rule of thumb here.
5. Decryption Context Matters. Think about it!
Ultimately, to decrypt something you’ve encrypted via AES, you are going to need the same key/IV pair that you used initially. Consider what you are going to do with that ‘context’ over the elapsed time between your encryption and decryption events.
The answer you come up with depends entirely on what you are trying to do with encryption in the first place. I can’t answer for you what you should do here. However, I can briefly describe my own situation and reasoning for the final solution.
I am aiming for the password-encryption of a save-file, created via a project that I’ve released open-source. I have no interest in the code also managing the storage of the password anywhere. The user (me) must consider password storage external to the code. I also can’t rely on keeping the exact storage and retrieval tricks secret by locking the source-code up somewhere.
When it comes time to decrypt, by applying the same salt and hashing algorithm to the original password, the code is capable of deriving the original key. The password salt will need storage over time, most especially because the advice is that it be random per encryption event. The Initialisation Vector will also need storage over time for exactly the same reason.
So… the password is the missing external piece of the puzzle required to derive a valid decryption. Also, I don’t want to be reliant on anything other than the encrypted file and the code for decryption. Note that I’m actively choosing a trade-off between convenience and security here. Models involving storage of the salt and Initialisation Vector away from the encrypted content might be more secure, so I must be happy that the convenience of local storage is worth less security.
It just so happens that I’m (mostly) fine with this tradeoff. I’m only “mostly” fine here because by storing the salt with the file, a rainbow table based on the salt can be derived once an attacker knows the file format. It’s been mentioned that hiding the salt and IV don’t add a great deal of security anyway, so I’m all flippy-floppy about this particular decision.
Anyway, convenience wins and I store the password salt and Initialization Vector within the file along with the encrypted content. I store all three data-types the same way, so casual inspection won’t drop visible clues to where one starts and another stops.
In storing context, you might be tempted to do funky stuff like, oh, interleaving the salt and IV data, etc. Ultimately, whatever you do though is obfuscation that MUST be un-obfuscated for a successful decryption (skilled crackers know that you must unobfuscate again, and it helps to know ahead of time that the game is winnable). A common theme I’ve seen in my pouring over software and security is ‘Don’t think you can out-obfuscate a determined unobfuscator’.
Because the code is open-source, it won’t make a lick of difference because the obfuscation code would be available, so simple is best given the lack of extra benefit. I just glue the salt and IV to the encrypted content and save it all to file.
The weak-point that matters is the password itself. By choosing not to deal with its storage within the code, and using a salted hash to derive a key, I’m happy that a strong, memorised password will lead to a good enough encryption experience for my password-locked encrypted file.
Now, if you’re looking to do something different, I doubt my reasoning here would still fully apply. At this stage, I suggest you pour over content from OWASP and Information Security Stack Exchange that seems relevant to the kind of application you’re considering building.
And there we have it. All the nuanced AES learning I picked up whilst getting a working implementation going. If you’re about to embark on your own AES journey, I hope this grants you fairer sailing.
Finally, if you’re interested in the end-result, here’s my contribution of an example implementing AES in Java with the BouncyCastle library.