Detecting Bad OpenSSL Usage

by William Wang, UCLA

OpenSSL is one of the most popular cryptographic libraries out there; even if you aren’t using C/C++, chances are your programming language’s biggest libraries use OpenSSL bindings as well. It’s also notoriously easy to mess up due to the design of its low-level API. Yet many of these mistakes fall into easily identifiable patterns, which raises the possibility of automated detection.

As part of my internship this past winter and spring, I’ve been prototyping a tool called Anselm, which lets developers describe and search for patterns of bad behavior. Anselm is an LLVM pass, meaning that it operates on an intermediate representation of code between source and compilation. Anselm’s primary advantage over static analysis is that it can operate on any programming language that compiles to LLVM bitcode, or any closed-source machine code that can be converted backwards. Anselm can target any arbitrary sequence of function calls, but its original purpose was to inspect OpenSSL usage so let’s start there.

OpenSSL

The design of OpenSSL makes it difficult to understand and work with for beginners. It has a variety of inconsistent naming conventions across its library and offers several, arguably too many, options and modes for each primitive. For example, due to the evolution of the library there exist both high level (EVP) and low level methods that can be used to accomplish the same task (e.g. DSA signatures or EC signing operations). To make this worse, their documentation can be inconsistent and difficult to read.

In addition to being difficult to use, other design choices make the library dangerous to use. The API inconsistently returns error codes, pointers (with and without ownership), and demonstrates other surprising behavior. Without rigorously checking error codes or defending against null pointers, unexpected program behavior and process termination can occur.

So what types of errors can Anselm detect? It depends on what the developer specifies, but that could be anything from mismanaging the OpenSSL error queue to reusing initialization vectors. It’s also important to remember that these are heuristics, and misidentifying both good and bad behavior is always possible. Now, let’s get into how the tool works.

Function Calls

While the primary motivation of this project was to target OpenSSL, the library itself doesn’t actually matter. One can view OpenSSL usage as a sequence of API calls, such as EVP_EncryptUpdate and EVP_EncryptFinal_ex. But we could easily replace those names with anything else, and the idea remains the same. Hence bad behavior is a pattern of any function calls (not just OpenSSL’s) which we would like to detect.

My main approach was to search through all possible paths of execution in a function, looking for bad sequences of API calls. Throughout this post, I’ll be using OpenSSL’s symmetric encryption functions in my examples. Let’s consider EVP_EncryptUpdate, which encrypts a block of data, and EVP_EncryptFinal_ex, which pads the plaintext before a final encryption. Naturally, they should not be called out of order:

EVP_EncryptFinal_ex(ctx, ciphertext + len, &len);...EVP_EncryptUpdate(ctx, ciphertext, &len, plaintext, plaintext_len);

This should also be flagged, since the bad sequence remains a possibility:

EVP_EncryptFinal_ex(ctx, ciphertext + len, &len);...if (condition) {  EVP_EncryptUpdate(ctx, ciphertext, &len, plaintext, plaintext_len);}

I worked with LLVM BasicBlocks, which represent a list of instructions that always execute together. BasicBlocks can have multiple successors, each reflecting different paths of execution. A function, then, is a directed graph of many BasicBlocks. There is a single root node, and any leaf node represents an end of execution.

Finding all possible executions amounts to performing a depth-first search (DFS) starting from the root node. However, notice that the graph can contain cycles; this is analogous to a loop in code. If we performed a blind DFS, we could get stuck in an infinite loop. On the other hand, ignoring previously visited nodes can lead to missed behavior. I settled this by limiting the length of any path, after which Anselm stops exploring further (this can be customized).

One issue remains, which is that performing DFS over an entire codebase can be very time-consuming. Even if our exact pattern is simple, it still needs to be matched against all possible paths generated by the search. To solve this, I first prune the graph of any BasicBlock that does not contain any relevant API calls. This is done by pointing any irrelevant node’s predecessors to each of its successors, removing the middleman.

In practice, this dramatically reduces the complexity of a graph for faster path-finding: entire if statements and while loops can be eliminated without any consequences! It also makes any path limit much more reasonable.

Matching Values

Although solely examining function calls is a good start, we can do better. Consider OpenSSL contexts, which are created by EVP_CIPHER_CTX_new and must be initialized with algorithm, key, etc. before actual use. In the following situation, we want every context to be initialized by EVP_EncryptInit_ex:

EVP_CIPHER_CTX *ctx1 = EVP_CIPHER_CTX_new();EVP_CIPHER_CTX *ctx2 = EVP_CIPHER_CTX_new();EVP_EncryptInit_ex(ctx1, EVP_aes_256_cbc(), NULL, key, iv);

EVP_EncryptInit_ex always follows EVP_CIPHER_CTX_new, yet ctx2 is obviously not initialized properly. A more precise pattern would be, “Every context returned from EVP_CIPHER_CTX_new should later be initialized in EVP_CIPHER_CTX_new.”

I addressed this by matching arguments and return values — checking whether they pointed to the same LLVM Value object in memory. Contexts are a prime situation to match values, but we can use the same technique to detect repeated IVs:

EVP_EncryptInit_ex(ctx1, EVP_aes_256_cbc(), NULL, key1, iv);EVP_EncryptInit_ex(ctx2, EVP_aes_256_cbc(), NULL, key2, iv);

Internally, Anselm uses regex capture groups to perform this analysis; every execution path is a string of function calls and Value pointers, while a bad behavior is defined by some regex pattern.

Pattern Format

Towards the end of my internship, I also defined a format for developers to specify bad behaviors, which Anselm translates into a (somewhat messy) regex pattern. Every line begins with a function call, followed by its return value and arguments. If you don’t care about a value, use an underscore. Otherwise, define a token which you can use elsewhere. Hence a rule banning repeat IVs would look like this:

EVP_EncryptInit_ex _ _ _ _ _ ivEVP_EncryptInit_ex _ _ _ _ _ iv

Since the iv token is reused, Anselm constrains its search to only match functions which contain the same Value pointer at that argument position.

I also defined a syntax to perform negative lookaheads, which tells Anselm to look for the absence of specific function calls. For example, if I wanted to prevent any context from being used before initialized, I would prepend an exclamation mark like so:

EVP_CIPHER_CTX_new ctx! EVP_EncryptInit_ex _ ctx _ _ _ _EVP_EncryptUpdate _ ctx _ _ _ _

In English, this pattern identifies any calls to EVP_CIPHER_CTX_new and EVP_EncryptUpdate that do not have EVP_EncryptInit_ex sandwiched in between.

Final Notes

With its current set of tools, Anselm is capable of interpreting a wide range of function call patterns and searching for them in LLVM bitcode. Of course, it’s still a prototype and there are improvements to be made, but the main ideas are there and I’m proud of how the project turned out. Thanks to Trail of Bits for supporting these types of internships — it was a lot of fun!


*** This is a Security Bloggers Network syndicated blog from Trail of Bits Blog authored by Trail of Bits. Read the original post at: https://blog.trailofbits.com/2020/05/29/detecting-bad-openssl-usage/