How AWS SideTrail verifies key AWS cryptography code

We know you want to spend your time learning valuable new skills, building innovative software, and scaling up applications — not worrying about managing infrastructure. That’s why we’re always looking for ways to help you automate the management of AWS services, particularly when it comes to cloud security. With that in mind, we recently developed SideTrail, an open source, program analysis tool that helps AWS developers verify key security properties of cryptographic implementations. In other words, the tool gives you assurances that secret information is kept secret. For example, developers at AWS have used this tool to verify that an AWS open source library, s2n, is protected against cryptographic side-channels. Let’s dive a little deeper into what this tool is and how it works.

SideTrail’s secret sauce is its automated reasoning capability, a mathematical proof-based technology, that verifies the correctness of critical code and helps detect errors that could lead to data leaks. Automated reasoning mathematically checks every possible input and reports any input that could cause an error. Full detail about SideTrail can be found in a scientific publication from VSTTE, a prominent software conference.

Automated reasoning is also useful because it can help prevent timing side-channels, a process by which data may be inadvertently revealed. Automated reasoning is what powers SideTrail to be able to detect the presence of these issues. If a code change causes a timing side-channel to be introduced, the mathematical proof will fail, and you’ll be notified about the flaw before the code is released.

Timing side-channels potentially allows attackers to learn about sensitive information, such as encryption keys, by observing the timing of messages on a network. For example, let’s say you have some code that checks a password as follows:


for (i = 0; i < length; ++i) { if (password[i] != input[i]) { send("bad password"); }
}

The amount of time until the reply message is received depends on which byte in the password is incorrect. As a result, an attacker can start guessing what the first character of the password is:

  • a*******
  • b*******
  • c*******

When the time to receive the error message changes, the first letter in the password is revealed. Repeating this process for each of the remaining characters turns a complex, exponential guessing challenge into a comparatively simple, linear one. SideTrail identifies these timing gaps, alerting developers to potential data leaks that could impact customers.

Here’s how to get started with SideTrail:

  1. Download SideTrail from GitHub.
  2. Follow the instructions provided with the download to setup your environment.
  3. Annotate your code with a threat model. The threat model is used to determine:
    • Which values should be considered secret. For example, a cryptographic key would be considered a secret, whereas the IP address of a public server wouldn’t.
    • The minimum timing difference an attacker can observe.
  4. Compile your code using SideTrail and analyze the report (for full guidance on how to analyze your report, read the VSTTE paper).

How SideTrail proves code correctness

The prior example had a data leak detected by SideTrail. However, SideTrail can do more than just detect leaks caused timing side-channels: it can prove that correct code won’t result in such leaks. One technique that developers can use to prevent data loss is to ensure that all paths in a program take the same amount of time to run no matter what a secret value may be. SideTrail helps to prove that such code is correct. Consider the following code that implements this technique:


int example(int pub, int secret) { if (secret > pub) { pub -= secret; } else { pub += secret; } return pub;
}

Following the steps outlined in the prior section, a developer creates a threat model for this code by determining which values are public and which are secret. After annotating their code accordingly, they then compile it using SideTrail, which verifies that all paths in the code take the same amount of time. SideTrail verifies the code by leveraging the industry-standard Clang compiler to compile the code to LLVM bitcode. Working at the LLVM bitcode level allows SideTrail to work on the same optimized representation the compiler does. This is important because compiler optimizations can potentially affect the timing behavior of code, and hence the effectiveness of countermeasures used in that code. To maximize the effectiveness of SideTrail, developers can compile code for SideTrail using the same compiler configuration that’s used for released code.

The result of this process is a modified program that tracks the time required to execute each instruction. Here is how the result looks, inclusive of the annotations SideTrail includes (in red) to indicate the runtime of each operation. For more detail on the LLVM annotations, please read the VSTTE paper referenced earlier in this post.


int example(int pub, int secret, int *cost) { *cost += GT_OP_COST; if (secret > pub) { *cost += SUBTRACTION_OP_COST; pub -= secret; } else { *cost += ADDITION_OP_COST; pub += secret; } return pub;
} 

If you’re wondering how this works in practice, you can inspect the SideTrail model by examining the annotated LLVM bitcode. Guidance on inspecting the SideTrail model can be found in the VSTTE paper.

To check whether the above code has a timing side-channel, SideTrail makes two copies of the code, symbolically executes both copies, and then checks whether the runtime differs. SideTrail does this by generating a wrapper and then using an automated reasoning engine to mathematically verify the correctness of this test for all possible inputs. If automated reasoning engine does this successfully, you can have confidence that the code does not contain a timing side-channel.

Conclusion

By using SideTrail, developers will be able to prove that there are no timing side channels present in their code, providing increased assurance against data loss. For example, Amazon s2n uses code-balancing countermeasures to mitigate this kind of attack. Through the use of SideTrail, we’ve proven the correctness of these countermeasures by running regression tests on every check-in to s2n, allowing AWS developers to build on top of this library with even more confidence, while delivering product and technology updates to customers with the highest security standards available.

More information about SideTrail and our other research can be found on our Provable Security hub. SideTrail is also available for public use as an open-source project.

If you have feedback about this blog post, submit comments in the Comments section below.

Want more AWS Security news? Follow us on Twitter.

Daniel Schwartz-Narbonne

Daniel is a Software Development Engineer in the AWS Automated Reasoning Group. Prior to joining Amazon, he earned a PhD at Princeton, where he developed a software framework to debug parallel programs. Then, at New York University, he designed a tool that automatically isolates and explains the cause of crashes in C programs. When he’s not working, you might find Daniel in the kitchen making dinner for his family, in a tent camping, or volunteering as an EMT with a local ambulance squad.