Is the programming language you use to write software a matter of national security? The US White House Office of the National Cyber Director (ONCD) thinks so. On February 26, they issued a report urging that all programmers move to memory-safe programming languages for all code. For those legacy codebases that can’t be ported easily, they suggest enforcing memory-safe practices.
Some of the biggest exploits of the internet era have come from unsafe memory practices with languages that allow unsafe memory handling, like C and C++. In this article, we’ll take a look at what makes for unsafe memory, the consequences, and the languages and practices that mitigate it.
Unsafe memory
Every variable you create and assign data to gets stored in memory. How that memory is allocated is a low-level function of programming languages. Some languages automatically allocate memory when the variable is created or assigned and deallocate the memory when it's no longer needed, usually with some sort of garbage collection mechanism. Other languages, like C and C++, require you to manually allocate memory to variables and deallocate when they are no longer needed. This offers a lot of freedom and possibility, but also opens up some pretty gnarly issues and exploits.
The issues happen in two categories: spatial and temporal. Spatial memory errors happen when you try to read or assign data to arrays or variables outside of what’s been allocated. When you allocate 20 items to an array, what happens when you try to read the 21st item? Memory-safe languages will throw an error, but unsafe ones may let you read contiguous memory regions, regardless of what’s stored there. These are out-of-bounds errors.
Temporal errors happen when a program tries to access deallocated memory. Pointers in particular can cause these problems. Your pointer stores the memory address of a variable, so if the variable gets deallocated and the memory freed for other processes, the pointer remains the same. Accessing it could read whatever new data was stored in that address, known as a use-after-free bug.
As we mentioned above, some of the biggest vulnerability events of the past were memory-safety issues. The Heartbleed bug, which affected the OpenSSL software in 2014, allowed malicious actors to bypass the SSL security on websites and steal tons of secret info, including X.509 certificates, usernames and passwords, instant messages, and emails. More recently, the 2023 BLASTPASS vulnerability allowed black hats to compromise iPhones using a malicious image and without any interactions from the victim. Unlike a lot of security holes, these didn’t require any actions from the user to enable the compromise.
Both of these were out-of-bounds errors: Heartbleed was due to improper input validation, where anything sent was not checked to ensure it stayed within the bounds of what the input variable could handle. BLASTPASS used the image to overflow the buffer and execute arbitrary code. While sanitizing inputs would have helped, a memory-safe language wouldn’t have allowed the extra bytes to move into its memory.
Memory-safe in practice
The ONCD report advises programmers to migrate to memory-safe languages, but only mentions Rust. An earlier NSA cybersecurity report lists C#, Go, Java®, Ruby™, and Swift® as well. While these languages can’t absolutely guarantee safety (and Rust will let you explicitly mark code `unsafe`), they do have features that make it easier.
Rust has what’s called a borrow checker. This ensures that any reference doesn’t outlive the data that it refers to. When you create a variable, it allocates the necessary memory for its value. When you assign a value to that variable, it owns that data; you can’t assign it to something else. When that variable falls out of scope, say at the end of the function in which it was created, the memory is freed up so there’s no chance of use-after-free bugs.
To prevent this from being too restrictive, Rust also allows for borrowing in other frames by using reference (similar to pointers in C and C++). The difference is that the reference goes away when the variable does. It takes a little bit of time to get the hang of Rust (and stop fighting the borrow checker), but it’s grown into a language that developers love—it’s topped our survey since 2016 as the most loved language and was the most admired this year.
But suppose you can’t rewrite your entire C or C++ codebase. There are ways to be memory-safe there, too. The snarky version is `malloc(x) ( null )`. The helpful version, as posted by user15919568 on Stack Overflow, includes the following:
- NULL out pointers always when freeing memory to avoid use-after-free bugs and double free bugs.
- Always perform bound checks to avoid OOB (out-of-bounds) read and OOB write vulnerabilities.
- Try not to use recursion, or just use it when knowing your limits, to prevent Stack Exhaustion and Heap Exhaustion vulnerabilities.
- If you suspect a pointer could be NULL at any time, always check it before using it to avoid NULL pointer dereference vulnerabilities.
- Use multi-thread hardening mechanisms to avoid race conditions leading to memory-safety bugs.
- Initialize always pointers and variables, specially if they are going to be used or accessed without prior value assignment.
- Always ensure a string is properly NULL-terminated, to avoid memory leaks and other memory safety issues.
- Be sure copying functions, specially when using loops, are properly designed not to surpass one byte into a subsequent buffer or variable (off-by-one vulnerability).
- Carefully select types and casts to avoid problems like integer overflows.
Even coding with memory-safe languages and writing safe code, the ONCD recommends using formal methods to check for potential vulnerabilities. This would include static analysis, which analyzes how a program would perform without compiling or executing it, and assertion-based testing, which uses boolean assertions that will be true unless there is a bug in the program. By including these as part of build and deploy pipelines, developers can minimize the chances of anything unsafe making it into production.
Protecting data in software
It’s no surprise that, after so many high-profile exploits, governments now understand the fundamentals of programming as a national security issue. C and C++ have been the workhorses of foundational software that runs on anything, even bare metal. While their memory management allows for some high-performance tricks, it also opens the doors to misuse and new exploits.
That said, computer science and software development have developed a culture of knowledge-sharing, and progress moves ever forward. As new languages develop and new concepts are implemented, will there be a new set of recommendations issued in a decade about some exploit we haven’t even seen yet?