Blue Team Rust: What is "Memory Safety", Really?

07-29-2020 | A brief technical primer on Rust's key security feature, with embedded-specific context.

08-02-2020 update: Reddit discussion here, Hacker News here. Appreciate all the community feedback!

Rust Memory Safety

Tools shape both their user and their result. Paradigms of C and C++ have molded generations of systems programmers, the ubiquity and staying power of both languages is a testament to their utility. But the resultant software has suffered decades of memory corruption CVEs.

Rust, as a compiled language without garbage collection, supports what have traditionally been C/C++ domains. This includes everything from high-performance distributed systems to microcontroller firmware. Rust offers an alternative set of paradigms, namely ownership and lifetimes. If you've never tried Rust, imagine pair programming alongside a nearly-omniscient but narrowly-focused perfectionist. That's what the borrow checker, a compiler component implementing the ownership concept, can sometimes feel like. In exchange for the associated learning curve, we get memory safety guarantees.

Like many in the security community, I've been drawn to Rust by the glittering promise of a safe alternative. But what does "safe" actually mean on a technical level? Is the draw one of moths to flame, or does Rust fundamentally change the game?

This post is my attempt to answer these questions, based on what I've learned so far. Memory safety is a topic knee-deep in operating system and computer architecture concepts, so I have to assume formidable prior systems security knowledge to keep this post short-ish. Whether you're already using Rust or are just flirting with the idea, hope you find it useful!

What exactly are the "memory safety guarantees"? In terms of exploitability?

Let's start with the good news. Rust largely prevents a major vector for information leakage and malicious code execution:

Stack protection: Classic stack-smashing is now an exception and not memory corruption; attempts to write past the end of a buffer will trigger a panic instead of leading to buffer overflow. Regardless of panic handling logic (e.g. panic_reset), your application could still be subject to Denial of Service (DoS) attacks. This is why fuzzing Rust is still worthwhile. But, as the panic prevents attacker-controlled stack corruption, you won't fall victim to Arbitrary or Remote Code Execution (ACE and RCE, respectively). Attempts to read past the end of a buffer are similarly stopped, so no Heartbleed-style bugs. Enforcement is dynamic: the compiler inserts runtime bounds checks where necessary, incurring small performance overhead. Bounds checks are more effective than the stack cookies a C compiler might insert because they still apply when indexing linear data structures, an operation that's easier to get right with Rust's iterator APIs.
Heap protection: Bounds checks and panic behavior still apply to heap-allocated objects. In addition, the ownership paradigm eliminates dangling pointers, preventing Use-After-Free (UAF) and Double-Free (DF) vulnerabilities: heap metadata is never corrupted. Memory leaks (meaning never freeing allocations, not over-reading data) are still possible if a programmer creates cyclical references. Compile-time static analysis does the enforcement, soundly reasoning about abstract states representing all possible dynamic executions. There is no runtime cost. Effectiveness is maximal: the program simply can't enter a bad state.
References are always valid and variables are initialized before use: safe Rust doesn't allow manipulation of raw pointers, ensuring that pointer dereferences are valid. This means no NULL dereferences for DoS and no pointer manipulation for control flow hijack or arbitrary read/write. The Option type facilitates error handling when NULL is a concept the programmer wishes to logically express. These are compile-time guarantees, courtesy of ownership and lifetimes. A similar compile-time guarantee ensures variables can't be read until they've been initialized. Use of uninitialized variables is a warning in most modern C compilers; that aspect isn't novel. But ensuring valid dereferences certainly is.
Data races are completely eliminated: Rust's ownership system ensures that any given variable can only have one writer (e.g. a mutable reference) at any given program point, but an unlimited number of readers (e.g. immutable references). In addition to enabling memory safety, this scheme solves the classic readers-writers concurrency problem. Thus Rust eliminates data races, sometimes without the need for synchronization primitives or reference counting - but not race conditions in general. Data race prevention reduces opportunities for concurrency attacks.

All good things in life come with a caveat. Let's look at the fine print:

Not all Rust code is memory safe: Satisfying the compiler's analyses is part of what makes implementing certain data structures, like doubly-linked lists, challenging in Rust. Moreover, certain low-level operations, like Memory Mapped I/O (MMIO), are difficult to fully analyze for safety. Blocks of code marked as unsafe are manually-designated "blindspots" for the analyses, relaxing safety-specific checks since the programmer vouches for their correctness. This includes parts of Rust's standard library, for which CVE numbers have been assigned, and, by extension, any external libraries called via C Foreign Function Interface (CFFI). Furthermore, researchers have found that ownership's automatic destruction can create new (meaning unique to Rust) UAF and DF patterns in unsafe code. Hardened allocators, which check heap consistency invariants dynamically, aren't entirely obsolete. Memory safety guarantees apply broadly, not universally.
unsafe drops memory safety guarantees for a limited scope and doesn't eliminate all checks: unsafe isn't a free-for-all. Type, lifetime, and reference checks are still active; high-risk operations have explicit APIs (e.g. get_unchecked). The CFFI boundary may be a weak link. While memory corruption is possible with unsafe, the possibility is constrained to small portions of your codebase - around 1% of a typical Rust library by one estimate. From a security audit perspective, that's a colossal reduction in attack surface for a major bug class. Think of unsafe as a small Trusted Computing Base (TCB) in a larger system.
Interior mutability can push borrow checks to runtime: the interior mutability pattern allows multiple mutable aliases to a single memory location so long as they're not in use simultaneously. It's a sidestep of the borrow checker, a fallback when the problem can't be reframed in an idiomatic way for powerful compile-time guarantees. Safe wrappers for unsafe APIs (e.g. Rc<RefCell<T>>, Arc<Mutex<T>>) verify exclusivity at runtime, incurring a performance penalty and introducing the potential to panic. I couldn't find metrics on how widely used this pattern is or isn't, but would again recommend fuzzing for probabilistic panic detection.

To be honest, decades of hardware, OS, and compiler-level defenses have hardened C and C++ deployments. Memory corruption 0-days aren't exactly low-hanging fruit. Yet Rust still feels like a significant step forward and a noteworthy improvement to the security posture of performance-critical software. Even though the unsafe escape hatch must exist, memory corruption - a large and vicious bug class - is largely eliminated.

So is Rust the new messiah, sent to save us from the hell of remote shell? Definitely not. Rust won't stop command injection (e.g. part of an input string ending up as an argument to execve). Or misconfiguration (e.g. fallback to an insecure cipher). Or logic bugs (e.g. forgetting to verify user permissions). No general purpose programming language will make your code inherently secure or formally correct. But at least you don't have to worry about these kinds of mistakes and maintaining complex, invisible memory invariants throughout your Rust codebase.

OK, what about embedded systems? Aren't those super vulnerable?

Let's assume "embedded" means no OS abstractions; the software stack is a single, monolithic binary (e.g. AVR or Cortex-M firmware) or part of the OS itself (e.g. kernel or bootloader). Rust's #![no_std] attribute facilitates developing for embedded platforms. #![no_std] Rust libraries typically forsake dynamic collections (like Vec and HashMap) for portability to baremetal environments (no memory allocator, no heap). The borrow checker barrier is minimal without dynamic memory, so prototyping ease remains roughly equivalent to embedded C - albeit with fewer supported architectures.

The resource-constrained and/or real-time embedded systems #![no_std] targets often lack modern mitigations like a Memory Protection Unit (MPU), No eXecute (NX), or Address Space Layout Randomization (ASLR). We're talking about a lawless land where memory is flat and no one can hear you segfault. But Rust still gives us that sweet, sweet bound check insurance when running baremetal without an allocator. That's noteworthy because it might be the first and last line of defense in an embedded scenario. Just remember that low-level interaction with hardware will probably require some amount of unsafe code, in which memory access without bounds check is opt-in.

For x86/x64, stack probes are also inserted by the Rust compiler to detect stack overflow. At present, this feature doesn't apply to #![no_std] or other architectures - although creative linking solutions have been suggested. Stack probes, often implemented via guard pages, prevent exhausting stack space due to unending recursion. Bounds checks, on the other hand, prevent stack or heap-based buffer overflow bugs. It's a subtle distinction, but an important one: for security, we typically care far more about the latter.

Keep in mind that memory, from the perspective of Rust, is a software abstraction. When the abstraction ends, so do the guarantees. If physical attacks (side-channel attacks, fault injection, chip decapsulation, etc.) are part of your threat model, there is little reason to believe that language choice offers any protection. If you forgot to burn in the appropriate lock bits, shipped with a debug port exposed, and a symmetric key for firmware decryption/authentication is sitting in EEPROM: an attacker in the field won't need a memory corruption bug.

That's all cool, but how about day-to-day development?

Dependency management isn't as glamorous as exploits with catchy marketing names or new-age compiler analyses to prevent them. But if you've ever been responsible for production infrastructure, you know that patch latency is often the one metric that counts. Sometimes it's your code that is compromised, but more often it's a library you rely on that puts your systems at risk. This is an area where Rust's package manager, cargo, is invaluable.

cargo enables composability: your project can integrate 3rd party libraries as statically-linked dependencies, downloading their source from a centralized repository on first build. It makes dependency maintenance easier - including pulling the latest patches, security or otherwise, into your build. No analogue in the C or C++ ecosystems provides cargo's semantic versioning, but managing a set of git submodules could have a similar effect.

Unlike C/C++ submodule duck tape, the aforementioned composability is memory safe in Rust. C/C++ libraries pass struct pointers around with no enforced contract for who does the cleanup: your code might free an object the library already freed - a reasonable mistake - creating a new DF bug. Rust's ownership model provides a contract, simplifying interoperability across APIs.

Finally, cargo provides first-class test support, an omission modern C and C++ are often criticized for. Rust's toolchain makes the engineering part of software engineering easier: testing and maintenance is straightforward. In the real world, that can be as important for overall security posture as memory safety.

Hold on...didn't we forget about integer overflows?

Not exactly. Integer overflow isn't a memory safety issue categorically, it'd almost certainly have to be part of a larger memory corruption bug chain to facilitate ACE. Say the integer in question was used to index into an array prior to write of attacker-controlled data - safe Rust would still prevent that write.

Regardless, integer overflows can lead to nasty bugs. cargo uses configurable build profiles to control compilation settings, integer overflow handling among them. The default debug (low optimization) profile includes overflow-checks = true, so the binary output will panic on integer overflow where the developer hasn’t made it explicit (e.g. u32::wrapping_add). Unless overwritten, release (high optimization) mode does the opposite: silent wrap-around is allowed, like C/C++, because removing the check is better for performance. Unlike C/C++, integer overflow is not undefined behavior in Rust; you can reliably expect two's complement wrap.

If performance is priority number one, your test cases should strive for enough coverage of debug builds to catch the majority of integer overflows. If security is priority number one, consider enabling overflow checks in release and taking the availability hit of a potential panic.

Takeaway

Memory safety is not a new idea, garbage collection and smart pointers have been around for a while. But sometimes it's the right implementation of an existing good idea that makes for a novel great idea. Rust's ownership paradigm - which implements an affine type system - is that great idea, enabling safety without sacrificing predictable performance.

Now I [begrudgingly] aim to be pragmatic, not dogmatic. There are perfectly valid reasons to stick with a mature vendored HAL and C toolchain for a production embedded project. Many existing C/C++ code bases should be fuzzed, hardened, and maintained - not re-written in Rust. Some library bindings, those of the z3 solver being one example, greatly benefit from the dynamic typing of an interpreted language. In certain domains, languages with Hoare logic pre and post conditions might justify the productivity hit (e.g. Spark Ada). Physical attacks are typically language agnostic. In summary: no tool is a panacea.

That disclaimer aside, I can't remember the last time a new technology made me stop and take notice quite like Rust has. The language crystallizes systems programming best practices in the compiler itself, trading development-time cognitive load for runtime correctness. Explicit opt-in (e.g. unsafe, RefCell<T>) is required for patterns that put memory at risk. Mitigation of a major bug class feels like a legitimate shift left: a notable subset of exploitable vulnerabilities become compile time errors and runtime exceptions. Ferris has a hard shell.

Read a free technical book! I'm fulfilling a lifelong dream and writing a book. It's about developing secure and robust systems software. Although a work-in-progress, the book is freely available online (no paywalls or obligations): https://highassurance.rs/

Tiemoko Ballo