In this blog post we introduce a new kind of attack against embedded systems relying on XiP (eXecute in Place) that exploits TOCTOU at hardware level in order to bypass secure boot.
Time-of-Check to Time-of-Use (TOCTOU) vulnerabilities pose a significant threat to embedded systems, particularly when dealing with external flash memory. Flash memory serves as a non-volatile storage medium for firmware, allowing for the storage of code and data even when the system is powered off. However, the temporal gap inherent in TOCTOU vulnerabilities introduces a risk that malicious actors can exploit.
In the realm of flash memory where the firmware is usually stored in an embedded system, TOCTOU vulnerabilities often involve the checking of a condition, such as firmware integrity, at one point in time and subsequently using that result later during execution. This time gap creates an opportunity for attackers to manipulate the system, leading to unauthorized code execution or data tampering.
For instance, an attacker might clone the external flash memory, modify its contents, and revert to the original flash during the execution of a critical process, such as secure boot. If the system checks the integrity of the firmware only at the start of the boot process but fails to revalidate during execution, it creates a window for the attacker to introduce malicious code.
This concern is not limited to any specific platform, and it underscores the broader challenge in securing embedded systems where external non-volatile storage plays a pivotal role in the overall functionality and security of the device.
eXecute in Place (XiP), a prevalent approach involving direct code execution from flash memory, magnifies the implications of TOCTOU vulnerabilities. This is evident in the temporal gap between the verification of conditions, such as firmware integrity, and the actual execution of code during XiP.
Consider a scenario where the system conducts an initial integrity check during boot but continues to execute code directly from flash thereafter. In this context, an attacker exploiting a TOCTOU vulnerability could subtly manipulate flash content after the initial check, leading to unauthorized code execution at runtime.
However, the current technique of emulating flash memory to explore and address TOCTOU vulnerabilities introduces a layer of complexity. The intricacies of replicating the temporal behavior, statefulness, and protocol-based features of flash memory in an emulation setting contribute to the challenges researchers face. This complexity underscores the need for advanced tooling to accurately emulate flash memory and assess the security robustness of systems employing XiP strategies against evolving threats in the embedded space.
To combat the challenges posed by Time-of-Check to Time-of-Use (TOCTOU) vulnerabilities in embedded systems. This approach utilizes two flash memory modules making use of the embedded Chip Select (CS) line. In contrast to complex flash emulation, this method enables a straightforward yet robust dynamic switch between normal and potentially altered firmware at runtime.
Here’s how it works: initially, the system operates using one flash memory module, validated during boot. However, the clever use of the CS line facilitates a smooth transition to the second flash memory module at runtime. This exploits the temporal gap between integrity checks and execution, allowing for a dynamic switch without the need for intricate emulation.
The vulnerability not only affects the Chips which relies on XiP but also on filesystems which loads the pages on request. For example a smart device which loads the firmware from a NAND flash to verify it and loads the linux/rtos filesystem dynamically. This can theoretically be manipulated by using this technique as NAND flash supports Chip Select. Maybe it is possible with DRAM as well?? I will leave it upto the reader to experiment
The ESP32 employs Execute-in-Place (XiP), a key strategy that enhances both performance and security in microcontroller applications. XiP allows the ESP32 to execute code directly from external flash memory, eliminating the need to copy it into RAM. This not only conserves RAM space but also facilitates the handling of larger firmware sizes, a critical consideration in resource-constrained embedded systems.
In the architecture of ESP32 memory, the careful allocation and utilization of IRAM, DROM, and IROM plays a pivotal role: Instruction RAM (IRAM) resides in the Internal SRAM0 region, dedicated to storing portions of the application requiring rapid execution.
On the other hand, Data Stored in Flash (DROM) is a repository for constant, read-only data that the linker places into a region mapped to the MMU flash cache. This serves as a reference point, distinct from the executable code residing in IROM.
Code executed from flash (IROM) becomes the default residence for functions not explicitly placed into IRAM. Leveraging the Flash MMU mechanism, IROM allows direct code execution from flash, enhancing operational efficiency. This optimizes execution speed and manages memory resources within the ESP32 framework.
The table below shows the memory mapping of the external flash for Data and Instruction.
For the sake of PoC esp-idf/examples/protocols/http_server/simple demo application is used. It is a simple web server which connects to the Wi-Fi and prints a text “Hello world” in the webpage.
Initially, the system was configured with a validated firmware in flash memory 1 with secure boot functionality. This ensured a secure and validated boot process, confirming the integrity of the firmware during system initialization.
The flash memory 2 is the same copy of the flash memory 1 but with the text being rendered being modified to “Hacko World” instead of “Hello World” which breaks the secure boot chain as the integrity is invalidated. While secure boot performed admirably by rejecting the boot attempt from Flash memory 2.
Now the ESP32 is let to boot using flash memory 1 and allowed to connect to the Wi-Fi and then a transition from flash memory 1 to flash memory 2 using Chip Select line was enacted, exposing a notable susceptibility. The ESP32 executed content from the modified flash memory 2, thereby compromising the integrity of the secure boot process. This occurrence underscores the inherent risks associated with Time-of-Check to Time-of-Use (TOCTOU) vulnerabilities, wherein the temporal gap between initial validation and subsequent execution creates an exploitable window.
Demo Video
In the video below, we demonstrate the attack by replacing the text returned by the HTTP server running on this ESP32 target.
The vulnerability exists on both Secure Boot V1 and V2. This vulnerability was reported to Espressif, and they acknowledged the vulnerability. Their proposed mitigation was to use flash encryption, which fixes this issue to some extent, since flash encryption used in ESP32 is AES XTS and each page have their own encryption key. You can’t patch the binary without breaking the page boundary but our opinions are that it doesn’t mitigate it fully if the malicious flash could randomly execute code based on a false cipher text block inside a specific page which is meant to be a security constrained
This is also applicable to all SoC/MCU vendors that support eXecute-in-place vulnerability and doesn’t support XiP encryption or user doesn’t enable them by default.
Research supported by Certainity.
At ONEKEY, we offer comprehensive hardware penetration testing services that go beyond typical checklists and employ innovative, research-based approaches to find vulnerabilities. Our team of experts is dedicated to uncovering vulnerabilities in your hardware systems and providing you with actionable recommendations to improve your security posture. We utilize the latest tools and techniques to identify weaknesses in your hardware, and our team has years of experience in the field.
Don’t wait until it’s too late. Contact ONEKEY today to schedule a hardware penetration test and ensure the security of your organization’s hardware systems.