Emulation Fuzzing versus Hardware-in-the-Loop Fuzzing

This blog post is based on data from Muench et al.’s NDSS 2018 paper “What You Corrupt Is Not What You Crash: Challenges in Fuzzing Embedded Devices”.

Embedded systems often operate with minimal hardware protection. In many cases, memory corruption vulnerabilities do not result in the clear-cut crashes that traditional desktop systems exhibit. This is primarily due to two factors:

Lack of an MMU: Many embedded devices do not have a Memory Management Unit (MMU), which on desktop systems helps trigger segmentation faults and other crash signals.

Absence of Sanitization: Unlike modern desktop operating systems that incorporate runtime sanitizers (such as AddressSanitizer), embedded systems rarely include these defensive measures.

As a consequence, many memory corruption–based vulnerabilities remain “silent” on embedded devices, making them difficult to detect with hardware-in-the-loop (HITL) fuzzing methods that rely solely on observing crashes.

Embedded Device Classifications

Embedded devices can be categorized into three types, which are useful to understand the context of fault behavior:

Type-I: General Purpose OS-Based Devices: Devices that run a general-purpose operating system (for example, Embedded Linux) with minimal modifications.

Example: A router such as the Linksys EA6300v1.

Type-II: Embedded OS-Based Devices: Devices that use a specialized embedded operating system designed for low computational power. These systems may have a logical separation between kernel and application code but often lack advanced features like an MMU.

Example: An IP camera running uClinux.

Type-III: Devices Without an OS Abstraction: Devices that run monolithic firmware without a traditional operating system, meaning there is no MMU or built-in protection against memory corruptions.

Example: A development board such as the STM Nucleo-L152RE.

Are Memory Corruptions Detected?

The following table summarizes the observed behaviors when various artificial memory corruption vulnerabilities were triggered. The results highlight that while desktop systems generally produce immediate, observable crashes, many embedded devices (especially Type-II and Type-III) may not:

When testing for memory corruption vulnerabilities, the following outcomes may be observed:

Observable Crash (✓): The device stops execution, often with an error message. In some cases, the crash details are minimal or “opaque.”

Reboot (✓): The device immediately reboots. For Type-III (monolithic firmware) devices, there is no distinction between a crash and a reboot; for Type-I and Type-II, only specific services may crash while the rest of the system continues to operate.

Hang (!): The target becomes unresponsive—often stuck in an infinite loop—without an immediate crash.

Late Crash (!): The system continues running for a noticeable period before eventually crashing (for example, when the connection terminates).

Malfunctioning (✗): The process continues execution but produces incorrect results or wrong data.

No Effect (✗): Despite the underlying memory corruption, there are no observable side-effects, and the device appears to run normally.

What HITL Fuzzing Overlooks

HITL fuzzing provides valuable insights by testing on real hardware. However, in embedded environments that lack features like an MMU or comprehensive sanitization, many memory corruption scenarios may pass undetected:

Buffer Overflows on Microcontrollers: A buffer overflow might overwrite critical variables such as connection states without triggering an immediate system fault. HITL testing may miss these issues since the corrupted state does not necessarily lead to a crash.

Function Pointer Overwrites: In some cases, memory corruption alters function pointers. Without the protective measures of modern operating systems, the altered pointers may lead to unexpected behavior later in operation rather than an immediate segmentation fault.

Silent Communication Failures: Lightweight protocols often run on bare-metal systems. An out-of-bounds write might silently corrupt data used in subsequent communications, making the error difficult to reproduce and diagnose during HITL testing.

Emulation Fuzzing Is Better

Consider the following scenarios where emulation-based fuzz testing proves beneficial:

Network Packet Processing: A microcontroller handling network packets might suffer a buffer overflow that corrupts a connection state. While HITL testing may not reveal any immediate crash, an emulator can track the corrupted state and identify the potential for unauthorized access.

Control Flow Deviations: Overwriting a function pointer in a jump table might not lead to an instant crash but could cause misdirected execution later. Emulation can capture these deviations, allowing developers to understand and rectify the issue before it’s exploited.

What This Means for HITL Fuzzing

For security teams using HITL fuzzing products, the key takeaway is that relying solely on crash detection (or simple liveness checks) may leave many vulnerabilities undiscovered. In scenarios where the device lacks an MMU or does not incorporate runtime sanitization, critical memory corruption vulnerabilities might only manifest as subtle malfunctions, late crashes, or even no immediately observable effect at all.

To improve vulnerability detection in such cases, emulation-based fuzzing significantly increases the detection rate of memory corruptions.