GreyNoise doesn’t have much common need for detailed firmware analysis. If it’s happening on the internet, we already see it. However, when we do need to investigate vulnerabilities in embedded devices, things can get very complicated, very quickly if no information is publicly available. It can be fun and insightful to learn these skills in the rare case we need them.
In late October 2022, we became aware of CVE-2022-41140, a buffer overflow and remote code execution vulnerability in D-Link routers, which D-Link had been notified of on February 17th. Noting the months-long turnaround time, we decided this was a good chance to perform a learning and discovery exercise.
On March 13th, 2023 we became aware of CVE-2023-24762, a command injection vulnerability in D-Link DIR-867 devices. This recent CVE spurred us to share some of our internal documentation regarding a research spike into D-Link devices.
This blog aims to explain the process of gaining a foothold in firmware or a physical device for vulnerability research and achieving a debuggable interface. While existing Proof-Of-Concept code for (yet another) D-Link vulnerability CVE-2022-1262 is utilized within this document, as well as strong hints at suspect areas of code, don’t expect to find any new ready-to-fire exploits buried in the contents below.
What Vulnerability?
D-Link was notified of CVE-2022-41140, a buffer overflow vulnerability on February 17th, 2022. By November 15th, 2022, no additional information was available, which sparked an investigation into discovering available hints about the nature of the vulnerability. While this accurately speaks to the current state of public vulnerability tracking, we start off our investigation with a simple search on Google for the CVE and find two relevant links:
- https://www.zerodayinitiative.com/advisories/ZDI-CAN-13796/
- https://supportannouncement.us.dlink.com/announcement/publication.aspx?name=SAP10291
While the Zero Day Initiative lists the vulnerability as
(…) flaw exists within the lighttpd service, which listens on TCP port 80 by default. The issue results from the lack of proper validation of the length of user-supplied data prior to copying it to a fixed-length stack-based buffer.
the D-Link Technical Support page provides more detailed information
(…) a 3rd party security research team reported Buffer Overflow & RCE vulnerabilities in the Lighttpd software library utilized in DIR-867, DIR-878, and DIR-882/DIR-882-US router firmware.
A stack-based buffer overflow in the prog.cgi binary in D-Link DIR-867. A crafted HTTP request can cause the program to use strcat() to create a overly long string on a 512-byte stack buffer. Authentication is not required to exploit this vulnerability.
Additionally, the D-Link support page provides a table of the Affected Models
DIR-867 | v1.30B07 & Below | Under Development | 03/04/2022 |
DIR-878 | v1.30B08-Hotfix & Below | v1.30b08_Beta_Hotfix | 04/01/2022 |
DIR-882-US | v1.30B06-Hotfix & Below | Under Development | 03/04/2022 |
From this information, we can derive that the vulnerability is triggered by an HTTP request to TCP port 80, which will hit the lighttpd service and route to the prog.cgi binary resulting in an overflow on a 512-byte stack buffer.
We can also derive that the vulnerability can be patched/mitigated on some hardware models, but not others.
How to trigger the vulnerability?
The D-Link support pages provide links to download firmware images for the DIR-878, including base firmware versions like v1.30B08 as well as security advisement firmware versions like v1.30B08 Hotfix_04b.
Knowing that we can access the firmware images before/after the security patch for CVE-2022-41140, we will attempt the following steps:
Obtain copies of prog.cgi
We start by downloading a known vulnerable version of the firmware for a model that also offers a patched version. We download DIR-878_REVA_FIRMWARE_v1.30B08.zip and extract the firmware image DIR_878_FW1.30B08.bin.
We run the file command to quickly determine if it’s a commonly known file type. Unfortunately, this returns generic information.
Next, we use a more specialized tool, binwalk, which assists in searching binary images for embedded files and executable code. Again, this produces no results.
A handy feature of binwalk is the -E, –entropy command line flags, which allow you to measure the entropy or “randomness” of a file.
As an example, here is an entropy graph of 1024 bytes of Lorem ipsum:
And here is an entropy graph of DIR_878_FW1.30B08.bin
As you can see, the entropy of our firmware image is very high. Typically, this is indicative that a file is in a compressed archive format or is encrypted. Since neither file nor binwalk identified it as a compressed archive format, it’s reasonable to assume that it may be encrypted.
If you believe a file is encrypted, it’s always a good idea to take a peek at the bytes at the start of the file, just in case there’s an identifiable file header:
At the start of the file is a 4-byte sequence that maps to the ASCII characters “SHRS”.
A quick Google search for “SHRS firmware” turns up relevant results, indicating that we’re on the right track.
- https://github.com/0xricksanchez/dlink-decrypt/blob/master/dlink-dec.py
- https://0x00sec.org/t/breaking-the-d-link-dir3060-firmware-encryption-recon-part-1/21943
- https://0x00sec.org/t/breaking-the-d-link-dir3060-firmware-encryption-static-analysis-of-the-decryption-routine-part-2-1/22099
After a bit of reading, we can determine that D-Link does indeed encrypt some of their firmware, which is identifiable by the “SHRS” header. The blogs linked above go into depth on how they obtained a copy of the imgdecrypt binary and reverse engineer the binary to determine how to decrypt the firmware and produce the relevant python script.
Since we will be dealing with encryption again later in this blog, we won’t go into depth on this specific layer of encryption. Our firmware can be decrypted with:
Taking our decrypted firmware and running it through binwalk again we can see that some file signatures are recognized.
Since file signatures were recognized, we can recursively extract them by using the -e, –extract, and -M, –matryoshka, command line flags.
This creates nested folders for each extracted layer of the file, ultimately resulting in a cpio-root folder containing the root filesystem for the firmware.
The desired prog.cgi file is located exactly where those familiar with *nix directory structures would expect it to be. However, for completeness, the file can be located by name using:
Now we have a copy of the entire root filesystem, including prog.cgi.
Repeating the same steps on the patched firmware sets us up for the next step.
Patch Diffing
In the previous step, we obtained an unpatched and patched copy of prog.cgi. We’ll rename them prog_old.cgi and prog_new.cgi, respectively, to help keep track.
BinDiff is a comparison tool for binary files, that assists vulnerability researchers and engineers to quickly find differences and similarities in disassembled code
For this blog, we’ll be using Binary Ninja with the BinDiff Viewer Plugin. There are roughly comparable free alternatives and plugins like Ghidra.
Following the relevant plugin steps to generate a bindiff, we open old/new and begin to look for functions that are very similar but not 1.00, indicating that a small change such as a patch may have been performed.
Uses of strcat()
Using our list of similar (but not exact duplicate!) functions, we work our way down the list, looking for uses of strcat() that have changed between old/new. In this example, the main function:
Old
New
Here we can see that the old binary used strcat() and the new binary has a different set of logic.
The strcat() function concatenates the destination string and the source string, and the result is stored in the destination string.
A quick check of the destination var_20c shows that its size is 0x200, or 512 bytes. For a sanity check, we can list all uses of strcat() throughout the binary.
There are 22 uses of strcat(). After reviewing them, none but the usage within main operate on a 512-byte buffer.
We now have a reasonable candidate for the location of the vulnerability.
Debugging with Emulation
Now that we have a reasonable candidate for a vulnerable code path, the next step is to start determining what conditions are required to actually reach the vulnerable code path. While wiser minds may be able to determine these conditions without needing a debugger, it’s always a safe bet to make getting a debugging interface a priority.
We want to run the necessary components and attach a debugging interface to a running program.
First, we need to determine the attributes of the file we would like to emulate. The file command we used earlier can be used to identify important information about the architecture the binary is meant to run on.
Using QEMU is an easy way to run binaries for other architectures, but with the same operating system as the current one. In this case, we want qemu-mipsel-static which is provided from the qemu-user-static package.
However, we need to know what to run.
There are init scripts that run when a system boots, and we can find the relevant one in /etc_ro/rcS:
It’s best to start at the top and work your way down and Google things where applicable.
- Filesystems are mounted
- /var/run folder is created if it doesn’t exist
- A script to create device (/dev) links is run
- The Message Of The Day (motd) is written to the device console
- A binary to manage reading/writing to non-volatile random-access memory (nvram) is started in the background
- A binary init_system is run with the start command
- A telnet daemon is started
/var/log folder is created if it d