Introduction
Before we dive into the technical details, we want to raise our hats to the teams behind binwalk, ubi_reader, jefferson, and yaffshiv and express our respect and admiration for the work they put into it over the years and all their contributions towards the security community. Without them, and many other great projects, security analysis of IoT devices would not be where it is today. With the fading maintenance of binwalk, we too were inspired to contribute to the security community and open source our internal extraction framework unblob. Our objective with this blog is to summarize some of the pitfalls when dealing with untrusted data and to raise awareness about path traversal security issues and the impact they may have.
With that being out of the way, let’s dive in !
As detailed in my Black Alps talk, we audited multiple third-party extractors code base that unblob relies on over the summer of 2022 and identified multiple issues ranging from logic bugs leading to extraction failures to path traversals. In the process, I learned a lot about the many different ways you can end up with a path traversal in Python.
Around October 2022, I had the realization that if all those third-party dependencies are suffering from some variation of these insecure coding patterns, binwalk may be too. So, I started looking and soon enough found a path traversal within the PFS filesystem extractor. I then found a way to gain remote code execution by abusing binwalk’s plugin system over lunch at hardwear.io with Mücahid.
As explained in the pull request I sent on October 26th, I took the liberty to report [it] in the open since #556 was fixed that way and I did not find any security/coordinated disclosure policy or contact info. At the time of publication, the vulnerability has yet to be patched.
Path Traversal in Binwalk
Affected vendor & product | Refirm Labs binwalk |
Vendor Advisory | None at this time. |
Vulnerable version | 2.1.2b through 2.3.3 included |
Fixed version | None at this time. |
CVE IDs | CVE-2022-4510 |
Impact (CVSS) | 7.8 (high) AV:L/AC:L/PR:N/UI:R/S:U/C:H/I:H/A:H |
Credit | Q. Kaiser, ONEKEY Research Lab |
Summary
A path traversal vulnerability was identified in ReFirm Labs binwalk from version 2.1.2b through 2.3.3 (inclusive). This vulnerability allows remote attackers to execute arbitrary code on affected installations of binwalk. User interaction is required to exploit this vulnerability in that the target must open the malicious file with binwalk using extract mode (-e
option).
The bug
PFS is an obscure filesystem format found in some embedded devices. The only public documentation comes from a tool named pfstool written by Peter Lekensteyn.
A PFS extractor plugin was merged into binwalk in 2017 with commit d023454, and a path traversal mitigation attempt was introduced with commit 58d1d92 on the same day. This commit introduced the following change:
def extractor(self, fname): fname = os.path.abspath(fname) + out_dir = binwalk.core.common.unique_file_name(os.path.join(os.path.dirname(fname), "pfs-root")) + try: with PFS(fname) as fs: # The end of PFS meta data is the start of the actual data - data = open(fname, 'rb') + data = binwalk.core.common.BlockFile(fname, 'rb') data.seek(fs.get_end_of_meta_data()) for entry in fs.entries(): - self._create_dir_from_fname(entry.fname) - outfile = open(entry.fname, 'wb') - outfile.write(data.read(entry.fsize)) - outfile.close() + outfile_path = os.path.join(out_dir, entry.fname) + if not outfile_path.startswith(out_dir): # this branch will never be taken + binwalk.core.common.warning("Unpfs extractor detected directory traversal attempt for file: '%s'. Refusing to extract." % outfile_path) + else: + self._create_dir_from_fname(outfile_path) + outfile = binwalk.core.common.BlockFile(outfile_path, 'wb') + outfile.write(data.read(entry.fsize)) + outfile.close() data.close() except KeyboardInterrupt as e: raise e
The issue lies in the fact that os.path.join
one line 16 does not fully resolve a path. Therefore, the condition on line 17 will never be true. Here’s an example of that behavior:
>>> os.path.join("/tmp", "../etc/passwd") '/tmp/../etc/passwd' >>> os.path.abspath(os.path.join("/tmp", "../etc/passwd")) '/etc/passwd'
By crafting a valid PFS filesystem with filenames containing the ../
traversal sequence, we can force binwalk to write files outside of the extraction directory.
Our fix
Our fix simply introduce a call to os.path.abspath
on line 8 so that the built path is fully resolved.
--- a/src/binwalk/plugins/unpfs.py +++ b/src/binwalk/plugins/unpfs.py @@ -104,7 +104,7 @@ class PFSExtractor(binwalk.core.plugin.Plugin): data = binwalk.core.common.BlockFile(fname, 'rb') data.seek(fs.get_end_of_meta_data()) for entry in fs.entries(): - outfile_path = os.path.join(out_dir, entry.fname) + outfile_path = os.path.abspath(os.path.join(out_dir, entry.fname)) if not outfile_path.startswith(out_dir): binwalk.core.common.warning("Unpfs extractor detected directory traversal attempt for file: '%s'. Refusing to extract." % outfile_path) else:
Exploitation Strategy
There are plenty of ways to get remote command execution from a path traversal (e.g., by overwriting .ssh/authorized_keys
to obtain password-less SSH access, overwrite ~/.bashrc
to execute arbitrary commands on the next login), but I wanted something that was environment agnostic and relied on what’s already there. Enter binwalk plugins.
Since the early days of binwalk, users have the ability to define their own plugins using binwalk’s API. As indicated in the documentation:
“Activating a plugin is as simple as dropping it in binwalk’s plugin directory $HOME/.config/binwalk/plugins/. The plugin will then be loaded on all subsequent binwalk scans.“
So, if we exploit the path traversal to write a valid plugin at that location, binwalk will immediately pick it up and execute it while it’s still scanning the malicious file. On top of that, the PFS extractor will take care of creating all required directories if they do not exist, so we don’t need to expect anything from the system we’re running on.
This is the plugin I ended up writing. The plugin executes two times since it does not define an explicit MODULE
attribute that defines its purpose (e.g., signature scan, entropy calculation, compression stream identification). I take advantage of that behavior to make it clean up after itself.
import binwalk.core.plugin import os import shutil class MaliciousExtractor(binwalk.core.plugin.Plugin): """ Malicious binwalk plugin """ def init(self): if not os.path.exists("/tmp/.binwalk"): os.system("id") with open("/tmp/.binwalk", "w") as f: f.write("1") else: os.remove("/tmp/.binwalk") os.remove(os.path.abspath(__file__)) shutil.rmtree(os.path.join(os.path.dirname(os.path.abspath(__file__)), "__pycache__"))
Crafting malicious PFS file is left as an exercise to the reader.
Demo
Here’s a video demo of the exploit:
Future Work
The “D-Link RomFS” plugin is probably affected by a similar vulnerability but the format, which is actually eCOS RomFS, is not parsed properly (see this PR for a fix). I did not want to load two opposing format constructs in my brain just to come up with a proof-of-concept. As a former colleague of mine would have said: CBA.
Key Takeaways
As security industry, every now and then, we need to look in the mirror and also validate the security of our own technology stack. This especially becomes critical in forensic analysis and reverse engineering where we are commonly faced with untrusted, potentially malicious files.
While the path traversals described in this article have the potential to void any reverse engineering efforts and to tamper with evidence collected, they also demonstrate the importance of sandboxing analysis environments to limit the impact of such vulnerabilities. Especially with the rise of automated extraction and analysis tools relying on tools like binwalk (e.g., FACT, ofrak, EMBA), it’s important for developers and users of those solution to be aware of the risks.
Timeline
2022-10–24 – Attempt to get in touch with Refirm Labs but no security policies and domains are down.
2022-10-26 – Decided to send a pull request with the fix (https://github.com/ReFirmLabs/binwalk/pull/617) so that it could be immediately integrated.
2022-11-17 – Live demo of the exploit during our talk at Black Alps.
2023-01-24 – Since the CPE of the latest binwalk vulnerability states microsoft:binwalk
and that Refirm Labs got acquired in 2021, we reported it to MSRC. Turns out MSRC does not consider it a Microsoft product and the CPE was chosen this way by VulDB.
2023-01-25 – Since we’re a CNA and we’re not seeing any movement on the repository, we take the decision to create a dedicated CVE so that users are aware of this.
2023-01-31 – ONEKEY releases its advisory
Python Path Traversal Code Patterns
All of the code examples provided below are illustrations of the insecure code patterns observed in the affected projects. You can click on the link provided in each description to open the pull request highlighting the actual code.
ubi_reader – no path traversal verification at all
Affected vendor & product | jrspruitt:ubi_reader |
Vulnerable version | < 0.8.5 |
Fixed version | 0.8.5 |
CVE IDs | CVE-2023-0591 |
Impact (CVSS) | 5.5 (medium) AV:L/AC:L/PR:N/UI:R/S:U/C:N/I:H/A:N |
Credit | Q. Kaiser, ONEKEY Research Lab |
As seen in ubi_reader, the code does not attempt to protect against traversal.
import os extraction_dir = "/tmp" for filename in filenames: extraction_path = os.path.join(extraction_dir, filename)
jefferson – no path traversal verification at all
Affected vendor & product | sviehb:jefferson |
Vulnerable version | < 0.4.1 |
Fixed version | 0.4.1 |
CVE IDs | CVE-2023-0592 |
Impact (CVSS) | 5.5 (medium) AV:L/AC:L/PR:N/UI:R/S:U/C:N/I:H/A:N |
Credit | Q. Kaiser, ONEKEY Research Lab |
Similar but not the same signature, observed in Jefferson.
import os extraction_dir = "/tmp" for filename in filenames: extraction_path = os.path.join(os.getcwd(), extraction_dir, path)
yaffshiv – misunderstanding os.path.join’s argument precedence
Affected vendor & product | devttys0:yaffshiv |
Vulnerable version | <= 0.1 |
Fixed version | None |
CVE IDs | CVE-2023-0593 |
Impact (CVSS) | 5.5 (medium) AV:L/AC:L/PR:N/UI:R/S:U/C:N/I:H/A:N |
Credit | Q. Kaiser, ONEKEY Research Lab |
The code makes the assumption that filename
does not start with a forward slash. Observed in yaffshiv.
import os extraction_dir = "/tmp" for filename in filenames: file_path = os.path.join(extraction_dir, filename) if b'..' in file_path: raise Exception("Path traversal attempt, aborting.")
The second argument of os.path.join
always takes precedence if both of them starts with a forward slash.
>>> os.path.join("/tmp", "home/traversal") '/tmp/home/traversal' >>> os.path.join("/tmp", "/home/traversal") '/home/traversal'
binwalk’s unpfs – misunderstanding os.path.join’s lack of resolution
The code makes the assumption that os.path.join
returns an absolute path, which it doesn’t.
import os extraction_dir = "/tmp" for filename in filenames: outfile_path = os.path.join(extraction_dir, filename) if not outfile_path.startswith(extraction_dir ): # this condition will never be True raise Exception("Path traversal attempt, aborting.")
This is what it looks like:
>>> os.path.join("/tmp", "../etc/passwd") '/tmp/../etc/passwd' >>> os.path.abspath(os.path.join("/tmp", "../etc/passwd")) '/etc/passwd'