• Aucun résultat trouvé

XSS in WiFi Enabled SD Cards?

4.5 Case Studies

4.5.3 XSS in WiFi Enabled SD Cards?

SD cards are often more complex than one would imagine. Most SD cards actually contain a processor which runs firmware. This processor often manages functions such as the flash memory translation layer and wear leveling. Security issues have been previously shown on such SD cards [202].

Some SD cards have an embedded WiFi interface with a full fledged web server.

This interface allows direct access to the files on the SD card without ejecting

Large Scale Security Analysis of Embedded Devices’ Firmware

Analysis & Reports Database

Private RSA keys with cracked passphrase

VendorC

HTTPS Ecosystem Scans

VendorB

SAME private RSA

SAME self-signed SSL certificate DIFFERENT vendor

Device1

Device2 Check ZMap

IP addresses

Common Vulnerable Components

Figure 4.4:Correlation engine and shared self-signed certificates clustering.

4.6. FUTURE WORK 55

it from the device in which it is inserted. It also allows administration of the SD card configuration (e.g., WiFi access points).

We manually found a Cross Site Scripting (XSS) vulnerability in one of these web interfaces, which consists of a perl based web application. As this web application does not have platform specific binary bindings, we were able to load the files inside a similar Boa web server on a PC and confirm the vulnerability.

Once we found the exact perlfiles responsible for the XSS, we used our corre-lation engine based on fuzzy hashes. With this we automatically found another SD card firmware that is vulnerable to the same XSS. Even though the perl files were slightly different, they were clearly identified as similar by the fuzzy hash. This correlation would not have been detected by a normal checksum or by a regular hash function.

The process is visualized in Figure4.5. The file (*) was found vulnerable. Subse-quently, we identified correlated files based on fuzzy hashing. Some of them were related to the same firmware or a previous version of the firmware of thesame vendor (in red). Also, fuzzy hash correlation identified a similar file in a firmware from a different vendor (in orange) that is vulnerable to the same weakness. It further identified some non-vulnerable or non-related files from other vendors (in green).

Those findings are reported as CVE-2013-5637 andCVE-2013-5638. We were also able to confirm this vulnerability and extend the list of affected versions for one of these vendors.

However, such manual vulnerability confirmation does not scale. Therefore, we integrated some of the static analysis tools [16,44,85,105,139] into our scalable framework. Also, we developed dynamic analysis techniques that scale. In Chap-ter5 we show these techniques integrate in our framework and we demonstrate their effectiveness by finding vulnerabilities in real world firmware images.

4.6 Future Work

We plan to continue collecting new data and extend our analysis to all the firmware images we downloaded so far. Moreover, we want to extend our system with more sophisticated static analysis techniques that allow a more in-depth study of each firmware image. This approach shows a lot of potential and besides the few previously mentioned case studies it can lead to new interesting results such as the ones presented in Chapter3.

Large Scale Security Analysis of Embedded Devices’ Firmware

Same Vendor

*

Same Firmware

Figure 4.5: Fuzzy hash clustering and vulnerability propagation. A vulnerability was propagated from a seed file (*) to other two files from the same firmware and three files from the same vendor (in red) as well as one file from another vendor (in orange).

Also four non-vulnerable files (in green) have a strong correlation with vulnerable files.

Edge thickness displays the strength of correlation between files.

4.7 Summary

In this chapter we presented a large-scale static analysis of embedded firmware images. We showed that a broader view on firmware is not only beneficial, but actually necessary for discovering and analyzing vulnerabilities of embedded de-vices. Our study helps researchers and security analysts to put the security of particular devices in context, and allows them to see how known vulnerabilities that occur in one firmware reappear in the firmware of other manufacturers. The summarized datasets are available at http://firmware.re/usenixsec14.

In the following next two chapters we describe several improvements to our framework. In Chapter 5, we attempt to emulate the firmware images by run-ning the unpacked firmware inside the QEMU emulator. We do this to allow a scalable dynamic and static analysis. We show the effectiveness of the improve-ment by performing scalable analysis of embedded web interfaces within several hundreds of firmware images. Then, in Chapter 6, we apply machine learning to classify and label unknown firmware images. We also use multi-score fusion at the HTTP level to fingerprint embedded online devices. With these improvements, we partially address the “Building a Representative Dataset”, “Firmware Iden-tification”, “Scalability and Computational Limits”, and “Results Confirmation”

challenges presented in Section 4.2.

Chapter 5

Dynamic Firmware Analysis at Scale: A Case Study on Embedded Web Interfaces

5.1 Introduction

During the past few years, embedded devices became more connected forming what is called the Internet of Things (IoT). Such devices are often put online by composition; attaching a communication interface to an existing (insecure) device. Most of these devices lack the user interface of desktop computers (e.g., keyboard, video, mouse), but nevertheless need to be administered. Albeit some devices rely on custom protocols such as “thick” clients or even legacy interfaces (i.e., telnet), the web quickly became the universal “de facto” administration interface. Therefore, the firmware of these devices often embed a web server running from simple to fairly complex web applications. For the rest of this chapter, we will refer to these as embedded web interfaces.

It is well known that making secure web applications is a difficult task. In partic-ular, researchers showed that more than 70% of vulnerabilities are hosted in the (web) application layer [171]. Attackers, who are familiar with this fact, use a va-riety of techniques to exploit web applications. Well known vulnerabilities, such as SQL injection [62] or Cross Site Scripting (XSS) [199], are still frequently exploited and constitute a significant portion of the vulnerabilities discovered each year [71]. Additionally, vulnerabilities such as Cross Site Request Forgery (CSRF) [45], command injection [188], and HTTP response splitting [142] are also quite often present in web applications.

Given such a track record of security problems in both embedded systems and web applications, it is natural to expect the worse from embedded web interfaces.

57

However, as we will discuss, those vulnerabilities are not easy to discover, analyze and confirm.

Analysis of embedded web interfaces. While there are solutions that can be used during the design phase of the software [128,153,178,179], it is also important to discover and patch existing vulnerabilities before they are found and abused “in the wild” by the attackers. One way to do so, is to use a “white box” approach, using static analysis of the source code [44,85,93,140]. Another technique is dynamic analysis, where the web interface is typically exercised against a number of known attack patterns [48,58].

Unfortunately, those tools are either inefficient or difficult to use for detecting vulnerabilities of embedded web servers [48,112]. Performing static analysis on embedded web interfaces is a rather simple task once the firmware has been un-packed. However, one main limitation of this approach is that the web interfaces often rely on various languages (e.g., PHP, CGIs, custom server-side languages), but the static analysis tools are usually designed for a particular one. In addition to that, many static analysis tools are merely “glorified greps” and have a large number of False Positives (FP), which makes them problematic to reliably use in a large scale study. On the other hand, dynamic analysis tools [110,130] are more generic as they are less sensitive to the server-side language used. Nevertheless, they require the web interface to be functional. Unfortunately, it is difficult to create an environment that can perfectly emulate firmware images for various devices based on a variety of computing architectures and hardware designs.

Scalable dynamic analysis of embedded web interfaces. The easiest way to preform dynamic analysis is to perform it on a live device. However, acquiring devices to dynamically analyze them is not scalable. Also it is ethically ques-tionable, if not illegal, to test devices one does not own (e.g., devices on the Internet). Another option would be to extract the web interface files from a device and load them in a test environment, like an Apache web server. Unfortunately, a large majority of the embedded web interfaces use native CGIs, bindings to local architecture-dependent tools or custom web server features which cannot be easily reproduced in a different environment (Section 5.2.4).

Emulating the firmware is an elegant method to perform dynamic analysis of a system, since it does not require the physical device to be present and can be completely performed in a controlled environment. Sadly, emulation of unknown devices is not easy because an embedded firmware expects specific hardware to be fully present, such as peripherals or memory layouts. Previous attempts were made at improving emulation of firmware images by forwarding I/O to the hardware [203], which applies to many kind of embedded systems, even the monolithic firmware images. In a different approach [141], Linux based embedded systems are emulated with a custom kernel that forwardsioctlrequests to the

5.1. INTRODUCTION 59

Figure 5.1:Overview of the analysis framework.

embedded device that runs the original kernel. Those techniques achieve a rather good emulation, but require the presence of the original device, which does not scale. We observe that, in Linux based embedded systems, the interaction with the hardware is usually performed from the kernel. Moreover, web interfaces often do not interact with the hardware or this interaction is indirect.

In this chapter, we propose a partial emulation of firmware images by replacing their kernel with a stock kernel (targeting the same architecture) and emulate the whole userland of the firmware using a hypervisor, such as QEMU [106]. We chroot the unpacked firmware and start the initprogram, the init scripts or, sometimes, directly the web server. Once (and if) the web server is up and oper-ational, we use dynamic analysis tools to discover vulnerabilities in the system.

This approach has the advantage to be rather automated and generic, however, a complete emulation is very slow and some firmware images and tools do not run properly.