• Aucun résultat trouvé

5.2 Exploring Techniques to Analyze Web Interfaces of Firmware Images 60

5.2.4 Running Web Interfaces

Dynamic analysis of web applications require a function web interface. There are different ways to launch the web interface that is present in the firmware of an embedded system, however, none of them are perfect. Some methods are very accurate but infeasible in our setup, such as emulating the firmware in a perfect emulator (which is not available). Other methods are much less accurate, like extracting the web application files and serving them from a generic web server.

Therefore, we evaluated different possibilities and describe their advantages and drawbacks (summarized in Figure5.2).

Hosting Web Interfaces Non-Natively

A straightforward way to launch a web interface from a firmware is to extract it (i.e., the document root) and launch it under a web server on an analysis environment, without trying to emulate the original web server and firmware.

For this, we need to identify the web server used by the firmware (e.g., boa, minihttpd, thttpd, lighttpd, apache) and to locate its original configura-tion. Then the web application is located (described in Section5.3.2), extracted and “transplanted” to thehosting environment. The hosting environment is set up using the same web server (or a very similar version) and the configuration files extracted from the firmware. The main advantage of this technique is that it does not require emulation, which dramatically simplifies the deployment and thus, it is easy to automate.

However, this approach has many limitations. For example, it is not possible to handle platform dependent binaries and CGIs. We analyzed the document roots within the 1580 firmware candidates for emulation and found that 57% out of those were using binary CGIs or were in some way bound to the platform2. In

2This is a lower bound as we did not count web scripts calling local system utilities, e.g.,

Large Scale Security Analysis of Embedded Devices’ Firmware

addition to this, the firmware images often use either customized web servers, or versions which are not available on normal systems (e.g.,uClinuxhas a custom version of the commonboa web server). Those often support options which are nontrivial to reproduce under a normal system without profoundly modifying the web server used. As a consequence, the chances of properly emulating the web interfaces using this technique is very low and after careful evaluation on a few firmware samples we decided not to use it in our framework.

Firmware and Web Interface Emulation

To emulate a firmware (or part of it) we need to detect the CPU architecture it is intended to execute on. Also, depending on the chosen emulation method, a generic kernel and a file system for that architecture could be required. Detecting the CPU architecture of each root filesystem is not trivial. For example, some firmware packages contain files for various architectures (e.g., ARM and MIPS).

Sometimes, vendors package two different firmware blobs into a single firmware update package. The firmware installer then picks the right architecture during the upgrade based on the detected hardware. In such cases, we try to emulate this filesystem with each detected architecture. We detect the architecture of each executable in a firmware using ELF headers or statistical opcode distribution for raw binaries. We then decide on the architecture by counting the number of architecture specific binaries it contains. Once we detected the right architecture, we use the QEMU emulator for that particular architecture. There are different possibilities for emulating the firmware images, which we now compare.

Perfect emulation. Ideally we would have a complete firmware (including the bootloader, kernel,. . . ) and a QEMU setup that perfectly emulates the hardware for which the firmware was designed. However, QEMU only emulates few plat-forms for each CPU architecture. Moreover, perfectly emulating unknown hard-ware is impossible, especially considering that hardhard-ware devices can be arbitrarily complex. In addition to this, hardware in embedded devices is often custom and its documentation is in general not available. It is therefore infeasible to adapt the emulator, and even less to apply this at a large scale.

Original kernel and original filesystem on a generic emulator. Reusing the kernel of the firmware could lead to a quite accurate emulation, in particular because it may export interfaces for some custom devices that are needed to properly emulate the system. Unfortunately, kernels for embedded systems are often customized and hence, do not support a wide range of peripherals. There-fore, using the original kernel is unlikely work very well. Furthermore, the original

using thesystem()call.

5.2. EXPLORING TECHNIQUES TO ANALYZE WEB INTERFACES OF

FIRMWARE IMAGES 65

kernel is often missing from the firmware image under analysis. For instance, we found only 107 kernels out of 1925 original firmware images. Consequently, for most of the embedded systems, the kernels have to be extracted from the device (e.g., via JTAG, flash dumping). We therefore did not try to use the original kernels.

Generic kernel and original filesystem on a generic emulator. We can eas-ily find or build a complete generic kernel for the architecture of a device3. Such a kernel will then boot properly on the generic emulator. The lack of the origi-nal kernel can reduce the accuracy of the emulation. However, we estimate this limitation does not affect significantly the web interface emulation and hence, the coverage of the dynamic analysis. Another drawback is the raw usage of the original firmware filesystem. As these firmware images are built differently by different vendors for varying devices, we cannot fully rely on the presence of required utilities, e.g.,SSH, SCP, netstat. Without such basic tools, it be-comes problematic to do automated management of the emulated hosts. We have evaluated this technique on a few selected firmware instances. Though we could build and install generic support tools, there are more elegant solutions which we are going to present.

Generic kernel and a generic filesystem, chrooting the firmware to ana-lyze. We first boot a generic Debian Linux system and generic kernel inside the QEMU emulator. Once this generic environment is up and operational, we copy the root filesystem of the firmware into the emulated environment. We then ch-root to the firmware’s filesystem and execute (inside the chch-root) the shell (e.g., /bin/sh) or the init binary (e.g., /sbin/init). Finally, in the chroot environ-ment we start the web server’s binary along with the web interface docuenviron-ment root and web configuration.

The main advantage of this approach is that it can provide almost complete em-ulation of the web interfaces of the embedded devices and is scalable. There are few drawbacks of this approach. First, emulating the system is not very fast; em-ulation is one order of magnitude slower than native execution. Additionally, the emulator environment setup and cleanup introduces a significant overhead. This approach also limits the number of platforms to those supported by theDebian Ports project. Luckily, the Debian project supports many common embedded architectures (e.g., arm, mips, powerpc).

Unfortunately, with this approach we cannot fully emulate the peripherals and specific kernel modules of the embedded devices. However, few firmware images and a limited part of embedded web interfaces actually interact directly with the

3We used the pre-compiled Debian Squeeze kernels and systems fromhttps://people.

debian.org/~aurel32/qemu/

Large Scale Security Analysis of Embedded Devices’ Firmware

peripherals. One such example is a web page that performs a firmware upgrade which in turn requires access to flash or NVRAMmemory peripherals.

In summary, this is the approach that we found to be most scalable and that provided the best results in starting the web interfaces for analysis. At the same time, this approach is a balanced trade-off between the emulation accuracy, complexity and speed (see Figure 5.2).

Architectural chroot. One way to improve the performance and emulation management aspects of our framework is by using architectural chroot [12], or QEMU static chroot. This technique uses chroot to emulate an environment for architectures other than the architecture of the running host itself. This basically relies on the Linux kernel’s ability to call an interpreter to execute an ELF executable for a foreign architecture. Registering the userland QEMU as an interpreter allows to execute arm Linux ELF executables on an x86_64 Linux system. Our approach is to do userland emulation only with architectural root (Figure 5.4). We have successfully tested this technique on a few firmware instances. This approach is fast and easy to manage (compared to the system QEMU emulation), however, it is quite fragile and requires significant additional work to be more reliable. In addition, while this approach has the advantage of improving emulation speed, it is unlikely to improve the number of firmware packages we can emulate in the end. Therefore, we did not use this technique in our experiments. We plan to explore this direction in more details in our future work.