Executing the Payload from the Heap

One limitation of the preceding technique is that if you want to call any subrou-tines that take pointer arguments, you need to be able to calculate or guess the address of the attack string in memory. A fl exible technique that overcomes the non-executable stack and Library Randomization, allowing you to execute an arbitrary existing payload without having to guess volatile memory addresses, would be ideal. On Mac OS X x86 10.4 and 10.5, Apple has made only the stack segments truly non-executable, not the other writable memory regions such as the data and heap segments. Copying the payload to the heap and transferring control to it there would allow you to use an arbitrary existing payload without modifi cation. In this section we will describe Dino Dai Zovi’s technique for overcoming Leopard’s Library Randomization and non-executable stack in an arbitrary stack-buffer-overfl ow exploit.

To do this, the technique takes advantage of several limitations of Leopard’s Library Randomization. Although Leopard randomizes the load address of most shared libraries and frameworks on the system, it notably does not randomize the base address of the dynamic linker itself, dyld. The dyld executable image is always loaded at the same base address, 0x8fe00000. In addition, since dyld cannot depend on any other libraries, it includes the code for any library functions that it needs within its own text segment. These two properties make it very useful for return-to-libc-style exploits because they can make use of the standard library functions at fi xed known locations in dyld’s text segment. With some creativity, an attacker can take advantage of this to create a return-into-libc attack string that copies the exploit payload into the heap and executes it directly from there.

One of the most interesting library functions available in dyld’s text segment is setjmp(). The setjmp() and longjmp() functions are used to implement non-local transfers of control by saving and restoring the execution environment,

95363c07.indd 176

95363c07.indd 176 1/25/09 4:41:47 PM1/25/09 4:41:47 PM

Chapter 7 ^■ Exploiting Stack Overflows 177 respectively. In practice, the execution environment is the signal context and values of the nonvolatile registers. Here are the declarations of the functions on Mac OS X from /usr/include/setjmp.h and _setjmp.s in the Libc source code.

#include <setjmp.h>

As you can see, the jmp_buf argument to setjmp is just an array of machine words. The technique is based on returning to the setjmp() function and then returning within the jmp_buf to execute the values of controlled registers as machine-code instructions. Since we know which registers’ contents are over-written with values from our attack string, we can return to known offsets from the jmp_buf pointer to execute those values as CPU instructions.

We will explain the execute-payload-from-heap stub by following its control fl ow through each jump. We begin with the fi rst jump, when the vulnerable function in the target process uses its overwritten return address to return into the setjmp() subroutine.

Step 1: Return to setjmp()

The stub’s fi rst jump simulates a call to setjmp() with an address of writable memory somewhere in the target process address space. Again, since dyld is loaded at a known location, we will use an address of some writable memory in its data segment for our jmp_buf parameter. After setjmp() executes, it will pop its return address from our attack string, which is set to the address in our jmp_buf where the value of the EBP register is stored.

95363c07.indd 177

95363c07.indd 177 1/25/09 4:41:47 PM1/25/09 4:41:47 PM

Step 2: Return to jmp_buf[JB_EBP]

Most subroutine prologs save the caller’s frame pointer onto the stack. When a stack buffer overfl ows, it will overwrite the frame pointer before it overwrites the return address. This means that the value of the EBP register can be speci-fi ed in the attack string. When the vulnerable program returns from setjmp to jmp_buf[JB_EBP], it executes a four-byte fragment of chosen machine code, as shown here:

00000000 90 nop ; Change to int3 to debug 00000001 59 pop eax ; Adjust stack pointer 00000002 61 popa ; Restore all registers 00000003 C3 ret ; Return into next jump

This code fragment executes the popa instruction to restore all register values from the attack string on the stack. The popa instruction pops successive values from the stack into the EDI, ESI, and EBP registers, skips one for ESP, and then pops values into the EBX, EDX, ECX, and EAX registers. Before executing popa, the fragment executes a single pop instruction to adjust the stack pointer so that the second code fragment is loaded into the proper registers by the popa instruction. Finally, it executes a return instruction to execute the next jump, simulating a call to setjmp() again.

Step 3: Return to setjmp() Again

The second simulated call to setjmp() executes with more controlled registers due to the fact that the popa instruction loaded all of their values from the attack string. This call to setjmp() also requires an address of writable memory in the target address space, but there is no need for it to be different from the address we used in the fi rst call to setjmp(). Leopard’s setjmp implementation saves only the nonvolatile general-purpose registers (EBX, EDI, ESI, and EBP), of which EDI, ESI, and EBP are stored sequentially in the jmp_buf. The attack string fi lls those registers with machine code in order to execute a 12-byte frag-ment of chosen machine code.

Just as before, after setjmp() executes, it pops its return address from the attack string. This time the return address is set to the address of jmp_buf[JB_EDI] to execute a 12-byte fragment of chosen machine code.

Step 4: Return to jmp_buf[JB_EDI]

On an architecture like x86, where the instruction encoding is extremely space effi cient, 12 bytes of machine code is enough space to execute a few actions.

The second machine-code fragment loads a pointer to the payload in the attack

95363c07.indd 178

95363c07.indd 178 1/25/09 4:41:47 PM1/25/09 4:41:47 PM

Chapter 7 ^■ Exploiting Stack Overflows 179 string and stores it on the stack such that it would be used as the fi rst parameter to the next called subroutine. The value is written directly to the stack instead of pushing so that it does not overwrite the next return address. The assembly code for this 12-byte fragment is shown below.

00000000 90 nop ; Set to int3 to debug 00000001 58 pop eax ; Adjust stack pointer 00000002 89E0 mov eax,esp ; Load addr of payload 00000004 83C00C add eax,byte +0xc ; from attack string 00000007 89442408 mov [esp+0x8],eax ; as subr parameter 0000000B C3 ret ; Return to next jump

Step 5: Return to strdup()

The C standard library function strdup() takes a string pointer as an argument, copies the source string to a newly allocated heap buffer, and returns the newly allocated copy. In Leopard, unlike the memory used for the stack segment that is protected by hardware NX, the memory used for the heap segment is execut-able. The stub uses strdup() to copy an arbitrary payload from the attack string on the stack into heap memory where it may be freely executed.

Step 6: Return to EAX

After strdup() fi nishes executing, it pops its return address from the attack string. On the x86 architecture, the return value of a function is passed in the EAX register. Since the ultimate goal is to execute the payload now stored in the heap buffer that EAX points to, the stub needs to fi nd a way to transfer control to the memory that EAX points to. To do this, the stub returns to a register-indirect jump or call instruction at a known location in memory. Again, since dyld is always loaded at a known address, we can use one of these instructions from within it. Later in this chapter, in the section “Finding Useful Instruction Sequences,” we discuss how to fi nd these instruction sequences and how to choose a reliable one. By using the address of a register-indirect jump to EAX for the return address from strdup(), the stub fi nally transfers control into the actual exploit payload.

Step 7: Execute Payload

At this point the target process will begin executing the exploit payload from the heap. The stack pointer will point to the original attack string on the stack, which can be safely overwritten by the payload since it is executing from the heap seg-ment and does not need to be careful not to overwrite itself in memory.

95363c07.indd 179

95363c07.indd 179 1/25/09 4:41:47 PM1/25/09 4:41:47 PM

The Complete exec-payload-from-heap Stub

Finally, we will demonstrate the exec-payload-from-heap stub in a simple exploit. The exploit prints the attack string to its standard output, so it can be used against smashmystack.x86 with the following command.

% ./smashmystack.x86 `./exec-payload-from-heap.rb`

The exploit is a short Ruby script as shown below.

#!/usr/bin/env ruby

# Simple proof-of-concept exploit for smashmystack.x86

# using the exec-payload-from-heap technique.

Chapter 7 ^■ Exploiting Stack Overflows 181

# The actual payload to execute

# payload = “\xCC” * 4

# Create the stub

stub = make_exec_payload_from_heap_stub()

# The final attack string with stub and payload puts “A” * 1032 + stub + payload

Dans le document The Mac (Page 194-199)