• Aucun résultat trouvé

Shedding Light on objc_msgSend Calls

Dans le document The Mac (Page 163-168)

The IDC script helped demystify some of the calls to objc_msgSend, but in many cases it didn’t help, as in the example in Figure 6-7. In these cases, you still end up with a bunch of calls to objc_msgSend, where at fi rst glance, it is not obvious where they go. To make matters worse, due to this calling mechanism, you lose out on useful cross-reference information; see Figure 6-14. In this fi gure, only one cross-reference exists, and it is a data cross-reference (to the Obj-C struc-tures). This makes tracing code execution diffi cult. This is true even for calls that

95363c06.indd 145

95363c06.indd 145 1/25/09 4:41:27 PM1/25/09 4:41:27 PM

used fi xed offsets such that fi xobjc.idc made it easier to read; the cross-references are still broken. In this way, IDA Pro is reduced to a GUI for otool.

Figure 6-14: An Obj-C method typically has no CODE cross-references since it is called via a data structure by objc_msgSend.

Luckily, you can oftentimes fi x these defi ciencies; you just need to do some-thing a little more precise. On the surface, this seems like a pretty straightfor-ward problem to fi x because the information needed to resolve which function to call is passed as the fi rst and second arguments to objc_msgSend. However, in reality it is slightly more complicated. These arguments often are passed through many registers and stack values before ending up as an argument, which would require complicated slicing of these values through the code.

(Actually, Hotchkies and Portnoy have a script that tries to do exactly this, with limited success.) Instead of doing this analysis, you can utilize the ida-x86emu emulator for IDA Pro, written by Chris Eagle. This tool, from a given spot in the binary, emulates the x86 processor as it acts on emulated registers and an emu-lated stack and heap. In this way, the program’s fl ow can be analyzed without running the code. This plug-in was designed to help reverse-engineer malicious and other self-modifying code. However, the emulation is useful in this case because you can emulate entire functions and then whenever objc_msgSend is called you can fi nd the values that are used as arguments to the function. We do make one simplifi cation; the method presented here emulates each func-tion in isolafunc-tion—i.e., you do not emulate the funcfunc-tions called from within the analyzed function. For the most part this inexact analysis is suffi cient since you care only about arguments to this one function. This simplifi cation saves time and overhead, but has the drawback of being somewhat inaccurate. For example, if one of the arguments to objc_msgSend is passed as a parameter to a function, you will not be able to identify it. For most cases, though, this technique is suffi cient.

You want to go through each function, emulate it, and record the arguments to objc_msgSend. ida-x86emu is designed as a GUI to interact with IDA Pro. So

95363c06.indd 146

95363c06.indd 146 1/25/09 4:41:27 PM1/25/09 4:41:27 PM

Chapter 6 Reverse Engineering 147

you need to make some changes to it. For the code in its entirety, please consult www.wiley.com/go/machackershandbook. What follows are some of the most important changes that need to be made.

First you want to execute the code when ida-x86emu normally throws up its GUI window, so replace the call to CreateDialog with a call to your code. Then iterate through each function, and for each function emulate execution for all instructions within it. This code is shown here. Note that you will not necessar-ily go down every code path, so some calls to objc_msgSend may be missed.

void do_execute_single_function(unsigned int f_start, unsigned int f_end){

int iFuncCount = get_func_qty();

msg(“Functions to process: %d\n”, iFuncCount);

for(int iIndex = 0; iIndex < iFuncCount; iIndex++) {

do_execute_single_function(pFunc->startEA, pFunc->endEA);

} else {

So far you haven’t done anything except automate how the emulator works.

ida-x86emu has C++ code that emulates each (supported) instruction. The only change you need to make is how the CALL instruction is handled:

95363c06.indd 147

95363c06.indd 147 1/25/09 4:41:27 PM1/25/09 4:41:27 PM

get_func_name(cpu.eip + disp, buf, sizeof(buf));

if(!strcmp(buf, “objc_msgSend”)){

// Get name from ascii components

unsigned int func_name = readMem(esp + 4, SIZE_DWORD);

unsigned int class_name = readMem(esp, SIZE_DWORD);

get_ascii_contents(func_name, get_max_ascii_length(func_name, ASCSTR_C, false), ASCSTR_C, buf, sizeof(buf));

if(class_name == -1){

strcpy(bufclass, “Unknown”);

} else {

get_ascii_contents(class_name, get_max_ascii_length(class_name, ASCSTR_C, false), ASCSTR_C, bufclass, sizeof(bufclass));

}

strcpy(buf2, “[“);

strcat(buf2, bufclass);

strcat(buf2, “::”);

for ( bool ok=xb.first_to(func_name, XREF_ALL); ok; ok=xb.next_to() )

{

char buffer[64];

get_segm_name(xb.from, buffer, sizeof(buffer));

if(!strcmp(buffer, “__inst_meth”) || !strcmp(buffer,

Chapter 6 Reverse Engineering 149 This code runs only when the name of the function being called is objc_

msgSend. It then reads the values of the two arguments to the function stored on the stack and gets the strings at those addresses. In the case, when the code doesn’t have the class information (for example, if this were an argument to the function being emulated), it uses the string Unknown. It then builds a string that describes the function really being called and adds a comment to the IDA Pro database if it cannot determine the exact location of the function.

The way it tries to determine the function relies on the mechanics of the Obj-C runtime library. It starts at the ASCII string, which describes the function that needs to be called—for example, set_integer:. It looks at any cross-references to this string and tries to fi nd one in a section called either __inst_method or __cat_inst_method. If it fi nds one there, it knows that these particular structures are arranged such that the third dword points to the code for the function, as you saw earlier in this chapter. In particular, this data structure references the code. So the plug-in looks for any references to any code in the __text section.

If it fi nds one, it knows it has located the code associated with the string. When it can carry out these steps, it knows the address of the executable code that will eventually be called via objc_msgSend. In this case it can place appropri-ate references in the IDA Pro database. With the addition of these cross-references, when viewing the disassembly it is possible to view and navigate to the functions being called.

If this method of looking up the code associated with the string fails (for example, if the code were located in a different binary), then the ASCII string is placed as a comment next to the call to objc_msgSend. Finally, the program sets the function’s return value to be the name of the class being used, for future reference by the emulator.

To use this plug-in, make sure it is located in the plug-in directory of IDA Pro. Then, when the binary being disassembled is ready, press Alt+F8, the key sequence originally used to activate the ida-x86emu plug-in. This should add cross-references and comments to many of the calls to objc_msgSend; see Figure 6-15.

The cross-references also make backtracing calls much easier. Compare Figure 6-16 to Figure 6-14.

95363c06.indd 149

95363c06.indd 149 1/25/09 4:41:27 PM1/25/09 4:41:27 PM

Figure 6-15: Calls to objc_msgSend are either commented with their destination or have cross-references added.

Figure 6-16: This function now has three code cross-references listed as to where it is called.

Dans le document The Mac (Page 163-168)