Malware Tutorial Analysis 23: Tracing Kernel Data Using Data Breakpoints

Malware Tutorial Analysis 23: Tracing Kernel Data Using Data Breakpoints

Learning Goals:

  1. Use WinDbg for kernel debugging
  2. Apply the data tracing and hardware data breakpoint points for analyzing data flow
  3. Understand how rootkits set up and hide a driver module

Applicable to:

  1. Operating Systems
  2. Assembly Language
  3. Operating System Security

1. Introduction
This tutorial continues the analysis presented in Tutorial 20. We reveal how Max++ performs another round of driver infection, and how it sets up and hides an infected driver. We will also study how to use hardware data breakpoint to trace the use of data and kernel data structures. Our analysis starts from _+37AF.

2. Lab Configuration
In general we will use the instructions of Section 2 of Tutorial 20. In the following we just remind you of several important steps in the configuration:
(1) You need a separate image named “Win_Notes” to record and comment the code. You don’t really need to run the malware on this instance, but just to record all your observations using the .udd file. To do this, you have to modify the control flow of IMM so that it does not crash on .sys files. See Section 2 of Tutorial 20 for details. Jump to 0x100037AF to start the analysis.
(2) The second “Win_DEBUG” image has to be run in the DEBUG mode and there should be a WinDbg hooked from the host system using COM part — so here, we are doing kernel debugging.
(3) Set a breakpoint “bu _+37af” in WinDbg to intercept the driver entry function.

3. Data Breakpoints and Tracing File Name
We now continue the analysis after Tutorial 21. We begin with _+37AF. Figure 1 shows the first couple of instructions. As shown in Figure 1, the first section of the code is to massage a collection of names.

Figure 1. Copy and Manipulate Strings

At 0x100037BF, it is copying string “??C2CAD…snifer67” to the area pointed by EDI. Doing a data analysis in WinDbg yields the following. Clearly, EDI value (the starting address) of the string is 0xFAFAF9F8 (which is ESP+34 at this moment)

kd> db fafaf9f8
fafaf9f8  5c 00 3f 00 3f 00 5c 00-43 00 32 00 43 00 41 00  .?.?..C.2.C.A.
fafafa08  44 00 39 00 37 00 32 00-23 00 34 00 30 00 37 00  D.9.7.2.#.4.0.7.
fafafa18  39 00 23 00 34 00 66 00-64 00 33 00 23 00 41 00  9.#.4.f.d.3.#.A.
fafafa28  36 00 38 00 44 00 23 00-41 00 44 00 33 00 34 00  6.8.D.#.A.D.3.4.
fafafa38  43 00 43 00 31 00 32 00-31 00 30 00 37 00 34 00  C.C.
fafafa48  5c 00 4c 00 5c 00 53 00-6e 00 69 00 66 00 65 00  .L..S.n.i.f.e.
fafafa58  72 00 36 00 37 00 00 00-14 fb 57 80 00 f3 c4 e1  r.6.7…..W…..
fafafa68  00 52 2e 81 00 20 2f 81-00 10 00 00 d8 fa 57 80  .R… /…….W.

Similarly you can infer the second string generated by the swpringf at 0x100037DB (in Figure 1) is “systemrootsystem32driversrasppoe” (this is the name of the randomly picked driver). The name could change in every run.

The the challenge to us is that if we look in the notes window, we are not able to infer where these two strings are used! We have to use WinDbg data breakpoints to figure out where these file/service names are used.

Let’s take the second string as an example. By analyzing the input parameter of swprintf (as shown in Figure 1, 2nd highlighted area), we know that the second string “systemrootsystem32driversrasppoe” is located at  0xFAFB7A78, as shown in following. Then we could set a data read breakpoint on it: ba r4 fafb7a78 (this means to watch for any reading on the 4 bytes starting at fafb7a78).

kd> db fafb7a78
fafb7a78  5c 00 73 00 79 00 73 00-74 00 65 00 6d 00 72 00  .s.y.s.t.e.m.r.
fafb7a88  6f 00 6f 00 74 00 5c 00-73 00 79 00 73 00 74 00  o.o.t..s.y.s.t.
fafb7a98  65 00 6d 00 33 00 32 00-5c 00 64 00 72 00 69 00  e.m.3.2..d.r.i.
fafb7aa8  76 00 65 00 72 00 73 00-5c 00 6b 00 62 00 64 00  v.e.r.s..k.b.d.
fafb7ab8  63 00 6c 00 61 00 73 00-73 00 2e 00 73 00 79 00  c.l.a.s.s…s.y.
fafb7ac8  73 00 00 00 77 7a 56 80-10 0d 00 e1 c4 06 00 00  s…wzV………
fafb7ad8  a8 7b fb fa 10 0d 00 e1-01 00 00 00 c4 06 00 00  .{…………..
fafb7ae8  00 00 00 00 20 0d 00 e1-88 2d 00 e1 f9 ba 13 81  …. ….-……
kd> ba r4 fafb7a78

Now run the program we hit _+0x1b in RtlInitUnicodeString, at this time, if you run Kp (to show the stack contents) you might not be able to get the right sequence of frames in the stack (as shown in the following).

kd> g
Sun Mar 25 20:26:39.359 2012 (UTC – 4:00): Breakpoint 1 hit
804d92c2 66f2af          repne scas word ptr es:[edi]

kd> Kp
ChildEBP RetAddr 
fafb7970 faeaefea nt!RtlInitUnicodeString+0x1b
WARNING: Stack unwind information not available. Following frames may be wrong.
fafb79b4 faeaf808 _+0x2fea
fafb7c7c 805a399d _+0x3808
fafb7d4c 805a3c73 nt!IopLoadDriver+0x66d
fafb7d74 804e426b nt!IopLoadUnloadDriver+0x45
fafb7dac 8057aeff nt!ExpWorkerThread+0x100
fafb7ddc 804f88ea nt!PspSystemThreadStartup+0x34
00000000 00000000 nt!KiThreadStartup+0x16

In this case, we want to step out of RtlInitUnicodeString. There is a command Step Out (shift+f11), however, not working here, because Max++ does not follow the conventional C conventions. We have to press F10 very patiently. After around 10 steps over (F10), we reached _+1a32, as shown below!

kd> p
804d92df 5f              pop     edi
kd> p
804d92e0 c20800          ret     8
kd> p
faeada32 33c0            xor     eax,eax

_+1a32 is a part of a function in Max++, which is responsible for constructing an instance of _OBJECT_ATTRIBUTES (where “systemrootsystem32driversrasppoe” is served as the ObjectName).

Figure 2. The Function Which Calls RtlInitUnicodeString

Tracing again from _+1a32, we can find that the program flow jumps to _+23e9 (which reads the contents of the driver file and put it in a collection of locked virtual pages).

Challenge 1. Finish the above analysis and provide a detailed report on how the “systemrootsystem32driversraspppoe” string is used.

4. Virtual Pages
We continue the analysis. At _+3803, Max++ calls another function located at  _+23C8 (which reads the contents of a file and puts the contents in virtual pages). There are some interesting technical details here. Figure 3 shows its function body. Note the first highlighted area, it constructs an instance of _OBJECT_ATTRIBUTES that entails the file name “systemrootsystem32driversraspppoe”, as discussed in Section 3 (how to trace the use of data). Then Max++ opens the file and queries about the standard file information of the file. When all operations succeed, it proceeds to the creation of virtual pages.

Figure 3. First Part of _+23C8

  We continue to the second part of Function _+23C8 (as shown in Figure 4). In driver implementation, in many cases you have to lock the physical pages for your virtual addresses (so that your contents in RAM will not be swapped into disk by OS). The intention of this part of code is pretty clear: it first requests virtual pages (see the first highlighted area), the virtual page descriptor is saved in a data structure named _MDL (stored at 8121c970). Once successful, it will ask the system to allocate the physical pages (see MmMapLockedPageSpecifyCache). Then Max++ reads the infected driver file into these pages (starting at address 0xf7649000). If you dump the data starts at 0xf7649000, you would find it’s really a binary executable (i.e., see the magic 4D5A header info. for DOS header).

kd> dd f7649000
f7649000  00905a4d 00000003 00000004 0000ffff
f7649010  000000b8 00000000 00000040 00000000

Figure 4. Second Part of _+23C8

Now comes the interesting part (see the last highlighted area of Figure 4). Once the file contents of the infected driver are read, Max++ immediately released the physical pages (for virtual address 0xf7649000) immediately. This is quite counter-intuitive, wouldn’t Max++ want to use these data later? It’s your job to figure it out.

Challenge 3. Use the same trick for tracing the data, set two data breakpoints. One for the _MDL (e.g., in our case it’s 0x8121c970)  and one for the starting address of the infected driver executable data (e.g., in our case it’s 0xf7649000). Try to figure out if these pages of malicious binary executable are really used or not. In summary, you have to answer the question: why does Max++ release the pages in Figure 4?

Challenge 4. Analyze the function of _+22C3.

5. Infection of Driver Again and the Use of Virtual Pages
At _+3889 Max++ calls function 2D9F. We now analyze its function (as shown in Figure 5). It is used to infect a driver file (the file name is given as the first parameter in its stack frame). The function first creates a section object on the file, then it performs a memcopy from a MDL descriptor to the file, and flushes the contents back to the file.

Figure 5. Infect Driver fips.sys

Challenge 5. use data tracing technique to analyze where is the malicious file content from?

6. Final Set Up of Malicious Disk Driver
In Tutorial 22, we showed you how a malicious disk driver is used to simulate the file requests on “??C2CAD…” using a file called “12345678.sav”. In the following, we show how this driver is configured by copying attributes from the real disk driver.

Figure 6. Wiring and Copying of Driver Object

 The first part (as shown in the first highlighted area in Figure 6), adjusts the DriverSection field of the infected object. It is actually a basic link list operation, which tries to remove the infected driver from the list of modules. Notice that the type of the DriverSection field (offset 0x14) is _LDR_DATA_TABLE_ENTRY. You can use WinDbg to verify.

Next in the second highlighted area of Figure 6, Max++ tries to copy all the attributes from the original DriverDisk object to the infected driver (in this case on the comments it’s .serial, the name could change during sessions). There is only one attribute of the infected driver remains: the major function _+2bDE! Up to this point, Max++ has successfully set up the infected disk driver and it has hided it from the loaded module list.