Introduction
I have every now and then encountered malware which steals or manipulates data from other software ran in a user environment. In "Practical Malware Analysis - A hand-on guide to dissect software" they wrote a couple of lines about a topic called "hooking". In this book they discuss two different techniques which often are used in Windows user-mode rootkits, namely IAT Hooking and Inling hooking. Recently i found an article discussing software interposition, which is another word for hooking software-functions, and found it quite interesting to read that within Linux one could pretty easy intercept symbols which other software depend on.There are various ways to perform these kinds of software hooks, and i will in this blog-post discuss hooking in a Linux environment rather than in a Windows environment. While these kind of techniques are often implemented into malware, one could also use it to debug software, create statistics, look for memory leaks as well as during malware analysis.
Shared Library
In computer software developers rely on using dynamic linked libraries which is a set of functions one include into your own code. C programming language has a set of standard libraries which are defined in a file called libc.so.6. Functions like printf(), rand() and so on are defined there for our pleasure to use, and are shared amongst different software which are relying on these functions to perform the wanted actions.When source-code is compiled, the compiler link the dependencies between your program and the library through a process called linking. The linker will take the object-files created by compiler and combine them into one single executable, and are thus responsible for placing the references to the wanted symbols. Then it will place the reference into a certain section inside the binary, which is specified by operating systems and again file-format (In this case: ELF). At runtime the symbols are resolved by the first library that provides the functions used by the program.
When compiling a shared library it is important to use -shared and -fPIC compile options since we are building a shared-library and shared-libraries should be Position-Independent Code. By using PIC we ensure that the library can be mapped into any memory address without being modified, and executables can map this address into its own address space and call it properly without any problems, regardless of the library's address.
Linking and compiling are often mentioned as one and same thing as this is happening during compile-time, but are quite different.
For a better understanding of libraries, static libraries and shared libraries see Dynamic Linking in Linux and Windows.
Function interposition
Function interposition, often referred to as hooking, is a technique to intercept dynamic library functions by writing a wrapper-library which replaces the actual wanted symbol. It can be used by developers to debug their program, run all sorts of run-time statistics and so on. It is very often implemented into malicious software, and it can be used to analyse malicious software in a different manner than i am used to.As mentioned above, the symbols are resolved by the first library, so if we could get our wrapper in between the software and standard C library, we should be able to intercept the symbols used in a program. So, how do one perform this kind of magic?
First of, in Linux there are LD_PRELOAD environment variable, which could be very easily described as dynamic-linker-preload. This will tell the loader to load library stated in this variable before standard libraries. Information regarding this variable is found in the "GNU linker"-manual (man ld). At this point i raised the security alert since this is allowed in Linux this easily; but this environment variable has some limitations; We can for example not intercept software which requires root-access (setgid, setuid-flags set).
With all this in mind i could easily write a simple library which replaces the rand()-function - it is an easy first choice (examples are to be found HERE).
The function rand() is defined in stdlib.h as such: "int rand(void);". The replacement could look something liek this:
#include <stdlib.h>
/*
* Compile:
* gcc -shared -fPIC -o rand.so 02-rand_replace.c
* Run:
* LD_PRELOAD=$PWD/rand.so ./randy
*/
int rand(void) {
return 1; // fixed rand
}
But this isn't very sophisticated at all as we have no way to get back the real value of rand, this could clearly create errors during runtime if the software rely on a certain symbol, as they most of the times does ;)
C Dynamic Linking API
Fortunately there is a C-library for dynamic linking which makes it possible for us to work with dynamic libraries. By reading the documentation i found out that by creating a wrapper and to use dlsym-function i could fetch the old instruction. The documentation states the following:The RTLD_NEXT flag is useful to navigate an intentionally created hierarchy of multiply-defined symbols created through interposition. For example, if a program wished to create an implementation of malloc() that embedded some statistics gathering about memory allocations, such an implementation could use the real malloc() definition to perform the memory allocation-and itself only embed the necessary logic to implement the statistics gathering function.
So by using dlsym with RTLD_NEXT-flag and the function to hook, we could perform our wanted function-interposition, do what ever with the data and return the "old_result" if thats desirable. In the old rand()-function i then created a point to int old_rand, to hold the results of the symbol actually called, and then we can place dlsym into that pointer. Because i use the RTLD_NEXT flag i need to either compile with -D_GNU_SOURCE or define it inside the source-code, as the code below.
#define _GNU_SOURCE
#include <stdlib.h>
#include <stdio.h>
#include <dlfcn.h>
/*
* Compile:
* gcc 03-hooking.c -o hook.so -shared -fPIC -ldl
* Run:
* LD_PRELOAD=$PWD/hook.so ./randy
*/
int rand(void) {
int (*old_rand)(void);
int res;
old_rand = dlsym(RTLD_NEXT, "rand"); // Hook first/next rand()
printf("Hooked on rand()\nBad boying starts here...\n\n\n");
res = old_rand();
return res;
}
By using the program above i could easily interpose as a rand() function and perform wanted actions where the "badboying" starts. I decided to call it badboying as i was relating this to earlier malware-analysis, but one could perform all sorts of actions inbetween these lines of codes.
Okey, how fun wasn't that? Well, the examples was not fun - but this technique seems quite powerful, and my immediately thought was to perform these kind of actions on a malware-sample - although i would believe it is possible to detect this kind of hooking and avoid it completely. Sounds liek a new blog-topic.
References
http://www.symantec.com/connect/articles/dynamic-linking-linux-and-windows-part-onehttp://www.yolinux.com/TUTORIALS/LibraryArchives-StaticAndDynamic.html
http://eli.thegreenplace.net/2011/11/03/position-independent-code-pic-in-shared-libraries/
http://www.jayconrod.com/posts/23/tutorial-function-interposition-in-linux
http://tldp.org/HOWTO/Program-Library-HOWTO/dl-libraries.html
Ingen kommentarer:
Legg inn en kommentar