Sometimes it is necessary to alter the functionality of a routine, or collect some data from a malfunctioning routine, for debugging. It works well, as long as we have the access to source code. But what if we don't have access to source code or changes to the source code is not feasible? With dynamic libraries, it is very easy to intercept any call to a routine of choice, and can do whatever we wish to do in that routine, including calling the real routine the client intended to call.
In simple words, the hacker (who writes the interposing library, in this context) writes a new library with the exact interfaces of the routines, that (s)he wish to intercept, and preloads the new library before starting up the application. It works well, as long as the targeted interfaces are not protected. On Solaris, with linker's
-Bsymbolic
option or Sun Studio compiler's
-xldscope=symbolic
option, all symbols of a library can be made non-interposable (those symbols are called
protected symbols, since no one else can interpose on them). If the targeted routine is interposable, dynamic linker simply passes the control to whatever symbol it encounters first, that matches the function call (callee). Now with the preloaded library in force, hacker gets control over the routine. At this point, it is upto the hacker whether to pass the control to the actual routine that the client is intended to call. If the intention is just to collect data and let go, the required data can be collected and the control will be passed to the actual routine with the help of
libdl
routines. Note that the control has to be passed explicitly to the actual routine; and as far as dynamic linker is concerned, it is done with its job once it passes the control to the function (interposer in this case). If the idea is to completely change the behavior of the routine (easy to write a new routine with the new behavior, but the library and the clients have to be re-built to make use of the new routine), the new implementation will be part of the interposing routine and the control will never be passed to the actual routine. Yet in worst cases, a malicious hacker can intercept data that is supposed to be confidential (eg., passwords, account numbers etc.,) and may do more harm at his wish.
[Off-topic] To guard against such attacks, it is recommended to make most of the symbols local in scope, with the help of linker supported map files or compiler supported linker scoping mechanism. Read
http://developers.sun.com/tools/cc/articles/symbol_scope.html to learn more about linker scoping.
The above mentioned technique is commonly referred as
library interposition; and as we can see it is quite useful for debugging, collecting run-time data, and for performance tuning of an application.
It would be more interesting to see some interceptor in action. So, let's build a very small library with only one routine
fopen()
. The idea is to collect the number of calls to
fopen()
and to find out the files being opened. Our interceptor, simply prints a message on the console with the file name to be opened, everytime there is a call to
fopen()
from the application. Then it passes the control to
fopen()
routine of
libc
. For this, first we need to get the signature of
fopen()
.
fopen()
is declared in
stdio.h
as follows:
FILE *fopen(const char *filename, const char *mode);
Here is the source code for the interposer:
% cat interceptfopen.c
#include <stdio.h>
#include <dlfcn.h>
FILE *fopen(const char *filename, const char *mode) {
FILE *fd = NULL;
static void *(*actualfunction)();
if (!actualfunction) {
actualfunction = (void *(*)()) dlsym(RTLD_NEXT, "fopen");
}
printf("\nfopen() has been called. file name = %s, mode = %s \n
Forwarding the control to fopen() of libc", filename, mode);
fd = actualfunction(filename, mode);
return(fd);
}
% cc -G -o libfopenhack.so interceptfopen.c
% ls -lh libfopenhack.so
-rwxrwxr-x 1 build engr 3.7K May 19 19:02 libfopenhack.so*
actualfunction
is a function pointer to the actual
fopen()
routine, which is in
libc
.
dlsym
is part of
libdl
and the
RTLD_NEXT
argument directs the dynamic linker (
ld.so.1
) to find the next reference to the specified function, using the normal dynamic linker search sequence.
Let's proceed to write a simple C program, that writes and reads a string to and from a file.
% cat fopenclient.c
#include <stdio.h>
int main () {
FILE * pFile;
char string[30];
pFile = fopen ("myfile.txt", "w");
if (pFile != NULL) {
fputs ("Some Random String", pFile);
fclose (pFile);
}
pFile = fopen ("myfile.txt", "r");
if (pFile != NULL) {
fgets (string , 30 , pFile);
printf("\nstring = %s", string);
fclose (pFile);
} else {
perror("fgets(): ");
}
return 0;
}
% cc -o fopenclient fopenclient.c
% ./fopenclient
string = Some Random String
With no interceptor, everything works as expected. Now let's introduce the interceptor and collect the data, during run-time.
% setenv LD_PRELOAD ./libfopenhack.so
% ./fopenclient
fopen() has been called. file name = myfile.txt, mode = w
Forwarding the control to fopen() of libc
fopen() has been called. file name = myfile.txt, mode = r
Forwarding the control to fopen() of libc
string = Some Random String
%unsetenv LD_PRELOAD
As we can see from the above output, the interceptor received the calls to
fopen()
, instead of the actual implementation in
libc
. And the advantages of this technique is evident from this simple example, and it is up to the hacker to take advantage or abuse the flexibility of symbol interposition.
Suggested Reading:
- Debugging and Performance Tuning with Library interposers
- Profiling and Tracing Dynamic Library Usage Via Interposition