Pages

Thursday, April 20, 2006

Solaris: NULL pointer bugs & /usr/lib/0@0.so.1 library

Some programmers assume that a NULL character pointer is the same as a pointer to a NULL string. However de-referencing a NULL pointer (ie., location 0x00000000 in the address space of a 32-bit process) results in a segmentation fault on Solaris; and hence the process dies with a signal SEGV. On Solaris, typically application text loads at 0x00001000. All the address space between 0x00000000 and 0x0000FFFF (that is about 64K of address space) is not used. I still need to find out why?. [Update: 04/26/06] According to Chris, first few pages were intentionally unmapped to catch poorly written code. Thanks Chris.

The following simple C program illustrates the NULL pointer de-referencing, and the subsequent process crash:
 % cat strlen.c
 #include <stdio.h>
 #include <string.h>

 int main()
 {
  char *string = NULL;

  printf("\nString length = %d", strlen(string));
  return (0);
 }

 % cc -g -o strlen strlen.c

 % ./strlen
 Segmentation Fault (core dumped)

 % dbx strlen core
 Reading strlen
 core file header read successfully
 Reading ld.so.1
 Reading libc.so.1
 Reading libdl.so.1
 Reading libc_psr.so.1
 program terminated by signal SEGV (no mapping at the fault address)
 0xff2b44e4: strlen+0x0080: ld [%o1], %o2
 Current function is main
  8 printf("\nString length = %d", strlen(string));


 (dbx) print $o1
 $o1 = 0

 (dbx) where
  [1] strlen(0x0, 0x0, 0x36a2c, 0x7efefeff, 0x81010100, 0xff3cdc4c), at 0xff2b44e4
 =>[2] main(), line 8 in "strlen.c"

 (dbx) whatis string
 char *string;

 (dbx) examine string
 0x00000000: 0x00000000

 (dbx) regs
 current frame: [2]
 g0-g1 0x00000000 0x00000000 0x00000000 0xff2b4464
 g2-g3 0x00000000 0x00000000 0x00000000 0x00000000
 g4-g5 0x00000000 0x00000000 0x00000000 0x00000000
 g6-g7 0x00000000 0x00000000 0x00000000 0x00000000
 o0-o1 0x00000000 0x00000000 0x00000000 0x00000000
 o2-o3 0x00000000 0x00036a2c 0x00000000 0x7efefeff
 o4-o5 0x00000000 0x81010100 0x00000000 0xff3cdc4c
 o6-o7 0x00000000 0xffbff8b8 0x00000000 0x00010c14
 l0-l1 0x00000000 0x00000000 0x00000000 0x00010c70
 l2-l3 0x00000000 0xff342070 0x00000000 0x00000000
 l4-l5 0x00000000 0x00000000 0x00000000 0x00000000
 l6-l7 0x00000000 0x00000000 0x00000000 0xff3ee7c4
 i0-i1 0x00000000 0x00000001 0x00000000 0xffbff984
 i2-i3 0x00000000 0xffbff98c 0x00000000 0x00020c00
 i4-i5 0x00000000 0x00000000 0x00000000 0x00000000
 i6-i7 0x00000000 0xffbff920 0x00000000 0x000107d0
 y 0x00000000 0x00000000
 ccr 0x00000000 0xfe400006
 pc 0x00000000 0x00010c14:main+0x14 call strlen [PLT] ! 0x20cdc
 npc 0x00000000 0xff2b44e8:strlen+0x84 inc 4, %o1

Observe that dbx is actually showing six arguments (addresses) instead of just one argument that is the address of the string being passed to this routine. These six arguments are the registers %o0 to %o5 which hold the outgoing arguments. You can match the arguments to strlen() in the call stack with the addresses in regs output. Note that the registers %i0 to %i5 hold the incoming arguments.

How to prevent the crash?

The recommended way is to modify the source code. The quick and dirty way is to pre-load the Solaris specific /usr/lib/0@0.so.1 library into process address space. /usr/lib/0@0.so.1 is an user compatibility library that Sun started shipping with Solaris 2.6, provides a mechanism that will cause location 0x00000000 {for 32-bit processes} to be mapped to a valid address containing the value 0. Note that the address 0x00000000 (or 0x0 in short) of the virtual address space is not mapped, by default.

eg.,
 % setenv LD_PRELOAD_32 /usr/lib/0@0.so.1

 %./strlen
 String length = 0

Read man page of run-time linker, ld.so.1, for more.

Acknowledgements:
Jim Fiori
___________________
Technorati tags: | |

3 comments:

  1. Hi Giri. The story I always heard was that the decision to leave the first page of memory unmapped was made intentionally to catch poorly written code. For example, there may be string pointers inside uninitialized malloced blocks. The string pointers might be NULL 99% of the time, but as soon as malloc returns nonzeroed memory (because a block was previously used and freed) then blammo! Your code is trying to read garbage. It's better to find such code and initialize the variables in question to a pointer to an empty string.

    ReplyDelete
  2. Hi grir/chris,

    The fix given /usr/lib/0@0.so.1 works fine strlen(NULL) call.
    But not for STL string function.
    If i use string(NULL) Abort core dump message in solaris box with and without using usr/lib/0@0.so.1.
    You have any fix for this defect.

    Thanks,
    Raghu.Yadav

    ReplyDelete
  3. Have you heard about another useful and most effective application PST Repair Software which is very effective to repair corrupt Outlook PST & recover email,calendar, notes, tasks & contacts. Retrieves lost PST Password & Split PST File of Microsoft Outlook. The software supports all versions of MS outlook and handles all outlook pst file corruption situations.

    ReplyDelete