Old Posts:
09.04
10.04
11.04
12.04
01.05
02.05
03.05
04.05
05.05
06.05
07.05
08.05
09.05
10.05
11.05
12.05
01.06
02.06
03.06
04.06
05.06
06.06
07.06
08.06
09.06
10.06
11.06
12.06
01.07
02.07
03.07
04.07
05.07
06.07
08.07
09.07
10.07
11.07
12.07
01.08
02.08
03.08
04.08
05.08
06.08
07.08
08.08
09.08
10.08
11.08
12.08
01.09
02.09
03.09
04.09
05.09
06.09
07.09
08.09
09.09
10.09
11.09
12.09
01.10
02.10
03.10
04.10
05.10
06.10
07.10
08.10
09.10
10.10
11.10
12.10
01.11
02.11
03.11
04.11
05.11
07.11
08.11
09.11
10.11
11.11
12.11
01.12
02.12
03.12
04.12
05.12
06.12
07.12
08.12
09.12
10.12
11.12
12.12
01.13
02.13
03.13
04.13
05.13
06.13
07.13
08.13
09.13
10.13
11.13
12.13
01.14
02.14
03.14
04.14
05.14
06.14
07.14
09.14
10.14
11.14
12.14
01.15
02.15
03.15
04.15
06.15
09.15
12.15
01.16
03.16
04.16
05.16
06.16
07.16
08.16
09.16
12.16
01.17
02.17
03.17
04.17
06.17
07.17
08.17
09.17
10.17
12.17
01.18
02.18
03.18
04.18
05.18
06.18
07.18
08.18
09.18
11.18
12.18
01.19
02.19
05.19
06.19
08.19
10.19
11.19
05.20
10.20
11.20
12.20
09.21
11.21
12.22
Wednesday, October 26, 2005
C/C++: About __FILE__ & __LINE__ Macros
__FILE__ & __LINE__ macros specified in C & C++ standards are useful to provide some debugging information, during run-time. During pre-processing stage of compilation, __FILE__
macro expands to the name of the file being compiled, in the form of a C string constant; and __LINE__
expands to the line number of the source file that is being compiled, in the form of an integer constant.
Run-time anomalies are not uncommon in production environment. Some developers check for pre-, post-conditions for a block of code with the help of assertions. Depending on the severity of the failure/violation/deviation, the program (application) may either crash or continue to run (of course, at the will of the developer who wrote that piece of code). However to minimize errors/failures during run-time, it is often necessary to log the run-time error messages for a later analysis. So, the developer can introduce some statements with __FILE__ & __LINE__
macros, at appropriate places in the source code to log the exact location of the failure, if an inconsistency was detected during run-time.
Here's an example of poorly written C code, that fails to do proper exception handling. However it logs enough information, to locate the failed statement easily.% cat -n file-line-directives.c
1 #include <stdio.h>
2 #include <stdlib.h>
3
4 int main(int argc, char **argv)
5 {
6 FILE *fd;
7 char *errinfo = NULL;
8
9 errinfo = (char *) malloc (sizeof (char) * 50);
10
11 fd = fopen (argv[1], "r");
12 if (fd == NULL)
13 {
14 sprintf (errinfo, "Error at line %d of %s", (__LINE__ - 3), __FILE__);
15 perror (errinfo);
16 free (errinfo);
17 }
18 else
19 {
20 printf ("file descriptor = %d", fileno(fd));
21 fclose (fd);
22 }
23
24 return (0);
25 }
Compiling this code with Sun Studio C (or C++) compiler produces the following result, at run-time:% cc -o directives file-line-directives.c
% ./directives <- needs an argument ie., file name
Error at line 11 of file-line-directives.c: Bad address
% ./directives /tmp/nonexistingfile <- need an existing file name, to open it up for reading
Error at line 11 of file-line-directives.c: No such file or directory
% ls /tmp/*.log
/tmp/activity.log
%./directives /tmp/activity.log
file descriptor = 3
Note:
Since these pre-defined macros are part of the standard(s), they are available with all C and C++ standard compliant compilers.
___________________
Technorati tags: C | C++ | Programming
Monday, October 24, 2005
My Favorite Music V
Here comes another long list of my favorite singles. Previous posts of this series are at:
I |
II |
III |
IV
3 Doors Down - Let Me Go Abba - Dancing Queen Ace Of Base - The Sign Aerosmith - Angel Aha - Take On Me Alannah Myles - Black Velvet Anita Baker - Giving You The Best I Got Beck - E-pro Bee Gees - Night Fever Belinda Carlisle - Heaven Is A Place On Earth Bette Midler - Wind Beneath My Wings Blondie - Heart of Glass Blondie - One Way or Another Blues Traveler - Run Around Bobby Brown - Every Little Step Bobby McFerrin - Dont Worry Be Happy Bon Jovi - Livin On A Prayer Bonnie Raitt - Good Man Good Woman Bonnie Raitt - Nick Of Time Brian McKnight - Back At One Bryan Adams - Have You Really Ever Loved A Woman Bryan Adams - Summer of 69 Buggles - Video Killed The Radio Star CC Music Factory - Groove Of Love (Whats This Word Called Love) Cars - Just What I Needed Chaka Khan - I Feel For You Chemical Brothers - Shake Break Bounce Cher - Believe Cheryl Lynn - I Got To Be Real Club Nouveau - Lean on Me Counting Crows - Big Yellow Taxi Crazy Town - Butterfly Cyndi Lauper - Girls Just Wanna Have Fun Cyndi Lauper - Time After Time DHT - Listen to Your Heart Dave Mathews Band - American Baby Dee Lite - Groove Is In The Heart Dexys Midnight Runners - ComeOn Eileen Dire Straits - Money for Nothing Divinyls - Touch Myself Don Henley - End Of Innocence Duran Duran - Hungry Like the Wolf EMF - Unbelievable Earth Wind Fire - Lets Groove Elastica - Connection Emotions - Best Of My Love Enigma - Sadness Eric Clapton - Badge Eric Clapton - Tears In Heaven Europe - Final Countdown Eurythmics - Here Comes The Rain Again Everything But The Girl - Missing Faith Hill - This Kiss Fontella Bass - Rescue Me Gloria Gaynor - I Will Survive Goo Goo Dolls - Iris Gorillaz - Feel Good Inc Green Day - Wake Me Up When Septtember Ends Gwen Stefani - Cool Haddaway - What Is Love J Geils Band - Centerfold Jack Johnson - Flake Janet Jackson - Escapade Jesus Jones - Real Real Real Jesus Jones - Right Here Right Now Jimmy Cliff - I Can See Clearly Now Joan Jett - I Love Rock'n'Roll Joe Cocker - With A Little Help From My Friends John Mellencamp - Small Town John Waite - Missing You Journey - Lights Keith Urban - You Will Think Of Me Kim Appleby - Dont Worry Kim Wilde - You Keep Me Hanging On
|
Kinks - You Really Got Me LeAnn Rimes - How Do I Live Leann Womack - I Hope You Dance Lenny Kravitz - American Woman Lenny Kravitz - Lady Linda Ronstadt - Dont Know Much Lipps Inc - Funky Town Liz Phair - Why Cant I Madonna - Angel Madonna - Borderline Madonna - Just Like a Prayer Marc Anthony - I Need To Know Marvin Gaye - I Heard It Through The Grapevine Match Box 20 - 3AM Melissa Etheridge - Come to My Window Men At Work - Down Under Michael Bolton - How Am I Supposed To Live Michael Bolton - When A Man Loves A Woman Mike And The Mechanics - The Living Years Modern English - I Melt With You Moira Jane's Cafe - Definition of Sound Montell Jordan - This Is How We Do It Naked Eyes - Always Something To Remind Me Nine Inch Nails - Collector Nine Inch Nails - Only Nine Inch Nails - With Teeth OMD - If You Leave Pat Benatar - Hit Me with Your Best Shot Pat Benatar - Love is a Battle Field Peter Gabriel - In Your Eyes Petshop Boys - Eastern Boys Western Girls Prince - 1999 Prince - I Would Die 4 You Prince - Kiss Prince - Let's Go Crazy Prince - Little Red Corvette Prince - Purple Rain Prince - Raspberry Beret Queen - Bohemian Rhapsody Red Hot Chilli Peppers - Under The Bridge Rednex - Cotton Eye Joe Rick James - Super Freak RodStewart - Maggie May (cool guitar bits at the end) Roxette - The Look Santana/Michelle Branch - Game Of Love Savage Garden - Truly Madly Deeply Shania Twain - I Feel Like A Woman Sinead O'Connor - Nothing Compares Sixpence None The Richer - Kiss Me Soft Cell - Tainted Love Soho - Hippy Chick Soul II Soul - Back To Life Spin Doctors - Two Princes Tasmin Archer - Sleeping Satellite The Corrs - Breatheless The Cranberries - Dreams The Four Tops - Reach Out I'll Be There The Judds - Love Can Build A Bridge The Supremes - Baby Love Thomas Dolby - She Blinded Me With Science Tom Petty - Free Falling Tommy Lee - Good Times Toni Braxton - Unbreak My Heart Tracy Chapman - Fast Car Train - Get to Me U2 - Hold Me Thrill Me Kiss Me U2 - Vertigo UB40 - Red Red Wine Unchained Melody - My Love My Darling Us3 - Cantaloop Wall Of Voodoo - Mexican Radio War (What is it Good For) - Rush Hour Soundtrack Weezer - Beverly Hills When in Rome - The Promise Wilson Phillips - Hold On |
________________
Technorati tag:
Music
Friday, October 21, 2005
Solaris: pthread_attr_getstack() broken?
Apparently there seems to be some confusion
1 2 around when the structure
pthread_attr_t gets updated. In one of the above threads, it appears that the initiator believes that the attribute structure must be updated at least when a thread was created by calling
pthread_create()
.
Let's start with an example:(Taken from:
pthread_attr_getstack(), pthread_attr_getstackaddr() thread -- for simplicity, I just removed the error checking code)
#include <pthread.h>
#include <stdio.h>
#include <unistd.h>
#include <errno.h>
#include <stdlib.h>
void* Proc(void* param)
{
sleep(3);
return 0;
}
int main(int argc, char *argv[])
{
pthread_attr_t Attr;
pthread_t Id;
void *Stk;
size_t Siz;
int err;
pthread_attr_init(&Attr);
pthread_attr_getstacksize(&Attr, &Siz);
pthread_attr_getstackaddr(&Attr, &Stk);
printf("Default: Addr=%08x Size=%d\n", Stk, Siz);
pthread_create(&Id, &Attr, Proc, NULL);
pthread_attr_getstackaddr(&Attr, &Stk);
pthread_attr_getstacksize(&Attr, &Siz);
printf("Stack : Addr=%08x Size=%d\n", Stk, Siz);
return (0);
}
On Solaris, this piece of code produces a result of:
Default: Addr=00000000 Size=0
Stack : Addr=00000000 Size=0
This result may surprise (and
confuse) some of the users, who expects a valid base address and a valid size for the stack. However on Solaris, a value of NULL for stack address, and 0 for stack size indicates that the new thread will have system defined stack address and system allocated stack size (1M for 32-bit and 2M for 64-bit processes) respectively. Other default values for attribute (
Attr
, in the example) is mentioned in
Solaris 10 Multithreaded Programming Guide. To understand when the interfaces
pthread_attr_getstackaddr()
and
pthread_attr_getstacksize()
, return valid values, first we need to know a little about attribute object.
What is an Attribute object?Attributes are a way to specify behavior that is different from the default. Any variable of type
pthread_attr_t
can be used specify the new behavior. Attribute object can be specified, when a thread is created with
pthread_create()
or when a synchronization variable is initialized. Note that these attributes cannot be altered while the thread is in use.
Now it all makes sense for the OS to return NULL and 0 for base address and size of a stack. Since the POSIX specification didn't specify the default values to be returned, the output for the same program might be different on different platforms (see previous paragraphs for the default values chosen by Sun).
Let's modify the C code a little to place a 3M stack, at a different location than the system defined address. (The newly introduced code is in green color)
#include <pthread.h>
#include <stdio.h>
#include <unistd.h>
#include <errno.h>
#include <stdlib.h>
#include <sys/mman.h>
void* Proc(void* param)
{
sleep(3);
return 0;
}
int main(int argc, char *argv[])
{
pthread_attr_t Attr;
pthread_t Id;
void *Stk;
size_t Siz;
int err;
size_t stksize = 0x300000;
void *stack;
pthread_attr_init(&Attr);
pthread_attr_getstacksize(&Attr, &Siz);
pthread_attr_getstackaddr(&Attr, &Stk);
printf("Default: Addr=%08x Size=%d\n", Stk, Siz);
stack = mmap(NULL, stksize, PROT_WRITE|PROT_READ, MAP_ANON|MAP_SHARED, -1, 0);
pthread_attr_setstack(&Attr, stack, stksize);
pthread_create(&Id, &Attr, Proc, NULL);
pthread_attr_getstackaddr(&Attr, &Stk);
pthread_attr_getstacksize(&Attr, &Siz);
printf("Stack : Addr=%08x Size=%d\n", Stk, Siz);
return (0);
}
Here's the output from the above program:
Default: Addr=00000000 Size=0
Stack : Addr=fee80000 Size=3145728
Moral of the story:
pthread_attr_getstack()
is not broken; and the structure
pthread_attr_t (ie., the attribute object) gets updated only if we update it, with the help of pthread_attr_setxxxx() interfaces defined in pthread.h ie.,
% grep "pthread_attr_set*" /usr/include/pthread.h
int pthread_attr_setstack(pthread_attr_t *, void *, size_t);
int pthread_attr_setstacksize(pthread_attr_t *, size_t);
int pthread_attr_setstackaddr(pthread_attr_t *, void *);
int pthread_attr_setdetachstate(pthread_attr_t *, int);
int pthread_attr_setscope(pthread_attr_t *, int);
int pthread_attr_setinheritsched(pthread_attr_t *, int);
int pthread_attr_setschedpolicy(pthread_attr_t *, int);
int pthread_attr_setschedparam(pthread_attr_t *_RESTRICT_KYWD,
int pthread_attr_setguardsize(pthread_attr_t *, size_t);
Otherwise system is going to return the default values for all pthread_attr_getxxxx() calls.
________________
Technorati tag:
Programming |
POSIX |
Solaris
Friday, October 14, 2005
Handling SIGFPE
Couple of days back I was asked to look into simple C code that tries to get into
division by zero problem deliberately. Signal handler was installed, and there's some code to catch signal SIGFPE (Floating Point Exception), and to print simple message when the relevant code in the signal handler is called. Here's the code (courtesy:
Dheeraj):
% cat fpe.c
#include <sys/types.h>
#include <signal.h>
#include <stdio.h>
#include <stdlib.h>
#include <limits.h>
void signal_handler (int signo, siginfo_t *si, void *data) {
switch (signo) {
case SIGFPE:
fprintf(stdout, "Caught FPE\n");
break;
default:
fprintf(stdout, "default handler\n");
}
}
int main (void) {
struct sigaction sa, osa;
unsigned int b = ULONG_MAX;
sa.sa_flags = SA_ONSTACK | SA_RESTART | SA_SIGINFO;
sa.sa_sigaction = signal_handler;
sigaction(SIGFPE, &sa, &osa);
b /= 0x0;
return b;
}
During run-time, the system (OS) throws SIGFPE once the statement
b /= 0x0;
gets executed. Since the handler is available for this signal, it should print
Caught FPE once, on console and then return from
main()
. Strangely enough, the floating point exception was caught multiple times as though it was in an infinite loop, and the process didn't exit.
% cc -o fpe fpe.c
"fpe.c", line 25: warning: division by 0
% ./fpe
Caught FPE
Caught FPE
Caught FPE
Caught FPE
Caught FPE
Caught FPE
Caught FPE
^C
It turns out to be the expected behavior; and it appears that when a floating point instruction traps due to the occurrence of an unmasked floating point exception, the hardware leaves the instruction pointer pointing to the beginning of the same instruction. This explains the reason for the multiple SIGFPE's from the same process (and from the same instruction).
Now the developer has the following choices:
- Abort the program
- Modify the operands of the instruction, so the exception will not occur; then continue by re-executing that instruction. Doing so, supplies a result for the trapping instruction
--And/Or--
- Update the instruction pointer (PC), so the execution continues at the next instruction (nPC)
I chose the final one, and simply updated the program counter with the next instruction, as follows. New code is in green color.
% cat fpe.c
#include <sys/types.h>
#include <signal.h>
#include <stdio.h>
#include <stdlib.h>
#include <limits.h>
#include <ucontext.h>
void signal_handler (int signo, siginfo_t *si, void *data) {
ucontext_t *uc;
uc = (ucontext_t *) data;
switch (signo) {
case SIGFPE:
fprintf(stdout, "Caught FPE\n");
uc->uc_mcontext.gregs[REG_PC] = uc->uc_mcontext.gregs[REG_nPC];
break;
default:
fprintf(stdout, "default handler\n");
}
}
int main (void) {
struct sigaction sa, osa;
unsigned int b = ULONG_MAX;
sa.sa_flags = SA_ONSTACK | SA_RESTART | SA_SIGINFO;
sa.sa_sigaction = signal_handler;
sigaction(SIGFPE, &sa, &osa);
b /= 0x0;
return b;
}
% cc -o fpe fpe.c
"fpe.c", line 30: warning: division by 0
% ./fpe
Caught FPE
uc points to the user context, defined by the structure
ucontext_t
. The user context includes the contents of the calling processes' machine registers, the signal mask, and the current execution stack.
uc_mcontext
is a member of the structure
ucontext_t
, of type
mcontext_t
.
gregs
, general register set is a member of structure
mcontext_t
.
gregs[REG_PC] holds the PC of the current instruction, and
gregs[REG_nPC] holds the PC of the next instruction.
Now it is obvious that
uc->uc_mcontext.gregs[REG_PC] = uc->uc_mcontext.gregs[REG_nPC];
statement increments the program counter . Since the user context got manipulated a bit, the process will be able to continue with the next instruction.
----
This code works "as is" on SPARC, since REG_nPC is available on SPARC. To make it work with other processors, the code needs to be changed a little bit.
________________
Technorati tag:
Programming |
Solaris |
SPARC
Thursday, October 06, 2005
Sun Studio: Investigating memory leaks with dbx
Some time back I have had a blog entry with the title
Sun Studio: Investigating memory leaks with Collector/Analyzer. It is possible to check memory leaks {along with memory use and memory access checking} with the Sun Studio debugger tool,
dbx
, as well. This blog post attempts to show the steps involved in generating the memory leak report, with a simple example (taken from Collector/Analyzer example).
Runtime checking with dbx
dbx
uses the word
runtime checking (RTC) for detecting runtime errors, like memory leaks and memory access errors. Extensive material about RTC is available in the
Debugging a Program With dbx document, under chapter
Using Runtime Checking.
For the runtime checking feature to work properly, the process must have
rtcaudit.so
preloaded when it starts.
To preload rtcaudit.so:
% setenv LD_AUDIT <path-to-rtcaudit-lib>/rtcaudit.so
Turn off mutex tracking with
dbxenv mt_sync_tracking off
, to avoid running into the bug
thread_db synchronization tracking causes cond_wait failure and hangs. Even if you don't, tt forces you to turn it off, anyway.
Unset
LD_AUDIT
once the data collection is done.
Here's an example, with annotated commentary:
% more memleaks.c <= source file
#include <stdlib.h>
#include <unistd.h>
#include <stdio.h>
void allocate() {
int *x;
char *y;
x = (int *) malloc(sizeof(int) * 100);
y = (char *) malloc (sizeof(char) * 200);
printf("\nAddress of x = %u, y = %u", x, y);
x = (int *) malloc (sizeof(int) * 25);
y = (char *) malloc (sizeof(char) * 25);
printf("\nNew address of x = %u, y = %u\n", x, y);
free (y);
}
void main() {
while (1) {
allocate();
sleep(1);
}
}
% which cc <= Sun Studio 10 C compiler
/software/SS10/SUNWspro/prod/bin/cc
% cc -g -o memleaks memleaks.c <= Compile the code with debug (-g
) flag
% setenv LD_AUDIT /software/SS9/SUNWspro/prod/lib/dbxruntime/rtcaudit.so <= Pre-load rtcaudit library
% ./memleaks
Address of x = 134621928, y = 134622336
New address of x = 134622544, y = 134614272
Address of x = 134622656, y = 134623064
New address of x = 134623272, y = 134614272
...
...
In another window:
% ps -eaf | grep memleaks
techno 11174 10744 0 19:39:09 syscon 0:00 ./memleaks
% dbx - 11174 <= Attach the process to the debugger, dbx
For information about new features see `help changes'
To remove this message, put `dbxenv suppress_startup_message 7.4' in your .dbxrc
Reading -
Reading ld.so.1
Reading libc.so.1
Reading rtcaudit.so
Reading libmapmalloc.so.1
Reading libgen.so.1
Reading libdl.so.1
Reading libm.so.2
Reading libc_psr.so.1
Reading rtcboot.so
Reading librtc.so
(dbx) dbxenv mt_sync_tracking off <= Turn off mutex tracking
(dbx) check -leaks <= Turn on memory leak checking
leaks checking - ON
RTC: Enabling Error Checking...
RTC: Running program...
(dbx) cont <= Resume program execution
Address of x = 134623384, y = 134623792
New address of x = 134624000, y = 134614272
Address of x = 134624112, y = 134624520
New address of x = 134624728, y = 134614272
Address of x = 134624840, y = 134625248
New address of x = 134625456, y = 134614272
Address of x = 134625568, y = 134625976
New address of x = 134626184, y = 134614272
Address of x = 134626296, y = 134626704
New address of x = 134626912, y = 134614272
^C <= Interrupt the execution with Ctrl-C, to get intermediate leak report
dbx: warning: Interrupt ignored but forwarded to child.
signal INT (Interrupt) in ___nanosleep at 0xed1bc7dc
0xed1bc7dc: ___nanosleep+0x0004: ta 8
Current function is main
24 sleep(1);
(dbx) showleaks <= showleaks
reports new memory leaks since the last showleaks
command
Checking for memory leaks...
Actual leaks report (actual leaks: 15 total size: 3500 bytes)
Total Num of Leaked Allocation call stack
Size Blocks Block
Address
========== ====== =========== =======================================
2000 5 - allocate < main
1000 5 - allocate < main
500 5 - allocate < main
Possible leaks report (possible leaks: 0 total size: 0 bytes)
(dbx) cont <= Continue the execution of the program
Address of x = 139208, y = 139632
New address of x = 139856, y = 139984
Address of x = 140040, y = 140464
New address of x = 140688, y = 135824
Address of x = 140816, y = 141240
New address of x = 141464, y = 136656
Address of x = 141592, y = 142016
New address of x = 142240, y = 137488
^C <= Interrupt the execution with Ctrl-C, to get intermediate leak report
dbx: warning: Interrupt ignored but forwarded to child.
signal INT (Interrupt) in ___nanosleep at 0xed1bc7dc
0xed1bc7dc: ___nanosleep+0x0004: ta 8
Current function is main
24 sleep(1);
(dbx) showleaks <= showleaks
reports new memory leaks since the last showleaks
command
Checking for memory leaks...
Actual leaks report (actual leaks: 12 total size: 2800 bytes)
Total Num of Leaked Allocation call stack
Size Blocks Block
Address
========== ====== =========== =======================================
1600 4 - allocate < main
800 4 - allocate < main
400 4 - allocate < main
Possible leaks report (possible leaks: 0 total size: 0 bytes)
(dbx) showleaks -a <= showleaks -a
show all the leaks generated so far,
not just the leaks since the last showleaks
command
Checking for memory leaks...
Actual leaks report (actual leaks: 27 total size: 6300 bytes)
Total Num of Leaked Allocation call stack
Size Blocks Block
Address
========== ====== =========== =======================================
3600 9 - allocate < main
1800 9 - allocate < main
900 9 - allocate < main
Possible leaks report (possible leaks: 0 total size: 0 bytes)
(dbx) showleaks -v <= Verbose report, since the last showleaks
command. Default: Non verbose report
Checking for memory leaks...
Actual leaks report (actual leaks: 12 total size: 2800 bytes)
Memory Leak (mel):
Found 4 leaked blocks with total size 1600 bytes
At time of each allocation, the call stack was:
[1] allocate() at line 9 in "memleaks.c"
[2] main() at line 23 in "memleaks.c"
Memory Leak (mel):
Found 4 leaked blocks with total size 800 bytes
At time of each allocation, the call stack was:
[1] allocate() at line 10 in "memleaks.c"
[2] main() at line 23 in "memleaks.c"
Memory Leak (mel):
Found 4 leaked blocks with total size 400 bytes
At time of each allocation, the call stack was:
[1] allocate() at line 14 in "memleaks.c"
[2] main() at line 23 in "memleaks.c"
Possible leaks report (possible leaks: 0 total size: 0 bytes)
(dbx) func allocate <= Change the current function to allocate
(dbx) list 9 <= List line number 9 of function allocate
9 x = (int *) malloc(sizeof(int) * 100);
(dbx) list 10
10 y = (char *) malloc (sizeof(char) * 200);
(dbx) list 14
14 x = (int *) malloc (sizeof(int) * 25);
(dbx) func main
(dbx) list 23
23 allocate();
(dbx)
In short, memory leak report can be generated as follows with the help of RTC feature of dbx:
- Compile your program with
-g
(debug) flag, to get source line numbers in the runtime checking error messages. However it is not mandatory to compile with -g
to do runtime checking (RTC)
- Pre-load RTC library, rtcaudit:
setenv LD_AUDIT <path-to-rtcaudit-lib>/rtcaudit.so
- Load the program with dbx or attach the running process to dbx with
dbx - <pid>
- Turn off mutex tracking with
dbxenv mt_sync_tracking off
-- I have no clue why dbx didn't like mutex tracking on; and there's no relevant documentation anywhere in Sun documentation site. Perhaps Chris Quenelle can explain this
- Turn on the runtime checking for memory leaks with
check -leaks
command
- Run the program with
run
command, if it is not running already
- Occasionally interrupt the execution with Ctrl-C, and get the leak report with any of
showleaks, showleaks -a, showleaks -v
commands
- Detach the process, once the data has been collected
- Unset the LD_AUDIT variable.
unsetenv LD_AUDIT
__________________
Technorati tags:
Sun Studio |
dbx
Tuesday, October 04, 2005
Dynamic TSB support in Solaris 10
Recently at our partner (ISV, short for Independent Software Vendor) site, I was asked to look into some unexpected {huge} performance improvement {compared to Solaris 9} in running their application on a UltraSPARC based server running Solaris 10. The application was compiled/built on Solaris 8. Although Solaris 10 has many improvements, the mileage of the customer applications vary depending on the nature of the application being run. The application under discussion is a userland, CPU intensive financial application, written in C++. Apparently I didn't have much information about the way the application was compiled, and on which platform (hardware & software). So I made one of the Sun Fire v480's, a dual boot server with Solaris 9 and 10 installed on two partitions. I've installed the ISV's application on the third partition, and did a quick test by loading up the machine with virtual users until the average CPU consumption was about 85%. Interestingly I haven't seen phenomenal improvement as the ISV observed, but decent enough (~2.88%) gain that lead me to investigate further.
Here's the trapstat output from Solaris 9 env:
cpu m size| itlb-miss %tim itsb-miss %tim | dtlb-miss %tim dtsb-miss %tim |%tim
----------+-------------------------------+-------------------------------+----
ttl | 1339705 7.7 12031 0.5 | 869027 6.3 86899 4.7 |19.2
ttl | 1371385 7.9 12165 0.5 | 931897 6.8 93874 5.1 |20.3
ttl | 1261136 7.2 11227 0.5 | 862982 6.3 86420 4.7 |18.7
ttl | 1334286 7.7 12201 0.5 | 871144 6.4 90464 4.9 |19.4
ttl | 1423610 8.2 14101 0.6 | 957773 7.1 105544 5.7 |21.6
ttl | 1399334 8.1 14120 0.6 | 973754 7.2 110116 6.0 |21.9
ttl | 1478324 8.5 13310 0.6 | 975822 7.2 104689 5.7 |21.9
ttl | 1416840 8.1 12698 0.5 | 962725 7.1 98593 5.3 |21.0
ttl | 1464161 8.4 13149 0.6 | 974467 7.2 105842 5.8 |21.9
ttl | 1412006 8.2 13685 0.6 | 915772 6.9 107461 5.9 |21.5
Average time spent in virtual to physical memory translatations: 20.74%
trapstat output from Solaris 10 env:
cpu m size| itlb-miss %tim itsb-miss %tim | dtlb-miss %tim dtsb-miss %tim |%tim
----------+-------------------------------+-------------------------------+----
ttl | 1449113 8.7 5045 0.2 | 1015584 7.5 35621 1.7 |18.1
ttl | 1504522 9.0 5771 0.3 | 1056137 7.9 39809 1.9 |19.0
ttl | 1372013 8.2 4824 0.2 | 965968 7.2 33577 1.6 |17.2
ttl | 1366566 8.2 5194 0.2 | 988130 7.3 34719 1.6 |17.3
ttl | 1433062 8.6 5170 0.2 | 1006544 7.4 34607 1.6 |17.9
ttl | 1463364 8.8 5403 0.2 | 1023112 7.6 37313 1.7 |18.3
ttl | 1356094 8.1 4904 0.2 | 979501 7.3 34212 1.6 |17.2
ttl | 1497592 9.0 5816 0.3 | 1060080 7.9 39844 1.9 |19.0
ttl | 1468445 8.8 6166 0.3 | 1079617 8.1 42968 2.0 |19.2
ttl | 1505277 9.0 5737 0.3 | 1062101 7.8 39025 1.8 |18.9
Average time spent in virtual to physical memory translatations: 18.21%
In Solaris 9 env, the OS has spent 2.53% CPU cycles more {compared to Solaris 10}, in serving TLB/TSB misses. A closer look at the trapstat outputs reveals that Solaris 10 has less burden of serving TSB misses {for data}. There's about 3.64% difference between Solaris 9 & 10's dTSB miss%; but Solaris 10 spent ~0.75% more cycles in serving dTLB misses, compared to Solaris 9, which leaves us 2.89% (ie., 3.64% - 0.75%). Surprisingly this number (2.89%) matched with the gain (2.88%) I've observed by running the same application on both Solaris 9 & 10.
Solaris 10's dynamic TSB supportSince I ran the same application on same hardware with different versions of Solaris, I can directly attribute the improvement in performance to Solaris 10. It is no brainer to quickly realize that this is the result of the algorithmic changes to dynamic TSB support in Solaris 10.
On Solaris 9 and prior versions, depending on the physical memory installed on the machine, the system allocates a fixed number of TSBs with size 128KB or 512KB, at boot time; and since the number is fixed, all processes have to share those TSBs. Due to the limited (only 2) number of supported TSB sizes, any process that needs a TSB of size somewhere between 128 & 512, say 256KB, may either experience a miss (for eg., if the translation was done in a 128KB TSB) or wastes some memory (for eg, if the translation was done in 512KB).
Prior to version 10, Solaris is lacking the flexibility of using the right TSB size, for the right process. Recent versions of UltraSPARC chips can support TSBs of eight different sizes (8K, 16K, 32K, 64K, 128K, 256K, 512K, 1024K or 1M). By sticking to only 128K and 512K TSB's, Solaris 9 and prior versions couldn't take the advantage of the hardware capability quite efficiently.
Solaris 10 overcomes those drawbacks mentioned above by creating a TSB on the fly, per the needs of the process. Here's the corresponding RFE to fix the issues which were seen until Solaris 9:
Integrate support for Dynamic TSBs.
Now it makes more sense for me to mention about the 3% reduction in memory footprint per user, in my test runs.
To complete the original story of huge performance difference between Solaris 9 & 10, I gave them a check list to make sure they are doing apples-to-apples comparison; but I never heard from 'em back. Anyway here's the check list that I sent:
- On which hardware (US II/III/IV/..), the application was built?
This is extremely important to know, because building on US II, and running the binary on later processors (US III, III+, ..) will have significant impact on the overall performance of the application. For eg., I have seen nearly 4% performance difference in CPU utilization with some application that was built on US II, and ran on US III+ machine, with similar work loads.
- Is it the same binary that was run on both Solaris 9 & 10?
- Check the difference(s) in the run-time environments of both experiments. Have the tests been conducted on the same kind of hardware ? With same #processors? With same load? etc.,
- Make sure to use the same {application & OS} tunables, in both experiments
- Which version of Solaris 10 is in use to test the application?
The later versions of Solaris 10 (also called Solaris Express builds), enable large pages for data and instructions by default, if the OS thinks doing so is beneficial. Large pages for data (MPSS) is already introduced in Solaris 9; now Solaris 11 extends it to instructions. So, if Solaris Express bits are being used (less likely though; but there's a possibility), there is almost 13 to 15% improvement (based on the trapstat data shown above) in CPU utilization with no effort from the users.
- Make sure large pages are either enabled or not enabled on both (S9 & S10) platforms
- If the application binary is not the same, check the changes in the application, that could improve performance significantly. Also check the changes in compiler flags
________________
Technorati tag:
Solaris
2004-2019 |
|
|