Old Posts:
09.04
10.04
11.04
12.04
01.05
02.05
03.05
04.05
05.05
06.05
07.05
08.05
09.05
10.05
11.05
12.05
01.06
02.06
03.06
04.06
05.06
06.06
07.06
08.06
09.06
10.06
11.06
12.06
01.07
02.07
03.07
04.07
05.07
06.07
08.07
09.07
10.07
11.07
12.07
01.08
02.08
03.08
04.08
05.08
06.08
07.08
08.08
09.08
10.08
11.08
12.08
01.09
02.09
03.09
04.09
05.09
06.09
07.09
08.09
09.09
10.09
11.09
12.09
01.10
02.10
03.10
04.10
05.10
06.10
07.10
08.10
09.10
10.10
11.10
12.10
01.11
02.11
03.11
04.11
05.11
07.11
08.11
09.11
10.11
11.11
12.11
01.12
02.12
03.12
04.12
05.12
06.12
07.12
08.12
09.12
10.12
11.12
12.12
01.13
02.13
03.13
04.13
05.13
06.13
07.13
08.13
09.13
10.13
11.13
12.13
01.14
02.14
03.14
04.14
05.14
06.14
07.14
09.14
10.14
11.14
12.14
01.15
02.15
03.15
04.15
06.15
09.15
12.15
01.16
03.16
04.16
05.16
06.16
07.16
08.16
09.16
12.16
01.17
02.17
03.17
04.17
06.17
07.17
08.17
09.17
10.17
12.17
01.18
02.18
03.18
04.18
05.18
06.18
07.18
08.18
09.18
11.18
12.18
01.19
02.19
05.19
06.19
08.19
10.19
11.19
05.20
10.20
11.20
12.20
09.21
11.21
12.22
Thursday, May 26, 2005
Solaris: 32-bits , fopen()
and max number of open files
Last friday I was assigned to look into an issue where the application is not able write into files, once it is up for more than one week. It is a 32-bit application running on Solaris (SPARC platform) and the error message says,
too many open files
. With little effort, we came to know that all those errors are due to the calls to
fopen()
, from the application.
A little background on stdio's fopen():
fopen()
is part of
stdio
API. For a 32-bit application, a
stdio
library
FILE
structure represents the underlying file descriptor as an
unsigned char
(8 bits), limiting the range of file descriptors which can be opened as FILE's to 0-255 inclusive.
A common known problem (perhaps a "fact") is that when the 32-bit
stdio
is used in large applications on Solaris, the 255 limit for the number of open files, is frequently reached. File descriptors are allocated by the operating system starting at 0, and are then allocated in numerical order. Descriptors 0, 1, and 2 are opened for every process as
stdin
,
stdout
and
stderr
at startup.
open()
system call can also be used to open files from a C program. Both
open
and
fopen
use file descriptors which are taken from the total number of file descriptors allowed by the environment. Also the system allocates descriptors from the same pool of file descriptors, for calls to
popen(), socket(), accept()
and any other system call that returns a descriptor. That is, the same pool of file descriptors will be shared by various system calls like
fopen, open, popen, accept, socket
. So, if the application has numerous calls to these functions, and assuming if they are not closed immediately, it is very likely that a call to
fopen
may fail even before it reaches the 253 (266 - 3 = 253) file descriptors, that it can have them open as permitted by the OS.
However if the program uses
open/popen/socket/accept
exclusively, then the program will be able to open as many files/pipes/sockets/connections as the current soft limit allows. The soft limit defines how many files a process can open. There are actually two environmental limits. The soft limit and the hard limit. The soft limit is the number of files a process can open by default. The hard limit is the maximum number of files a process can open if it increases the soft limit.
The following C program illustrates the limitation of the number of open files with
fopen()
:
% cat fopen.c
#include <stdio.h>
#include <errno.h>
#define MAXFOPEN 275
int main() {
FILE *fps[MAXFOPEN];
char fname[15];
int i, j;
/*
* Test total number of fopen()'s which can be completed
*/
for (i = 0; i < MAXFOPEN; i++) {
sprintf(fname, "fopen_%d", i);
if ((fps[i] = fopen(fname, "w+")) == NULL) {
perror("fopen fails");
break;
}
}
printf("fopen() completes: %d\n", i);
/*
* Close the file descriptors
*/
for (j =0; j < i; j++) {
if (fclose(fps[j]) == EOF) {
perror("fclose failed");
}
}
return (0);
}
% cc -o fopen fopen.c
% file fopen
fopen: ELF 32-bit MSB executable SPARC32PLUS Version 1, V8+ Required,
dynamically linked, not stripped
% ./fopen
fopen failed: Too many open files
fopen() completes: 253
How to resolve this issue:
Make it a 64-bit binary; it will allow the program to have 65536 open files % cc -xarch=v9 -o fopen fopen.c
% file fopen
fopen: ELF 64-bit MSB executable SPARCV9 Version 1, dynamically linked, not
stripped
% ./fopen
fopen() completes: 275
As we can see,
fopen()
was able to overcome the 253 open files limitation with 64-bit executable.
Note:
To use 64-bits, the processor and the OS must have support for 64-bit binaries
Since I cannot re-compile the code at customer site, the option of creating 64-bit binaries has been ruled out.
A closer look at the output of
lsof
(LiSt of Open Files), gave me a clue that the most of the open files are actually TCP sockets/connections.
% grep TCP openfiles.log
app 6913 giri 11u IPv4 0x30013e4d3c0 0t0 TCP *:49152 (LISTEN)
app 6913 giri 12u IPv4 0x30011722200 0t0 TCP *:49153 (LISTEN)
app 6913 giri 13u IPv4 0x300126e8680 0t0 TCP *:49154 (LISTEN)
app 6913 giri 14u IPv4 0x30010faf300 0t0 TCP *:49155 (LISTEN)
app 6913 giri 15u IPv4 0x300082c2d00 0t0 TCP *:49156 (LISTEN)
app 6913 giri 16u IPv4 0x30011e3a180 0t0 TCP *:49157 (LISTEN)
app 6913 giri 17u IPv4 0x30018e36700 0t0 TCP *:1571 (LISTEN)
app 6913 giri 18u IPv4 0x30009bad900 0t0 TCP *:49158 (LISTEN)
app 6913 giri 43u IPv4 0x3001aa0a700 0t0 TCP as7:44232->as7:49156 (ESTABLISHED)
app 6913 giri 46u IPv4 0x30011cff800 0t0 TCP as7:1571->as3:27025 (ESTABLISHED)
app 6913 giri 49u IPv4 0x3000ce48d40 0t0 TCP as7:1571->as3:27026 (ESTABLISHED)
app 6913 giri 51u IPv4 0x300199d3980 0t722051 TCP as7:44238->repo:1521 (ESTABLISHED)
app 6913 giri 52u IPv4 0x30014d40c40 0t793865 TCP as7:44239->repo:1521 (ESTABLISHED)
app 6913 giri 55u IPv4 0x300197db340 0t0 TCP as7:1571->as3:27027 (ESTABLISHED)
app 6913 giri 56u IPv4 0x30011b5f800 0t675177 TCP as7:44243->repo:1521 (ESTABLISHED)
app 6913 giri 57u IPv4 0x30012853880 0t0 TCP as7:1571->as3:27028 (ESTABLISHED)
app 6913 giri 58u IPv4 0x30011d94d00 0t723190 TCP as7:44244->repo:1521 (ESTABLISHED)
app 6913 giri 62u IPv4 0x30016d5b240 0t0 TCP as7:1571->as3:27029 (ESTABLISHED)
app 6913 giri 63u IPv4 0x3001126d9c0 0t575246 TCP as7:44247->repo:1521 (ESTABLISHED)
app 6913 giri 64u IPv4 0x3000a825900 0t0 TCP as7:1571->as3:27030 (ESTABLISHED)
...
...
app 6913 giri 250u IPv4 0x300139899c0 0t0 TCP as7:1571->as3:27076 (ESTABLISHED)
app 6913 giri 251u IPv4 0x30017fc4700 0t0 TCP as7:1571->as3:27077 (ESTABLISHED)
app 6913 giri 252u IPv4 0x30011c3b900 0t403370 TCP as7:44390->repo:1521 (ESTABLISHED)
app 6913 giri 253u IPv4 0x3000cd32c40 0t445290 TCP as7:44391->repo:1521 (ESTABLISHED)
app 6913 giri 257u IPv4 0x30017f640c0 0t0 TCP as7:1571->as3:27078 (ESTABLISHED)
app 6913 giri 258u IPv4 0x300141f1280 0t0 TCP as7:1571->as3:27079 (ESTABLISHED)
So, to find the actual number of calls to
fopen()
, I have created a simple interposing library with only one interface that interposes on actual
fopen()
function.
% cat logfopen.c
#include <dlfcn.h>
#include <stdio.h>
#include <stdarg.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <sys/errno.h>
#include <thread.h>
#include <synch.h>
#include <fcntl.h>
FILE *fopen(const char *filename, const char *mode) {
FILE *fd;
static void * (*func)();
if(!func) {
func = (void *(*)()) dlsym(RTLD_NEXT, "fopen");
if (func == NULL) {
(void) fprintf(stderr, "dlopen(): %s\n", dlerror());
return(0);
}
}
fd = func(filename, mode);
if (fd != NULL) {
fprintf(stderr, "\nfopen(): fd = %d filename = %s mode = %s",
fileno(fd), filename, mode);
} else {
fprintf(stderr, "\nfopen() failed; returned NULL. Tried to open %s
with mode: %s", filename, mode);
}
return (fd);
}
Interestingly the interposer caught only two calls to
fopen()
, during a 10 min real world simulation run of the application. This was confirmed by running
truss
tool.
% grep fopen stderrout.log
fopen(): fd = 32 filename = /export/home/oracle/network/names/.sdns.ora mode = r
/export/home/C/liblogfopen.so:fopen+0x3c
fopen(): fd = 32 filename = /export/home/oracle/network/admin/tnsnames.ora mode = r
/export/home/C/liblogfopen.so:fopen+0x3c
% grep fopen truss.log
6913/14@14: -> libc:fopen(0xe48f6178, 0xe48f627c, 0x1, 0x61)
6913/14@14: <- libc:fopen() = 0
6913/14@14: -> libc:fopen(0xe48f8ad0, 0xe48f8bd4, 0x0, 0x61)
6913/14@14: <- libc:fopen() = 0xfdae884c
This observation made my job little simple. To resolve this particular customer issue, the application just needs to reserve low numbered file descriptors <= 255 for use by
fopen()
.
How to reserve the file descriptors?
Use file control function,
fcntl()
to return lowest file descriptor greater than or equal to 256, that is not already associated with an open file.
fcntl()
takes the OS assigned file descriptor and returns a new file descriptor greater than or equal to the value passed as 3rd argument. ie., once
fcntl()
successfully returns a new file descriptor, we will have two file descriptors pointing to the same open file. Since our intention is to make, as many file descriptors available as possible for fopen() to succeed, and since we don't need two file descriptors, we can close the OS assigned file descriptor safely.
In fact, database management systems like Oracle, Sybase, Informix addressed the
fopen
issue by employing this technique of reserving the low numbered file descriptors for exclusive use by
stdio
routines.
As changing the application code is not feasible (and not possible), this can be done very easily with an interposing library, with interfaces to
open(), popen(), socket(), accept()
. (A brief introduction to library interposition, is available at:
Solaris: hijacking a function call (interposing)). The interfaces of the interposing library catches all calls to
open(), popen(), socket(), accept()
etc., even before the actual implementation receives the call, and duplicates the file descriptors with the help of
fcntl()
function, to get a new file descriptor that is > 256, and returns the OS assigned file descriptor, to the pool of available file descriptors ie., to the OS.
Interposing code for
socket()
:
% cat fopenfix.c
#include <stdio.h>
#include <dlfcn.h>
#include <ucontext.h>
#include <sys/types.h>
#include <unistd.h>
#include <fcntl.h>
#include <sys/stat.h>
int socket(int domain, int type, int protocol) {
int sd, newsd = -1;
static void * (*func)();
if(!func) {
func = (void *(*)()) dlsym(RTLD_NEXT, "socket");
}
sd = (int) func(domain, type, protocol);
if (sd != NULL) {
if (sd < 256) {
newsd = (int) fcntl(sd, F_DUPFD, 256);
if (newsd == -1) {
fprintf(stderr, "\nfcntl() failed. Cannot return %d to OS", sd);
return (sd);
} else {
close (sd);
return (newsd);
} // if-else
} else {
return (sd);
}
} else {
return (sd);
}
}
The same functionality has to be replicated for the other interfaces:
int open(const char *path, int oflag, ...);
int accept(int s, struct sockaddr *addr, void *addrlen);
...
After preloading the interposing library, all the file descriptors of
open, popen, socket, accept
etc., were mapped to the descriptors > 256, leaving enough room for
fopen()
to have nearly 253 open files.
_______________________
Because of issues like this, it is recommended to use
open()
and its family of functions, for file handling instead of
stdio
's
fopen()
and its subordinates. It is a good practice for the developers to think about problems like this and handle them properly (just like Oracle/Informix/.. did), during the early stages of the design/development of the application.
References and suggested reading:
- SunSolve document: Maximum number of open files
- Solaris: Hijacking a function call (interposing)
Friday, May 20, 2005
Behavior of Sun C++ Compiler While Compiling Templates
When the C++ compiler finds a template declaration in a header (.h) file, the compiler needs the definition of the template to allow compilations to proceed faster (on average), because template definitions that are not used don't need to be processed. Hence the compiler automatically searches for a .cc or .C or .cpp etc file with the same name. If such a file exists, it is automatically included in the current compilation. Note that it is not separately compiled. This compiler behavior means that some source code organizations won't work with our compiler.
For example, the compilation of the following driver program fails with
Multiple declaration
of a variable. The driver program calls a function template
multiply
, which in turn calls another function template
Array
. The definitions of
multiply
and
Array
are in different source (.cpp) files. Since there is a dependency between
Array
and
multiply
function templates, the compiler tries to include both source files into the compilation unit and hence the failure.
%cat array.h
#ifndef _ARRAY_H_
#define _ARRAY_H_
const int ArraySize = 20;
template <class Type> class Array {
private:
Type* data;
int size;
public:
Array(int sz=ArraySize);
int GetSize();
};
#endif // _ARRAY_H_
% cat array.cpp
static const char file_id [ ] = "$Header: array.cpp 1 04/05/05 1:35p Giri $";
#include "array.h"
template <class Type> Array<Type>::Array(int sz) {
size = sz;
data = new Type[size];
}
template <class Type> int Array<Type>::GetSize() {
return size;
}
% cat multiply.h
#include "array.h"
int AnyNumber;
template <class Number>
Number multiply(Number original);
% cat multiply.cpp
static const char file_id [ ] = "$Header: multiply.cpp 1 04/05/05 1:35p Giri $" ;
template <class Number>
Number multiply( Number original ) {
Array<int> IntArray;
int size = IntArray.GetSize();
return (size * original);
}
% cat driver.cpp
#include "multiply.h"
#include <stdio.h>
int main( ) {
printf("\n ** %d **\n", multiply(50));
}
% CC -o driver driver.cpp
"array.cpp", line 1: Error: file_id is initialized twice.
"array.cpp", line 1: Error: Multiple declaration for file_id.
2 Error(s) detected.
% truss -f -o truss.log CC -o driver driver.cpp
% cat truss.log | egrep "multiply|array"
24813: open("multiply.h", O_RDONLY) = 6
24813: open("multiply.cpp", O_RDONLY) = 5
24813: open("array.h", O_RDONLY) = 6
24813: open("array.cpp", O_RDONLY) = 5
Note that both
multiply.cpp
and
array.cpp
are syntactically correct; but the problem can be seen only during the compilation of
multiply
routine. The above mentioned behavior of the compiler has been documented in Sun C++ User's Guide,
Compiling Templates chapter.
Sun C++ User's Guide suggests employing a
definitions separate template compilation model. This model can better be described as follows: If file x.h has any template declarations, a file called x.cc or x.C or x.cpp, etc must contain definitions of those templates, and nothing else; no #include directives, no definitions of anything other than the templates declared in x.h.
To comply with the definitions separate template compilation model,
array.cpp
and
multiply.cpp
files can be modified as follows:
% cat array.cpp
template <class Type> Array<Type>::Array(int sz) {
size = sz;
data = new Type[size];
}
template <class Type> int Array<Type>::GetSize() {
return size;
}
% cat multiply.cpp
template <class Number>
Number multiply( Number original ) {
Array<int> IntArray;
int size = IntArray.GetSize();
return (size * original);
}
Now the compilation of
driver.cpp
should succeed, due to the implementation of definitions separate template compilation model.
% CC -o driver driver.cpp
%./driver
** 1000 **
As we can see, the compilation succeeds and the driver program prints the expected result on console.
If the source code organization does not follow this model, you can use the compiler option
-template=no%extdef
. This option tells the compiler not to look for template definitions in associated files. With this compiler option, the
compilation succeeds, but the linking may fail. For example, compiling the original source files with
-template=no%extdef
compiler option, fails during linking phase with the following error:
% CC -o driver -template=no%extdef driver.cpp
Undefined first referenced
symbol in file
__type_0 multiply(__type_0) driver.o
ld: fatal: Symbol referencing errors. No output written to driver
Moral of the story: rely on "definitions separate template compilation" model as suggested by the documentation, but not on temporary workarounds.
If source code changes are not feasible, carefully guarding the common interfaces, variable names etc., with
#ifdef
directives will do the trick and the compilation and eventually linking succeds.
Acknowledgements:
Steve Clamage of Sun Microsystems
Thursday, May 19, 2005
Solaris: hijacking a function call (interposing)
Sometimes it is necessary to alter the functionality of a routine, or collect some data from a malfunctioning routine, for debugging. It works well, as long as we have the access to source code. But what if we don't have access to source code or changes to the source code is not feasible? With dynamic libraries, it is very easy to intercept any call to a routine of choice, and can do whatever we wish to do in that routine, including calling the real routine the client intended to call.
In simple words, the hacker (who writes the interposing library, in this context) writes a new library with the exact interfaces of the routines, that (s)he wish to intercept, and preloads the new library before starting up the application. It works well, as long as the targeted interfaces are not protected. On Solaris, with linker's
-Bsymbolic
option or Sun Studio compiler's
-xldscope=symbolic
option, all symbols of a library can be made non-interposable (those symbols are called
protected symbols, since no one else can interpose on them). If the targeted routine is interposable, dynamic linker simply passes the control to whatever symbol it encounters first, that matches the function call (callee). Now with the preloaded library in force, hacker gets control over the routine. At this point, it is upto the hacker whether to pass the control to the actual routine that the client is intended to call. If the intention is just to collect data and let go, the required data can be collected and the control will be passed to the actual routine with the help of
libdl
routines. Note that the control has to be passed explicitly to the actual routine; and as far as dynamic linker is concerned, it is done with its job once it passes the control to the function (interposer in this case). If the idea is to completely change the behavior of the routine (easy to write a new routine with the new behavior, but the library and the clients have to be re-built to make use of the new routine), the new implementation will be part of the interposing routine and the control will never be passed to the actual routine. Yet in worst cases, a malicious hacker can intercept data that is supposed to be confidential (eg., passwords, account numbers etc.,) and may do more harm at his wish.
[Off-topic] To guard against such attacks, it is recommended to make most of the symbols local in scope, with the help of linker supported map files or compiler supported linker scoping mechanism. Read
http://developers.sun.com/tools/cc/articles/symbol_scope.html to learn more about linker scoping.
The above mentioned technique is commonly referred as
library interposition; and as we can see it is quite useful for debugging, collecting run-time data, and for performance tuning of an application.
It would be more interesting to see some interceptor in action. So, let's build a very small library with only one routine
fopen()
. The idea is to collect the number of calls to
fopen()
and to find out the files being opened. Our interceptor, simply prints a message on the console with the file name to be opened, everytime there is a call to
fopen()
from the application. Then it passes the control to
fopen()
routine of
libc
. For this, first we need to get the signature of
fopen()
.
fopen()
is declared in
stdio.h
as follows:
FILE *fopen(const char *filename, const char *mode);
Here is the source code for the interposer:
% cat interceptfopen.c
#include <stdio.h>
#include <dlfcn.h>
FILE *fopen(const char *filename, const char *mode) {
FILE *fd = NULL;
static void *(*actualfunction)();
if (!actualfunction) {
actualfunction = (void *(*)()) dlsym(RTLD_NEXT, "fopen");
}
printf("\nfopen() has been called. file name = %s, mode = %s \n
Forwarding the control to fopen() of libc", filename, mode);
fd = actualfunction(filename, mode);
return(fd);
}
% cc -G -o libfopenhack.so interceptfopen.c
% ls -lh libfopenhack.so
-rwxrwxr-x 1 build engr 3.7K May 19 19:02 libfopenhack.so*
actualfunction
is a function pointer to the actual
fopen()
routine, which is in
libc
.
dlsym
is part of
libdl
and the
RTLD_NEXT
argument directs the dynamic linker (
ld.so.1
) to find the next reference to the specified function, using the normal dynamic linker search sequence.
Let's proceed to write a simple C program, that writes and reads a string to and from a file.
% cat fopenclient.c
#include <stdio.h>
int main () {
FILE * pFile;
char string[30];
pFile = fopen ("myfile.txt", "w");
if (pFile != NULL) {
fputs ("Some Random String", pFile);
fclose (pFile);
}
pFile = fopen ("myfile.txt", "r");
if (pFile != NULL) {
fgets (string , 30 , pFile);
printf("\nstring = %s", string);
fclose (pFile);
} else {
perror("fgets(): ");
}
return 0;
}
% cc -o fopenclient fopenclient.c
% ./fopenclient
string = Some Random String
With no interceptor, everything works as expected. Now let's introduce the interceptor and collect the data, during run-time.
% setenv LD_PRELOAD ./libfopenhack.so
% ./fopenclient
fopen() has been called. file name = myfile.txt, mode = w
Forwarding the control to fopen() of libc
fopen() has been called. file name = myfile.txt, mode = r
Forwarding the control to fopen() of libc
string = Some Random String
%unsetenv LD_PRELOAD
As we can see from the above output, the interceptor received the calls to
fopen()
, instead of the actual implementation in
libc
. And the advantages of this technique is evident from this simple example, and it is up to the hacker to take advantage or abuse the flexibility of symbol interposition.
Suggested Reading:
- Debugging and Performance Tuning with Library interposers
- Profiling and Tracing Dynamic Library Usage Via Interposition
Wednesday, May 18, 2005
Sun C/C++: Reducing symbol scope with Linker Scoping feature
This article was published on Sun developer's portal at:
http://developers.sun.com/tools/cc/articles/symbol_scope.htmlI have been working on this article for more than 3 months, and glad to learn quite a few new things, from the extensive feedback of
Lawrence Crowl and Steve Clamage, of Sun C/C++ compiler team.
Keywords:
Linker Scoping, Global, Symbolic, Hidden, __global, __symbolic, __hidden, __declspec, dllexport, dllimport, xldscope, xldscoperef, linker map files
Friday, May 13, 2005
Csh: Arguments too long
error
Symptom:
C-shell fails to execute commands with arguments using wildcard characters.
eg.,
% \rm -rf *
Arguments too long
% ls -l | wc
8462 76151 550202
The reason for this failure is that the wildcard has exceeded C-shell limitation(s). The command in this example is evaluating to a very long string. It overwhelmed the
csh
limit of 1706, for the maximum number of arguments to a command for which filename expansion applies.
Workarounds:
- Use multiple commands Or
- Use
xargs
utility% \rm -rf *
Arguments too long
% ls | xargs rm -rf
% ls
%
Or% \rm -rf *
Arguments too long
% find . -name "*" | xargs rm -rf
% ls
%
From Jerry Peek's
Handle Too-Long Command Lines with xargs:
xargs
reads a group of arguments from its standard input, then runs a UNIX command with that group of arguments. It keeps reading arguments and running the command until it runs out of arguments.
- The shell's backquotes do the same kind of thing, but they give all the arguments to the command at once. That's the main reason for the
Arguments too long
error, when the shell reaches its limitations
Reference:
Man page of
csh
Thanks to Chris Quenelle for suggesting the
xargs
workaround
Thursday, May 12, 2005
CPU hog with connections in CLOSE_WAIT
Couple of days back I got a call at our partner's site, to look into an issue where one process (server) is hogging all the processing power with absolutely no load on the server. The server process is running on Solaris 9.
% prstat 1 1
PID USERNAME SIZE RSS STATE PRI NICE TIME CPU PROCESS/NLWP
2160 QAtest 941M 886M cpu0 0 0 80:03:57 99% myserver/41
28352 patrol 7888K 6032K sleep 59 0 4:49:37 0.1% bgscollect/1
24720 QAtest 1872K 1656K cpu3 59 0 0:00:00 0.0% prstat/1
59 root 4064K 3288K sleep 59 0 0:27:56 0.0% picld/6
2132 QAtest 478M 431M sleep 59 0 0:15:45 0.0% someserver.exe/901
I started off with my favorite tool
truss
, and found that the
recv()
system call is being called tons of times with no corresponding
send()
.
% truss -c -p 2160
^Csyscall seconds calls errors
time .001 115
lwp_park .001 51 24
lwp_unpark .000 23
poll .002 34
recv 61.554 2512863
-------- ------ ----
sys totals: 61.561 2513086 24
usr time: 12.008
elapsed: 68.350
Interestingly the return value of all
recv()
calls is 0 (EOF). A return value of 0 is an indication that the the other end has nothing more to write and ready to close the socket (connection).
% head /tmp/truss.log
2160/216: recv(294, 0x277C9410, 32768, 0) = 0
2160/222: recv(59, 0x1F4CB410, 32768, 0) = 0
2160/216: recv(294, 0x277C9410, 32768, 0) = 0
2160/222: recv(59, 0x1F4CB410, 32768, 0) = 0
2160/216: recv(294, 0x277C9410, 32768, 0) = 0
2160/222: recv(59, 0x1F4CB410, 32768, 0) = 0
2160/216: recv(294, 0x277C9410, 32768, 0) = 0
2160/222: recv(59, 0x1F4CB410, 32768, 0) = 0
2160/216: recv(294, 0x277C9410, 32768, 0) = 0
2160/222: recv(59, 0x1F4CB410, 32768, 0) = 0
2160/216: recv(294, 0x277C9410, 32768, 0) = 0
A typical recv() call will be like this:
recv(55, 0x05CCB010, 4096, 0) = 2958
Then collected the network statistics, and found quite a number of connections in CLOSE_WAIT state
% netstat -an
...
127.0.0.1.54356 127.0.0.1.9810 49152 0 49152 0 ESTABLISHED
127.0.0.1.9810 127.0.0.1.54356 49152 0 49152 0 ESTABLISHED
127.0.0.1.54687 127.0.0.1.9810 49152 0 49152 0 ESTABLISHED
127.0.0.1.9810 127.0.0.1.54687 49152 0 49152 0 ESTABLISHED
...
127.0.0.1.9710 127.0.0.1.55830 49152 0 49152 0 CLOSE_WAIT
127.0.0.1.9810 127.0.0.1.57701 49152 0 49152 0 CLOSE_WAIT
127.0.0.1.9710 127.0.0.1.59209 49152 0 49152 0 CLOSE_WAIT
127.0.0.1.9810 127.0.0.1.60694 49152 0 49152 0 CLOSE_WAIT
127.0.0.1.9810 127.0.0.1.61133 49152 0 49152 0 CLOSE_WAIT
127.0.0.1.9810 127.0.0.1.61136 49152 0 49152 0 CLOSE_WAIT
...
(Later realized that these half-closed socket connections have been lying there for more than two days).
2160/216: recv(294, 0x277C9410, 32768, 0) = 0 <- from truss
The next step is to find out the state of the network connection, with socket id: 294.
pfiles
utility of Solaris, reports the information for all open files in each process. It makes sense to use this utility, as the socket descriptor is nothing, but a file id. (On UNIX, everything is mapped to a file including the raw devices)
% pfiles 2160
2160: /export/home/QAtest/572bliss/web/bin/myserver
Current rlimit: 1024 file descriptors
...
294: S_IFSOCK mode:0666 dev:259,0 ino:35150 uid:0 gid:0 size:0
O_RDWR
sockname: AF_INET 127.0.0.1 port: 9710
peername: AF_INET 127.0.0.1 port: 59209
Now it is fairly easy to identify the connection with the port numbers reported in
pfiles
output
% netstat -an | grep 59209
127.0.0.1.9710 127.0.0.1.59209 49152 0 49152 0 CLOSE_WAIT
A closer look at the other socket ids from
truss
indicated that the server is continuously trying to read data from connections that are in CLOSE_WAIT state. Here are the corresponding statistics for TCP:
% netstat -s
TCP tcpRtoAlgorithm = 4 tcpRtoMin = 400
tcpRtoMax = 60000 tcpMaxConn = -1
tcpActiveOpens =4593219 tcpPassiveOpens =2259153
tcpAttemptFails =4036987 tcpEstabResets = 20254
tcpCurrEstab = 75 tcpOutSegs =1264739589
tcpOutDataSegs =645683085 tcpOutDataBytes =1480883468
tcpRetransSegs =682053 tcpRetransBytes =759804724
tcpOutAck =618848538 tcpOutAckDelayed =40226142
tcpOutUrg = 351 tcpOutWinUpdate =155203
tcpOutWinProbe = 3278 tcpOutControl =18622247
tcpOutRsts =8970930 tcpOutFastRetrans = 60772
tcpInSegs =1622143125
tcpInAckSegs =443838358 tcpInAckBytes =1459391481
tcpInDupAck =3254927 tcpInAckUnsent = 0
tcpInInorderSegs =1462796453 tcpInInorderBytes =550228772
tcpInUnorderSegs = 12095 tcpInUnorderBytes =10680481
tcpInDupSegs = 60814 tcpInDupBytes =30969565
tcpInPartDupSegs = 29 tcpInPartDupBytes = 19498
tcpInPastWinSegs = 66 tcpInPastWinBytes =102280302
tcpInWinProbe = 2142 tcpInWinUpdate = 3092
tcpInClosed = 1218 tcpRttNoUpdate =391989
tcpRttUpdate =441925010 tcpTimRetrans =185795
tcpTimRetransDrop = 456 tcpTimKeepalive = 8077
tcpTimKeepaliveProbe= 3054 tcpTimKeepaliveDrop = 0
tcpListenDrop = 18265 tcpListenDropQ0 = 0
tcpHalfOpenDrop = 0 tcpOutSackRetrans =255744
Apparently one end of the connection (at server, in this scenario) ignored the 0 length read (EOF) and trying to read the data from the connection as if it is still a
duplex connection.
But how to check if the other end has really closed the connection?
According to man page of
recv
:
Upon successful completion, recv() returns the length of the message in bytes. If no messages are available to be received and the peer has performed an orderly shutdown, recv() returns 0. Otherwise, -1 is returned and errno is set to indicate the error.
So, a simple check on the return value of
recv()
would do. Just to make sure that the other end is really intended to close the connection, but not sending null strings (very unlikely though), try this: after a series of EOFs (ie., return value 0) from
recv()
, try to write some data to the socket. It would result in a "connection reset" (ECONNRESET) error. A subsequent (second) write results in a "broken pipe" (EPIPE) error. Then it is safe to assume that the other end has closed the connection.
I just suggested the responsible engineer to check the return value of
recv()
and close the connection when it is safe to do so (see above).
About CLOSE_WAIT state:
CLOSE_WAIT state means the other end of the connection has been closed while the local end is still waiting for the application to close. That's normal. But an indefinite CLOSE_WAIT state normally indicates some application level bug. TCP connections will move to the CLOSE_WAIT state from the ESTABLISHED state after receiving a FIN from the remote system but before a close has called from the local application.
The CLOSE_WAIT state signifies that the endpoint has received a FIN from the peer, indicating that the peer has finished writing ie., it has no more data to send. This will be indicated by a 0 length read on the input. The connection is now half-closed or a
simplex connection (one way) the receiver of the FIN still has the option of writing more data. The state can persist indefinitely as a it is perfectly valid, synchronized tcp state. The peer should be in FIN_WAIT_2 (i.e. sent fin, received ack, waiting for fin). It's only an application's fault, if the it ignores the EOF (0 length read) and persists as if the connection is still a duplex connection.
Note that an application that only intends to receive data and not send any, might close its end of the connection, which leaves the other end in CLOSE_WAIT until the process at that end is done sending data and issues a close. (But that's not the case in this scenario.)
State diagram for the closing phase of a TCP connection:
Server Client
| Fin |
CLOSE_WAIT|<-------------- | FIN_WAIT_1
| |
| Ack |
|--------------->| FIN_WAIT_2
| |
| |
| |
| |
| |
| |
| Fin |
LAST_ACK |--------------->| TIME_WAIT
| |
| Ack |
|<-------------- |
CLOSED | |
| |
Reference:
Sun Alert document:
TCP: Why do I have tcp connections in the CLOSE_WAIT state?Suggested reading:
RFC 793 Transmission Control protocol
Saturday, May 07, 2005
Solaris: Mounting a CD-ROM manually
- Get the device name in
cxtydzsn
format, associated with the CD drive
- Mount the device associated with CD-ROM
% mount -F hsfs -o ro /dev/dsk/c1t0d0s0 /cdrom
- ensure the existence of the mount point
/cdrom
, before running the mount
command
-F
specifies the type of the file system on which to operate. High Sierra File System (HSFS) is the file system for CD-ROM
-o
specifies the file system options. ro
stands for read-only. Default is rw
(read-write).
- Check the file system
Wednesday, May 04, 2005
C/C++: global const
variables, symbol collisions & symbolic scoping
(Most of the following is "generic" C/C++. Sun Studio compilers were used to compile the code and to propose a solution to symbol collision problem)
The way C++ handles global
const
variables is different from C.
In C++, a global const variable that is not explicitly declared
extern
has static linkage.
In C, global const variables will have
extern
linkage by default, and global variables can be declared more than once. As long as a single initialization (at most) for the same variable is used, the linker resolves all the repeated declarations into a single entity; and the the initialization takes place when the program starts up, before entry to the main function.
This can be illustrated with a simple C program, that produces different results when compiled with C and C++ compilers
% cat mylib.h
const float libraryversion = 2.2;
float getlibversion();
int checklibversion();
% cat mylib.c
#include <stdio.h>
#include "mylib.h"
float getlibversion() {
printf("\nmylib.c: libraryversion = %f", libraryversion);
return (libraryversion);
}
int checklibversion() {
float ver;
ver = getlibversion();
printf("\nmylib.c: ver = %f", ver);
if (ver < 2.0) {
return (1);
} else {
return (0);
}
}
% cat thirdpartylib.h
extern const float libraryversion = 1.5;
float getlibversion();
% cat thirdpartylib.c
#include <stdio.h>
#include "thirdpartylib.h"
float getlibversion() {
printf("\nthirdparty.c: libraryversion = %f", libraryversion);
return (libraryversion);
}
% cat versioncheck.c
#include <stdio.h>
#include "mylib.h"
int main() {
printf("\n** versioncheck.c: libraryversion = %f", libraryversion);
int retval = 0;
retval = checklibversion();
if (retval) {
printf("\n** Obsolete version being used .. Can\'t proceed further! **\n");
} else {
printf("\n** Met the library version requirement .. Good to Go! ** \n");
}
return (0);
}
Case 1:
Compile with Sun Studio C compiler:
% cc -G -o libmylib.so mylib.c
% cc -G -o libthirdparty.so thirdpartylib.c
% cc -o vercheck -lthirdparty -lmylib versioncheck.c
% ./vercheck
** versioncheck.c: libraryversion = 2.200000
thirdparty.c: libraryversion = 2.200000
mylib.c: ver = 2.200000
** Met the library version requirement .. Good to Go! **
From this output, it appears that it is working as expected although there is a symbol collision between
libmylib
and
libthirdparty
load modules over
libraryversion
symbol.
Case 2:
Compile with Sun Studio C++ compiler:
% CC -G -o libmylib.so mylib.c
% CC -G -o libthirdparty.so thirdpartylib.c
% CC -o vercheck -lthirdparty -lmylib versioncheck.c
% ./vercheck
** versioncheck.c: libraryversion = 2.200000
thirdparty.c: libraryversion = 1.500000
mylib.c: ver = 1.500000
** Obsolete version being used .. Can't proceed further! **
The inherent symbol collision was exposed when the code was compiled with C++ compiler.
It is a known fact that the global
const
variables, as
libraryversion
in this example are bound to cause problems.
The following is an alternative implementation of the above example, which shows consistent behavior when compiled with C and C++ compilers.
% cat mylib_public.h
float getlibversion();
int checklibversion();
% cat mylib_private.h
#include "mylib_public.h"
const float libversion = 2.2;
% cat mylib.c
#include <stdio.h>
#include "mylib_private.h"
float getlibversion() {
printf("\nmylib.c: libraryversion = %f", libraryversion);
return (libraryversion);
}
int checklibversion() {
float ver;
ver = getlibversion();
printf("\nmylib.c: ver = %f", ver);
if (ver < 2.0) {
return (1);
} else {
return (0);
}
}
% cat versioncheck.c
#include <stdio.h>
#include "mylib_public.h"
int main() {
int retval = 0;
retval = checklibversion();
if (retval) {
printf("\n** Obsolete version being used .. Can\'t proceed further! **\n");
} else {
printf("\n** Met the library version requirement .. Good to Go! ** \n");
}
return (0);
}
Since we cannot control 3rd party implementation, it was kept intact in this example.
Case 1:
Compile with Sun Studio C compiler:
% cc -G -o libmylib.so mylib.c
% cc -G -o libthirdparty.so thirdpartylib.c
% cc -o vercheck -lthirdparty -lmylib versioncheck.c
% ./vercheck
thirdparty.c: libraryversion = 1.500000
mylib.c: ver = 1.500000
** Obsolete version being used .. Can't proceed further! **
Case 2:
Compile with Sun Studio C++ compiler:
% CC -G -o libmylib.so mylib.c
% CC -G -o libthirdparty.so thirdpartylib.c
% CC -o vercheck -lthirdparty -lmylib versioncheck.c
% ./vercheck
thirdparty.c: libraryversion = 1.500000
mylib.c: ver = 1.500000
** Obsolete version being used .. Can't proceed further! **
Now with the new implementation, the behavior of the code is the same and as expected with both C and C++ compilers.
The final paragraph proposes a solution common to both C and C++, to resolve the symbol collision. With C++, symbol collisions can be minimized using namespaces.
symbolic (protected) scopeAll symbols of a library get symbolic scope, when the library was built with Sun Studio's
-xldscope=symbolic
compiler option.
Symbolic scoping is more restrictive than global linker scoping; all references within a library that match definitions within the library will bind to those definitions. Outside of the library, the symbol appears as though it was global. That is, at first the link-editor tries to find the definition of the symbol being used in the same shared library. If found the symbol will be bound to the definition during link time; otherwise the search continues outside the library as the case with global symbols. This explanation holds good for functions, but for variables, there is an extra complication of copy relocations.
Let's see how symbolic scope works practically, by compiling the same code again with
-xldscope=symbolic
option.
Case 1:
Compile with Sun Studio C compiler:
% cc -G -o libmylib.so -xldscope=symbolic mylib.c
% cc -G -o libthirdparty.so thirdpartylib.c
% cc -o vercheck -lthirdparty -lmylib versioncheck.c
% ./vercheck
mylib.c: libraryversion = 2.200000
mylib.c: ver = 2.200000
** Met the library version requirement .. Good to Go! **
Case 2:
Compile with Sun Studio C++ compiler:
% CC -G -o libmylib.so -xldscope=symbolic mylib.c
% CC -G -o libthirdparty.so thirdpartylib.c
% CC -o vercheck -lthirdparty -lmylib versioncheck.c
% ./vercheck
mylib.c: libraryversion = 2.200000
mylib.c: ver = 2.200000
** Met the library version requirement .. Good to Go! **
With symbolic (protected) scoping, the reference to the symbol
libraryversion
was bound to its definition within the load module
libmylib
and the program showed the intended behavior.
However the main drawback of
-xldscope=symbolic
is that, it may interpose the implementation symbols of C++. These implementation interfaces often must remain global within a group of similar dynamic objects, as one interface must interpose on all the others for the correct execution of the application. Due to this, the use of
-xldscope=symbolic
is strongly discouraged.
Sun Studio compilers (8 or later versions) provide a declaration specifier called
__symbolic
and using
__symbolic
specifier with symbols that needs to have symbolic scope (protected symbols) is recommended.
Monday, May 02, 2005
C/C++: Printing Stack Trace with printstack() on Solaris
libc
on Solaris 9 and later, provides a useful function called
printstack
, to print a symbolic stack trace to the specified file descriptor. This is useful for reporting errors from an application during run-time.
If the stack trace appears corrupted, or if the stack cannot be read,
printstack()
returns -1.
Programmatic example:
% more printstack.c
#include <stdio.h>
#include <ucontext.h>
int callee(int file) {
printstack(file);
return (0);
}
int caller() {
int a;
a = callee (fileno(stdout));
return (a);
}
int main() {
caller();
return (0);
}
% cc -o stacktrace stacktrace.c
% ./stacktrace
/tmp/stacktrace:callee+0x18
/tmp/stacktrace:caller+0x22
/tmp/stacktrace:main+0x14
/tmp/stacktrace:0x6d2
The
printstack()
function uses
dladdr1()
to obtain symbolic symbol names. As a result, only global symbols are reported as symbol names by
printstack()
.
% CC -o stacktrace stacktrace.c
% ./stacktrace
/tmp/stacktrace:__1cGcallee6Fi_i_+0x18
/tmp/stacktrace:__1cGcaller6F_i_+0x22
/tmp/stacktrace:main+0x14
/tmp/stacktrace:0x91a
The stack trace from a C++ program, will have all the symbols in their mangled form. So as of now, the programmers may need to have their own wrapper functions to print the stack trace in unmangled form.
There has been an RFE (Request For Enhancement) in place against Solaris'
libc
to print the stack trace in unmangled form, when
printstack()
has been called from a C++ program. This will be released as a
libc
patch for Solaris
8, 9 & 10 some time in the near future.
% elfdump -CsN.symtab libc.so | grep printstack
[5275] 0x00052629 0x00000051 FUNC GLOB D 0 .text _printstack
[6332] 0x00052629 0x00000051 FUNC WEAK D 0 .text printstack
Since the object code is automatically linked with
libc
during the creation of an executable or a dynamic library, the programmer need not specify
-lc
on the compile line.
Suggested Reading:
Man page of
walkcontext
or
printstack
2004-2019 |
|
|