Mandalika's scratchpad [ Work blog @Oracle | My Music Compositions ]

Old Posts: 09.04  10.04  11.04  12.04  01.05  02.05  03.05  04.05  05.05  06.05  07.05  08.05  09.05  10.05  11.05  12.05  01.06  02.06  03.06  04.06  05.06  06.06  07.06  08.06  09.06  10.06  11.06  12.06  01.07  02.07  03.07  04.07  05.07  06.07  08.07  09.07  10.07  11.07  12.07  01.08  02.08  03.08  04.08  05.08  06.08  07.08  08.08  09.08  10.08  11.08  12.08  01.09  02.09  03.09  04.09  05.09  06.09  07.09  08.09  09.09  10.09  11.09  12.09  01.10  02.10  03.10  04.10  05.10  06.10  07.10  08.10  09.10  10.10  11.10  12.10  01.11  02.11  03.11  04.11  05.11  07.11  08.11  09.11  10.11  11.11  12.11  01.12  02.12  03.12  04.12  05.12  06.12  07.12  08.12  09.12  10.12  11.12  12.12  01.13  02.13  03.13  04.13  05.13  06.13  07.13  08.13  09.13  10.13  11.13  12.13  01.14  02.14  03.14  04.14  05.14  06.14  07.14  09.14  10.14  11.14  12.14  01.15  02.15  03.15  04.15  06.15  09.15  12.15  01.16  03.16  04.16  05.16  06.16  07.16  08.16  09.16  12.16  01.17  02.17  03.17  04.17  06.17  07.17  08.17  09.17  10.17  12.17  01.18  02.18  03.18  04.18  05.18  06.18  07.18  08.18  09.18  11.18  12.18  01.19  02.19  05.19  06.19  08.19  10.19  11.19  05.20  10.20  11.20  12.20  09.21  11.21  12.22 


Wednesday, January 12, 2005
 
C/C++: External linkage with extern "C"

[Updated: 04/07/2006] Much accurate information is available in a better format at:
Mixed-Language Programming and External Linkage
__________________

It is a common practice to mix code written in one programming language with 
code written in another. But the developer needs to take some additional care to
make such programs work; else the compilation endup with link errors about unresolved
symbols. Let's discuss the problem(s) & solution(s) of mixing code written in
different programming languages with a simple example

Assume that we're writing C++ code and wish to call a C function from C++ code

bpte4500s001:/sunbuild1/giri/testcases/%cat greet.h
char *greet();

bpte4500s001:/sunbuild1/giri/testcases/%cat greet.c
#include "greet.h"

char *greet()
{
return ((char *) "Hello!");
}

bpte4500s001:/sunbuild1/giri/testcases/%cc -G -o libgreet.so greet.c

bpte4500s001:/sunbuild1/giri/testcases/%ls -l libgreet.so
-rwxrwxr-x 1 build engr 2788 Jan 12 12:21 libgreet.so*

Let's try to call the C function "greet()" from a C++ program

bpte4500s001:/sunbuild1/giri/testcases/%cat mixedcode.cpp
#include

extern char *greet();

int main() {
char *greeting = greet();
cout << greeting << "\n";
return (0);
}

Note:
The "extern" keyword declares a variable or function and specifies that it has
external linkage i.e., its name is visible from files other than the one in which
it's defined

bpte4500s001:/sunbuild1/giri/testcases/%CC -lgreet mixedcode.cpp
Undefined first referenced
symbol in file
char*greet() mixedcode.o
ld: fatal: Symbol referencing errors. No output written to a.out

Though the C++ code is linked with the dynamic library "libgreet.so" which holds the
implementation for greet(), the linking failed with undefined symbol error. What
went wrong?

The reason for the link error is that a typical C++ compiler mangles (encrypts) some
of the symbols (for eg., function name) to support Function Overloading. So the
symbol "greet" will be changed to something else depending on the algorithm
implemented in compiler during symbol mangling process and the object file will not
be having the symbol "greet" anywhere. Symbol table section of mixedcode.o object
file confirms this. Lets have a look at the symbol tables of libgreet.so &
mixedcode.o:

bpte4500s001:/sunbuild1/giri/testcases/%elfdump -s libgreet.so

Symbol Table Section: .symtab
index value size type bind oth ver shndx name
...
[1] 0x00000000 0x00000000 FILE LOCL D 0 ABS libgreet.so
...
[37] 0x00000268 0x00000004 OBJT GLOB D 0 .rodata _lib_version
[38] 0x000102f3 0x00000000 OBJT GLOB D 0 .data1 _edata
[39] 0x00000228 0x00000028 FUNC GLOB D 0 .text greet
[40] 0x0001026c 0x00000000 OBJT GLOB D 0 .dynamic _DYNAMIC

bpte4500s001:/sunbuild1/giri/testcases/%elfdump -s mixedcode.o

Symbol Table Section: .symtab
index value size type bind oth ver shndx name
[0] 0x00000000 0x00000000 NOTY LOCL D 0 UNDEF
[1] 0x00000000 0x00000000 FILE LOCL D 0 ABS mixedcode.cpp
[2] 0x00000000 0x00000000 SECT LOCL D 0 .rodata
[3] 0x00000000 0x00000000 FUNC GLOB D 0 UNDEF
__1cDstd2l6Frn0ANbasic_ostream4Ccn0ALchar_traits4Cc____pkc_2_

[4] 0x00000000 0x00000000 FUNC GLOB D 0 UNDEF __1cFgreet6F_pc_
[5] 0x00000000 0x00000000 NOTY GLOB D 0 UNDEF __1cDstdEcout_
[6] 0x00000010 0x00000050 FUNC GLOB D 0 .text main
[7] 0x00000000 0x00000000 NOTY GLOB D 0 ABS __fsr_init_value

bpte4500s001:/sunbuild1/giri/testcases/%dem __1cFgreet6F_pc_

__1cFgreet6F_pc_ == char*greet()

char*greet() has been mangled to __1cFgreet6F_pc_ by the Sun Studio 9 C++ compiler.
That's why the static linker (ld) couldn't match the symbol in the object file.
What's the solution to this problem?

The solution to this problem is to disable name mangling, so that we can call
external C functions from C++ code. This can be done by prepending extern "C" to the
signature of the function to be called from C++ code.

syntax:
extern "C" <function declaration>

Or if we have more than one C function to be called from C++, put the
function signatures within a extern "C" block

extern "C" {
<function declaration>
<function declaration>
...
<function declaration>
}

The linkage directive extern "C" tells the compiler to inhibit the default encoding
(name mangling) of a function name for a particular function

Notes:
1) A function declared as extern "C" cannot be overloaded
2) extern "C" declaration can only be applied to global functions
3) extern "C" declaration must always be after the last include
4) It is possible to use a linkage directive with all the functions in a file. This
is useful if we wish to use C library functions in a C++ program

extern "C" {
#include "mylibrary.h"
}

Please do not use extern "C" when including standard C header files because these
header files already contain extern "C" directives

So let's modify the source of mixedcode.cpp a bit, and recompile the program

bpte4500s001:/sunbuild1/giri/testcases/%cat mixedcode.cpp
#include

extern "C" char *greet();

int main() {
char *greeting = greet();
cout << greeting << "\n";
return (0);
}

bpte4500s001:/sunbuild1/giri/testcases/%CC -lgreet mixedcode.cpp
bpte4500s001:/sunbuild1/giri/testcases/%./a.out
Hello!

It works!! Let's have a look at the symbol table of mixedcode.o again

bpte4500s001:/sunbuild1/giri/testcases/%CC -c -lgreet mixedcode.cpp
bpte4500s001:/sunbuild1/giri/testcases/%elfdump -s mixedcode.o

Symbol Table Section: .symtab
index value size type bind oth ver shndx name
[0] 0x00000000 0x00000000 NOTY LOCL D 0 UNDEF
[1] 0x00000000 0x00000000 FILE LOCL D 0 ABS mixedcode.cpp
[2] 0x00000000 0x00000000 SECT LOCL D 0 .rodata
[3] 0x00000000 0x00000000 FUNC GLOB D 0 UNDEF
__1cDstd2l6Frn0ANbasic_ostream4Ccn0ALchar_traits4Cc____pkc_2_

[4] 0x00000000 0x00000000 FUNC GLOB D 0 UNDEF greet
[5] 0x00000000 0x00000000 NOTY GLOB D 0 UNDEF __1cDstdEcout_
[6] 0x00000010 0x00000050 FUNC GLOB D 0 .text main
[7] 0x00000000 0x00000000 NOTY GLOB D 0 ABS __fsr_init_value

As expected, the function name "greet" was not mangled by the C++ compiler and hence
the linker could find the symbol in the object file and able to build the executable

Please note that extern "C" declaration do not specify the details of what must be
done to allow C & C++ to be mixed. Name mangling is commonly part of the problem to
be solved, but it is only a part. There are certain other issues with mixing
languages and needs additional steps to resolve those issues. For example, on some
systems, C & C++ functions are called in different ways. If the declaration and
definition don't match, the program may crash or show abnormal behavior. For a broad
description, other issues/solutions etc., please read "Linkage Specification" of "The
C++ Programming Language"

Suggested Reading:
1) "Linkage Specification" of "The C++ Programming Language"
2) C++ name mangling - http://technopark02.blogspot.com/2004/11/c-name-mangling.html



Monday, January 10, 2005
 
Life cycle of a C/C++ program

1) Write the program


eg.,

int main (int argc, char *argv[]) {
printf ("Hello World");
}

2) Compile the program

When the program gets compiled, the compiler realizes that the current compile unit (ie., the simple C program in this case) has no implementation for printf(), and therefore produces an entry in the object file's symbol table saying that printf() has an `unresolved reference'. And the compiler generates an object file (.o file) if there were no syntax errors in the program

3) Link the object file(s)

The next phase is to link the object files to produce an executable. During linking, the static linker (ld) sees the unresolved reference to printf() and searches the available libraries for an implementation for printf(). In general this will be found in the C library (libC on Solaris). Now, the linker has two options:

* It (linker) can take the printf() implementation from the library and copy it into the final executable. The linker then searches the printf() implemenation for other unresolved references, and again consult the libraries for resolution. This process will be performed iteratively until all references to the symbols were resolved. This is known as static linking

* If the C library is realized as a `shared library', the linker can simply put a reference to the C library into the final executable. Still the linker performs symbol resolution checking as above, to determine if the reference to the printf() function necessiates further references to other (shared or nonshared) libraries. This is known as dynamic linking

4) Run the executable

What happens when you run the executable, depends on whether it was linked statically or dynamically:

* A statically linked executable is self contained. It is loaded into memory. The entry point, whose designation is system dependent (for eg, the `__main' symbol) is found and called. This entry function, usually provided by the compiler or a library, performs some setup and initialization and then calls the user-defined main() function and the instructions inside main() function gets executed

* In a dynamically linked executable, after loading the executable binary into memory, the dynamic linker (ld.so.1) takes control first. It reads the library references to dynamic libraries produced by the static linker, and loads them into memory. It then performs symbol resolution again and updates all references to symbols in the shared library to point to their actual location, which can only be determined at runtime, because the shared libraries might be loaded to different memory locations each time the executable binary gets executed

The dynamic linker also has the option to abort the execution if the dynamic linking fails - for example if the shared library has been modified, and a reference known to the static linker isn't available any more



Friday, January 07, 2005
 
2s complement

In general, we (human beings) express negative numbers by placing a minus (-) sign at the left end of the number. Similarly while representing the integers in binary format, we can leave the left-most bit be the sign bit. If the left-most bit is a zero, the integer is positive; if it is a one, it is negative. To make it easy to design computers which do integer arithmetic, integers should obey the following rules:

(1) Zero is positive and -0 = 0
(2) The top-most bit should tell us the sign of the integer.
(3) The negative of a negative integer is the original integer ie., --55 is 55.
(4) x - y should give the same result as x + -y. That is, 8 - 3 should give us the same result as 8 + -3.
(5) Negative and positive numbers shouldn't be treated in different ways when we do multiplication and division with them.

A simple and elegant way to represent integers which obeys these rules is called 2s complement. The 2s complement of an integer is calculated by changing all bits of integer from 1 to 0 & 0 to 1, then adding 1 to the result.

eg., The 2s complement of -55 is 1100 1001

0011 0111 <- binary representation of 55 (8-bit)
1100 1000 <- the 1s complement; change 1's to 0's and 0's to 1's
+1
---------
1100 1001 <- the 2s complement (-55)
---------
Now lets calculate --55 ie., 55

1100 1001 <- binary representation of -55 (8-bit)
0011 0110 <- 1s complement of 55
+1
---------
0011 0111 = 55
---------
Above example verifies rule (3); similarly we can verify rest of the rules


Thursday, January 06, 2005
 
Binary compatibility

What's It?


"Binary compatibility" (BC) is the ability of one machine to run software that was written for another without having to change or recompile the software

BC of an Operating System:

Binary compatibility of an OS is the ability to run application(s) that were built for one version of OS, on later versions of OS without having to change or rebuild the application; but the same application may not run on earlier versions of the operating system

For example, if a software company wants to ship their product for Solaris 8, 9 & 10 platforms, the product should be built on the earliest version of Solaris ie., Solaris 8, so that it will run on all three versions

BC of a Compiler:

The compiler rule is similar. An old binary can be linked into a program built with a newer compiler. But a binary created by a new compiler cannot be linked into a program built with an older compiler

Eg. Company X creates some libraries using Sun Studio 8 and some with Sun Studio 9. Company X can link all of them into a program built with Sun Studio 10. But a library built with Sun Studio 9 might not work in a program built with Sun Studio 8

Why do we need it?

The reason for the compatibility rules is that the software developers are careful to preserve old interfaces in newer releases, but reserve the right to create new interfaces. When the developer builds a newer compiler or a newer OS, the program might depend on a new interface that is not available with older compilers or older OS versions

Acknowledgements:
Steve Clamage, Sun Microsystems



Wednesday, January 05, 2005
 
Solaris: Tips

* To approximate the amount of memory being used by the kernel:
kstat | grep pp_kernel | awk '{ print ($2*8192)/(1024*1024), "MB"; }'

* Kernel memory usage breakdown:
echo "::memstat" | mdb -k (as "root")

Page Summary Pages MB %Tot
------------ ---------------- ---------------- ----
Kernel 122064 953 12%
Anon 147000 1148 15%
Exec and libs 14329 111 1%
Page cache 54188 423 5%
Free (cachelist) 547198 4274 55%
Free (freelist) 115649 903 12%

Total 1000428 7815

* Using alternative ld.so (on Solaris 9 or later versions):
LD_LIBRARY_PATH=/path/as/needed /path/to/alternative/ld.so.1 /path/to/executable

* To force 64-bit linker use:
export LD_ALTEXEC=/usr/ccs/bin/sparcv9/ld

* Stack trace using mdb:
echo '::stack'|mdb core

* Use prstat -m to determine if the system has a memory shortage. Scan processes and check the amount of time they are sleeping on page-in; this is visible via the datafault microstate, visible from prstat -m as DFL. As a rule of thumb, if processes are spending more than 5% of their wall clock time in data faults, then there is a memory shortage. This method gives a true indication how much slower a process is running due to memory pressure



2004-2019 

This page is powered by Blogger. Isn't yours?