Mandalika's scratchpad: 03.06

Mandalika's scratchpad

[ Work blog @Oracle | My Music Compositions ]

Old Posts: 09.04 10.04 11.04 12.04 01.05 02.05 03.05 04.05 05.05 06.05 07.05 08.05 09.05 10.05 11.05 12.05 01.06 02.06 03.06 04.06 05.06 06.06 07.06 08.06 09.06 10.06 11.06 12.06 01.07 02.07 03.07 04.07 05.07 06.07 08.07 09.07 10.07 11.07 12.07 01.08 02.08 03.08 04.08 05.08 06.08 07.08 08.08 09.08 10.08 11.08 12.08 01.09 02.09 03.09 04.09 05.09 06.09 07.09 08.09 09.09 10.09 11.09 12.09 01.10 02.10 03.10 04.10 05.10 06.10 07.10 08.10 09.10 10.10 11.10 12.10 01.11 02.11 03.11 04.11 05.11 07.11 08.11 09.11 10.11 11.11 12.11 01.12 02.12 03.12 04.12 05.12 06.12 07.12 08.12 09.12 10.12 11.12 12.12 01.13 02.13 03.13 04.13 05.13 06.13 07.13 08.13 09.13 10.13 11.13 12.13 01.14 02.14 03.14 04.14 05.14 06.14 07.14 09.14 10.14 11.14 12.14 01.15 02.15 03.15 04.15 06.15 09.15 12.15 01.16 03.16 04.16 05.16 06.16 07.16 08.16 09.16 12.16 01.17 02.17 03.17 04.17 06.17 07.17 08.17 09.17 10.17 12.17 01.18 02.18 03.18 04.18 05.18 06.18 07.18 08.18 09.18 11.18 12.18 01.19 02.19 05.19 06.19 08.19 10.19 11.19 05.20 10.20 11.20 12.20 09.21 11.21 12.22 02.26

Sunday, March 26, 2006

SAXParseException: Element "web-app" does not allow "sometag" here

Couple of days back, I was asked to look into some web server related issue, at our partner's site. According to them, they packaged and deployed the web application, per the instructions of Sun Java Web Server (aka iPlanet web server) documentation -- yet they couldn't access their application using a web browser. They gave me a clue that they noticed some error (see below) during the web server start up:

[23/Mar/2006:04:07:29] failure (11038): WEB4220: The web application [/mywebapp] is unavailable because of errors during startup. Please check the logs for errors

The first thing I did was to check the actual log (<webserver_root>/<server_instance>/logs/errors) file for a more detailed error message, and found the one that I'm looking for:

[23/Mar/2006:04:07:26] info (10896): WEB0100: Loading web module in virtual server [https-v490s001] at [/mywebapp]
[23/Mar/2006:04:07:26] info (10896): WEB0100: Loading web module in virtual server [https-v490s001] at [/search]
[23/Mar/2006:04:07:28] info (10896): CORE3282: stdout: PARSE error at line 27 column -1
[23/Mar/2006:04:07:28] info (10896): CORE3282: stdout: org.xml.sax.SAXParseException: Element "web-app" does not allow 
"mime-mapping" here.
[23/Mar/2006:04:07:28] failure (10896): ContextConfig[/mywebapp] WEB3524: Parse error in application web.xml

It clearly says that the problem is with the mime-mapping tag in mywebapp's web.xml file. The last few lines of web.xml are like this:

        ...
        ...
       <welcome-file-list>
                <welcome-file>default.jsp
        </welcome-file-list>
        <mime-mapping>
                <extension>xsd
                <mime-type>text/xml
        </mime-mapping>
</web-app>

The real problem is the actual order of welcome-file-list and mime-mapping tags in web.xml. mime-mapping tag should appear before welcome-file-list in web.xml file. So, swapping welcome-file-list and mime-mapping tags fixed the issue, and the web application is accessible through a web browser, now.

The key is finding the real order of all tags that we define in web.xml. All web.xml files should conform to the XML DTD for a Servlet web-app (war) module, in order for the web server to load the application, properly. The last known published DTD is available in Sun Microsystem's web site at: XML DTD for a Servlet 2.3 web-app (war) module. Apparently the following piece {in DTD} helped me resolving the issue:

<!--
The web-app element is the root of the deployment descriptor for
a web application.
-->
<!ELEMENT web-app (icon?, display-name?, description?, distributable?,
context-param*, filter*, filter-mapping*, listener*, servlet*,
servlet-mapping*, session-config?, mime-mapping*, welcome-file-list?,
error-page*, taglib*, resource-env-ref*, resource-ref*, security-constraint*,
login-config?, security-role*, env-entry*, ejb-ref*,  ejb-local-ref*)>

Technorati Tags:
Sun | Web Server

Posted by: Giri Mandalika. # 1:36 AM

Wednesday, March 22, 2006

Updated C/C++ articles on Sun Developer Network (SDN)

In an effort to clean up the outdated content, SDN/Sun Studio team got all the published articles reviewed one more time. Since there are two articles under my name, they forwarded the new feedback and asked me to make the changes, as they fit. The updated content is live now, and is available at the following URLs:

Mixed-Language Programming and External Linkage
Thanks to Lawrence Crowl (Sun) for the corrections; and also for suggesting a better solution for the example.
Reducing Symbol Scope with Sun Studio C/C++
Thanks to Mukesh Kapoor (Sun) for reading this lengthy article, and catching some unnoticed error in one of the programming examples.

Posted by: Giri Mandalika. # 8:25 PM

Saturday, March 18, 2006

Solaris: Better scalability with libumem

Scalability issues with standard memory allocator

It is a known fact that multi-threaded applications do not scale well with standard memory allocator, because the heap is a bottleneck. When multiple threads simultaneously allocate or de-allocate memory from the allocator, the allocator will serialize them. Therefore, with the addition of more threads, we find more threads waiting, and the wait time grows longer, resulting in increasingly slower execution times. Due to this behavior, programs making intensive use of the allocator actually slow down as the number of processors increases. Hence standard malloc works well only in single-threaded applications, but poses serious scalability issues with multi-threaded applications running on multi-processor (SMP) servers.

Solution: libumem, an userland slab allocator

Sun started shipping libumem, an userland slab (memory) allocator, with Solaris 9 Update 3. libumem provides faster and more efficient memory allocation by using an object caching mechanism. Object caching is a strategy in which memory that is frequently allocated and freed will be cached, so the overhead of creating the same data structure(s) is reduced considerably. Also per-CPU set of caches (called Magazines) improve the scalability of libumem, by allowing it to have a far less contentious locking scheme when requesting memory from the system. Due to the object caching strategy outlined above, the application runs faster with lower lock contention among multiple threads.

libumem is a page based memory allocator. That means, if a request is made to allocate 20 bytes, libumem aligns it to the nearest page (ie., at 24 bytes on SPARC platform -- the default page size is 8K on Solaris/SPARC) and returns a pointer to the allocated block. As these requests add up, it can lead to internal fragmentation, so the extra memory that is not requested by application, but allocated by libumem is wasted. Also libumem uses 8 bytes of every buffer it creates, to keep meta data about that buffer. Due to the reasons outlined in this paragraph, there will be a slight increase in the per process memory footprint.

More interesting information about libumem can be found in the article Magazines and Vmem: Extending the Slab Allocator to Many CPUs and Arbitrary Resources.

libumem can also be used in catching memory management bugs like memory leaks, corrupted heap, in an application. Identifying Memory Management Bugs Within Applications Using the libumem Library article has the detailed steps to catch memory management bugs with lucid explanation/examples.

Quick tip:
Run "truss -c -p <pid>", and stop the data collection with Ctrl-c (^c) after some time say 60 sec. If you see more number of system calls to lwp_park, lwp_unpark, lwp_mutex_timedlock, it is an indication that the application is suffering from lock contention, and hence may not scale well. Consider linking your application with libumem library, or pre-load libumem during run-time, for better scalability.

Technorati tags
Solaris | OpenSolaris

Posted by: Giri Mandalika. # 3:16 PM

Tuesday, March 14, 2006

Solaris: DTrace script for getting call stacks

The following DTrace script really helped me nailing down some lock contention issue, that I was looking into, at work. This simple script records all the call stacks, upto 60 frames, whenever a call has been made to lwp_*() API, explicitly or implicitly. At the end (ie., when we press ^C), it dumps all the stack traces along with the number of times the same call stack was executed. It also prints the duration (in seconds) for which the data was collected, and the IDs of active LWPs.

% cat lwp.d
#!/usr/sbin/dtrace -s

#pragma D option quiet

BEGIN
{
        start = timestamp;
}

syscall:::entry
/execname == "execname"/
{
        @s[probefunc] = count();
}

syscall::lwp_*:entry
/execname == "execname"/
{
        @c[curthread->t_tid] = count();
        @st[ustack(60)] = count();
}

END
{
        printf("Ran for %d seconds\n\n", (timestamp - start) / 1000000000);

        trunc(@s,5);
        printa(@s);

        printf("\n%-10s %-10s\n", "LWP ID", "COUNT");
        printa("%-10d %@d\n", @c);

        printa(@st);
}

This script can be easily modified to obtain the call stacks to any kind of function call, by replacing "lwp_*", with the actual function name. Also "execname" has to be replaced with the actual process name.

Technorati tags
Solaris | OpenSolaris | DTrace

Posted by: Giri Mandalika. # 9:19 PM

Friday, March 03, 2006

Solaris: Resource Controls - Physical Memory

Solaris Zones: Resource Controls - CPU explains the steps to control the CPU resources on any server, running Solaris 10 or later. It is also possible to restrict the physical memory usage by a process, or by all processes owned by a user. This can be done either in a local zone or in a global zone on Solaris 10 and later. Note that Solaris 9 and later versions can be used for capping physical memory.

The goal of this blog entry is to show the simple steps in restricting the total physical memory utilization by all processes owned by an user called giri to 2G (total physical memory installed: 8G), in a local zone called v1280appserv.

 v1280appserv:/% prtconf | grep Mem
 prtconf: devinfo facility not available
 Memory size: 8192 Megabytes

To achieve the physical mem cap, we have to start with a project creation, for the user giri. A project is a grouping of processes that are subject to a set of constraints. To define the physical memory resource cap for a project, establish the physical memory cap by adding rcap.max-rss attribute to the newly created project. rcap.max-rss in a project indicates the total amount of physical memory, in bytes, that is available to all processes in the project. Project creation and establishing the physical memory cap steps can be combined into one simple step as shown below:

 % projadd -c "App Serv - Restrict the physical memory usage to 2G" -K "rcap.max-rss=2147483648" \
     -U giri appservproj

where: appservproj is the name of the project.

It will append an entry to /etc/project file.

 % cat /etc/project
 system:0::::
 user.root:1::::
 ...
 appservproj:100:App Serv - Restrict the physical memory usage to 2G:giri::rcap.max-rss=2147483648

-l option of projects can be used to list all the configured projects, and the detailed information about each project.

 % projects -l
 system
         projid : 0
         comment: ""
         users  : (none)
         groups : (none)
         attribs:
 user.root
         projid : 1
         comment: ""
         users  : (none)
         groups : (none)
         attribs:
 ...
 ...
 appservproj
         projid : 100
         comment: "App Serv - Restrict the physical memory usage to 2G"
         users  : giri
         groups : (none)
         attribs: rcap.max-rss=2147483648

Now associate the project appservproj to user giri, by appending the following line to /etc/user_attr file:
giri::::project=appservproj

 % cat /etc/user_attr
 ...
 adm::::profiles=Log Management
 lp::::profiles=Printer Management
 root::::auths=solaris.*,solaris.grant;profiles=Web Console Management,All;lock_after_retries=no
 giri::::project=appservproj

Finally enable the resource capping daemon, rcapd if it is not running. The rcapd daemon enforces resource caps on collections of processes. It supports per-project physical memory caps, as we need.

 % ps -ef | grep rcapd
     root 21160 21097   0 18:10:36 pts/4       0:00 grep rcapd

 % rcapadm -E

 % pgrep -l rcapd
 21164 rcapd

That's about it. When the resident set size (RSS) of a collection of processes owned by user giri, exceeds its cap, rcapd takes action and reduces the total RSS of the collection to 2G. The excess memory will be paged out to the swap device. The following run-time statistics indicate that the physical memory cap is effective -- observe the total RSS size under project appservproj; and also from the paging activity (vmstat output).

 % prstat -J
    PID USERNAME  SIZE   RSS STATE  PRI NICE      TIME  CPU PROCESS/NLWP
  21584 giri      555M  381M sleep   59    0   0:01:19 7.4% siebmtshmw/73
  21580 giri      519M  391M sleep   59    0   0:01:16 5.7% siebmtshmw/73
  21576 giri      547M  372M sleep   59    0   0:01:19 5.7% siebmtshmw/73
  21591 giri      519M  372M sleep   59    0   0:01:14 3.5% siebmtshmw/75
  21565 giri      209M  119M sleep   59    0   0:00:16 0.4% siebprocmw/9
  21549 giri     5560K 3080K sleep   49    0   0:00:00 0.1% prstat/1
  21620 giri     4776K 3728K cpu1    59    0   0:00:00 0.1% prstat/1
  21564 giri      162M  111M sleep   59    0   0:00:07 0.1% siebmtsh/10
  ...
  ...

 PROJID    NPROC  SIZE   RSS MEMORY      TIME  CPU PROJECT
    100       14 3232M 2052M    26%   0:06:20  23% appservproj
      3        8   62M   28M   0.3%   0:01:08 0.1% default

 Total: 22 processes, 396 lwps, load averages: 1.56, 1.07, 0.63

 % vmstat 2
  kthr      memory            page            disk          faults      cpu
  r b w   swap  free  re  mf pi po fr de sr s0 s1 s3 --   in   sy   cs us sy id
  0 0 0 16985488 6003688 3 6  8  5  4  0  0  1  0  0  0  340  280  230  4  1 96
  0 0 0 14698304 3809408 38 556 6523 0 0 0 0 643 164 0 0 6075 4301 4899 78 12 10
  6 0 0 14691624 3792080 25 604 5922 0 0 0 0 573 168 0 0 5726 5451 3269 93 6  0
  11 0 0 14680984 3770464 360 831 6316 0 0 0 0 882 191 0 0 7276 4352 3010 75 11 15
  7 0 0 14670192 3765472 211 747 5725 0 0 0 0 865 178 0 0 7428 4349 3628 73 13 14
  13 0 0 14663552 3778280 8 300 1493 0 0 0 0 2793 101 0 0 16703 4485 2418 68 7 25
  14 0 0 14659352 3825832 12 154 983 0 0 0 0 3202 104 0 0 18664 4147 2208 56 4 40
  20 0 0 14650432 3865952 4 157 1009 0 0 0 0 3274 116 0 0 19295 4742 1984 70 6 25
  6 0 0 14644240 3909936 2 119 858 0 0 0  0 3130 81 0 0 18528 3691 2025 54 5 42
  18 0 0 14637752 3953560 1 121 662 0 0 0 0 3284 70 0 0 18475 5327 2297 95 5  0

rcapd daemon can be monitored with rcapstat tool.

 % rcapstat
     id project         nproc    vm   rss   cap    at avgat    pg avgpg
  ...
  ...
    100 appservproj        14 2603M 1962M 2048M    0K    0K    0K    0K
    100 appservproj        14 2637M 1996M 2048M    0K    0K    0K    0K
    100 appservproj        14 2645M 2005M 2048M    0K    0K    0K    0K
    100 appservproj        14 2686M 2042M 2048M    0K    0K    0K    0K
    100 appservproj        14 2706M 2063M 2048M   24K    0K   24K    0K
     id project         nproc    vm   rss   cap    at avgat    pg avgpg
    100 appservproj        14 2731M 2071M 2048M   61M    0K   38M    0K
    100 appservproj        14 2739M 2001M 2048M    0K    0K    0K    0K
    100 appservproj        14 2751M 2016M 2048M    0K    0K    0K    0K
    100 appservproj        14 2771M 2036M 2048M    0K    0K    0K    0K
    100 appservproj        14 2783M 2049M 2048M  880K    0K  744K    0K
    100 appservproj        14 2796M 2054M 2048M   15M    0K 6576K    0K
    100 appservproj        14 2824M 2030M 2048M    0K    0K    0K    0K
    100 appservproj        14 2832M 2047M 2048M    0K    0K    0K    0K
    100 appservproj        14 2875M 2090M 2048M   33M    0K   21M    0K
    100 appservproj        14 2895M 1957M 2048M   21M    0K   21M    0K
    100 appservproj        14 2913M 1982M 2048M    0K    0K    0K    0K
    100 appservproj        14 2951M 2040M 2048M    0K    0K    0K    0K
    100 appservproj        14 2983M 2081M 2048M   20M    0K 1064K    0K
    100 appservproj        14 2996M 2030M 2048M   55M    0K   33M    0K
    100 appservproj        14 3013M 2052M 2048M 4208K    0K 8184K    0K
    100 appservproj        14 3051M 2100M 2048M   52M    0K   56M    0K
    100 appservproj        14 3051M 2100M 2048M    0K    0K    0K    0K
    100 appservproj        14 3064M 2078M 2048M   30M    0K   36M    0K
    100 appservproj        14 3081M 2099M 2048M   51M    0K   56M    0K
    100 appservproj        14 3119M 2140M 2048M   52M    0K   48M    0K
  ...
  ...

 % rcapstat -g
     id project         nproc    vm   rss   cap    at avgat    pg avgpg
    100 appservproj        14 3368M 2146M 2048M  842M    0K  692M    0K
 physical memory utilization:  50%   cap enforcement threshold:   0%
    100 appservproj        14 3368M 2146M 2048M    0K    0K    0K    0K
 physical memory utilization:  50%   cap enforcement threshold:   0%
    100 appservproj        14 3368M 2146M 2048M    0K    0K    0K    0K
 physical memory utilization:  50%   cap enforcement threshold:   0%
    100 appservproj        14 3380M 2096M 2048M   48M    0K   44M    0K
  ...

To disable rcapd daemon, run the following command:

 % rcapadm -D

For more information and examples, see:

System Administration Guide: Solaris Containers-Resource Management and Solaris Zones
Brenden Gregg's Memory Resource Control demos

Technorati tags
Solaris | OpenSolaris

Posted by: Giri Mandalika. # 10:47 PM

2004-2026