Consider the following real world scenario (copied from an e-mail message that I got from one of our partners):
[...]
one process is waiting for a very long time in a send
system call. It is sending on a valid fd but the question we have is that, is there any way to find who is on the other end of that fd? We want to know to which process is that message being sent to. [...]
Here is how I proceed in finding the other end of the socket, and the state of the socket connection with Mozilla's Thunderbird mail client in one end of the socket connection:
- Get the process id of the application
% prstat
PID USERNAME SIZE RSS STATE PRI NICE TIME CPU PROCESS/NLWP
22385 mandalik 180M 64M sleep 49 0 0:05:15 0.1% thunderbird-bin/5
- Run
pfiles
on the pid - it prints a list of open files including open sockets (pfiles
is the Solaris supported equivalent of lsof
utility).
% pfiles 22385
22385: /usr/lib/thunderbird/thunderbird-bin -UILocale C -contentLocale C
Current rlimit: 512 file descriptors
...
...
33: S_IFSOCK mode:0666 dev:280,0 ino:31544 uid:0 gid:0 size:0
O_RDWR|O_NONBLOCK
SOCK_STREAM
SO_SNDBUF(49152),SO_RCVBUF(49640)
sockname: AF_INET 192.168.1.2 port: 60364
peername: AF_INET 192.18.39.10 port: 993
...
...
- Locate the socket id and the corresponding sockname/port#, peername/port# in the output of
pfiles pid
(see step #2).
Here my assumption is that I know the socket id I'm interested in. In the above output, 33 is the socket id. One end of the socket is bound to port 60364 on the local host 192.168.1.2; and the other end of the socket is bound to port 993 on the remote host 192.18.39.10.
- Run
netstat -a | egrep "<sockport#>|<peerport#>
(get the port numbers from step 3); and check the state of the socket connection. If you see anything other than ESTABLISHED
, it indicates trouble.
% netstat -a | egrep "60364|993"
solaris-devx-iprb0.60364 mail-sfbay.sun.com.993 48460 0 49640 0 ESTABLISHED
If you want to see the host names in numbers (IP addresses), run netstat
with option -n
.
% netstat -an | egrep "60364|993"
192.168.1.2.60364 192.18.39.10.993 49559 0 49640 0 ESTABLISHED
Now since we know both ends of the socket, we can easily get the state of the socket connection at the other end by running netstat -an | egrep '<localport#>|<remoteport#>
.
If the state of the socket connection is CLOSE_WAIT
, have a look at the following diagnosis: CPU hog with connections in CLOSE_WAIT.
Finally to answer
... which process is that message being sent to ... part of the original question:
Follow the above steps and find the remote host (or IP) and remote port number. To find the corresponding process id on the remote machine to which the other half of the socket belongs to, do the following:
- Login as
root
user on the remote host.
-
cd /proc
- Run
pfiles * | egrep "^[0-9]|sockname" > /tmp/pfiles.txt
.
-
vi /tmp/pfiles.txt
and search for the port number. If you scroll few lines up, you can see the process ID, name of the process along with its argument(s).
_____________________
Technorati tags:
Solaris |
OpenSolaris |
Networking
Here are the deals on GPS Navigation devices at verious stores on Black Friday (Day After ThanksGiving: Nov 23, 2007). Listed in a table for your convenience.
MIR in Misc Info Column = Mail-In Rebate
Store Name |
Description of the Item |
Price |
Misc Info |
Circuit City | MIO Portable Navigation GPS(MIO) | $99.99 | |
Linens & Things | Amcor A3900 GPS Navigation System (Amcor) | $99.99 | MIR |
Office Depot | Maxx Digital Portable GPS System (Maxx Digital) | $99.99 | |
Best Buy | TomTom ONE LE GPS(TomTom) | $119.99 | |
Circuit City | Magellan Roadmate 1200 Portable GPS(Magellan) | $124.99 | |
Staples | TomTom ONE 3rd Edition Portable GPS(TomTom) | $124.99 | |
WalMart | Garmin StreetPilot C330 GPS(Garmin) | $128.88 | |
Kohl's | GPS Navigation System(Unknown) | $129.99 | |
Office Depot | Tom Tom ONE 3.5" Portable GPS (TomTom) | $129.99 | |
Radio Shack | Megellan Maestro 3100 GPS(Magellan) | $129.99 | |
Radio Shack | TomTom ONE Third Edition GPS(TomTom) | $139.99 | |
CompUSA-TG | Magellan Maestro 3100 GPS Navigation System (Magellan) | $147.99 | |
Target | Magellan Maestro 3100 Auto GPS(Magellan) | $149.00 | |
CompUSA | TomTom One 3rd Edition GPS Navigation System (TomTom) | $149.99 | |
Costco.com | Magellan Triton 500 Handheld Outdoor GPS(Magellan) | $149.99 | |
Radio Shack | Navigon 2100 GPS w/ Traffic Service(Navigon) | $149.99 | |
Radio Shack | Mio C320 GPS(MIO) | $149.99 | |
JC Penney | Invion 3.5" Text To Voice Touchscreen GPS System(Invion) | $169.88 | |
Best Buy | Garmin Nuvi 200 GPS(Garmin) | $169.99 | |
Circuit City | Garmin GPS(Garmin) | $199.99 | |
Toys R Us | Magellan Maestro 4000 GPS(Magellan) | $219.99 | |
CompUSA | Garmin nuvi 200W GPS Navigation System (Garmin) | $249.99 | |
Office Max | Magellan Maestro 4000 GPS Auto Navigation System(Magellan) | $249.99 | |
Sears | Magellan GPS Maestro 4000(Magellan) | $249.99 | |
Office Depot | Tom Tom XLS 4.3" Portable GPS (TomTom) | $259.99 | |
Circuit City | Garmin GPS Navigation w/ Slim Profile(Garmin) | $299.99 | |
CompUSA | Magellan Maestro 4210 GPS Navigation System (Magellan) | $299.99 | |
Radio Shack | Magellan Maestro 4040 GPS(Magellan) | $299.99 | |
Best Buy | Garmin StreetPilot C550 GPS(Garmin) | $329.99 | |
Linens & Things | Garmin nuvi 200W Voice-Prompted GPS System (Garmin) | $349.99 | |
Best Buy | Garmin nuvi 660 GPS(Garmin) | $399.99 | |
CompUSA | HP iPAQ 310 Travel Companion GPS System (HP) | $399.99 | |
CompUSA | Garmin StreetPilot C580 GPS Navigation System (Garmin) | $399.99 | |
Office Depot | Tom Tom 100 4.3" Portable GPS (TomTom) | $449.99 | |
Labels: best, black friday, buy, deal, friday, Garmin, GPS, Magellan, Navigation, sale, thanksgiving sale, TomTom
First things first - Oracle 11g Release 1 for Solaris/SPARC is available now; and can be downloaded from
here.
In some Siebel 8.0 environments where Oracle database is being used, customers might be noticing intermittent Siebel object manager crashes under high loads when the work is actively being done by tons of LWPs with less number of object managers. Usually the call stack might look something like:
/export/home/siebel/siebsrvr/lib/libsslcosd.so:0x4ad24
/lib/libc.so.1:0xc5924
/lib/libc.so.1:0xba868
/export/home/oracle/lib32/libclntsh.so.10.1:kpuhhfre+0x8d0 [ Signal 11 (SEGV)]
/export/home/oracle/lib32/libclntsh.so.10.1:kpufhndl0+0x35ac
/export/home/siebel/siebsrvr/lib/libsscdo90.so:0x3140c
/export/home/siebel/siebsrvr/lib/libsscfdm.so:0x677944
/export/home/siebel/siebsrvr/lib/libsscfdm.so:__1cQCSSLockSqlCursorGDelete6M_v_+0x264
/export/home/siebel/siebsrvr/lib/libsscfdm.so:__1cJCSSDbConnOCacheSqlCursor6MpnMCSSSqlCursor__I_+0x734
/export/home/siebel/siebsrvr/lib/libsscfdm.so:__1cJCSSDbConnRTryCacheSqlCursor6MpnMCSSSqlCursor_b_I_+0xcc
/export/home/siebel/siebsrvr/lib/libsscfdm.so:__1cJCSSSqlObjOCacheSqlCursor6MpnMCSSSqlCursor_b_I_+0x6e4
/export/home/siebel/siebsrvr/lib/libsscfdm.so:__1cJCSSSqlObjHRelease6Mb_v_+0x4d0
/export/home/siebel/siebsrvr/lib/libsscfom.so:__1cKCSSBusComp2T5B6M_v_+0x790
/export/home/siebel/siebsrvr/lib/libsscacmbc.so:__1cJCSSBCBase2T5B6M_v_+0x640
/export/home/siebel/siebsrvr/lib/libsscasabc.so:0x19275c
/export/home/siebel/siebsrvr/lib/libsscfom.so:0x318774
/export/home/siebel/siebsrvr/lib/libsscaswbc.so:__1cWCSSSWEFrameMgrInternalPReleaseBOScreen6M_v_+0x10c
/export/home/siebel/siebsrvr/lib/libsscaswbc.so:
__1cWCSSSWEFrameMgrInternalJBuildView6MpkHnLBUILDSCREEN_pnSCSSSWEViewBookmark_ipnJCSSBusObj_pnPCSSDrilldownDef_p1pI_I_+0x18b0
/export/home/siebel/siebsrvr/lib/libsscaswbc.so:
__1cWCSSSWEFrameMgrInternalJBuildView6MrpnKCSSSWEView_pkHnLBUILDSCREEN_pnSCSSSWEViewBookmark_ipnJCSSBusObj_pnPCSSDrilldownDef_p4_I_+0x50
/export/home/siebel/siebsrvr/lib/libsscaswbc.so:
__1cPCSSSWEActionMgrUActionBuildViewAsync6MpnbACSSSWEActionBuildViewAsync_pnQCSSSWEHtmlStream_pnNWWEReqModInfo_rpnJWWECbInfo__I_+0x38c
/export/home/siebel/siebsrvr/lib/libsscaswbc.so:
__1cPCSSSWEActionMgrODoPostedAction6MpnQCSSSWEHtmlStream_pnNWWEReqModInfo_rpnJWWECbInfo__I_+0x104
/export/home/siebel/siebsrvr/lib/libsscaswbc.so:
__1cPCSSSWEActionMgrSCheckPostedActions6MpnQCSSSWEHtmlStream_pnKCSSSWEArgs_pnNWWEReqModInfo_rpnJWWECbInfo_ri_I_+0x378
/export/home/siebel/siebsrvr/lib/libsscaswbc.so:
__1cWCSSSWEFrameMgrInternalSInvokeAppletMethod6MpnQCSSSWEHtmlStream_pnKCSSSWEArgs_pnNWWEReqModInfo_rpnJWWECbInfo_rnOCSSStringArray__I_+0x12f8
/export/home/siebel/siebsrvr/lib/libsscaswbc.so:
__1cSCSSSWECmdProcessorMInvokeMethod6MpnQCSSSWEHtmlStream_pnKCSSSWEArgs_pnNWWEReqModInfo_rpnJWWECbInfo__I_+0x88c
/export/home/siebel/siebsrvr/lib/libsscaswbc.so:
__1cSCSSSWECmdProcessorP_ProcessCommand6MpnQCSSSWEHtmlStream_pnNWWEReqModInfo_rpnJWWECbInfo__I_+0x860
/export/home/siebel/siebsrvr/lib/libsscaswbc.so:
__1cSCSSSWECmdProcessorOProcessCommand6MpnUCSSSWEGenericRequest_pnVCSSSWEGenericResponse_rpnNWWEReqModInfo_rpnJWWECbInfo__I_+0x9bc
/export/home/siebel/siebsrvr/lib/libsscaswbc.so:
__1cSCSSSWECmdProcessorOProcessCommand6MpnRCSSSWEHttpRequest_pnSCSSSWEHttpResponse_rpnJWWECbInfo__I_+0xd8
/export/home/siebel/siebsrvr/lib/libsscaswbc.so:__1cSCSSServiceSWEIfaceHRequest6MpnMCSSSWEReqRec_pnRCSSSWEResponseRec__I_+0x404
/export/home/siebel/siebsrvr/lib/libsscaswbc.so:__1cSCSSServiceSWEIfaceODoInvokeMethod6MpkHrknOCCFPropertySet_r3_I_+0xa80
/export/home/siebel/siebsrvr/lib/libsscfom.so:__1cKCSSServiceMInvokeMethod6MpkHrknOCCFPropertySet_r3_I_+0x244
/export/home/siebel/siebsrvr/lib/libsstcsiom.so:__1cOCSSSIOMSessionTModInvokeSrvcMethod6MpkHp13rnISSstring__i_+0x124
/export/home/siebel/siebsrvr/lib/libsstcsiom.so:
__1cOCSSSIOMSessionMRPCMiscModel6MnMSISOMRPCCode_nMSISOMArgType_LpnSCSSSISOMRPCArgList_4ripv_i_+0x5bc
/export/home/siebel/siebsrvr/lib/libsstcsiom.so:
__1cOCSSSIOMSessionJHandleRPC6MnMSISOMRPCCode_nMSISOMArgType_LpnSCSSSISOMRPCArgList_4ripv_i_+0xb98
/export/home/siebel/siebsrvr/lib/libsssasos.so:__1cJCSSClientLHandleOMRPC6MpnMCSSClientReq__I_+0x78
/export/home/siebel/siebsrvr/lib/libsssasos.so:__1cJCSSClientNHandleRequest6MpnMCSSClientReq__I_+0x2f8
/export/home/siebel/siebsrvr/lib/libsssasos.so:0x8c2c4
/export/home/siebel/siebsrvr/lib/libsssasos.so:__1cLSOMMTServerQSessionHandleMsg6MpnJsmiSisReq__i_+0x1bc
/export/home/siebel/siebsrvr/bin/siebmtshmw:__1cNsmiMainThreadUCompSessionHandleMsg6MpnJsmiSisReq__i_+0x16c
/export/home/siebel/siebsrvr/bin/siebmtshmw:__1cM_smiMessageQdDOProcessMessage6MpnM_smiMsgQdDItem_li_i_+0x93c
/export/home/siebel/siebsrvr/bin/siebmtshmw:__1cM_smiMessageQdDOProcessRequest6Fpv1r1_i_+0x244
/export/home/siebel/siebsrvr/bin/siebmtshmw:__1cN_smiWorkQdDueuePProcessWorkItem6Mpv1r1_i_+0xd4
/export/home/siebel/siebsrvr/bin/siebmtshmw:__1cN_smiWorkQdDueueKWorkerTask6Fpv_i_+0x300
/export/home/siebel/siebsrvr/bin/siebmtshmw:__1cQSmiThrdEntryFunc6Fpv_i_+0x46c
/export/home/siebel/siebsrvr/lib/libsslcosd.so:0x5be88
/export/home/siebel/siebsrvr/mw/lib/libmfc400su.so:__1cP_AfxThreadEntry6Fpv_I_+0x100
/export/home/siebel/siebsrvr/mw/lib/libkernel32.so:__1cIMwThread6Fpv_v_+0x23c
/lib/libc.so.1:0xc57f8
Setting the Siebel environment variable
SIEBEL_STDERROUT to 1 shows the following heap dump in StdErrOut directory under Siebel enterprise logs directory.
% more stderrout_7762_23511113.txt
********** Internal heap ERROR 17112 addr=35dddae8 *********
***** Dump of memory around addr 35dddae8:
35DDCAE0 00000000 00000000 [........]
35DDCAF0 00000000 00000000 00000000 00000000 [................]
Repeat 243 times
35DDDA30 00000000 00000000 00003181 00300020 [..........1..0. ]
35DDDA40 0949D95C 35D7A888 10003179 00000000 [.I.\5.....1y....]
35DDDA50 0949D95C 0949D8B8 35D7A89C C0000075 [.I.\.I..5......u]
...
...
******************************************************
HEAP DUMP heap name="Alloc environm" desc=949d8b8
extent sz=0x1024 alt=32767 het=32767 rec=0 flg=3 opc=2
parent=949d95c owner=0 nex=0 xsz=0x1038
EXTENT 0 addr=364fb324
Chunk 364fb32c sz= 4144 free " "
EXTENT 1 addr=364f8ebc
Chunk 364f8ec4 sz= 4144 free " "
EXTENT 2 addr=364f7d5c
Chunk 364f7d64 sz= 4144 free " "
EXTENT 3 addr=364f6d04
Chunk 364f6d0c sz= 4144 recreate "Alloc statemen " latch=0
ds 2c38df34 sz= 4144 ct= 1
...
...
EXTENT 406 addr=35ddda54
Chunk 35ddda5c sz= 116 free " "
Chunk 35dddad0 sz= 24 BAD MAGIC NUMBER IN NEXT CHUNK (6)
freeable assoc with mark prv=0 nxt=0
Dump of memory from 0x35DDDAD0 to 0x35DDEAE8
35DDDAD0 20000019 35DDDA5C 00000000 00000000 [ ...5..\........]
35DDDAE0 00000095 0000008B 00000006 35DDDAD0 [............5...]
35DDDAF0 00000000 00000000 00000095 35DDDB10 [............5...]
...
...
EXTENT 2080 addr=d067a6c
Chunk d067a74 sz= 2220 freeable "Alloc statemen " ds=2b0fffe4
Chunk d068320 sz= 1384 freeable assoc with mark prv=0 nxt=0
Chunk d068888 sz= 4144 freeable "Alloc statemen " ds=2b174550
Chunk d0698b8 sz= 4144 recreate "Alloc statemen " latch=0
ds 1142ea34 sz= 112220 ct= 147
223784cc sz= 4144
240ea014 sz= 884
28eac1bc sz= 900
2956df7c sz= 900
1ae38c34 sz= 612
92adaa4 sz= 884
2f6b96ac sz= 640
c797bc4 sz= 668
2965dde4 sz= 912
1cf6ad4c sz= 656
10afa5e4 sz= 656
2f6732bc sz= 700
27cb3964 sz= 716
1b91c1fc sz= 584
a7c28ac sz= 884
169ac284 sz= 900
...
...
Chunk 2ec307c8 sz= 12432 free " "
Chunk 3140a3f4 sz= 4144 free " "
Chunk 31406294 sz= 4144 free " "
Bucket 6 size=16400
Bucket 7 size=32784
Total free space = 340784
UNPINNED RECREATABLE CHUNKS (lru first):
PERMANENT CHUNKS:
Chunk 949f3c8 sz= 100 perm "perm " alo=100
Permanent space = 100
******************************************************
Hla: 255
ORA-21500: internal error code, arguments: [17112], [0x35DDDAE8], [], [], [], [], [], []
Errors in file :
ORA-21500: internal error code, arguments: [17112], [0x35DDDAE8], [], [], [], [], [], []
----- Call Stack Trace -----
NOTE: +offset is used to represent that the
function being called is offset bytes from
the _PROCEDURE_LINKAGE_TABLE_.
calling call entry argument values in hex
location type point (? means dubious value)
-------------------- -------- -------------------- ----------------------------
F2365738 CALL +23052 D7974250 ? D797345C ? DD20 ?
D79741AC ? D79735F8 ?
F2ECD7A0 ?
F286DDB8 PTR_CALL 00000000 949A018 ? 14688 ? B680B0 ?
F2365794 ? F2ECD7A0 ? 14400 ?
F286E18C CALL +77460 949A018 ? 0 ? F2F0E8D4 ?
1004 ? 1000 ? 1000 ?
F286DFF8 CALL +66708 949A018 ? 0 ? 42D8 ? 1 ?
D79743E0 ? 949E594 ?
...
...
__1cN_smiWorkQdDueu CALL __1cN_smiWorkQdDueu 1C8F608 ? 18F55F38 ?
30A5A008 ? 1A98E208 ?
FDBDF178 ? FDBE0424 ?
__1cQSmiThrdEntryFu PTR_CALL 00000000 1C8F608 ? FDBE0424 ?
1AB6EDE0 ? FDBDF178 ? 0 ?
1500E0 ?
__1cROSDWslThreadSt PTR_CALL 00000000 1ABE8250 ? 600140 ? 600141 ?
105F76E8 ? 0 ? 1AC74864 ?
__1cP_AfxThreadEntr PTR_CALL 00000000 0 ? FF30A420 ? 203560 ?
1AC05AF0 ? E2 ? 1AC05A30 ?
__1cIMwThread6Fpv_v PTR_CALL 00000000 D7A7DF6C ? 17F831E0 ? 0 ? 1 ?
0 ? 17289C ?
_lwp_start()+0 ? 00000000 1 ? 0 ? 1 ? 0 ? FCE6C710 ?
1AC72680 ?
----- Argument/Register Address Dump -----
Argument/Register addr=d7974250. Dump of memory from 0xD7974210 to 0xD7974350
D7974200 0949A018 00014688 00B680B0 F2365794
D7974220 F2ECD7A0 00014400 D7974260 F286DDB8 800053FC 80000000 00000002 D79743E0
D7974240 4556454E 545F3231 35303000 32000000 F23654D4 F2365604 00000000 0949A018
D7974260 00000000 0949A114 0000000A F2ECD7A0 FC873504 00001004 F2365794 F2F0E8D4
D7974280 0949A018 00000000 F2F0E8D4 00001004 00001000 00001000 D79742D0 F286E18C
D79742A0 F6CB7400 FC872CC0 00000004 00CDE34C 4556454E 545F3437 31313200 00000000
D79742C0 00000000 00000001 D79743E0 00000000 F2F0E8D4 00001004 00001000 00007530
D79742E0 00007400 00000001 FC8731D4 00003920 0949A018 00000000 000042D8 00000001
D7974300 D79743E0 0949E594 D7974330 F286DFF8 D7974338 F244D5A8 00000000 01000000
D7974320 364FBFFF 01E2C000 00001000 00000000 00001000 00000000 F2ECD7A0 F23654D4
D7974340 000000FF 00000001 F2F0E8D4 F25F9824
Argument/Register addr=d797345c. Dump of memory from 0xD797341C to 0xD797355C
D7973400 F2365738
...
...
Argument/Register addr=1eb21388. Dump of memory from 0x1EB21348 to 0x1EB21488
1EB21340 00000000 00000000 FE6CE088 FE6CE0A4 0000000A 00000010
1EB21360 00000000 00000000 00000000 00000010 00000000 00000000 00000000 FEB99308
1EB21380 00000000 00000000 00000000 1C699634 FFFFFFFF 00000000 FFFFFFFF 00000001
1EB213A0 00000000 00000000 00000000 00000000 00000081 00000000 F16F1038 67676767
1EB213C0 00000000 FEB99308 00000000 00000000 FEB99308 FEB46EF4 00000000 00000000
1EB213E0 00000000 00000000 FEB99308 00000000 00000000 00000001 0257E3B4 03418414
1EB21400 00000000 00000000 FEB99308 FEB99308 00000000 00000000 00000000 00000000
1EB21420 00000000 00000000 00000000 00000000 1EB213B0 00000000 00000041 00000000
1EB21440 0031002D 00410036 00510038 004C0000 00000000 00000000 00000000 00000000
1EB21460 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
1EB21480 00000109 00000000
----- End of Call Stack Trace -----
Although I'm not sure what exactly is the underlying issue for the core dump, my suspicion is that there is some memory corruption in Oracle client's code; and the Siebel Object Manager crash is due to the Oracle bug
5682121 - Multithreaded OCI clients do not mutex properly for LOB operations. The fix to this particular would be available in Oracle 10.2.0.4 release; and is already available as part of Oracle 11.1.0.6.0. In case if you notice the symptoms of failure as described in this blog post,
upgrade the Oracle client in the application-tier to Oracle 11gR1 and see if it brings stability to the Siebel environment.
________________
Technorati tags:
Siebel |
CRM |
Oracle |
Solaris |
SPARC