on 12-30-2013 11:38 PM
Hi experts,
When I tried to connect to an Sybase ASE instance "ASE1570_S1", I got following error:
[sybase@rhel64-ase-tgt ~]$ isql -Usa -Psybase -SASE1570_S1
CT-LIBRARY error:
ct_connect(): user api layer: internal Client Library error: Read from the server has timed out.
CT-LIBRARY error:
ct_connect(): network packet layer: internal net library error: Net-lib operation timed out
I can see following warning in error log /opt/sybase/errorlogs/ASE1570_S1.log:
00:0001:00000:00000:2013/12/30 15:10:29.54 kernel Warning: The internal timer is not progressing. If this message is generated multiple times, report to Sybase Technical Support and restart the server (alarminterval=-1162).
00:0001:00000:00000:2013/12/30 15:20:29.54 kernel Warning: The internal timer is not progressing. If this message is generated multiple times, report to Sybase Technical Support and restart the server (alarminterval=-7162).
The Adaptive Server version is:
Adaptive Server Enterprise/15.7/EBF 21341 SMP SP101 /P/x86_64/Enterprise Linux/ase157sp101/3439/64-bit/FBO/Thu Jun 6 16:08:18 2013
Any hint would be high appreciated.
Jerry,
Were you able to resolve this problem? I see it states 'Not Answered'. I am having a similar issue and was looking for some answers.
I have a two node Sybase 15.7 cluster, but only one node will start. I can start either database first, but the second will not come online.
I get:
Connecting to master database...
Warning: Internal timer not progressing...
It never fails, but it also never comes online.
Thanks
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Hi,
About this message error,
[sybase@rhel64-ase-tgt ~]$ isql -Usa -Psybase -SASE1570_S1
CT-LIBRARY error:
ct_connect(): user api layer: internal Client Library error: Read from the server has timed out.
CT-LIBRARY error:
ct_connect(): network packet layer: internal net library error: Net-lib operation timed out
Have you check if SYBASE_ASE and SYBASE_OCS environment are set correctly ?
This behaviour can be being caused for another factors as network issues
-- have your network administrator check system for any problems, including
dropped network packets.
-- On UNIX platforms, check the value of the tcp ip
abort interval. If your client application is waiting for the server to return
results from a large or time-consuming query, the operating system may
disconnect the client.
-- Consider disabling tcp no delay.
-- Check the
operating system parameter for the number of file descriptors for both the
client and server machines.
On most UNIX machines, you can get this value by
issuing the limit or ulimit -a command. If you need to open more connections
than the number of file descriptors allowed, ask the UNIX system administrator
to set this parameter to a higher value.
Regards,
Claude
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Hi Claude,
Thanks for your reply. I can confirm that $SYBASE_ASE and $SYBASE_OCS are set correctly:
[sybase@rhel64-ase15 ~]$ echo $SYBASE_ASE
ASE-15_0
[sybase@rhel64-ase15 ~]$ echo $SYBASE_OCS
OCS-15_0
Since I connect to the data server from the same host using isql, I checked "sybase" user's ulimit:
[sybase@rhel64-ase15 ~]$ ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 30510
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 10240
cpu time (seconds, -t) unlimited
max user processes (-u) 1024
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
In a word, I don't think this is environmental issue, cause the unresponsive ASE server used to function normally.
Hi Jerry,
I think Bret explain what could be the problem here and you confirmed that with “dataserver process is using more than 95 %CPU”, when engine, engines are very close to 100% user busy task maybe not get schedule time which mean we do not process the deferred queue and do not schedule connection handler the first lead to “The internal timer is not progressing ….” and the second to that the client getting the “ct_connect(): user
api layer: internal Client Library error: Read from the server has timed out”.
As Bret said a couple of shared memory dump will help to Diagnostic this problem that is best done with a support case this is how to collect the manual shared memory dump.
https://support.wdf.sap.corp/sap/support/notes/1940109
You need to be logged in to service market place to open this link.
A other question is this running on a Virtual environment ? and what Kernel mode are you use in ASE 15.7 ?
You could also investigate this from the operating system level as an start, checking what take the 100%.
1. How to identify highest thread usage by CPU in the dataserver process for linux ?
top -H -p <pid> or to save to file top -n 10 -b -H -p <pid of dataserver> > top-output.txt
About option -b and -n from the linux top command man page:
-b : Batch mode operation
Starts top in "Batch mode", which could be useful for sending out put from top to other programs or to a file. In this mode, top will not accept input and runs until the iterations
limit you have set with the -n command-line option or until killed.
-n : Number of iterations limit as: -n number
Specifies the maximum number of iterations, or frames, top should produce before ending.
2. Find Thread stack , you can use nsd -stacks or use OS debugger
On Linux pstack maybe not exist if it does not exist. pstack is symlink to gstack which is part of the gdb package so gstack could be used.
This could give some idea which thread is the one that have most User Busy Utilization and what it is doing at that time.
Niclas
In general, the warning indicates that the server's engines are being kept too busy to process the deferred queue. The issue can resolve itself if an engine finds cycles to work on the deferred queue.
Were there any other errors in the errorlog around this time, particularly error 814 or timeslice errors? If you are seeing 814, the issue is probably due to a known bug, CR 750807, which is fixed in 15.7 SP120.
Example:
00:0002:00000:00102:2013/11/06 22:17:14.61 server Error: 814, Severity: 20, State: 7
00:0002:00000:00102:2013/11/06 22:17:14.61 server Keep count of buffer '0x2a6682628' in cache 'default data cache' holding logical page '725505' in database 'foo' has become negative.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Hi Jerry,
I concur, no signs of other errors in the errorlog. So the most likely explanation is that something is keeping the engines too busy to get the connection handler scheduled before your connection attempt times out, or to work on the internal timer from the deferred queue.
Is this server involved in any replication using dbcc logtransfer replication (as opposed to Rep Agent replication)? There is another bug fixed in 15.7 SP120 that results in high CPU use under that condition, which might result in your symptoms, CR 745325. If you aren't doing any replication, don't worry about it.
We would normally try to look at cpu busy issues with sp_sysmon, but if you can't log in, you can't run sp_sysmon. The next thing I'd try using to diagnose this would be a couple of manual sybmon memdumps taken perhaps a minute apart while this was happening. For directions on how to take a manual memdump, please see https://support.wdf.sap.corp/sap/support/notes/1940109. You should open a support instance to have the memdumps looked at.
-bret
Hi Bret,
No, I'm not doing any replication.
I can see the dataserver process is using more than 95 %CPU all the time in the output of "top" command on my VM.
BTW, I cannot open the link you provided, it looks like a URL you use in intranet? For the support instance, since my company is a partner of SAP, can we open a support instance with partner account?
User | Count |
---|---|
86 | |
10 | |
10 | |
9 | |
7 | |
7 | |
6 | |
5 | |
4 | |
4 |
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.