cancel
Showing results for 
Search instead for 
Did you mean: 

terminate session with ase database

suznCB
Participant
0 Kudos

Dears,

kindly note we use sybase ASE 15.7 for our production database,

we set number of user connection = 1000

while users are connecting to the database through our internal applications,

one or more users disconnected suddenly without any notification in the error log,

and the connected users at that time is just 600

how can I investigate in this case, and find the real cause of terminating the session with database.

- it is not network issue.

- we don't have audit database.

please advice.

Thank you

Accepted Solutions (0)

Answers (3)

Answers (3)

former_member89972
Active Contributor
0 Kudos

What is "common" for the group of end-users who lost their connections ?

Geo-location ? then it points to problem in end-to-end connectivity (i.e. most probably a network related issue)

ASE knocks out inactive/non-reachable connection only after tcp time out set at OS level. And it normally leaves a line in the error log about this.

Other reasons may be

- tempdb for the set of users not being available or full (but again this will result in lines in error log)

- Locks running out (leaves lines in error log)

Best option would be use auditing or have a periodic job to monitor master...sysprocesses, master..monProcesses etc.

HTH

Avinash

suznCB
Participant
0 Kudos

Thank you dear for replying.

kindly find my comment below on Mr Bret's answer, and I will add:

I check tempdb after receive a complaint from a user, but it is not full, what do you think about user log cache size ?

kindly find tha attached file in my previous comment, I think there is some thing with Large IO denied, or with packet size?

did you mean to run select * from master..sysprocesses every 30 minutes as an example?

Best Regards

former_member188958
Active Contributor
0 Kudos

How have you become aware of the disconnects?
Are users complaining, if so exactly what did they experience?
Is the spid gone but locks owned by the spid left behind (this issue is called "phantom locks")?
Any common factors for the disconnects (particular users, particular applications, particular time of day)?

-bret

suznCB
Participant
0 Kudos

Thank you for replying dear,

yes, users complain that during their works on an application (which is used by all users of our company) they disconnected with the following message,

different message for different users:

1- connection has been marked dead.

2- ct-connect() network packet layer, internal net library error, Net-lib protocol driver call to connect two endpoints fail.

3- cannot reconnect to database, transaction not connected.

4- Select error, ct_send() user api layer , external error this routine cannot be called because another command structure has result pending.

I mentioned not all users experience the issue, and I double check the network there is no issue with it, especially, if two users in a branch A which is in a far location , so user1 maybe disconnect, while user2 can continue his working, and what surprised me there is no hint related to that disconnecting in the error log, why?

sorry, but I don't know phantom locks!

also, there are no common factors, the issue is occurred randomly.

kindly note today a lot of users disconnected ,I ran sp_sysmon at the end of day , please find it in the attached to advice if there is some hint in it? sysmon.txt

Best Regards

former_member188958
Active Contributor

You only need to worry about phantom locks if you are experiencing them. You can check for them with this query:

select distinct spid from master..syslocks where spid not in (select spid from master.sysprocesses)
go

Although you aren't finding any problems with the network, the client errors do still make me think something is going wrong in the network layer. The "driver call to connect two endpoints fail" message is raised when there is a failure to connect during the initial login. "cannot reconnect" suggests a problem occurring after a connection has been started.

Have a look at sp_who when this happens.

a) are there processes stuck in a state where "cmd" = "MAINTENANCE TOKEN"?
b) do you find processes logged in that map to the complaining user? (if "joeuser" complains, run "sp_who joeuser")
What is the status (running, sleeping, etc.) and cmd for those processes?
[These may simply be other connections from that user, but might also be orphaned connections.]

In general ASE will only detect a broken connection if it tries to send something to the client and gets an error.
If ASE is in "awaiting command" state, it will just keep waiting. (the network layer should eventually clean up these orphaned connections based on the KEEPALIVE / KEEPIDLE settings, which often have a default of 2 hours). I would expect to see a "host disconnected" type of message in the log when the connection was closed this way.

Anything unusual at all in the ASE errorlog within a day of when this issue is happening?

-bret

suznCB
Participant
0 Kudos

Dear Bret, Thanks in advanced.

I will try your suggestions if experienced the problem again, I think we won't

kindly note what happened recently, it add some clearness on the issue:

yesterday, there were complaints about 2 issues:

1- when users tried (through their connection to the production database) to get data from another server which was added on the production database as a remote server,I can see it in the master..sysservers table,

they got the following error:


at that time, I logged in as a window user to the server where our sybase production server is, and from dsedit tool try to ping to that remote server and got the error message which was recorded in the database error log;

ping to the server x.x.x.x:6000 using protocol tcp

Net-lib protocol driver call to connect two endpoints failed to connect to server, Error is 10055 An operation on a socket could not be performed because the system lacked sufficient buffer space or because a queue was full.

but pinging from dsedit tool on my PC was fine,

I want to say the problem was from the production sybase server, cause pinging in the opposite direction : from remote server to production was fine from dsedit on that remote server.

I googled for this error and found that this is a windows issue returned to the cause of OS may run out of memory for tcp buffers, and one of the solutions was reboot the server to release some resources, or add some tcpip parameters to the registry.

but before restarting the windows server , I issued netstat on cmd and got more than 2000 results in TIME_WAIT state,

2- the second complaint was, some users try to connect to the database's server for twice and others for 3 times but couldn't establish a connection, at the next try they success ed

I restarted the windows server (which by it's way restarted sybase services), the pinging (for the first issue) done fine, and till now no user has a complaint about disconnecting or a problem in connection establishing to the database.

Many thanks for you

madhvi_pai
Advisor
Advisor
0 Kudos

Hi Suzn,

Are there any corresponding messages in the SM21 system log or any ST22 short dumps raised during the random disconnects?

Thanks,

Madhvi Pai

SAP Product Support

suznCB
Participant
0 Kudos

Thanks for replying dear,

I think SM21 or ST22 are related to SAP, aren't they?

I don't have SAP app, I just work with sybase ASE .