Skip to Content
0

Background job fails during ASCS (standalone) failover in cluster

Nov 29, 2016 at 08:09 PM

142

avatar image

Hello,

We have configured HP ServiceGuard Cluster solution in one of the LAB systems. Details are as below

HP SG version - 12.00.50 SGeSAP version - 06.00.80

Cluster nodes

labusnzdtsg01.dfdev.jnj.com (SG cluster VM1) - ASCS

labusnzdtsg02.dfdev.jnj.com (SG cluster VM2) – DB + ERS

App servers (not in cluster)

labusnzdtsga1.dfdev.jnj.com (App Server )

labusnzdtsga2.dfdev.jnj.com (App Server )

Oracle version - 12c

Oracle client version - Client Shared Library 64-bit - 12.1.0.2.0

SAP version - NW 7.40 SP7

SAP kernel version - 742 Patch 429

When labusnzdtsg01 powered off, ASCS package fails over to another cluster node labusnzdtsg02, and ASCS starts on labusnzdtsg02. During this failover, we had below observation

1. Lock entries remain intact

2. Dialog sessions go on hold and reconnects when ASCS is available on labusnzdtsg02 3. Background job (SGEN job with parallelism on both App servers) got cancelled with below error

M Thu Nov 17 12:43:56 2016 M *** ERROR => ThIRqSendWithReplyInline: reply rq_id 80634 for request with rq_id 77100 contains error [thRequest.c 3457] M {root-id=0050569906DD1EE6AB9D9A86C7763ABF}_{conn-id=00000000000000000000000000000000}_0 M error code REQ_RC_MS_ERROR M partner address (BY_NAME-labusnzdtsga2_S30_00

We do not expect failure of background job. Please let us know if this is the standard behaviour of the system. If not, please let us know solution for the same.

Thanks & Regards,

10 |10000 characters needed characters left characters exceeded
* Please Login or Register to Answer, Follow or Comment.

3 Answers

Basis LER Dec 14, 2016 at 08:28 PM
0

Hello Experts,

Any comments on this question?

Share
10 |10000 characters needed characters left characters exceeded
Reagan Benjamin
Dec 14, 2016 at 11:08 PM
0

The background jobs should not fail during a fail-over of ASCS to the second node unless the AS where the job runs crashes. I would first check if this happens for all the jobs or just for the job triggered by the Tx SGEN.

Share
10 |10000 characters needed characters left characters exceeded
Basis LER Feb 07, 2017 at 09:50 AM
0

Hi Reagan,

Thanks for reply.

You are correct, logically the job should not fail.

We have tested only SGEN and no other job.

However considering system behaviour, we expect other jobs to fail.

Also SAP replied that, whenever any job tries to connect to another App instance during failover of ASCS, the job is bound to get cancelled.

Any other thoughts?

Thanks & Regards,

Tejas

Share
10 |10000 characters needed characters left characters exceeded