Skip to Content

SAP BW with Oracle Data Guard (FSFO).

Hello Oracle/SAP gurus.
We are facing an issue with Oracle DG configuration in conjuction with SAP BW 7.31.
Oracle 12c, Oracle Client 12v2, SAP kernel 721_EXT 700 (DBSL718) or 722_EXT 200
(DBSL 211).
Problem in details:
after switchover(or failover) and role transition from Primary to Standby, oracle Service starts correclty on ex-Standby mode followed the New Primary openning. Active SAP work processes receiving "ORA-16456 switchover to standby in progress or completed" with dumps in st22 or just errors in sm21(db can't always write the dump) and reconnecting to new Primary (selects with TAF, this part works very well). Let's assume the WPs nr50 and nr55 have been reconnected after error ORA-16456. Users continue work in system, usually in RSA1, and if some action starts on affected WP50 or WP55, some strange errors appearing one by one ...
1) ORA-25408: can not safely replay call (why? wp is already reconnected to DB).
2) sql error 3113 performing FET on table RSCRT_RDA_REQ (any BW table)
3) ORA-01460: unimplemented or unreasonable conversion requested


4) And finally sessions running on affected WPs hang, with "sql*net more data from client",
so I can see them in st04-seesions(some obvious selects), and I see the WP is running showing the same select, but it never ends, following --> ORA-03137: TTC protocol internal error on DB side.
All other WPs looks like are working well, many dataloads etc. are finishing...

Does someone has BW on DataGuard? How switchover/failover works in your enviroment?
Thanks,

Best Regards, Sergo.

Add comment
10|10000 characters needed characters exceeded

  • Get RSS Feed

3 Answers

  • Nov 28, 2016 at 06:11 PM

    Hey Sergo,

    do you remember our chat on LinkedIn and my opinion about SAP and RAC (TAF implementation)? Now you see and feel what i meant as this is the same root cause :-)

    Point 1) Normal. For more details please check MOS ID #1268046.1

    Point 2 & 3) Sounds like an OCI or DBSL bug or TCP/IP (timeout) issue, but if this scenario is reproducible it should be pretty easy to troubleshoot with SQL*Net and DBSL trace

    Point 4) Sounds like a SAP kernel or DBSL bug or a subsequent fault of the OCI issue of point 2 & 3

    However just do a SQL*Net trace and the root cause (component) can almost be isolated. In general nothing uncommon here as SAP's (supported) implementation is anno year 2000 :-)

    Regards

    Stefan

    Add comment
    10|10000 characters needed characters exceeded

  • Nov 30, 2016 at 09:09 AM

    Hi Stefan,

    thank you for the answer! Yes, I remember everything we chated :)
    Thank you for pointing to SQL*NET traces, we will try to check using those traces, problem is we can't easily reporduce the error,
    and frankly speaking even we will be able to, it produces gigs of traces, I'm not sure it is feasible to find something meaningfull in them :)


    We tried latest DBSL patch currently available for download, same situation (message to SAP has been opened more than week ago).

    About "ORA-25408: can not safely replay call" it is expected during or just after switchover, but in our case, even after WP already reported that reconnect happened, in 5-10 mins such error can appear.
    P.S. As one of the possible bugs, we have found next one on Metalink -->

    Bug 18263924 - ORA-3137 (varying arguments) / ORA-1460 (usually with ORA-1002) on the Database When Using Multii-Threaded OCI Application (Doc ID 18263924.8)


    It looks similar to our problem, but there are some questions about this patch.
    Can be SAP application called "Multii-Threaded OCI Application" ?
    Inside bug there is next explanation -->
    Note:

    Bug 18263924 is an Oracle client-only fix. Applying it on the database server does not fix this problem.

    but them provided the interim fix for Linux x86_64 for Database side. ?
    This interim fix can't be applied on top of our patch currently installed, because it has come conflict :)
    Best Regards, Sergo.

    Add comment
    10|10000 characters needed characters exceeded

    • Hey Sergo,

      > and frankly speaking even we will be able to, it produces gigs of traces, I'm not sure it is feasible to find something meaningfull in them :)

      No problem. SQL*Net Trace got a log rotation function so you can split into several trace files and rotate them. It does not need to be enabled all the time - just at switch / fail-over time and then go right into the corresponding trace file.

      > About "ORA-25408: can not safely replay call" it is expected during or just after switchover, but in our case, even after WP already reported that reconnect happened, in 5-10 mins such error can appear.

      Yes, but i guess this is gonna be the first action after re-connect right? Works as designed :-)

      Please do not guess - just trace :-)

      Regards

      Stefan

  • Dec 15, 2016 at 02:10 PM

    Hi Stefan, all

    we were able to find a work around, disabling the TAF helped in our case, in any case there was not much benefit from it.

    About the issue itself, we don't have time now to test it deeply (SAP support tossed our ticked 3 weeks, before finally we reached
    the guys who were really aware about that is going on.) system is closed for the next tests.
    We would like to test it on next test system, but it takes some time to build similar env.

    Recent tests have shifted the suspicion from the DBSL to the Oracle client side (but not enough traces there).

    Meanwhile, with oracle 12.2 SAP can start the Application Continuity support, let's wait news from SAP.

    BR, Sergo.

    Add comment
    10|10000 characters needed characters exceeded