Skip to Content

Anyway to make transactions not complete until log records make it to the repserver?

(note, I'm using the word "primary" here for what Sybase typically calls the "active" database)

We occasionally want to do a "switch active" for a routine hardware failure or OS problem (tell the repserver to use what *was* the warm standby db as the new primary db)

But with the typical replication setup, transactions in the old primary database's transaction logs could be lost when we switch active when the old primary server down.

Anyway to set things on a primary ASE database so that transactions in that database don't commit until they've finished replicating into the repserver?

I understand the tradeoffs here, a problem in the repserver could block access to the primary db (or, at least, slow it down). And this isn't guaranteeing data made it to the replicate db's, but just to the repserver. But I think it's a feature that would be useful in certain environments.

Thanks in advance
Ben Slade

Add comment
10|10000 characters needed characters exceeded

  • Get RSS Feed

3 Answers

  • Aug 01, 2017 at 02:28 PM

    If you've got the money, can wade through the convoluted setup, and don't mind the synchronous 2-phase commit ... you could implement the ASE/SRS HA/DR solution (aka synchronous rep; aka streaming rep). 'course, that's going a bit further than what you want (ie, you have to wait for the txn to complete on the secondary database before it's marked 'complete' in the primary database.


    The next idea would be to beef up the repagent/SRS connectivity and throughput by tweaking the repagent and SRS configs. Repagent configs would include 'priority' (set to 0?), 'retry timeout' (set to 1?), 'scan timeout' (set to 1?). SRS configs ... you'll likely need to opt for the SRS/ASO option to allow for use of some of the newer performance-related configs. And of course, work with the hardware/network guys to make sure you've got the shortest/fastest network hop from repagent to repserver.


    Of course, none of the above really matters if you've got a batch process spitting out several 10's of millions of DML commands in a short period of time, with those more-important txns getting stranded in the pdb after you perform that unplanned switch.

    In this scenario you may have to live with the fact that you're going to lose some transactions in the event of an unplanned switch (active).

    However, what you might want to consider is whether or not you can break your transactions into categories, eg, must-get-replicated-at-all-cost, would-really-like-to-make-sure-these-get-replicated-but-not-the-end-of-the-world-if-they-dont and these-can-be-easily-resumed-in-the-new-primary-db. If you can do this then might want to look at multipath replication, with the main objective being to route the 'high priority' transactions over their own path (so they're not held back/up by huge-txn-count maintenance and batch operations.

    'course, multipath rep also requires the SRS/ASO option, but it's a lot easier to setup and doesn't suffer from the performance overhead of synchronous 2-phase commits (aka, ASE/SRS HA/DR always-on, synchronous rep).

    Add comment
    10|10000 characters needed characters exceeded

    • Yeah, the ASE/SRS HA/DR solution is too heavy/expensive for what I'm looking for. I just want synchronous from the primary ASE server to the first point in the repserver where it's safely stored (but not necessarily made it through all steps in repserver)

      My understanding is that, someday, the rep_agents will starting using a new communications protocol related to SRS HA/DR implementation, but that new communications protocol would be used even if you're not using SRS HA/DR. Maybe at that point, some new rep agent options will show up.

      Anyway, thanks for the detailed writeup.

  • Jul 31, 2017 at 07:10 PM

    SAP/Sybase Warm Standby replication is asynchronous mechanism.

    So active (primary) and standby (replicate) are loosely connected via replication server in between.

    As such there is always a chance of loosing a few transaction(s) on active side if switch is abrupt.

    To minimize transaction loss in a controlled switch you can

    - first cut off all user connections on active side (no new DML coming in),

    - have your mechanism (or rs_ticket ) of sending last transaction across (all TXN have gone from active to standby)

    - verify that last TXN arrived (Standby matches active),

    and then do the active standby switch.



    Add comment
    10|10000 characters needed characters exceeded

  • Aug 23, 2017 at 06:16 PM

    Answering my own question....

    Could you:

    • open a transaction in a primary db.
    • do a million DML commands (or 1 DML command that affects a million rows) in that transaction
    • hold the transaction open
    • on a separate connection, run rs_ticket in that primary database
    • once the rs_ticket command makes it to the replicate database (shows up in rs_ticket_history), commit the transaction in the first connection?

    I ran a quick test and it looks like the rs_ticket command causes the open transaction log entries in the first connection to get written to the repserver, even though they don't get applied to the replicate db until the commit transaction command gets replicated.

    Add comment
    10|10000 characters needed characters exceeded

    • Agree with Mark.

      RepAgent is just sending the transaction log records (committed or not) to stable queue in a chronological order to optimize data transfer.

      Come the time of "applying" these on Replicate side and only commit order is honored.

      So rs_ticket or any other single row committed insert/update in a replicated user table will be applied on the replicate side before the "big open transaction" still waiting to commit/rollback.

      You could write tandem transaction where rs_ticket or equivalent single row committed insert/update can reach replicate first and checking that on Replicate you can commit your application transaction on Primary.

      Just note this is still asynchronous operation and subject to replication latency.

      We use similar mechanism for switching our DSS reporting application that shifts connection from REPLICATE to PRIMARY based on a time stamped row replicated every minute and arriving with reasonable latency.