Skip to Content
avatar image
Former Member

Why would HANA db disconnect take place during backup copy procedure?

We run on ECC 6.0 EHP 7 Suite on premise on HANA VM SLES. We also replicate to a backup DB for DR purposes. During off hours we backup the DB to disk then copy to another storage location for final backup. During the copy (CP command) procedure there is frequent escalation of both CPU & Memory. Unfortunately, nearly daily at the same time, there is also a momentary disconnect to the database. Any jobs running during that time will fail as a result.The indexserver trace file often reports a timeout and broken connection, but not always. SM21 shows job failures, but not the same job daily. This started happening just over a month ago (APR 3), but we cannot relate it to any change that took place in our system environment.We have checked VM, Network, System, Backup, HANA, and Storage statistics and have run traces. We have opened incident with SAP, but have no answers after all of this.Anyone experiencing the same or similar?
Add comment
10|10000 characters needed characters exceeded

  • Get RSS Feed

2 Answers

  • May 10, 2018 at 03:03 AM

    You have done all the tracing, but what have you learned from it?

    Were there any error messages that pointed to a cause for the disconnects?

    What about the memory "escalations" during the copy process? Why is that happening? Are you using DirectIO when you copy the files? If not, you should consider this, as there is no benefit in using the file buffer memory when doing a one-off copy of the backup files.

    Add comment
    10|10000 characters needed characters exceeded

  • avatar image
    Former Member
    May 11, 2018 at 12:07 AM

    Hi Lars. This is Gary Conn, the DBA working with Mark.

    In an nutshell, if we knew the answers to your questions, we would not be here. That said...

    Since we upgraded HANA from 85.03 to 122.12, we have been getting small memory "spikes" on a semi-regular basis where we did not get them before we upgraded HANA. We are waiting on SAP to help, but so far, nothing. We have sent them the trace and RTE files as requested, but they have not found any "smoking gun". Also, when we are running our database backups, a spike is generated in memory and CPU; when we moved the backup to a different time, it followed. We do a local disk backup using the HANA native backup and then we use a very basic Linux cp command to copy the 400+GB (total size) files over to a Windows server we use for storing backups (then off to tape from there). We have been doing this for the last 2.5 years with no issues, until now. The memory and CPU spikes happen after the HANA backup and copy command (about 40 minutes into the cp command) and stop about 10-15 minutes later (memory goes back to normal after increasing by about 20%); the copy to the Windows server takes about 1.5 hours. We have HANA sync system replication running also to a local site over a 10GB pipe; first using the new logreplay, then switched to delta_datashipping; I am trying to see if different modes of HSSR is causing the spikes (still working on it).

    Have you or anyone else you know experienced this before?



    Gary Conn

    Add comment
    10|10000 characters needed characters exceeded

    • Former Member

      We do not use a samba share. Through further testing we have determined that the copy is not causing this issue. We agree with your statement. We have documented that a particular job that was running before the upgrade with no changes to it, is associated with the pronounced regular spiking.

      I've included an image to provide a view of dramatic before/after operation of the HANA DB.hana-spiking-2.png Again, the gist of seeking assistance is to identify why HANA would acknowledge a connection loss (at any time) with higher demand. We have changed HANA parameters such as tcp_backlog and indexserver maxchannels, but have not seen any definitive result.

      We'll keep looking for answers. Thanks.

      hana-spiking-2.png (410.5 kB)