Skip to Content
avatar image
Former Member

Snapshot Backups on HP EVA SAN

Hi everyone,

We are implementing a new HP EVA SAN for our SAP MaxDB Wintel environment. As part of the SAN setup we will be utilising the EVAs snapshot technology to perform a nightly backup.

Currently HP Data Protector does not support MaxDB for its "Zero Downtime Backup" concept (ZDB), thus we need to perform LUN snapshots using the EVAs native commands. ZDB would have been nice as it integrates into SAP and lets the DB/SAP know when a snapshot backup has occurred. However as I mentioned this feature is not available on MaxDB (only SAP on Oracle).

We are aware that SAP supports snapshots on external storage devices as stated in OSS notes 371247 and 616814.

To perform the snapshot we would do something similar (if not exactly) like note 616814 describes as below:

To create the split mirror or snapshot, proceed as follows:

dbmcli -d <database_name> -u < dbm_user>,<password>

util_connect < dbm_user>,<password>

util_execute suspend logwriter

==> Create the snapshot on the EVA

util_execute resume logwriter

util_release

exit

Obviously MaxDB and SAP are unaware that a "backup" has been performed. This poses a couple of issues that I would like to see if anyone has a solution too.

a. To enable automatic log backup MaxDB must know that it has first completed a "full" backup. Is it possible to have MaxDB be aware that a snapshot backup has been taken of the database, thus allowing us to enable automatic log backup?

b. SAP also likes to know its been backed up also. Earlywatch Alert reports start to get a little upset when you don't perform a backup on the system for awhile.

Also DB12 will mention that the system isn't in a recoverable state, when in fact it is. Any work arounds available here?

Cheers

Shaun

Add comment
10|10000 characters needed characters exceeded

  • Get RSS Feed

2 Answers

  • Best Answer
    Jun 08, 2008 at 07:46 PM

    Hi Shaun,

    ad a)

    You have to perform an initial complete data backup here - no way to avoid it. If you don't want to keep it - throw it away afterwards.

    ad b)

    Also this is a point where you (currently) have to decide to either use the SAP approach (only use the supported and predefined processes as these are the only ones that are captured in CCMS) OR you rely on your own backup processes and the monitoring of it.

    Since the EWA report is just a collection of warnings and hindsights that need to checked and interpreted in any case you can always say the EWA report is OK unless the missing backup warning is the only warning.

    Anyhow, what makes me really wonder is how the rest of your backup strategy looks like.

    How do you automate the checking of the snapshot-backup?

    How do you automate the consistency check of the backed up database?

    The Snapshot-Backup approach you're using obviously offers the big oppertunity to create a second copy of the database (from the backup taken), open the database and perform a consistency check without putting the load to the production machine.

    Is this part of your process?

    KR Lars

    Add comment
    10|10000 characters needed characters exceeded

    • Hi Shaun,

      interesting thread sofar...

      > It would be nice to see HP and SAP(MaxDB) take the snapshot technology one or two steps further, to provide a guaranteed consistent backup, and can be block level verified. I think HPs ZDB (zero downtime backup eg snapshots) technology for SAP on Oracle using Data Protector does this now?!??!

      Hmm... I guess the keyword here is 'market'. If there is enough market potential visible, I tend to believe that both SAP and HP would happily try to deliver such tight integration.

      I don't know how this ZDB stuff works with Oracle, but how could the HP software possibly know how a Oracle block should look like?

      No, there are just these options to actually check for block consistency in Oracle: use RMAN, use DBV or use SQL to actually read your data (via EXP, EXPDB, ANALYZE, custom SQL)

      Even worse, you might come across block corruptions that are not covered by these checks really.

      > Data corruption can mean so many things. If your talking structure corruption or block corruption, then you do hope that your consistency checks and database backup block checks will bring this to the attention of the DBA. Hopefully recovery of the DB from tape and rolling forward would resolve this.

      Yes, I was talking about data block corruption. Why? Because there is no reliable way to actually perform a semantic check of your data. None.

      We (SAP) simply rely on that, whatever we write to the database by the Updater is consistent from application point of view.

      Having handled far too much remote consulting messages concerning data rescue due to block corruptions I can say: getting all readable data from the corrupt database objects is really the easy part of it.

      The problems begin to get big, once the application developers need to think of reports to check and repair consistency from application level.

      > However if your talking data corruption as is "crap data" has been loaded into the database, or a rogue ABAP has corrupted several million rows of data then this becomes a little more tricky. If the issue is identified immediately, restoring from backup is a fesible option for us.

      > If the issue happened over 48hrs ago, then restoring from a backup is not an option. We are a 24x7x365 manufacturing operation. Shipping goods all around the world. We produce and ship to much product in a 24hr window that can not be rekeyed (or so the business says) if the data is lost.

      Well in that case you're doomed. Plain and simple. Don't put any effort into getting "tricky", just let never ever run any piece of code that had not passed the whole testfactory. That's really the only chance.

      > We would have to get tricky and do things such as restore a copy of the production database to another server, and extract the original "good" documents from the copy back into the original, or hopefully the rogue ABAP can correct whatever mistake they originally made to the data.

      That's not a recovery plan - that is praying for mercy.

      I know quite a few customer systems that went to this "solution" and had inconsistencies in their system for a long long time afterwards.

      > Look...there are hundreds of corruption scenarios we could talk about, but each issue will have to be evaluated, and the decision to restore or not would be decided based on the issue at hand.

      I totally agree.

      The only thing that must not happen is: open a callconference and talk about what a corruption is in the first place, why it happened, how it could happen at all ... I spend hours of precious lifetime in such non-sense call confs, only to see - there is no plan for this at customer side.

      > I would love to think that this is something we could do daily to a sandpit system, but with a 1.7TB production database, our backups take 6hrs, a restore would take about 10hrs, and the consistency check ... well a while.

      We have customers saving multi-TB databases in far less time - it is possible.

      > And what a luxury to be able to do this ... do you actually know of ANY sites that do this?

      Quick Backups? Yes, quite a few. Complete Backup, Restore, Consistency Check cycle? None.

      So why is that? I believe it's because there is no single button for it.

      It's not integrated into the CCMS and/or the database management software.

      It might also be (hopefully) that I never hear of these customers. See as a DB Support Consultant I don't get in touch with "sucess stories". I see failures and bugs all day.

      To me the correct behaviour would be to actually stop the database once the last verified ⚠️ backup is too old. Just like everybody is used to it, when he hits a LOGFULL /ARCHIVER STUCK situation.

      Until then - I guess I will have a lot more data rescue to do...

      > Had a read ... being from New Zealand I could easily relate to the sheep =)

      😊

      > Thats not wan't I meant. Like I said we are a 24x7x365 system. We get a maximum of 2hrs downtime for maintenance a month. Not that we need it these days as the systems practically run themselves. What I meant was that between 7am and 7pm are our busiest peak hours, but we have dispatch personnel, warehouse operations, shift supervisors ..etc.. as well as a huge amount of batch running through the "night" (and day). We try to maintain a good dialog response during the core hours, and then try to perform all the "other" stuff around these hours, including backups, opt stats, and business batch, large BI extractions ..etc..

      > Are we busy all day and night ... yes ... very.

      Ah ok - got it!

      Especially in such situations I would not try to implement consistency checks on your prod. database.

      Basically running a CHECK DATA there does not mean anything. Right after a table finished the check it can get corrupted although the check is still running on other tables. So you have no guranteed consistent state in a running database - never really.

      On the other hand, what you really want to know is not: "Are there any corruptions in the database?" but "If there would be any corruptions in the database, could I get my data back?".

      This later question can only be answered by checking the backups.

      > Noted and agreed. Will do daily backups via MaxDB kernel, and a full verification each week.

      One more customer on the bright side 😊

      > One last question. If we "restored" from an EVA snapshot, and had the DB logs upto the current point-in-time, can you tell MaxDB just to roll forward using these logs even though a restore wasn't initiated via MaxDB?

      I don't see a reason why not - if you restore the data and logarea and bring the db to admin mode than it uses the last successfull savepoint for startup.

      If you than use recover_start to supply more logs that should work.

      But as always this is something that needs to be checked on your system.

      That has been a really nice discussion - hope you don't get my comments as offending, they really aren't meant that way.

      KR Lars

  • Jun 07, 2008 at 07:10 PM

    What you can do is do write a script to create the snapshot from OS level (using SSSU) and trigger the backup from MaxDB instead of "manually".

    The backup medium definition has an "OS command" field.

    Markus

    Add comment
    10|10000 characters needed characters exceeded

    • Former Member

      Hi Markus,

      From what I can tell, the only way to use the OS Command option is to define the medium type as tape. It also requires me to specify an actual tape device.

      Is there away around this so that I can execute the OS Command without actually trying to backup to tape at the same time?

      Cheers

      Shaun