on 10-11-2010 7:44 AM
Hi All,
We have been observing that backups are failing.
Getting a message saying that 0003 Error during initialization
BR0051I BRBACKUP 7.20 (3)
BR0055I Start of database backup: beehthla.anf 2010-10-11 03.00.02
BR0484I BRBACKUP log file: /oracle/DB2/sapbackup/beehthla.anf
BR0071E BRBACKUP currently running or was killed
BR0072I Please delete file /oracle/DB2/sapbackup/.lock.brb if BRBACKUP was killed
BR0073E Setting of BRBACKUP lock failed
BR0056I End of database backup: beehthla.anf 2010-10-11 03.00.02
BR0280I BRBACKUP time stamp: 2010-10-11 03.00.03
BR0054I BRBACKUP terminated with errors
When i open the file it has oradb2 2> more /oracle/DB2/sapbackup/.lock.brb
FULL PID=14801
When i check for the same at the os level i get the below output.Could anyone help regarding the same.Thanks.
<hostname>:oradb2 3> ps -ef | grep -i 14801
oradb2 14801 14678 0 Oct09 ? 00:00:01 brbackup -u / -c -t online_cons -a -c -cds
oradb2 14802 14801 0 Oct09 ? 00:00:00 [oracle] <defunct>
oradb2 14805 1 0 Oct09 ? 00:00:46 /usr/sap/DB2/SYS/exe/run/brconnect -O 14801
oradb2 14806 14801 0 Oct09 ? 00:00:00 sh -c ( /usr/sap/DB2/SYS/exe/run/backint -u DB2 -f backup -i /oracle/DB2/sapbackup/.beehjluw.lst -t file_online -p /oracle/DB2/102_64/dbs/initDB2.utl -c ) 2>&1
oradb2 30910 30775 0 17:40 pts/2 00:00:00 grep -i 14801
Hello,
so the message did tell you
>BR0071E BRBACKUP currently running or was killed
You did check
> When i open the file it has oradb2 2> more /oracle/DB2/sapbackup/.lock.brb
> FULL PID=14801
>
> <hostname>:oradb2 3> ps -ef | grep -i 14801
> oradb2 14801 14678 0 Oct09 ? 00:00:01 brbackup -u / -c -t online_cons -a -c -cds
and you obviously found a backup still running...
Now you decided to fiddle with this semaphor file! WHY ?
Your backup was still active for whatever reason.
Your doing a backup to an external backup utility and may be your session was still waiting for a mount confirm or whatever.
Now the correct solution would have been to check out your external backup tool.
If anything had been blown there, your next approach should have been to kill the running "brbackup"
and not by giving it a "kill -9 " headshot as so many people do, but just a calm friendly standard "kill"
which will send a SIGTERM and allows brbackup to clean up all things to the finest,
which includes a proper deletion of this semaphor file.
You should only delete these .lock files if you are very certain, that the action that these .locks are
protecting are really completed or really aborted.
This is even worse with RMAN backups and 3rd party tools, because you can still have active sessions
in ST04 while brbackup is already dead. In this case you need to terminate these first even if brbackup
is already able to start again!
So how many brbackups do you have active, now that you deleted the lock file and I guess you
give a try (or two?) for a restart ....
Volker
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Volker,
That was simply amazing.Your analysis is too good.
First of all to answer your questions.
We did not fiddle with the backups.Its all automated using a cron job to run at the middle of the night.I was just checking whether backups had run or not.When checked daily backups were not finishing and we wanted to know why.So went through the log files and had pasted the same.
We use netbackup as our backup solution.So my understanding is that whenever something like this happens where the .lock.brb file gets generated we shouldnt really delete it,rather we should be killing the process associated i.e. the number mentioned in the file.So the order would be to first kill the br backup and then the netbackup.Please correct me if my understanding is wrong.
Volker,
Again the backup has failed today.So should i be killing the brbackup with kill and then the rest with kill also.Thanks.
hostname:oradb2 1> more /oracle/DB2/sapbackup/.lock.brb
FULL PID=14801
hostname:oradb2 2> ps -ef | grep 14801
oradb2 4851 4720 0 16:25 pts/2 00:00:00 grep 14801
oradb2 14801 14678 0 Oct09 ? 00:00:01 brbackup -u / -c -t online_cons -a -c -cds
oradb2 14802 14801 0 Oct09 ? 00:00:00 [oracle] <defunct>
oradb2 14805 1 0 Oct09 ? 00:01:03 /usr/sap/DB2/SYS/exe/run/brconnect -O 14801
oradb2 14806 14801 0 Oct09 ? 00:00:00 sh -c ( /usr/sap/DB2/SYS/exe/run/backint -u DB2 -f backup -i /oracle/DB2/sapbackup/.beehjluw.lst -t file_online -p /oracle/DB2/102_64/dbs/initDB2.utl -c ) 2>&1
Again the backup has failed today.So should i be killing the brbackup with kill and then the rest with kill also.Thanks.
No,
you should consult the netbackup server why the session hangs.
Netbackup should "see" the session and have a status for you.
May be it is waiting for a free media in your pool, or your backup-policy does not permit
an interactive start (allthough you should get an aborted backup then).
If you do not get information there, I'd try to start
kill backint
If this does not terminate brbackup
kill brbackup
after this, make sure no more backup process is running.
Check for the lockfile after this. If it is still there, you need to delete it as well,
but normally it gets cleaned up when brbackup is normally killed.
There is a section in the netbackup client guide how to enable a client trace.
You need to create some additional directories in you netbackup - "logs" subdir,
one of it named "backint" which should get 777 permission.
Restart the backup then and check the netbackuptracefile for strange things.
I mind to remember a similar behavior on a netbackup server,
when the sap-integration license was expiered, but it can be anything else.
Good hunting
Volker
Volker,
Spoke to the netbackup team.They asked us to kill the process and delete the file.
When i tried to using kill 14801
The process is not getting killed.How else can we kill the process that is of oct9.Thanks.
sapkxdap09:oradb2 7> ps -ef | grep br
oradb2 14801 14678 0 Oct09 ? 00:00:01 brbackup -u / -c -t online_cons -a -c -cds
when you lost hope, a reboot will always clean this up.
Woa Eric, that appears to me like using a very big hammer to squeeze ants
OK, if there is nothing left, one stil can try "kill -9"
I never said, never use it.
I just recommend not to use it as a standard procedure in the first try to clean things up.
V.
Hi,
As mentioned in the log, delete the file .lock.brb from the location /oracle/DB2/sapbackup and then re-execute the backup(make sure that there are no backups being executed).
The .lock.brb sets a lock such that the database will not be backed up until the current backup completes. Once the backup is completed, this file will be removed from that location allowing us to backup the DB again.
In case if the backup is disturbed abruptly the .lock.brb remains and thus not allowing us to take a backup.
Hope this clarifies.
Regards,
Varadharajan M
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
If BRARCHIVE or BRBACKUP terminate, it is no longer necessary to delete the files manually.The process number of the running program is stored in these files.If the system finds that a process with this number is no longer active (BRARCHIVE/BRBACKUP can no longer run) when you next call these programs, the following warning is output:
BR049W Last BRARCHIVE/BRBACKUP run was probably killed and the processing is continued.
Kindly update your br* patch to the latest.
Hi,
Pasted below the backup version,
brbackup -V
BR0051I BRBACKUP 7.00 (46)
Patch Date Info
9 2005-11-09 BRBACKUP ignores file copy errors in disk backups (note 896160)
10 2006-01-05 BR*Tools fail due to SAP license problems (note 912969)
11 2006-01-11 Small functional enhancements in BR*Tools (note 914174)
13 2006-03-29 BR*Tools support for MDM databases (note 936665)
15 2006-07-28 Substantial extensions in backups with BR*Tools (note 968507)
16 2006-08-11 BR*Tools start error: library libnnz10 not found (note 972136)
19 2006-10-30 Backup to disk fails on Windows with BR0278E (note 994136)
20 2006-11-24 Extended support for system copy in BR*Tools (note 1003028)
22 2007-01-10 Verification of database and archivelog files with RMAN (note 1016173)
24 2007-03-01 BR*Tools support for Oracle 10g RAC (note 1033126)
25 2007-04-26 BR*Tools failing with ORA-01455 for database > 16 TB (note 1050329)
26 2007-05-31 New BR*Tools command options (note 1060696)
30 2007-10-10 Support for RMAN save sets with disk backup (note 1101530)
31 2008-01-03 Aborting BRARCHIVE, BRBACKUP and BRRESTORE runs (note 1129197)
32 2008-02-05 Corrections for BR*Tools 7.00 patch 31 (note 1138968)
36 2008-07-29 Small functional enhancements in BR*Tools (2) (note 1235952)
37 2008-10-08 BRBACKUP fails with BR0146E for 'saveset_members' parameter (note 1259765)
41 2009-04-01 BR*Tools fail while calling function OpenService() (note 1325242)
45 2010-01-14 Restore of archivelog files fails with BR0100E (note 1426635)
46 2010-01-26 BR*Tools support for Oracle 11g (note 1430669)
release note 849483
kernel release 700
patch date 2010-01-26
patch level 46
make platform linuxx86_64
make mode OCI_102
make date Jan 29 2010
>Whats with the defunct process
I suspect some sql qury is running in background.
If sqlplus is running as background process and it become idle (no active SQL) will be not finished until it will be killed or taken to foreground using fg (UNIX command).
Be aware if you want to shutdown database and you have some sqlplus'es in background
User | Count |
---|---|
93 | |
10 | |
10 | |
9 | |
9 | |
7 | |
6 | |
5 | |
5 | |
4 |
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.