We had to shutdown a productive database with db2_kill, because it couldn't be stopped normally and had problem with a full FAILARCHPATH (After TSM server had problems, the archiving to TSM has not been successfully any more, even after TSM Server was up again: We had this problems before....)
The crash recovery takes very long. Sometimes even db2 list utilities <show details> seems to hang.
With db2pd -everything I can see the progress of the crash recovery:
Database Partition 0 -- Database PC1 -- Active -- Up 0 days 01:57:14 -- Date 05/07/2008 11:34:59
Recovery:
Recovery Status 0x00000C01
Current Log S0003363.LOG
Current LSN 061F2B330DBA
Job Type CRASH RECOVERY
Job ID 1
Job Start Time (1210145904) Wed May 7 09:38:24 2008
Job Description Crash Recovery
Invoker Type User
Total Phases 2
Current Phase 1
Progress:
Address PhaseNum Description StartTime CompletedWork TotalWork
0x000000020018E580 1 Forward Wed May 7 09:38:24 2008 786766439 bytes 1998253346 bytes
0x000000020018E670 2 Backward NotStarted 0 bytes 1998253346 bytes
So the db has now finished approx 1/3 of the bytes of the forward phase and then also have the backward phase!
In the db2diag.log there are no more entries after beginning of the crash recovery of 09:38.
We have move one logfile from the FAILARCHPATH directory (which was 100% full) to a different directory to be sure, that the slow crash recovery has nothing to do with the full FAILARCHPATH.
The log_dir directory has 20 logfiles (LOGPRIMARY+ LOGSECOND) in it (more could not be allocated there because the log_dir is sized according to the LOG-Parameters)
Parameter UTIL_HEAP_SZ = 150.000
Does anybody have an idea, why the crash recovery is so slow ?
Kind regards,
Uta