on 12-18-2017 11:37 AM
Hi Oracle experts,
Can somebody explain why there's a short dump Oracle error ORA-01555 snapshot too old in the following scenario ?
I'm surprised because I thought that the only possibility of ORA-01555 was that the same table is updated and read at the same time. In the current scenario, this table is not updated.
Scenario : 17 parallel jobs of the same program are running at the same time, each one reading the table ZTABLE1 and inserting into ZTABLE2 (the 2 tables have the same structure). There is no process updating ZTABLE1 at the same time.
This scenario is simplified here, but it really reads one table and inserts data into another table, no other table is accessed.
data: mycursor TYPE cursor,
itab TYPE TABLE OF ztable1.
Open cursor with hold mycursor on select * from ztable1
where package = <package dedicated to each job> " 1 to 17
order by primary key
Do.
Fetch next mycursor into table ITAB package size 10000.
if sy-subrc <> 0. EXIT. endif.
Insert ZTABLE2 from table ITAB.
call function 'DB_COMMIT'.
Enddo.
close cursor mycursor.
If I decrease the package size to 1000, there's no short dump any more.
I get the following short dump at the FETCH NEXT mycursor.
DBSQL_SQL_ERROR CX_SY_OPEN_SQL_DB SQL error "SQL code: 1555" occurred while accessing table "ZTABLE1".
Database error text: SQL message: ORA-01555: snapshot too old: rollback
segment number 39 with name "_SYSSMU39_3645373441$" too small
Return value of the database layer: "SQL dbsl rc: 99"
SAP system configuration :
ORACLE 11.2.0.3.0
Linux x86_64 PL 390
Kernel 742 Linux GNU SLES-11 x86_64 cc4.3.4 use-pr150925 PL 300
Oracle parameters :
SELECT RETENTION FROM DBA_TABLESPACES WHERE TABLESPACE_NAME = 'PSAPUNDO'
NOGUARANTEE
undo_management AUTO
undo_retention 43200
Thank you!
Sandra
I’ve seen this a few times on a large SAP system. The scenario was in principle similar to yours.
Oracle has a roll area where in progress transaction deltas are stored, the purpose being to provide read consistency and to store changes in a temporary place in order to roll back if needed. On commit the data is transferred to the log and then lastly saved to the main DB. If you throw too much at the DB then it can’t cope. My memory is a bit hazy here, but as I remember it one or both of these acts similar to a ring buffer, Oracle just keeps filling it in sequence, and when it reaches it’s tail then you start to see this sort of error.
So when you decrease your block size, you reduce the amount it needs to hold in it’s rollback buffer and flushes it to the log segment more often. The simple answer is to use a smaller block size or to increase the rollback buffer. The good answer is more involved and can only be done on your system, tuning large parallel jobs is an art form in itself.
Just in case you haven’t already found it, there’s also a nice technical blog about it over here.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Thank you very much, sorry for the late answer, I was quite busy.
This is all about the committed read. If I understand well, you describe the classic reason of ORA-01555. In my case, I don't understand the reason of the Oracle error on ZTABLE1 table, because it was not updated at the same time. The only reasons I could find in the web is about both an update and a read at the same time (and it's also what I ever thought).
The general algorithm of the program is:
I cannot say for 100% certain as my Oracle days were many moons ago, but I'm pretty sure the same principle applies:
Read consistency means that Oracle will always provide the same data within a transaction, even if it is updated in the meantime. In order to do so it may clone read-only data into the transaction buffer as well, if it is being modified elsewhere.
It is possible that if your update of ZTABLE1 is incomplete at the point ZPROG2 starts, that Oracle decides that ZTABLE1 is dirty and thus gets copied. Or something else could update ZTABLE1 once ZPROG2 has started, which would also force Oracle to buffer it.
So in other words:
Process 1 starts reading table 1
Process 2 modifies table 1 and does a commit.
Process 1 will still continue to receive the results of the query as it was at the start of the query.
In order to do this, even readonly data must be copied into the buffer.
More gory details at https://docs.oracle.com/en/database/oracle/oracle-database/12.2/cncpt/data-concurrency-and-consisten...
I am sure that the update was finished and committed. There was no SELECT during the UPDATE.
I have seen that there were a few bugs in Oracle, but I didn't check whether the corrections were applied. But I'm surprised that there still could be some bugs on so simplistic scenario (maybe not so simplistic at Oracle side).
By the way, I didn't tell that the update was about 300 millions of rows inserted in ZTABLE1. Maybe there was a kind of delaying/buffering at Oracle side.
Anyway, it's difficult now to investigate more, as there was a workaround (smallest SELECT cursor package size), and my client is happy with the solution.
Thank you. I mark your answer as the right one, because it explains the classic ORA-01555 problem.
Thanks. I think the detail falls under what I meant by “can only be done on your system”.
I am speculating here, but if you’re running Oracle RAC or a shadow database, it’s possible that Oracle will hold on to a transaction delta in the buffer until synchronisation is complete, even though the ‘local database’ transaction is consistent and committed. But I’m just making a semi-educated guess so don’t take my word for it.
Shadow DB = mirrored / hot standby DB. Ideally geographically separate.
User | Count |
---|---|
93 | |
10 | |
10 | |
9 | |
9 | |
7 | |
6 | |
5 | |
5 | |
4 |
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.