Tuesday, November 25, 2008

"When All Else Fails..."

This is on a shirt that I received as a present at work today.

Monday, November 10, 2008

Our "Friend" ORA-01555

Have you ever had your Oracle program ever get the dreaded ORA-01555: snapshot too old: rollback segment number 10 with name "_SYSSMU10$" too small error, you rerun the program, and it runs to completion with no errors? No changes to the code. No changes to the program parameters. No changes to the database. You just rerun the program and it works. As an application developer I would just say that the DBA needs to increase the amount of snapshot space and leave it up to the DBA to handle it. But since it is not a re-creatable event, the DBA could not do anything because we could not tell how much to increase the rollback space...

Well I ran into this sporadically at work recently and was able to overcome the problem. I overcame the problem with a programming change and not a database change. Basically the application when it was complete was deleting data out of a number of tables. We had some deletes inside of a cursor loop at the program's end.

FOR info IN cursor_x LOOP
DELETE FROM table1 WHERE ...
DELETE FROM table2 WHERE ...
...
END LOOP.

I finally determined that the problem was one iteration through the loop was deleting so much data that the database was cycling through the rollback space and when the cursor got to the top of the loop and attempted to get the next row of "info" the error would appear. So it was not the delete, but the "get" of the the row. The snap shot was too old. The database could not locate the next row of data in the cursor loop.

What was causing the sporadic nature of this problem was that we would have up to three of these programs running simultaneously. Some runs deleted small amounts of data and others deleted large amounts of data. If the programs were single threaded there was no problems. So the rollback space was large enough for one program to run, but multiple runs with at least one deleting a large amount of data would not work. In my option single threading the programs was not a fix. It was just a work around.

So since the data used for the cursor loop was not affected by the DELETEs inside the loop I did a BULK COLLECT and iterated through the array like:

OPEN cursor_x;
FETCH cursor_x INTO info;
CLOSE cursor_x;

FOR i IN 1 .. info.COUNT LOOP

DELETE FROM table1 WHERE ...
DELETE FROM table2 WHERE ...
...
END LOOP.

I figured I would share this in case you run into it. Usually I see the issue discussed from a DBA's perspective and not the programmer's perspective.

We have had no problems since. I had better not say that too loudly, the Oracle kernel may hear me;)