July 17, 2007, 4:37 p.m.
posted by oxy
Item 36: Use savepoints to keep partial work in the face of rollbackUnfortunately, transaction atomicity isn't always a wonderful thing. In particular, one of the problems that can arise with an atomic transaction is that when something goes wrong within the transaction's workspace, everything goes wrong. Once a database statement has failed within the transaction, every bit of the work conducted as part of that transaction is also invalid now and therefore must be aborted. Normally, we regard this behavior as a positive thing. After all, the reason we want atomic transactions is so that we don't have to write the rollback or compensating transaction code (see Item 28) to undo what has been done in the event of a failure. But even though this is the behavior we want most of the time, sometimes it works against us. As we saw in the trip planning scenario (again, see Item 28), it's not always the case that we want to abort the entire scope of work executed so far. In simple scenarios, this all-or-nothing failure model works well, but certain actions build up more context along the way, not all of which is invalidated in the event of a failure—just because we couldn't book that final leg from Frankfurt to Munich doesn't mean the first two legs of the trip need to be thrown away. To put this into a more systemic, concrete light, imagine for a moment a banking system calculating interest on all of its savings accounts. One way to do this is to set up a message-driven bean inside an EJB container and set up the new EJB Timer Service to fire a message at a specific time. When this message is received, the InterestCalculatingBean executes a pretty straightforward SQL UPDATE statement to increase the balance of all savings accounts by whatever the current interest rate is. Naturally, we want to make sure this calculation is completed; to be more accurate, we need to make sure all of our account holders' accounts are updated, so we put this into a single transaction to make sure the action is atomic. Unfortunately, there are well over ten million accounts at our bank, and it takes a while to run through an UPDATE statement on ten million rows (particularly if the SQL statement is at all complex, such as only updating those savings accounts with a minimum balance plus those accounts held by VIPs and those accounts opened by members between the dates of February 15 and April 15—you get the idea). Even on heavy iron, this could easily take several hours, perhaps all night, to run successfully. On the 9,947,831st row, we hit a snag, the UPDATE fails, and the entire transaction must be aborted. Guess we'll just have to try again next month, right? After all, the interest probably doesn't amount to all that much money.... This is one case where the standard transaction semantics don't serve us well—it's not really necessary to abandon all the work done before the failure because all we really need to do is mark the failed row and continue onward. For this reason the JDBC 3.0 Specification introduced a new concept into the API, called the savepoint. The idea here is fairly simple: it effectively plants a stake in the ground, telling the database that through "this point in time" (the time at which the savepoint was created), everything looks good and could be committed safely. Note the very careful wording of that statement: "could be committed," not "go ahead and commit"—we can still roll back work marked with a savepoint, if necessary. The benefit of savepoints comes when a failure occurs. If part of the transaction suddenly fails, we can roll back the transaction to the most recent savepoint, thereby abandoning only the work done since the last savepoint was demarcated, and start over again; in essence, the savepoint mechanism allows our hapless travel agent to abort the third leg of the flight and keep the first two as part of the established transaction. Using a JDBC Savepoint is straightforward. While using a Connection, at any point during the transaction, call setSavepoint either with a String to name the Savepoint or with no arguments to let the system choose a name for you, and a Savepoint object will be returned:
Connection conn = ...; // Get this from someplace
Statement stmt = conn.createStatement();
stmt.executeUpdate(someSQLHere);
Savepoint svpt = conn.setSavepoint();
stmt.executeUpdate(someOtherSQL);
Savepoint svpt2 = conn.setSavepoint();
ResultSet rs = stmt.executeQuery(SQLToTestWorkSoFar);
if (rs.next())
{
// Whoops! There's not supposed to be anything in here yet!
// Time to abort part of the work
//
conn.rollback(svpt);
// Note that we can roll back to any Savepoint, not just
// the last one marked
stmt.executeUpdate(someCorrectiveSQL);
}
stmt.executeUpdate(someMoreSQL);
conn.commit();
Savepoints offer the opportunity to preserve the work that has been done already within a transaction, so that we can choose whether our transacted work should be atomic or not. That's the good news; the bad news is that the savepoint model is brand-new to the JDBC 3.0 Specification, and as of this writing few database drivers actually support it. If you know the JDBC driver you're working with, make sure to find out whether it supports savepoints (see Item 49 for details, and use DatabaseMetaData. supportsSavepoints to find out). |
- Comment