Item 27: Prefer transactional processing for implicitly nonatomic failure scenarios



Item 27: Prefer transactional processing for implicitly nonatomic failure scenarios

As programmers, we are used to a model where operations carried out by the languages we use are implicitly atomic: that is, a function called will either return the correct value or send back some kind of error code or exception. More importantly, we assume that it will return either a completely finished result or nothing at all—there's no such thing as a "partial" answer from the Math.sqrt method, for example. Either it's a good square root value or it throws some kind of exception (if we ask it for the square root of –1, for example).

Unfortunately, this view of atomic operation is an entirely fictitious one. Almost nothing at the high level at which we program is truly atomic. For example, when we call the Math.sqrt method, the thread of execution we're using in turn starts to execute the code within Math.sqrt's definition, which in turn breaks down into individual CPU instructions, which in turn are executed serially on a CPU that must interrupt periodically to ask the bus to fetch some additional data from RAM, and so on. Instead, a great deal of work is done "under the hood" to present the vision of this atomicity—in fact, that's a large part of the job of a language and/or compiler, to hide much of this complexity so that it doesn't "get in the way" of what we really want to do, which in this case is find the square root of something.

Fortunately, presenting this degree of atomicity is a fairly simple operation, as long as everything we do is read-only; if anything goes wrong, we can just throw an exception and forget about the entire idea. If, however, the executed method does something to change the internal state of the object (or class, in the case of statics) against which it is called, now things aren't so simple. Should we return the state of the object/class back to its original values in the face of an error? (Probably.) What about methods we call in turn? Will they do so, as well?

Consider Item 46 in Effective Java [Bloch]: "Strive for failure atomicity." What holds for simple Java programming holds doubly for Java enterprise programming. In fact, it's even more difficult to achieve atomicity in enterprise programming because now there's more than just Java code to consider—we have to consider resources outside the JVM (such as database tables, message queues, files on the filesystem, and so on) as well.

Enter transactional processing.

The basic model of a transaction, from the programmer's perspective, is a simple one: you create a transaction against a resource manager, something that you want to do some work against. The traditional resource manager, of course, is the relational database, but other resource managers are certainly possible, including but not limited to the JMS message broker, a legacy mainframe or other Connector-accessed system, and possibly even the underlying filesystem, if the filesystem supports it. In some cases, ambitious and failure-conscious programmers can even write their objects to implement the resource manager portions of the Java Transaction API, thus making the objects you work with transactional in nature.

Once the transaction is open, you write what work you want done as part of that transaction. This is the resource-specific work, such as executing a SQL statement, retrieving a message, issuing a request to the legacy system, or whatever. When all work is complete, you can choose to commit the transaction, thereby "making permanent" the changes you've suggested, or rollback the transaction, effectively throwing away any and all changes against the resource since the transaction started. (It turns out that, if supported, you can choose to throw away only parts of the work done—see Item 36 for details.)

In other words, either all of your changes succeed or they all fail. You don't have to deal with "partial failure" or "partial success." No need to write code to explicitly undo the actions of earlier work in the method if the third or fourth step suddenly fails for some reason—either everything works or nothing does. And no additional work on your part is necessary to make this atomic failure scenario a reality; it's all part of the transaction-processing model.

To put this into practical perspective, consider the following code:






public class Person

{ . . . }



public class Minister

{

  public void marryPeople(Person spouse1, Person spouse2)

  {

    System.out.println("Do you, " + spouse1.getName() +

                       " take " + spouse2.getName() +

                       " to be your spouse?");

    if (spouse1.isOKToMarry(spouse2))

    {

      System.out.println("How about you, " +

                         spouse2.getName());

      if (spouse2.isOKToMarry(spouse1))

      {

        System.out.println("If any here know why these two " +

                           "should not be married, " +

                           "let them speak now " +

                           "or forever hold their peace.");

        if (Crowd.isOKWithMarriage(spouse1, spouse2))

        {

          System.out.println("Kiss, " + spouse1.getName() +

                             " and " +

                             spouse2.getName() +

                             ", I now pronounce you

                             married.");

          spouse1.setSpouse(spouse2);

          spouse2.setSpouse(spouse1);



          System.out.println("Let's party!");

        }

      }

    }

  }

}


On the surface, it's a pretty simplified version of the traditional American wedding ceremony for two Person objects in the system. We do the usual "ask if everybody's OK with this marriage" ritual, asking first the one person, then the other, if they take so-and-so to be their lawfully wedded whatever, then turning to the crowd as a whole. If everybody agrees, we pronounce the pair married, party the night away, and send the happy couple off to Hawaii, or Bermuda, or Disney World, wherever the honeymoon will be.

(In fact, the traditional wedding is a prime example of the distributed two-phase commit protocol used in distributed transaction management: Phase 1 is the "vote" phase where all the resources involved in the transaction—the couple-to-be, as well as the audience—are asked to commit to the transaction, and Phase 2, where the results of the vote are announced. If any resource in the group votes "no," the deal's off; otherwise the happy couple heads to Bermuda. More on this in Item 32.)

How many possible failure scenarios are here? Think about it for a moment before continuing.

Ready? Offhand, I count several.

  • We have a concurrency concern between the interrogation of the two spouses and "flipping the married bit" later in the code. If, for example, the same spouse is being used in another ceremony, it's entirely possible that the questions could come right after one another, since threads tend to run side-by-side. This is a simple Java concurrency problem, but it's not easily solved—we can't just mark the method itself synchronized because that will synchronize only this Minister object and won't lock the Person objects being used directly. And synchronizing methods on the Person class won't solve the problem, either, since it's the interleaving of the calls that's the problem, not the same object having two methods called simultaneously.

  • What happens if, thanks to the concurrency problem above, an AlreadyMarriedException is thrown out of spouse2.setSpouse? We never actually reset spouse1 to be single or remove spouse2 as spouse1's spouse, leaving poor spouse1 married to somebody who doesn't consider him- or herself to be married in turn. Such situations might be great fodder for writing soap operas and human drama, but they're generally considered bad for computer systems.

  • A slight variation: What happens if an OutOfMemoryError (or other Error or RuntimeException) is thrown from the same place? Again, one or the other of the two spouses aren't reset back to their original state.

  • And, of course, we haven't even considered what should happen with respect to paying the DJ, the florist, the church, or the minister for their efforts even if the ceremony itself fails—after all, despite the fact that the groom or bride suddenly backs out of the wedding, these people still expect to get renumerated for their efforts. We need to handle the situations where one spouse, both spouses, or even the crowd prevents the wedding from taking place by returning false out of the isOK...methods. Although it's not an issue in this case, in theory we have a half-dozen failure scenario possibilities (think of all the permutations of spouse1, spouse2, and crowd decisions).

I'm sure there are more possibilities here—a little imagination takes you a long way.

Now consider how we might write this in some kind of transactional system. (I'm taking some liberties here because Java doesn't support transactional processing out of the box, but the basic idea turns out about the same.)






public class Person { . . . }

public class Minister

{

  public void marryPeople(Person spouse1, Person spouse2)

  {

    Transaction txn = TransactionManager.getTransaction();

    txn.begin();



    // Enlist all the players who get a say in the results

    //

    txn.enlistResource(spouse1);

    txn.enlistResource(spouse2);

    txn.enlistResource(Crowd.getCrowd());



    try

    {

      if (spouse1.isOKToMarry(spouse2) &&

          spouse2.isOKToMarry(spouse1)&&

          Crowd.isOKWithMarriage(spouse1, spouse2))

      {

        spouse1.setSpouse(spouse2);

        spouse2.setSpouse(spouse1);



        goOnHoneymoon(spouse1, spouse2);



        // Pay for the wedding services

      }

    }

    catch (Exception x)

    {

      // Nothing to do here—spouse1 and spouse2 are returned

      // back to their original state automatically

    }

  }

}


Note that no additional work is necessary to restore the system to its original state in the event of a failure—any kind of failure—because the objects, being transactionally aware, can reset themselves. (If you're not ready to accept the idea of objects being transactionally aware, then imagine this is a stored procedure instead of Java code—the end result is the same.)

When viewed this way, it's hard not to get excited about writing transactional code, particularly for code that's more complicated than the scenario above. When you think about it, it's the Java equivalent of never having to say "I'm sorry": the Resource Manager somehow magically restores the state of the system to where it was when you started, for anything and everything touched as part of this processing.

Unfortunately, transactional processing gets something of a bum rap. On the one hand, it's considered "old-fashioned" because it's the core principle behind many of the older mainframe systems ("Transactional Processing" is the "TP" in the terms "TP Monitors" and "TP Systems"). On the other hand, it's considered to be either something that only database administrators deal with on a regular basis or a "head-hurting" world filled with arcane specifications and indecipherable implementations. Truth is, most of that complexity is intended to be buried behind the transactional veneer—for most programmers, transactional systems are far, far simpler to work with than the alternative (coding it by hand) would ever be.

Things do get awkward if you need to work against other resources as part of this unit of work—for example, you need to not only make a change to a user's account but also reflect the change made as part of an audit log in a separate database. This is where distributed transactions come into play, and while they make it possible for you to treat operations against multiple resource managers (i.e., more than one database) to still succeed or fail atomically, they carry their own costs, as described in Item 32.

Before you run off and immediately start looking for ways to make your objects intrinsically transactional, as is assumed in the Java examples presented earlier, bear in mind that nothing ever comes for free, and transactional processing definitely carries its share of costs. First of all, transactional processing isn't just looking for atomic failure, it's also a concurrency model, meaning that any time a transaction is started against a resource, a lock must be taken out against that resource to ensure that other players in the system can't make updates simultaneously against it. In fact, a transactional system is generally said to provide ACID properties. These locks, if not properly managed (see Items 29 and 30), will yield a system that could only be described as a scalability disaster.

In the long run, however, once the scalability concerns are addressed, transactional processing offers a powerful mechanism for writing code that automatically handles the failure scenarios in a simplified, consistent, and coherent way. And in the end, that's a powerful tool for keeping it simple (see Item 25).