Oct. 16, 2008, 9:23 p.m.
posted by oxy
Item 34: Consider using pessimistic concurrency for explicit concurrency controlConsider once again the problem of two users and one set of data, as introduced in Item 33. Each user reads the set of data, makes local changes, then seeks to update those changes back to the data store. Without some kind of control mechanism, we have a "last one wins" scenario—the last user to write his or her changes to the data store effectively overwrites any previous changes made by any other users. For most applications, losing one or more updates this way is generally considered by most users to be a Really Bad Idea. Optimistic locking (see Item 33) partially solves this problem by tagging each set of data with some kind of "marker" that will indicate when data has been updated by some other party, so that when the data is updated we get some awareness of the change. Unfortunately, optimistic locking assumes that receiving that notification at the time of update is acceptable—that is, we won't know about the change to the underlying data until we try to actually commit our transient state back. For some situations, like the ubiquitous e-commerce shopping cart, where abandonment of transient state is common, this is acceptable—a lot of users' shopping carts never actually turn into viable orders, so the update against the database never occurs. In other situations, however, the optimistic concurrency model's failure to notify until an update attempt creates a problem. Consider a Web application that requires a user to go through a complicated twelve-screen process in order to update a given order. If you come back with an error message along the lines of "We're sorry, somebody already updated this order" and force the user to start over from the first page, you're going to be looking for a new job pretty quickly. It gets more subtle than this, though—sometimes a system has to allow updates against the database even though another user has already "locked" the data. Some users may be "superusers," system administrators or supervisors who have to be able to indicate that certain customers are no longer able to purchase items from the e-commerce site (perhaps due to bad credit ratings or too many bounced checks). Or, perhaps, we need to run a global price change against all the items in the database (it's a 50% off sale starting December 26th), and that change needs to take effect regardless of how many users have open shopping carts. In such situations, both the optimistic concurrency model and the native locking model of the underlying resource manager fail us, and there's no other recourse but to handle it ourselves. This technique is called a pessimistic concurrency model (also known as the Pessimistic Offline Lock [Fowler, 426] or the Pessimistic Lock [Nock, 405] in some quarters), and while it has several drawbacks, it offers the overall benefit of giving you, the programmer, more control over what's going on. (It's called "pessimistic" because it assumes that other parties will attempt to update the same data you're currently using—we're "pessimistic" that naïve concurrency control will work.) In this model, the application programmer assumes full control over all concurrency semantics. Implementing a pessimistic concurrency scheme is pretty straightforward: before obtaining any data, your code checks a common "lock" table to see if that row is free: SELECT userid, lock_start, comment FROM lock_tbl WHERE lock_table=? AND lock_pk=? Here, the first parameter is the table you're wanting to fetch from, and the second parameter is the primary key value of the row you're wanting to fetch. If the result set returns empty, no lock exists for that row, and you first create a lock by running an INSERT command into the lock table before fetching the data you're interested in. Make sure to run DELETE on that row when you're done with the data, by the way, or the data will remain forever locked and thus inaccessible to the rest of the system. In some cases, you can embed the lock information in each row, appending a lock_user and/or lock_timestamp column to the table definition to contain the user ID and/or timestamp that indicates who locked this row and when. The first approach is less intrusive, the second one simpler. If the lock query returns a nonzero set of rows, the data is locked and you can either display a message to the user that the data is currently in use (e.g., "User 'FRED' is currently working with the data you've requested—please contact that user if you require access to this data") and abandon the request, or silently wait for some specified period of time and try again. At first glance, this seems like a lot of work. It is, no argument about it. What's worse, much of the work required isn't immediately obvious. For example, in order to avoid race conditions where two clients consult the lock table, find the lock isn't currently out, and simultaneously take out a lock against the same row (thus effectively overwriting one client's lock with the second client's lock), the lock-query/obtain-lock-update must be done under native transactional semantics. So the revised algorithm has to look something like this: BEGIN TRANSACTION SELECT userid, lock_start, comment FROM lock_tbl WHERE lock_table=? AND lock_pk=? (assuming no rows returned, continue with:) INSERT INTO lock_tbl (lock_table, lock_pk, userid, lock_start, comment) VALUES (?, ?, ?, ?, ?) COMMIT (having locked the data, now you can continue with:) SELECT * FROM table WHERE . . . You've also got to be careful that in the event of a failure, you release the lock; this implies some kind of exception handling in your processing code that will delete the lock from the lock table if a Java exception is thrown or some other failure occurs. This is a lot of work to do for each and every SQL statement you want to run, and it's best when tucked away behind a procedural-first persistence scheme (see Item 42). The payoff from a pessimistic model comes in the fact that you have much, much greater control over the concurrency that takes place; for example, to "bypass" a pessimistic lock, just don't bother with checking the lock table in the first place, relying instead on the underlying database transactional scheme to prevent any data corruption. Generally speaking, this would be done only under the most stringent circumstances, with the knowledge that clients could end up in some very odd situations if the circumstances handled correctly—for example, a user filling out a shopping cart could suddenly find that the total charged to his or her credit card was different from the total displayed on the previous screen because the price-change update came through right after that "Are you sure you want to place this order?" message was displayed. Alternatively, you might want situations where a superuser can "break" a lock on a given row or chunk of data. Perhaps the system has gotten itself into some kind of deadlock, or perhaps the superuser needs to reassign the data to a different user. ("Joe? Fred went on vacation, and he left a few items unfinished. Can you assign them to George so we can finally process those outstanding orders Fred started?") In fact, you can get as subtle and as tricky in your concurrency model as you wish. Perhaps it's possible for two users to simultaneously hold a lock against a given row, as long as they work on the same team, implicitly assuming that they can resolve any data concurrency problems between them in one of those infamous cubicle hallway conversations. Or perhaps certain users get read-only access to all data despite the presence of a lock but can't do updates. There are many possibilities. Diagnostics are also much simpler, as well as more informative, with a pessimistic locking scheme than in relying on native concurrency. When a user requests data already in use by another user, the diagnostic message can indicate who holds the lock and for how long, and with an always-online system, you can even go so far as to pop up some kind of message to the user holding the lock, asking if he or she can release it yet. You might also write some kind of daemon process that periodically scans the lock table, looking for locks that are older than x seconds/minutes/hours/days/months/years/whatever, releasing them automatically (with appropriate notifications to the users holding the locks) on the grounds that it's probably a deadlock or application failure. In some cases, pessimistic concurrency is the only form of concurrency available to you. For example, when working with files on the filesystem, because most filesystems don't yet provide any sort of X/Open transactional interface (see Item 32), in order to ensure that you're exclusively accessing a particular file, you'll often need to have some form of "external" concurrency model in place. This can again be the lock table discussed earlier, or many older systems use the concept of lock files, files that serve the same purpose as the lock table: create the lock file (usually by doing an "open" operating system API call of some form that fails if the file already exists), do the work, then delete the file when the work is complete. Note that you can use the lock file itself to hold the data normally stored in the lock table—user ID, comment, that sort of thing—and the file's creation timestamp can serve as the actual time of lock acquisition. This approach also holds the advantage that releasing the locks is particularly simple: just delete the lock files. (Skeptical that this whole approach works? If you're using CVS as your source control system, you've been using lock files the entire time—CVS uses them to ensure exclusivity of access when you're committing source files to the repository.) If you're particularly ambitious, you can even try to build a JTA-compliant resource manager around a pessimistic concurrency scheme, but this is usually a lot more work than most programmers are willing to undertake. It has the distinct advantage of providing pessimistic concurrency models while keeping the programmatic simplicity of the transaction processing model (see Item 27), which is particularly useful if you want pessimistic concurrency against one resource while working with multiple resource managers simultaneously. Pessimistic concurrency schemes provide you with a tremendous amount of power over the concurrency capabilities of your system; remember, however, as the comic book hero learned the hard way, "With great power comes great responsibility." Pessimistic concurrency gives you great flexibility, but it's up to you to make sure the concurrency decisions you make are solid ones; otherwise, corruption is the result. |
- Comment