Aug. 9, 2007, 4:26 a.m.
posted by oxy
Item 16: Consider your lookup carefullyQuiz time: What, precisely, does the term location transparency mean in its original context? In the early days of networks, it was a rare scenario when your program had to go off-machine in order to make use of some resource, assuming you were even part of a network. In fact, if your organization even had more than one computer, more often than not it was more cost-effective to just set up the traditional SneakerNet system rather than try to wade through all the techno-details to set up a more formal network. But as computers became more ubiquitous and networks became more prevalent, network and operating systems vendors began to realize that the notion of some kind of transparency was necessary—in other words, we as users want the system to hide the fact that certain processes and resources are physically distributed across multiple computers. In essence, transparency was a desire to add a layer of abstraction on top of the network; whereas before it was acceptable to force users to know that the file they wanted was on the machine named FILESERVER in the sharepoint titled PUBSRV03, now users just want to know that "it's on the M: drive," where the M: drive in a Windows-based network is "mapped" to the file server's sharepoint just mentioned. UNIX Network File System (NFS) goes even further, extending the single-rooted namespace across the network in such a way that most users can't even tell what machine the file /usr/home/~neward/book.pdf lives on—or at least, they won't be able to tell what machine it's on until that machine stops working, anyway. (A popular alternative definition of a distributed system, attributed to Leslie Lamport, is "You know you have one when the crash of a computer you've never heard of stops you from getting any work done" [Tanenbaum, 7].) There are several forms of transparency in a distributed system; Andrew S. Tanenbaum, for example, describes eight different forms of transparency: access (hiding differences in data representation and how a resource is accessed), location (hiding where a resource is located), migration (hiding that a resource may move to another location), relocation (hiding that a resource may move to another location while in use), replication (hiding that a resource is replicated), concurrency (hiding that a resource may be shared by several competitive users), failure (hiding the failure and recovery of a resource), and persistence (hiding whether a software resource is in memory or on disk) [Tanenbaum]. Within the J2EE space, all of these are hidden behind a single API intended to provide the necessary layer of indirection required to pull off this degree of transparency: JNDI. When first approaching J2EE, it's not uncommon for a novice Java programmer to question the necessity of JNDI in the first place—it's the boilerplate that you "just always do" to get hold of something you want to work with. For most J2EE practitioners, that's where JNDI ends, too: once the boilerplate code is out of the way, you just toss it aside and move on to the good stuff, calling methods on the returned DataSource, EJBHome, or whatever. Stop. JNDI exists in J2EE for a very real and highly underrated purpose, and to simply ignore it this way runs a very real risk of running into serious problems later. Consider, for a moment, what happens when we write the following line of code:
URL u = new URL("http://www.neward.net/ted/weblog");
u.openConnection();
When we open the connection to the URL, the Java networking libraries do a Domain Name System (DNS) lookup to find the IP address of my home server, then open a standard TCP/IP connection across the Internet to that IP address on port 80. Question: Why not just embed the IP address directly in the URL? Why take the hit of doing this lookup (even if it is cached somewhere on the local machine or even the process) if we don't have to? Why not just hit http://168.150.253.23/ted/weblog (or whatever my IP address is by the time this book ships) directly? Arguably, you don't care what the IP address is, you care only about its human-readable representation. While this argument holds water when talking about your average "clueless user," for distributed systems where presumably clueless users aren't the ones writing the calls across machines, it's less effective. No, the main reason we accept the overhead of doing the DNS lookup on each and every TCP/IP connection like this is a combination of several transparency concepts.
Think what would happen if you cached off the IP address to my server, and then I decided to upgrade (or downgrade—maybe Slashdot decided they don't like me anymore and it's not cost-effective to maintain the expensive clustered Web farm at the ISP anymore); suddenly, your code starts failing left and right, and you have to make a code change to accommodate the change. JNDI serves an important piece of the middleware puzzle, that of lookup: in order to avoid being accidentally broken by changes in resource configuration and/or replication, we put a layer of abstraction between users and the resource itself. The classic scenario is that of the database. Prior to the JNDI days, developers had to establish JDBC Connection objects by knowing, a priori, the database URL they wished to connect to, passing that into the Driver Manager to construct a Connection object. Of course, this means that we have to pass the URL into the DriverManager.getConnection call, which leads to the uncomfortable and unacceptable hard-coded scenario that we all loathe—as soon as our code needs to connect to a different database (doing a deployment into the QA environment and needing to connect to the QA database instead of the development database, for example), we have to go back and change the code. So, using JNDI, we create a programmer-friendly JNDI name ("jdbc/MyDataSource") to act as the indirection link and let system administrators change the JDBC URL as they wish or need through the application server's administrative interface. In fact, if for some reason the database administrator needs to take the production database down for a while and wants to redirect all database traffic to a different database instance, the administrator can just change the URL on the other side of the JNDI name, and our code will happily start sending requests to that backup database instead of the production database. Lo and behold, we have just taken an important step toward yielding better uptime statistics. But this layer of abstraction works only when you play by the rules and continue to go through that layer of abstraction to find the resources in question. Numerous J2EE books have been written suggesting that you cache off your JNDI lookups in order to avoid a remote round-trip across the network to the resource's home machine (the EJB server in the case of looking up an EJB, for example). Understand something very important here: if you cache these lookup results, you can't "switch over" without at least restarting the server. Remember how failover was supposed to be a large part of why we adopted J2EE in the first place? Think very hard about whether the inability to silently deal with a switchover like this is worth the cost savings of a potential round-trip across the network, particularly since in many cases the application server can do some smart caching of its own and avoid the need for the round-trip after all. By the way, bear in mind that the traditional client/server style of lookup enshrined in the JNDI API isn't the only form of lookup available to us: the first half of the 2000 decade was all abuzz about a new form of resource sharing named "peer-to-peer" or "P2P" for short. Napster, Kazaa, and GNUtella all proclaimed revolutionary ways of sharing (and stealing, if you want to get right down to it) resources like MP3s and videos. But if you crack open the code for these sorts of systems and take a look underneath, you discover something interesting: in almost all cases, the "peer-to-peer" nature of the system is purely in the lookup aspects. In other words, all Napster really did was tell you who near you was available, and from there your client engaged in traditional client/server interactions with the other party (acting as the server) to discover what songs they had, and you in turn acted as a server to them for the same purposes. All Napster really did was tell the two of you that you were around and that you might want to talk to each other. In short, Napster allowed two otherwise unaware processes to discover one another: it's just lookup by a different name. (Interestingly enough, a JNDI provider for this kind of dynamic discovery isn't all that difficult to write, since the JNDI API provides for JavaBean event-style notifications; using that, a potential client could ask its local JNDI provider implementation for a callback when something else entered the naming context—presumably in response to a broadcast announcement of a new server—thus making JNDI a dynamic peer-to-peer system as well. But as of yet no vendor has taken JNDI to the point of supporting this.) There's a lot of room within enterprise applications for discovery as a lookup mechanism; consider the idea, for example, of a client being able to "register" with a discovery service for notifications when a server of interest to that client becomes available. Two existing technologies from Sun (among others) provide this kind of capability: Jini, which uses an RMI-friendly model, and JXTA, which builds an entirely new API for doing so based on XML data exchange (thus providing for a more access-transparent mode of interacting with non-Java resources). Unfortunately, not much has been made of either of these technologies, or the concept of discovery itself for that matter, at the J2EE level, so making use of discovery may require stepping outside the J2EE API for a bit. One of the easiest ways of doing discovery, for example, is to use UDP/IP, the connectionless peer to TCP/IP. A "client" can do a broadcast UDP/IP request to a LAN, and any machines on the LAN that are listening on the given UDP/IP port will respond to the packet, typically sending back enough information about themselves to give the client the ability to connect back over traditional TCP/IP. (This is how Jini's DiscoveryService works, by the way.) So one cheap way to get a certain amount of clustering is to set up the application servers on two or more machines, then set up UDP/IP listeners on each. When it comes time for the middleware layer to find a server to execute some processing, issue the UDP/IP broadcast and take the first server that responds. Taking the first one to respond also provides a certain amount of load balancing, since a server that's being hammered will most likely take longer to respond anyway. One interesting aspect that emerges from this idea of discovery as a lookup mechanism is the idea of self-healing networks. For years, we've struggled with the problem that networks are not always reliable—routers go down, power goes out, network trunks get cut by street construction workers,[1] and other disasters occur. Normally, when a client can't communicate to the server, it just gives up and terminates because most client applications can't perform any processing without the server. In fact, most client applications check to see whether the server is up just once, at the start of the application, and never bother to check again, just assuming the server will always be there.
In a self-healing network environment, however, the client is written explicitly to deal with the idea that the server could "go away" for whatever reason. So, for example, in a rich-client application (see Item 51), when a server appears to go "offline" for whatever reason, the client can display some kind of "disconnected" icon, informing the user that things aren't well. If the application is written to use a local database (see Item 44), the client can simply use discovery to determine when the server comes back up (either by polling periodically, as JXTA does, or else registering with a discovery service, as Jini does) and resume connected operations at that point. Voilà: the application "repaired" or "self-healed" a network outage that would kill other applications. The user saw nothing but successful operation (to a point, anyway). Whether or not you choose to explore discovery as a lookup mechanism, don't just treat the lookup portions of your middleware coding as something you just have to get out of the way to get to the good stuff. Lookup is fundamental to the success of any middleware system because it enables location transparency. By the way, note very carefully that location transparency only wants to hide where the resource is located, not the fact that the resource is located remotely—sometimes location transparency is cited in situations (such as the common fallacy that "I don't care where objects are") where it's not the right kind of transparency to use; see Item 18 for details. |
- Comment