July 25, 2010, 3:11 p.m.
posted by oxy
Item 75: Don't be afraid of JNI code on the serverFor years, Java developers have had something of a love/hate relationship with code outside the JVM. Quite frequently, we need to get at that code, but doing so from within the JVM is something of an art form at best, a quick way to a crashed process at worst. The problem, of course, is that accessing anything outside the purview of the Java environment requires using the JNI, which typically means we're back to pointers, unmanaged code, and C/C++. While most Java programmers really don't have anything personal against C/C++ compilers, there's a reason we like to write code in Java: automatic memory management, a virtual machine that pretty much eliminates wild pointers and process crashes if we accidentally dereference a null pointer, and so on. It's a hard fact of the Java programmer's life, however, that the Java environment doesn't cover everything, despite Sun's efforts to the contrary. At times we will need to access something from Java that requires going through a C/C++-based API to do it, and that brings us directly back to JNI. When that happens, take a deep breath, put your courage to the sticking place, and dive right in. The truth is, JNI really isn't as bad as it seems; in fact, assuming you're already comfortable with C/C++ in general, the hard part about JNI isn't writing (and debugging!) the code. It's trying to figure out why the JVM won't load your shared library from within your J2EE environment that typically drives the Java developer mad. First things first, however. When faced with the task of writing JNI code, there are basically two ways to go about it: the hard, low-level way, and the easier, high-level approach. The hard, low-level way is to write JNI code as defined by the JNI Specification. Write your Java class with methods defined using the native keyword, use the javah utility to generate C header stubs (which can then be cut and pasted as the starting point for the C/C++ implementation), and then write a shared library (.dll under Win32, .so or similar construct under most flavors of UNIX) that contains the implementation of the native method, using the JNIEnv structure to get function pointers that allow for calling back into the JVM when necessary. Want to allocate a Java string as part of your native method? Call back into the JVM to take the C character string and turn it into a Java string. Want to turn a passed Java string into a C-style null-terminated character array? Call back into the JVM to do it. Want to allocate a Java object? You guessed it—call back through that JNIEnv structure into the JVM again. Oh, and don't forget to check for a Java exception at every step, just in case the JVM runs into a problem (like an OutOfMemoryError or a ClassNotFoundException). Doing this gets tedious, and anything that a programmer finds tedious very quickly turns error-prone. There's a better way, however, and it comes in a variety of flavors: let somebody else do the low-level code for you. A number of toolkits, both commercial and open-source, that litter the Internet can take much of the onerous parts of JNI code off your hands. Some of the open-source alternatives include Sheng Liang's "Shared Stubs" code from his book The Java Native Interface [Liang], Stu Halloway's Jawin (Java-Windows) library from his book Component Development for the Java Platform [Halloway], and JACE, an open-source project that provides C++ wrappers around Java objects in order to make calling those Java object methods easier from within native code. Take full advantage of these toolkits when you can—most have flexible licensing schemes that will accommodate even the stingiest managers and the most inflexible lawyers, and should the open-source community not serve your needs, a number of companies provide commercial alternatives. Whichever way you get your JNI code written, now the hard part comes: precisely where do you put this shared library so that the JVM will pick it up, even from within a J2EE container? The answer comes in two parts. First, the JVM looks for shared libraries in the manner common to the operating system, meaning it uses the dynamic-loading policies of the underlying operating system; on a Win32 box, for example, consult the LoadLibrary Win32 API for details of exactly where the Win32 loader will look for a DLL (it includes the PATH, the WINNT directory, the WINNT/SYSTEM32 directory, and the current directory). However, the JVM also augments this collection of locations with one other, "portable" location: within the JRE's lib directory, there is a CPU-specific directory into which shared libraries can go. For example, on a Win32 JVM, the lib directory contains an i386 directory. Normally, the only file there is a configuration file (see Item 68 for details), but if you drop a native-library DLL in here, it's automatically part of the path the JVM will search for native libraries when using the System.loadLibrary call—yet another reason to use separate JRE instances (see Item 69). In fact, part of the path the JVM searches, which includes those directories specified by the underlying operating system, is controlled by a JVM system property, the java.library.path property. Changing this at the Java launcher command line via the standard -D option replaces the JVM's default (which is the bin directory of the JRE, the current directory, and the PATH value on a Win32 box, for example). However, it's important to point out that J2EE containers support the hot deployment of components, meaning that we can insert, remove, and upgrade components deployed into a J2EE container without taking the server down. If the container is already handling requests from clients, this implies that two versions of the component could exist simultaneously in the JVM, as long as they are loaded by separate ClassLoader instances. Unfortunately, this isolation provided by ClassLoaders doesn't extend to native libraries, and most operating systems will load a dynamic library only once. Requests to load the same library, typically differentiated only by the library's name, will essentially no-op. To the JNI programmer, this means that once a native library is loaded, we're pretty much done for the day—getting a JVM to unload a native library is an exercise in utmost frustration. The JNI Specification states that a native library will be unloaded when the ClassLoader associated with the class it provides the implementation to is unloaded, but trying to force a ClassLoader to unload is virtually impossible within the JVM. As a result, if you're writing native libraries that will need to be hot deployed into a J2EE container (like a servlet container), make sure that each library has a differentiating name based on its version number; this will fool most operating systems into believing these are separate libraries and therefore will allow them to be loaded when the new version of the component is hot deployed into the container. The drawback, of course, is that if the component gets hot deployed multiple times, it's highly likely that multiple copies of the same native library will be loaded into the process, making the overall footprint of the JVM process that much larger. Obviously, "going native" in Java is not the optimal case. Your code loses some portability (if portability is important to you; see Item 11), it weakens the stability of the JVM because it opens the possibility of unmanaged code (i.e., non-Java code) accidentally trashing parts of the process, it weakens the secure environment Java code executes in because unmanaged code has no SecurityManager and/or Permissions model to protect it, and so on. But for those scenarios where you absolutely, positively must escape the JVM, JNI (and the assorted higher-level toolkits) give you the power to do so. |
- Comment