I recently spent an afternoon at the London Science Museum watching steam engines, pumps, turbines, and other antique machines operate. Some were so simple that their behavior was obvious, but most had sections cut away and replaced with glass windows so their inner workings could be observed. As I walked through the exhibits, I couldn't help but wonder how the craftsmen and maintenance crews of those machines were able to do their jobs without the cut-away models and diagrams that accompanied the exhibits here. Software engineers have it easy!
To a seasoned Java developer with access to source code, any program can look like the transparent models at the museum. Tools such as thread dumps, method call tracing, breakpoints, and profiling statistics can provide insight into what a program is doing right now, has done in the past, and may do in the future. But in a production environment, things aren't always so obvious. These tools are not typically available, or at the very least, they are usable only by trained developers. Support teams and end users also need to be aware of what an application is doing at any given time.
To fill this void, we've invented simpler alternatives such as log files (typically for server processes) and status bars (for GUI applications). However, because these tools must capture and report only a very small subset of the information available to them and often must present this information in a way that is easy to understand, they tend to be programmed explicitly into the application. This code is intertwined with the application's business logic in such a way that developers must "work around it" when trying to debug or understand core functionality, yet remember to update this code whenever that functionality changes. What we really want to do is to centralize the status-reporting logic in a single place, and manage the individual status messages as metadata.
In this article, I will consider the case of a status-bar component embedded in a GUI application. I will explore a number of different ways to implement this status reporter, starting with the traditional hard-coded idiom. Along the way, I will introduce and discuss a number of new features in Java 1.5, including annotations and run-time bytecode instrumentation.
The StatusManager
My primary goal is to create a JStatusBar Swing component that can be embedded in our GUI application. Figure 1 shows what the finished status bar looks like in a simple JFrame.
Figure 1. Our dynamically generated status bar
Since I do not wish to reference any GUI components directly from our business logic, I'll also create a StatusManager to act as an entry point for status updates. The actual notification will be delegated out to a StatusState object so that later this can be extended to support multiple concurrent threads. Figure 2 shows this arrangement
Figure 2. StatusManager and JStatusBar
Now I need to write code which that call the methods on StatusManager to report on the progress of the application. These method calls are typically scattered throughout the code in try-finally blocks, often one per method.
public void connectToDB (String url) {
StatusManager.push("Connecting to database");
try {
...
} finally {
StatusManager.pop();
}
}
The code in the 01_original directory in the peekinginside-pt1.tar.gz sample code (see References below) exhibits this traditional approach.
This code is functional, but after duplicating it dozens or even hundreds of times throughout your codebase, it can begin to look a bit messy. In addition, what if I want to access these messages in some other way? Later in this article, I'll define a user-friendly exception handler that shares the same messages. The problem is that I've hidden the status message in the implementation of our method, rather than on the interface where it belongs.
Attribute-Oriented Programming
What I really want to do is to leave any references to StatusManager out of our code altogether and simply tag the method with our message. I can then use either code-generation or run-time introspection to do the real work. The XDoclet Project refers to this as Attribute-Oriented Programming, and provides a framework which can convert custom Javadoc-like tags into source code.
However, with the inclusion of JSR-175, Java 1.5 has provided a more structured format for including these attributes inside of real code. The attributes are called "annotations" and they can be used to provide metadata for class, method, field, or variable definitions. They must be declared explicitly, and provide a sequence of name-value pairs that can contain any constant value (including primitives, strings, enumerations, and classes).
Annotations
To hold our status message, I'll want to define a new annotation that contains a string value. Annotation definitions look a lot like interface definitions, but the keyword @interface is used instead of interface and only methods are supported (though they act more like fields).
public @interface Status {
String value();
}
Like an interface, I will put the @interface in a file named Status.java, and import it into any other files that need to reference it.
value may seem like a strange name for our field here. Something like message might be more appropriate; however, value has special meaning to Java. This allows us to declare annotations as @Status("...") rather than @Status(value="..."), as a shortcut.
I can now define my method as follows:
@Status("Connecting to database")
public void connectToDB (String url) {
...
}
Note that to compile this code I needed to use the -source 1.5 option. If you're using Ant (as our code samples do) rather than building a javac command line directly, you'll also need Ant 1.6.1.
In addition to classes, methods, fields, and variables, annotations can also be used to provide metadata for other annotations. In particular, Java comes with a few annotations that can be used to customize how your annotation works. Let's redefine our annotation as follows:
@Target(ElementType.METHOD)
@Retention(RetentionPolicy.SOURCE)
public @interface Status {
String value();
}
The @Target annotation defines what the @Status annotation can reference. Ideally, I would like to mark up arbitrary blocks of code, but the only options are methods, fields, classes, local variables, parameters, and other annotations. I'm only interested in code, so I choose METHOD.
The @Retention annotation allows us to specify when Java is free to discard this information. It can be either SOURCE (discard when compiling), CLASS (discard at class load time), or RUNTIME (do not discard). Let's start with SOURCE, but I will need to upgrade it later.
Instrumenting Source Code
Now that my messages are encoded in metadata, I'll need some code to notify the status listeners. Let's assume for the moment that I continue to store my connectToDB method in source code control without any references to StatusManager. However, before compiling the class, I want to add in the necessary calls. That is, I want to insert the try-finally statements and the push/pop calls automatically.
The XDoclet framework is a source code generation engine for Java that uses annotations similar to those described above, but stored in Java source code comments. XDoclet works very well for generating entire Java classes, configuration files, or other build artifacts, but does not support the modification of existing Java classes. This limits its usefulness for instrumentation. Instead, I could use a parser tool like JavaCC or ANTLR (which comes with a grammar for parsing Java source code), but that requires a fair amount of effort.
There seems to be no good tool available for doing source code instrumentation of Java code. There may be a market for such a tool, but as you will see in the rest of this article, bytecode instrumentation can be a much more powerful technique.
Instrumenting Bytecode
Rather than instrumenting source code and then compiling it, I could compile the original source code and then instrument the bytecode that is produced. Depending on the exact transformation required, this can be either much easier or much more difficult than source code instrumentation. The main advantage of bytecode instrumentation is that the code can be modified at run time, without having a compiler available.
Although Java's bytecode format is relatively simple, I will certainly want to use a Java library to do the parsing and generation of the bytecode (e.g., to insulate us from future changes in the Java class file format). I chose to use Jakarta's Byte Code Engineering Library (BCEL), but I could just as easily have picked CGLIB, ASM, or SERP.
Since I will be instrumenting bytecode in a number of different ways, I'll begin by declaring a generic interface for instrumentation. This will act as a simple framework for doing annotation-based instrumentation. This framework will support the transformation of classes and methods, based on annotations, so the interface will look something like this:
public interface Instrumentor
{
public void instrumentClass (ClassGen classGen,
Annotation a);
public void instrumentMethod (ClassGen classGen,
MethodGen methodGen,
Annotation a);
}
ClassGen and MethodGen are BCEL classes that implement the Builder pattern. That is, they provide methods for mutating an otherwise immutable object, and for transforming between the mutable and non-mutable representations.
Now I will need to write an implementation for this interface that replaces @Status annotations with the appropriate calls to StatusManager. As described previously, I want to wrap these calls in a try-finally block. Note that for this to work, the annotations that are used must be tagged with @Retention(RetentionPolicy.CLASS), which instructs the Java compiler not to discard the annotations while compiling. Since I declared @Status as @Retention(RetentionPolicy.SOURCE) earlier, I need to upgrade it.
As it turns out, in this case, instrumenting bytecode is significantly more difficult than instrumenting source code. The reason is that try-finally is a concept that exists in source code only! The Java compiler transforms try-finally blocks into a series of try-catch blocks and inserts calls to the finally block before every return. Thus, I will need to do something similar in order to add a try-finally block to existing bytecode.
This is the bytecode that represents an ordinary method call, flanked by StatusManager updates.
0: ldc #2; //String message
2: invokestatic #3; //Method StatusManager.push:(LString;)V
5: invokestatic #4; //Method doSomething:()V
8: invokestatic #5; //Method StatusManager.pop:()V
11: return
This is the same method call, but in a try-finally block, so that StatusManager.pop() is called even if an exception is thrown.
0: ldc #2; //String message
2: invokestatic #3; //Method StatusManager.push:(LString;)V
5: invokestatic #4; //Method doSomething:()V
8: invokestatic #5; //Method StatusManager.pop:()V
11: goto 20
14: astore_0
15: invokestatic #5; //Method StatusManager.pop:()V
18: aload_0
19: athrow
20: return
Exception table:
from to target type
5 8 14 any
14 15 14 any
As you can see, I need to duplicate some instructions and add several jumps and exception table records just to implement a single try-finally. Luckily, BCEL's InstructionList class makes this fairly easy.
Instrumenting Bytecode At Compile Time
Now that I have an interface for modifying classes based on annotations and a concrete implementation of this interface, the last step is to write the actual framework that will call it. I'm actually going to write a few of these frameworks, starting with one that instruments all classes at a compile time. Since this is going to happen as part of my build process, I've decided to define an Ant task for it. The declaration of the instrumentation target in my build.xml file should look something like this:
To provide an implementation of this task, I need to define a class that realizes the org.apache.tools.ant.Task interface. Attributes and sub-elements of our task are passed in through set and add method calls. The execute method is called to implement the work of the task -- in this case, to instrument the class files given in the specified
public class InstrumentTask extends Task {
...
public void setClass (String className) { ... }
public void addFileSet (FileSet fileSet) { ... }
public void execute () throws BuildException {
Instrumentor inst = getInstrumentor();
try {
DirectoryScanner ds =
fileSet.getDirectoryScanner(project);
// Java 1.5 "for" syntax
for (String file : ds.getIncludedFiles()) {
instrumentFile(inst, file);
}
} catch (Exception ex) {
throw new BuildException(ex);
}
}
...
}
The one problem with using BCEL for this purpose is that as of version 5.1, it does not support parsing annotations. I could load the classes that we're instrumenting and use reflection to view the annotations. However, then I would have had to use RetentionPolicy.RUNTIME instead of RetentionPolicy.CLASS. I'd also be executing any static initializers in those classes, which may load native libraries or introduce other dependencies. Luckily, BCEL provides a plugin mechanism that allows clients to parse bytecode attributes that it does not natively support. I've written my own implementation of the AttributeReader interface that knows how to parse the RuntimeVisibleAnnotations and RuntimeInvisibleAnnotations attributes that are inserted into bytecode when annotations are present. Future versions of BCEL should include this functionality without the need for a plugin.
This compile time bytecode instrumentation approach is shown in the directory code/02_compiletime of the sample code.
There are a number of disadvantages to this approach, however. For one thing, I had to add an additional step to my build process. I also cannot turn the instrumentation on or off based on command-line settings or other information that is not available at compile time. If both instrumented and non-instrumented code needs to be run in a production setting, two separate .jars will need to be built and the decision of which to use must be all or nothing.
Instrumenting Bytecode at Class Loading Time
A better solution would be to delay the instrumentation of our bytecode until after it is loaded from the disk. This way, the instrumented bytecode does not need to be stored. The start-up performance of our application may suffer, but the trade-off is that you can control what happens based on system properties, or other runtime configuration data.
Prior to Java 1.5, it was possible to do this type of class-file manipulation with a custom class loader. However, the new java.lang.instrument package added in Java 1.5 provides a few additional tools. In particular, it defines the concept of a ClassFileTransformer, which can be used to instrument a class during the standard loading process.
To register our ClassFileTransformer at the appropriate time (before any of our classes are loaded), I'll need to define a premain method. Java will call this before the main class is even loaded, and it is passed a reference to an Instrumentation object. I will also need to add a -javaagent option to the command line to tell Java about our premain method with Java. This argument takes the full name of your agent class (which contains the premain method) and an arbitrary string argument. In this case, we'll pass the full name of our Instrumentor class as the argument (this should all be on one line):
-javaagent:boxpeeking.instrument.InstrumentorAdaptor=
boxpeeking.status.instrument.StatusInstrumentor
Now that I've arranged for a callback that occurs before any annotated classes are loaded, and I have a reference to the Instrumentation object, I can register our ClassFileTransformer.
public static void premain (String className,
Instrumentation i)
throws ClassNotFoundException,
InstantiationException,
IllegalAccessException
{
Class instClass = Class.forName(className);
Instrumentor inst = (Instrumentor)instClass.newInstance();
i.addTransformer(new InstrumentorAdaptor(inst));
}
The adaptor that is registered will need to act as a bridge between the Instrumentor interface given above and Java's ClassFileTransformer interface.
public class InstrumentorAdaptor
implements ClassFileTransformer
{
public byte[] transform (ClassLoader cl,
String className,
Class classBeingRedefined,
ProtectionDomain protectionDomain,
byte[] classfileBuffer)
{
try {
ClassParser cp =
new ClassParser(
new ByteArrayInputStream(classfileBuffer),
className + ".java");
JavaClass jc = cp.parse();
ClassGen cg = new ClassGen(jc);
for (Annotation an :
getAnnotations(jc.getAttributes())) {
instrumentor.instrumentClass(cg, an);
}
for (org.apache.bcel.classfile.Method m :
cg.getMethods()) {
for (Annotation an :
getAnnotations(m.getAttributes())) {
ConstantPoolGen cpg =
cg.getConstantPool();
MethodGen mg =
new MethodGen(m, className, cpg);
instrumentor.instrumentMethod(cg, mg, an);
mg.setMaxStack();
mg.setMaxLocals();
cg.replaceMethod(m, mg.getMethod());
}
}
JavaClass jcNew = cg.getJavaClass();
return jcNew.getBytes();
} catch (Exception ex) {
throw new RuntimeException("instrumenting " +
className, ex);
}
}
...
}
This approach of instrumenting bytecode at startup is shown in the example code's /code/03_startup directory.
Conclusion
In this article, I have replaced a hard-coded solution with one that uses metaprogramming based on annotations and instrumentation. Although I've eliminated the need for any extra steps in our build process, my solution still has a number of limitations. In the next installment, I will explore a completely different implementation that uses thread sampling, and then combine these two techniques to create a solution that gives the best features of each. I will also discuss a number of additional requirements, including a progress bar and dynamic status messages.
References
Example source code for this article
"JSR-175 Public Draft Specification: A Program Annotation Facility for the JavaTM Programming Language"
BCEL web site: jakarta.apache.org/bcel
XDoclet web site: http://xdoclet.sourceforge.net
"J2SE 1.5 in a Nutshell"
"Declarative Programming in Java"
No comments:
Post a Comment