Friday, February 29, 2008

Open Source Java Reporting with JasperReports and iReport

asperReports, a powerful, flexible open-source reporting engine, is easy to integrate into Java enterprise applications, but it lacks an integrated visual report editor. So, if you want to use JasperReports directly, you need to manipulate its XML report structureĆ¢€”a relatively technical activity with a high learning curve, to say the least.

In fact, writing a full JasperReport from scratch using only the XML format is a long, painful, and unrewarding task.
Luckily, some available alternatives are much easier. The first and foremost of which is to use a visual editor to design, compile, and test your reports.

One of the most useful visual editors you can use is iReport. This article demonstrates how to use iReport to leverage the full power of JasperReports without getting entangled in complexities of the JasperReports native XML format.

Getting Started

The first thing to do is download and install iReport. It is a Java application, so you will need a JDK on your machine (JDK 1.4 or higher). This tutorial uses JDK 1.5.0:

Download iReport from ireport.sourceforge.net.
Decompress the iReport archive.
Run the startup script (bin\startup.bat or ./bin/startup.sh).

The iReport download comes with its own JasperReports package (the latest version to date, 0.5.1, supports the recently released JasperReports 1.0.1).



Figure 1. The Tutorial's Employee Database Schema

Once you have iReport running, you can start designing your reports!


The Example Database
This tutorial uses a very simple database (see Figure 1) for demonstration. To follow along step-by-step, either download the scripts for setting up this database with MySQL and set it up on your machine, or use a similar database and translate the techniques to your situation.



Figure 2. Adding a New Database Connection


Adding a New Database Connection
First, add a new connection to your database. Use the "Datasource -> Connections/Datasources" menu to set up a new database connection (see Figure 2). If you chose the JDBC driver in the list (the example chooses MySQL), enter the server address and the database name, and then click on the 'Wizard' button. iReport should provide you with a correct JDBC URL for your particular database.
Now that you have a datasource, it's time to do something with it.

Generating Huge reports in JasperReports

There are certain things to care while implementing the Jasper Reports for huge dataset to handle the memory efficiently, so that the appliacation does not go out of memory.

They are:

1) Pagination of the data and use of JRDataSource,

2) Viruatization of the report.

When there is a huge dataset, it is not a good idea to retrieve all the data at one time.The application will hog up the memory and you're application will go out of memory even before coming to the jasper report engine to fill up the data.To avoid that, the service layer/Db layer should return the data in pages and you gather the data in chunks and return the records in the chunks using JRDataSource interface, when the records are over in the current chunk, get the next chunk untilall the chunks gets over.When I meant JRDataSource, do not go for the Collection datasources, you implement the JRDataSource interface and provide the data through next() and getFieldValue()To provide an example, I just took the "virtualizer" example from the jasperReports sampleand modified a bit to demonstrate for this article.To know how to implement the JRDataSource, Have a look at the inner class "InnerDS" in the example.



Even after returning the data in chunks, finally the report has to be a single file.Jasper engine build the JasperPrint object for this. To avoid the piling up of memory at this stage, JasperReports provided a really cool feature called Virtualizer. Virtualizer basically serializes and writes the pages into file system to avoid the out of memory condition. There are 3 types of Virtualizer out there as of now. They are JRFileVirtualizer, JRSwapFileVirtualizer, and JRGzipVirtualizer.JRFileVirtualizer is a really simple virtualizer, where you need to mention the number of pages to keep in memory and the directory in which the Jasper Engine can swap the excess pages into files. Disadvantage with this Virtualizer is file handling overhead. This Virtualizer creates so many files during the process of virtualization and finally produces the required report file from those files.If the dataset is not that large, then you can go far JRFileVirtualizer.The second Virtualizer is JRSwapFileVirtualizer, which overcomes the disadvantage of JRFileVirtualizer. JRSwapFileVirtualizer creates only one swap file,which can be extended based on the size you specify. You have to specify the directory to swap, initial file size in number of blocks and the extension size for the JRSwapFile. Then while creating the JRSwapFileVirtualizer, provide the JRSwapFile as a parameter, and the number of pages to keep in memory. This Virtualizer is the best fit for the huge dataset.The Third Virtualizer is a special virtualizer which does not write the data into files, instead it compresses the jasper print object using the Gzip algorithm and reduces the memory consumption in the heap memory.The Ultimate Guide of JasperReports says that JRGzipVirtualizer can reduce the memory consumption by 1/10th. If you are dataset is not that big for sure and if you want to avoid the file I/O, you can go for JRGzipVirtualizer.

Check the sample to know more about the coding part. To keep it simple, I have reused the "virtualizer" sample and added the JRDataSource implementation with paging.I ran the sample that I have attached here for four scenarios. To tighten the limits to get the real effects, I ran the application with 10 MB as the max heap size (-Xmx10M).

1a) No Virtualizer, which ended up in out of memory with 10MB max heap size limit.

export:
[java] Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
[java] Java Result: 1



1b) No Virtualizer with default heap size limit (64M)

export2:
[java] null
[java] Filling time : 44547
[java] PDF creation time : 22109
[java] XML creation time : 10157
[java] HTML creation time : 12281
[java] CSV creation time : 2078




2) 2) With JRFileVirtualizer
exportFV:
[java] Filling time : 161170
[java] PDF creation time : 38355
[java] XML creation time : 14483
[java] HTML creation time : 17935
[java] CSV creation time : 5812




3) With JRSwapFileVirtualizer
exportSFV:
[java] Filling time : 51879
[java] PDF creation time : 32501
[java] XML creation time : 14405
[java] HTML creation time : 16579
[java] CSV creation time : 5365



4a) With GZipVirtualizer with lots of GC
exportGZV:
[java] Filling time : 84062
[java] Exception in thread "RMI TCP Connection(22)-127.0.0.1" java.lang.OutOfMemoryError: Java heap space
[java] Exception in thread "RMI TCP Connection(24)-127.0.0.1" java.lang.OutOfMemoryError: Java heap space
[java] Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
[java] Exception in thread "RMI TCP Connection(25)-127.0.0.1" java.lang.OutOfMemoryError: Java heap space
[java] Exception in thread "RMI TCP Connection(27)-127.0.0.1" java.lang.OutOfMemoryError: Java heap space
[java] Java Result: 1



4b) With GZipVirtualizer (max: 13MB)
exportGZV2:
[java] Filling time : 59297
[java] PDF creation time : 35594
[java] XML creation time : 16969
[java] HTML creation time : 19468
[java] CSV creation time : 10313




I have shared the updated virtualizer sample files at Updated Virtualizer Sample files

Using Jasper Reports with Visual Web Pack

This tutorial illustrates the use of Jasper Reports with a Visual Web Pack application.
Register Jasper Reports library
Use the NetBeans Library Manager to create a library for the Jasper Reports class libraries. You need at least the following files from the distribution:
• dist/jasperreports-.jar
• lib/commons-beanutils-1.7.jar
• lib/commons-collections-2.1.jar
• lib/commons-digester-1.7.jar
• lib/commons-logging-1.0.2.jar
• lib/itext-1.3.1.jar
Register Jasper Reports image servlet
The image servlet is needed if you want html rendered reports (also without any graphical elements, because report placeholders uses images from this servlet). So you must register it in the web.xml configuration file. You can use the NetBeans web.xml editor to do so.
Servlet name : ImageServlet
Servlet class : net.sf.jasperreports.j2ee.servlets.ImageServlet
URL : /image
Insert methods for report output to application bean
The following methods in the application bean can be used to output a precompiled report as html or pdf. In this sample a collection of java objects is used as data source. For other data sources see the Jasper Reports documentation.
/**
* Output Jasper Report
*
* @param filename Precompiled report filename
* @param type Content type of report ("application/pdf" or "text/html")
* @param data Collection of value objects
*/
public void jasperReport( String filename, String type, Collection data ) {
jasperReport( filename, type, data, new HashMap() );
}

/**
* Output Jasper Report
*
* @param filename Precompiled report filename
* @param type Type of report ("application/pdf" or "text/html")
* @param data Collection of value objects
* @param params Map with parameters
*/
public void jasperReport( String filename, String type, Collection data, Map params ) {
final String[] VALID_TYPES = { "text/html", "application/pdf" };
// First check if type is supported
boolean found = false;
for ( int i = 0; i < VALID_TYPES.length; i++ ) {
if ( VALID_TYPES[i].equals( type ) ) {
found = true;
break;
}
}

if ( !found ) {
throw new IllegalArgumentException( "Report type '" + type + "' not supported." );
}

// InputStream for compiled report
ExternalContext econtext = getExternalContext();
InputStream stream = econtext.getResourceAsStream( filename );

if ( stream == null ) {
throw new IllegalArgumentException( "Report '" + filename + "' could not be opened." );
}

// Use collection as data source
JRBeanCollectionDataSource ds = new JRBeanCollectionDataSource( data );
JasperPrint jasperPrint = null;

try {
jasperPrint = JasperFillManager.fillReport( stream, params, ds );
} catch ( RuntimeException e ) {
throw e;
} catch ( Exception e ) {
throw new FacesException( e );
} finally {
try {
stream.close();
} catch ( IOException e ) {
}
}

// Configure exporter and set parameters
JRExporter exporter = null;
HttpServletResponse response = (HttpServletResponse) econtext.getResponse();
FacesContext fcontext = FacesContext.getCurrentInstance();

try {
response.setContentType( type );

if ( "application/pdf".equals( type ) ) {
exporter = new JRPdfExporter();
exporter.setParameter( JRExporterParameter.JASPER_PRINT, jasperPrint );
exporter.setParameter( JRExporterParameter.OUTPUT_STREAM,
response.getOutputStream() );
} else if ( "text/html".equals( type ) ) {
exporter = new JRHtmlExporter();
exporter.setParameter( JRExporterParameter.JASPER_PRINT, jasperPrint );
exporter.setParameter( JRExporterParameter.OUTPUT_WRITER, response.getWriter() );
HttpServletRequest request = (HttpServletRequest)
fcontext.getExternalContext().getRequest();
request.getSession().setAttribute(
ImageServlet.DEFAULT_JASPER_PRINT_SESSION_ATTRIBUTE, jasperPrint );
exporter.setParameter( JRHtmlExporterParameter.IMAGES_MAP, new HashMap() );
exporter.setParameter(
JRHtmlExporterParameter.IMAGES_URI,
request.getContextPath() + "/image?image=" );
}
} catch ( RuntimeException e ) {
throw e;
} catch ( Exception e ) {
throw new FacesException( e );
}

// Export report
try {
exporter.exportReport();
} catch ( RuntimeException e ) {
throw e;
} catch ( Exception e ) {
throw new FacesException( e );
}

// Tell JavaServer faces that no more processing is necessary
fcontext.responseComplete();
}
Start report output from page bean
The output of a report can initiated from a ActionEvent with the following code:
try {
getApplicationBean().jasperReport(
"/reports/report.jasper",
"application/pdf",
getSessionBean().getSuchergebnisDataProvider().getList() );
} catch ( Exception e ) {
Logger.getLogger(getClass().getName()).severe( e.getMessage() );

Using Hibernate queries with JasperReports

Introduction
In the article, we examine a performance-optimised approach for using Hibernate queries to generate reports with JasperReports.

JasperReports is a powerful and flexible Open Source reporting tool. Combined with the graphical design tool iReport, for example, you get a complete Java Open Source reporting solution. In this article, we will investigate how you can integrate JasperReports reporting with Hibernate data sources in an optimal manner, without sacrificing ease-of-use or performance.

Basic Hibernate/JasperReports integration
To integrate Hibernate and JasperReports, you have to define a JasperReports data source. One simple and intuitive approach is to use the JRBeanCollectionDataSource data source (This approach is presented here) :

List results = session.find("from com.acme.Sale");

Map parameters = new HashMap();
parameters.put("Title", "Sales Report");

InputStream reportStream
= this.class.getResourceAsStream("/sales-report.xml");
JasperDesign jasperDesign = JasperManager.loadXmlDesign(reportStream);
JasperReport jasperReport = JasperManager.compileReport(jasperDesign);

JRBeanCollectionDataSource ds = new JRBeanCollectionDataSource(results);
JasperPrint jasperPrint = JasperManager.fillReport(jasperReport,
parameters,
ds);

JasperExportManager.exportReportToPdfFile(jasperPrint, "sales-report.pdf");


This approach will work well for small lists. However, for reports involving tens or hundreds of thousands of lines, it is inefficiant, memory-consuming, and slow. Indeed, experience shows that, when running on a standard Tomcat configuration, a list returning as few as 10000 business objects can cause OutOfMemory exceptions. It also wastes time building a bulky list of objects before processing them, and pollutes the Hibernate session (and possibly second-level caches with temporary objects.

Optimised Hibernate/JasperReports integration
We need a way to efficiently read and process Hibernate queries, without creating too many unnecessary temporary objects in memory. One possible way to do this is the following :

Define an optimised layer for executing Hibernate queries efficiently
Define an abstraction layer for these classes which is compatible with JasperReports
Wrap this data access layer in a JasperReports class that can be directly plugged into JasperReports
The Hibernate Data Access Layer : The QueryProvider interface and its implementations
We start with the optimised Hibernate data access. (you may note that this layer is not actually Hibernate-specific, so other implementations could implement other types of data access without impacting the design).

This layer contains two principal classes :

The CriteriaSet class
The QueryProvider interface
A CriteriaSet is simply a JavaBean which contains parameters which may be passed to the Hibernate query. It is simply a generic way of encapsulating a set of parameters. A QueryProvider provides a generic way of returning an arbitrary subset of the query results set. The essential point is that query results are read in small chunks, not all at once. This allows more efficient memory handling and better performance.

/**
* A QueryProvidor provides a generic way of fetching a set of objects.
*/
public interface QueryProvider {

/**
* Return a set of objects based on a given criteria set.
* @param firstResult the first result to be returned
* @param maxResults the maximum number of results to be returned
* @return a list of objects
*/
List getObjects(CriteriaSet criteria,
int firstResult,
int maxResults) throws HibernateException;
}

A typical implementation of this class simply builds a Hibernate query using the specified criteria set and returns the requested subset of results. For example :

public class ProductQueryProvider implements QueryProvider {

public List getObjects(CriteriaSet criteria,
int firstResult,
int maxResults)
throws HibernateException {
//
// Build query criteria
//
Session sess = SessionManager.currentSession();
ProductCriteriaSet productCriteria
= (ProductCriteriaSet) criteria;
Query query = session.find("from com.acme.Product p "
+ "where p.categoryCode = :categoryCode ");

query.setParameter("categoryCode",
productCriteria.getCategoryCode();
return query.setCacheable(true)
.setFirstResult(firstResult)
.setMaxResults(maxResults)
.setFetchSize(100)
.list();
}
}


A more sophisticated implementation is helpful for dynamic queries. We define an abstract BaseQueryProvider class which can be used for dynamic query generation. This is typically useful when the report has to be generated using several parameters, some of which are optionnal.. Each derived class overrides the buildCriteria() method. This method builds a Hibernate Criteria object using the specified Criteria set as appropriate :

public abstract class BaseQueryProvider implements QueryProvider {

public List getObjects(CriteriaSet criteria, int firstResult, int maxResults)
throws HibernateException {

Session sess = SessionManager.currentSession();
Criteria queryCriteria = buildCriteria(criteria, sess);
return queryCriteria.setCacheable(true)
.setFirstResult(firstResult)
.setMaxResults(maxResults)
.setFetchSize(100)
.list();

}

protected abstract Criteria buildCriteria(CriteriaSet criteria, Session sess);
}

A typical implementation is shown here :

public class SalesQueryProvider extends BaseQueryProvider {

protected Criteria buildCriteria(CriteriaSet criteria,
Session sess) {
//
// Build query criteria
//
SalesCriteriaSet salesCriteria
= (SalesCriteriaSet) criteria;

Criteria queryCriteria
= sess.createCriteria(Sale.class);

if (salesCriteria.getStartDate() != null) {
queryCriteria.add(
Expression.eq("getStartDate",
salesCriteria.getStartDate()));
}
// etc...

return queryCriteria;
}
}



Note that a QueryProvider does not need to return Hibernate-persisted objects. Large-volume queries can sometimes be more efficiently implemented by returning custom-made JavaBeans containing just the required columns. HQL allows you to to this quite easily :

public class CityQueryProvider implements QueryProvider {

public List getObjects(CriteriaSet criteria,
int firstResult,
int maxResults)
throws HibernateException {
//
// Build query criteria
//
Session sess = SessionManager.currentSession();
Query query
= session.find(
"select new CityItem(city.id, "
+ " city.name, "
+ " city.electrityCompany.name) "
+ " from City city "
+ " left join city.electrityCompany");

return query.setCacheable(true)
.setFirstResult(firstResult)
.setMaxResults(maxResults)
.setFetchSize(100)
.list();
}
}


Hibernate data access abstraction : the ReportDataSource interface
Next, we define a level of abstraction between the Hibernate querying and the JasperReport classes. The ReportDataSource does this :

public interface ReportDataSource extends Serializable {
Object getObject(int index);
}

The standard implementation of this interface reads Hibernate objects using a given QueryProvider and returns them to JasperReports one by one. Here is the source code of this class (getters, setters, logging code and error-handling code have been removed for clarty) :

public class ReportDataSourceImpl implements ReportDataSource {

private CriteriaSet criteriaSet;
private QueryProvider queryProvider;
private List resultPage;
private int pageStart = Integer.MAX_VALUE;
private int pageEnd = Integer.MIN_VALUE;
private static final int PAGE_SIZE = 50;

//
// Getters and setters for criteriaSet and queryProvider
//
...

public List getObjects(int firstResult,
int maxResults) {

List queryResults = getQueryProvider()
.getObjects(getCriteriaSet(),
firstResult,
maxResults);
if (resultPage == null) {
resultPage = new ArrayList(queryResults.size());
}
resultPage.clear();
for(int i = 0; i < queryResults.size(); i++) {
resultPage.add(queryResults.get(i));
}
pageStart = firstResult;
pageEnd = firstResult + queryResults.size() - 1;
return resultPage;
}

public final Object getObject(int index) {
if ((resultPage == null)
|| (index < pageStart)
|| (index > pageEnd)) {
resultPage = getObjects(index, PAGE_SIZE);
}
Object result = null;
int pos = index - pageStart;
if ((resultPage != null)
&& (resultPage.size() > pos)) {
result = resultPage.get(pos);
}
return result;
}
}


Finally, we have to be able to call the Hibernate data source from JasperReports. To do so, we start by looking at the JasperManager fillReport() method, which takes a JRDataSource object as its third parameter and uses it to generate the report :

JasperPrint jasperPrint = JasperManager.fillReport(jasperReport, parameters, ds);

To implement our own optimised JRDataSource, we extended the JRAbstractBeanDataSource class. This class is presented here (logging and error-handling code has been removed for clarty).

public class ReportSource extends JRAbstractBeanDataSource {

private ReportDataSource dataSource;
protected int index = 0;
protected Object bean;
private static Map fieldNameMap = new HashMap();

public ReportSource(ReportDataSource dataSource) {
super(true);
this.dataSource = dataSource;
index = 0;
}

public boolean next() throws JRException {
bean = dataSource.getObject(index++);
return (bean != null);
}

public void moveFirst() throws JRException {
index = 0;
bean = dataSource.getObject(index);
}

public Object getFieldValue(JRField field) throws JRException {
String nameField = getFieldName(field.getName());
return PropertyUtils.getProperty(bean, nameField);
}

/**
* Replace the character "_" by a ".".
*
* @param fieldName the name of the field
* @return the value in the cache or make
* the replacement and return this value
*/
private String getFieldName(String fieldName) {
String filteredFieldName
= (String) fieldNameMap.get(fieldName);
if (filteredFieldName == null) {
filteredFieldName = fieldName.replace('_','.');
fieldNameMap.put(fieldName,filteredFieldName);
}
return filteredFieldName;
}
}


This class is basically just a proxy between JasperReports and the Hibernate data source object. The only tricky bit is field name handling. For some reason, JasperReports does not accept field names containing dots (ex. "product.code"). However, when you retrieve a set of Hibernate-persisted business objects, you often need to access object attributes. To get around this, we replace the "." by a "_" in the JasperReport template (ex. "product_code" instead of "product.code"), and convert back to a conventional JavaBean format in the getFieldName() method.

Putting it all together
So, when you put it all together, you get something like this :

List results = session.find("from com.acme.Sale");

Map parameters = new HashMap();
parameters.put("Title", "Sales Report");

InputStream reportStream
= this.class.getResourceAsStream("/sales-report.xml");
JasperDesign jasperDesign
= JasperManager.loadXmlDesign(reportStream);
JasperReport jasperReport
= JasperManager.compileReport(jasperDesign);

ReportDataSource hibernateDataSource
= new ReportDataSourceImpl();
hibernateDataSource.setQueryProvider(new SalesQueryProvider());
hibernateDataSource.setCriteriaSet(salesCriteria);
ReportSource rs = new ReportSource(hibernateDataSource);

JasperPrint jasperPrint
= JasperManager.fillReport(jasperReport,
parameters,
rs);
JasperExportManager.exportReportTsoPdfFile(jasperPrint,
"sales-report.pdf");


Further JasperReports optimisations

Compiled Report caching
In the previous code, the JasperReport is loaded and compiled before it is run. In a serverside application, for optimal performance, the reports should be loaded and compiled once and then cached. This will save several seconds on each query generation.

Optimising Hibernate Queries
The first cause of slow report generation is sub-optimal querying. Hibernate is a high-performance persistence library which gives excellent results when correctly used. So you should treate any report which takes an excessive time to generate as suspicious. Common Hibernate optimisation strategies include :

Correct use of joins
Correct use of lazy associations
Reading a subset of columns rather than whole Hibernate-persisted objects
Some Hibernate query optimisation techniques are discussed here.

Reporting Made Easy with JasperReports and Hibernate

JasperReports and Hibernate in Web applications

JasperReports is a valuable and viable reporting solution for Java Web applications. It simplifies report generation through the use of XML report templates that are then compiled using the JasperReports engine for use in reporting modules. These compiled report templates can be filled by data received from a variety of sources including relational databases. JasperReports can be integrated into Web applications and create reports in several file formats including PDF and XLS.

Reporting in Java Applications
Often reporting modules increase in complexity and size during the course of application development. Clients tend to demand more information from report modules when they become aware of the benefits reports offer. The reporting module developed as something of an afterthought in such environments suddenly becomes a much more integral part of the application. Reporting modules often seem to be tacked on to developed applications, rather than being considered and implemented during initial application development.

Recently while working on some applications that made extensive use of report extraction to XLS files using the Apache POI library, it became apparent that these report modules tied up lots of valuable development resources for extended periods of time. When the client requested PDF extraction, initial iText API research led me to discover JasperReports. JasperReports was to change our team approach to report development dramatically.

Prior to implementing JasperReports each report creation required the development of a custom report class using the Apache POI library. This approach expended valuable development time creating aspects of the report such as cell specific formats, styles, and population methods. JasperReports offered our team the ability to get back this valuable development time, while producing the same report because of its embedded use of the Apache POI library.

One of the benefits offered by the introduction of JasperReports is that a single report template implementation can produce reports in a number of formats. This means that templates created for XLS format extraction can also be used to produce PDF files and even CSV, HTML or XML.

How Can JasperReports Help Developers?
JasperReports gives developers the ability to create reports quickly and easily that can be extracted to numerous formats. Developers can also use the JasperReports engine to compile report templates at design or runtime - allowing dynamic report formats. Developers can also inject data into these reports from a number of data sources. Developer time no longer has to be spent creating custom report classes using the Apache POI or iText libraries for formatting and stylizing reports, allowing the code writers to focus on the data retrieval aspect of the report. As a result developers gain valuable flexibility and time savings using JasperReports in application development.

The XML report templates used by JasperReports provide the layout and presentation information required to format the resulting report as well as field, variable, and parameter references. Non-development staff can create these templates using a third-party GUI such as iReport with minimal developer collaboration, so developers don't have to involve themselves in the layout and presentation aspect of report generation.

JasperReports enables developers to concentrate their efforts on the parts of the reporting module where they are required, while relieving them of having to write custom report generation code. A developer's role in the report module can be reduced to template compilation, data source implementation, and actual report creation.

Creating and Compiling an XML Report Template
JasperReports requires a report design defining the layout, presentation, and data fields. This design can be built using the net.sf.jasperreports.engine.design.JasperDesign object, so developers can create report designs dynamically, or by creating a net.sf.jasperreports.engine.design.JasperDesign instance from an XML report template. Unless an application specifically requires a dynamic layout a compiled XML report template is the recommended method. This XML template is usually saved with a .jrxml file extension and compiled using the net.sf.jasperreports.engine.JasperCompileManager.

The JasperReports XML template includes elements for title, pageHeader, columnHeader, pageFooter, columnFooter, and the main data element. Each of these elements has a variety of sub-elements as can be seen in sampleReport.jrxml (see Listing 1).

You can download the code samples used in this article at jdj.sys-con.com. As can be seen in sampleReport.jrxml some elements such as and contain layout information, while others such as and font contain presentation information. The XML templates also contain , , and elements used to include data in the report.

The elements allow non-data source information to be passed into a report, such as a dynamic title; elements are the only way to map report fields to the data source fields, while variables are values generated at runtime for use in the report. The complete Document Type Definition (DTD) for the JasperReports XML report template can be found in the JasperReports Ultimate Guide.

Compilation of the XML template can be done either at runtime or build time as part of an Ant build using the JasperReports Ant task.

Compiling the report at runtime entails loading the report into a JasperDesign object and using the created instance as the parameter to the JasperCompileManager.compileReport(JasperDesign design) method, which returns a JasperReport instance. Alternatively the XML template can be passed into the JasperCompileManager.compileToFileReport(String sourceFileName, which creates a compiled report file (.jasper) available throughout the application.

Compiling the report at build time using the JasperReports Ant task requires the addition of the task definition to the build.xml file and a target making use of this task as seen in Listing 2, which is an extract from the source code build.xml. Using the Ant task results in the creation of a compiled (.jasper) file in the destdir task and offers the opportunity to save the Java source file by passing the keepjava attribute of the target a true value. A more thorough example of how to use the Ant task is included in the sample applications provided in the JasperReports download bundle.

Using Data Sources to Fill JasperReports
Most reports use a database as the data source, but JasperReports can use any available data source. These data sources are passed to a net.sf.jasperreports.engine.JasperFillManager fillReportXXX() method. Two types of data source are provided for by these methods - net.sf.jasperreports.engine.JRDataSource and java.sql.Connection. The source code for this article contains examples of both a static data source that extends the JRDataSource and a JDBC connection data source implementation.

The StaticDataSource class implementation provided implements the net.sf.jasperreports.engine.JRDataSource interface enabling it to fill the report data by calling the JasperFillManager.fillReport(JaperReport report, Map parameters, JRDataSource dataSource) method. The two required methods getFieldValue(JRField jrField) and next() of the JRDataSource interface present in StaticDataSource handle the data passing from the data source into the JasperReport. The data source used by StaticDataSource is a static simple two-dimensional array of bowlers containing their names and scores over three games (see Listing 3). When the fillReport() method containing this data source is processed and a detail section is encountered in the report a call will be made to the next() method. The implementation of this method in StaticDataSource (see Listing 4) returns true if there's another element in the data array, or false if there is no more data. If this method returns true then field elements encountered in the detail section will result in a call to the getFieldValue(JRField jrField) method in StaticDataSource. The implementation of this method in StaticDataSource (see Listing 5) returns the value of the mapped data field name for the current index of the data array. When the end of the detail section is encountered, the next() method is called again and the process repeats until the next() method returns false.

The JDBCDataSourceExample (see Listing 6) implements a fillReport() method that accepts a java.sql.Connection parameter. Through the addition of a element into the XML report template (jdbcSampleReport.jrxml) this fillReport() method enables data to be extracted from a relational database. The element returns the data fields for use in the report data mapping. In this case the query simply returns all records in the sample_data table. A java.sql.ResultSet can be used instead of implementing the element in the report template, allowing dynamic query implementation.

Using Hibernate with JasperReports
Hibernate is one of the most popular ORM tools in use at the moment. Using Hibernate as a data source for JasperReports can be very simple when a collection of objects is returned from a Hibernate query, but when a tuple of objects is returned then a custom JRDataSource implementation is required.

When a Hibernate query returns a collection of objects, a net.sf.jasperreports.engine.data.JRBeanCollection-DataSource can be used to map the Hibernate POJO instance fields to the report fields. All that's required for this simple solution is to use the JRBeanCollectionDataSource(java.util.Collection beanCollection) constructor, passing it the Hibernate Query result set as implemented in SimpleHibernateExample (see Listing 7). In this example the simple Hibernate query used (session.createQuery("from SampleData").list()) is equivalent to that found in the JDBCDataSourceExample. JRBeanCollectionDataSource implements JRDataSource like StaticDataSource but its getFieldValue(JRField jrField) method implementation maps the report template field names to the query result bean properties.

When a Hibernate query returns a tuple of objects it's necessary to write a custom implementation of the JRDataSource similar to HibernateDataSource (see Listing 8). The implementation of the required next() method in this class returns true if there is another list item in the Hibernate query result set, while putting the current list item in a currentValue holder for use in the getFieldValue(JRField jrField) method. The getFieldValue() method implementation gets the field index in the currentValue object via a call to the getFieldIndex(String field) method. This method iterates through the mapped field names passed to the HibernateDataSource constructor until it finds the field name it was passed and then returns the index of this field in the currentValue information. The getFieldValue() method then returns the value at this index in the currentValue result object.

More extensive solutions to using Hibernate with JasperReports, including the use of reflection instead of the name mapping method used in HibernateDataSource, can be found on the Hibernate Web site www.hibernate.org/79.html. Also of interest in this area is the report optimization implementation advocated by John Ferguson Smart in his article "Hibernate Querying 103: Using Hibernate Queries with JasperReports" (see Resources).

Exporting Reports to PDF and XLS Formats in Web Applications
After compiling and filling a JasperReport report exporting it is a fairly simple and straightforward process using the net.sf.jasperreports.engine.JRExporter interface implementations provided. JasperReports can export data to PDF, XLS, CSV, RTF, HTML, and XML from the same report design using the appropriate implementation of the JRExporter interface. The PDF and XLS formats are two of the most common export formats and examples of exporting to these formats from within a Web application can be found in the source code for this article. PrintServlet exports to PDF, while DataExtractServlet exports the same data to an XLS format file.

PrintServlet (see Listing 9) is a an example servlet implementation class using JasperReports to export a report to PDF format. JasperReports makes use of the Open Source iText PDF creation library (see Resources) to generate PDF format files. Once the report is compiled in PrintServlet, the PDF is created and streamed to the Web browser ready for printing using the runReportToPdfStream(InputStream inputStream, OutputStream outputStream, Parameters params, Connection connection) method implemented by the JasperRunManager facade class.

DataExtractServlet (see Listing 10) is an example servlet implementation class using JasperReports to export a report to the XLS format. JasperReports makes use of the Apache POI library (see Resources) to generate XLS format files. Once the report is compiled in DataExtractServlet the XLS file is created in memory and a save dialog is displayed to the user. The servlet uses net.sf.jasperReports.engine.export.JRXlsExporter, one of the concrete implementations of the JRExporter interface provided by JasperReports to export the report. The parameters for exporting the report are initialized using JRXlsExporterParameter variables to set the filled report (JRXlsExporterParameter.JASPER_PRINT) and the output stream (JRXlsExporterParameter.OUTPUT_STREAM) - which is the response object that has had its content type and header set so that the file will be made available to the user for saving rather than displayed as in the PrintServlet example when exportReport() is called.

Useful Hint: By default JasperReport puts page headings at the top of every 'page' of data. When exporting to an XLS format this breaks up the continuous data in a worksheet that contains more than a single 'page' of data. Data continuity can be maintained by passing the type of output format as a parameter to a report template combined with a element based on the passed parameter placed in the element. The below will result in only the page headings being output to a 'page' if the report is processing the first page when the output format isn't PDF and on every 'page' for PDF output formats.


|| $P{REPORT_TYPE}.equals("PDF")
? Boolean.TRUE : Boolean.FALSE]]>


Creating Reports Is Easy and Fun with JasperReports
Hopefully this article has whetted your appetite for exploring the world of report generation using JasperReports, or if you've already discovered JasperReports, that it's provided some ideas on how to delve into creating custom data sources or using new export formats. Understanding and mastering the implementation of the required JRDataSource methods next() and getFieldValue(JRField jrField) opens up any data source for use in generating reports with JasperReports.

Creating reports with JasperReports is made even simpler by some useful tools. iReport (see Resources), an excellent JasperReports template creation tool that allows visual report designs in a GUI application can be used by non-developers to create the JasperReport designs. It also offers substantial developer-focused functionality such as data source connectivity to create report previews outside of an application. JasperAssistant (see Resources), while not Open Source has the advantage of being an Eclipse plug-in for developing JasperReport templates in a similar GUI manner, albeit more developer-oriented. Both offer the benefit of being able to prepare a report design, which can then be provided to a developer for filling, relieving him of the tedious presentation aspect of report generation.

This article has barely scratched the surface of JasperReports' extensive use and functionality but hopefully it's introduced some developers to an extremely useful tool in any Java developer's arsenal. JasperReports can even produce charts and graphs, as well as including images in reports that increase the richness and presentation of an applications reporting system. JasperReports is a powerful API that can take a reporting system to the next level.

HowTo: JasperReports framework. Deployment to Tomcat

JasperReports framework does not ship report engine WAR file - you have to build it yourself.

Building JasperReports report engine WAR file

1. Download jasperreports-1.2.6-project.zip, unzip it to directory jasper.

2. Create Eclipse project: New->Project->General->Project. Location should point to jasper directory.

3. In the Eclipse Navigatior, select jasperreports-1.2.6\demo\samples\webapp\build.xml, open it, select war target, right-click->RunAs->Ant Build. The jasper-webapp.war file is built in jasperreports-1.2.6\demo\samples\webapp\ directory.

Deploying JasperReports report engine WAR file to Tomcat

1. Copy jasper-webapp.war file to tomcat\webapps directory. Start Tomcat, it will unzip the WAR file.

Deploying JasperReports report to Tomcat

For each report, you should have one file with .jasper extension (compiled report definition).
1. Copy the report file to jasper-webapp\reports directory.

2. Copy JDBC driver, required by your report, to jasper-webapp\WEB-INF\lib directory.

3. Modify jasper-webapp\jsp\html.jsp JSP file to load .jasper file, open database connection and render the report as HTML:


<%@ page errorPage="error.jsp" %>
<%@ page import="datasource.*" %>
<%@ page import="net.sf.jasperreports.engine.*" %>
<%@ page import="net.sf.jasperreports.engine.util.*" %>
<%@ page import="net.sf.jasperreports.engine.export.*" %>
<%@ page import="net.sf.jasperreports.j2ee.servlets.*" %>
<%@ page import="java.util.*" %>
<%@ page import="java.io.*" %>
<%@ page import="java.sql.*" %>

<%
File reportFile = new File(application.getRealPath("/reports/MyReport.jasper"));
if (!reportFile.exists())
throw new JRRuntimeException("File WebappReport.jasper not found. The report design must be compiled first.");

JasperReport jasperReport = (JasperReport)JRLoader.loadObject(reportFile.getPath());

Map parameters = new HashMap();
parameters.put("ReportTitle", "Address Report");
parameters.put("BaseDir", reportFile.getParentFile());

Class.forName ("com.mysql.jdbc.Driver");
Connection connection = DriverManager.getConnection ("jdbc:mysql://XX.XX.XX.XX/world", "userId", "password");

JasperPrint jasperPrint =
JasperFillManager.fillReport(
jasperReport,
parameters,
connection
);

connection.close();

JRHtmlExporter exporter = new JRHtmlExporter();

exporter.setParameter(JRExporterParameter.JASPER_PRINT, jasperPrint);
exporter.setParameter(JRExporterParameter.OUTPUT_WRITER, out);
exporter.setParameter(JRHtmlExporterParameter.IMAGES_URI, "../servlets/image?image=");

exporter.exportReport();
%>
4. Restart Tomcat.

5. Invoke your report, using URL like: "http://localhost:8080/jasper-webapp/jsp/html.jsp".

How to call Stored Procedures from JasperReports

Jasper Reports is unable to call Oracle stored procedures directly, because procedures do not return standard result sets. As a solution, in Oracle, you can use a stored function to retrieve the results of a stored procedure. There are a few more steps to do this than if you were able to use a stored procedure, but it currently is the only option, if the query you need to do can’t be done with a standard SQL query.

In order to to use stored functions to retrieve the result set of a stored procedure, you will need to use a temp table to hold the results, and then return the results using types and tables of types.

Note: In this example, I have kept the function very limited. This particular query would not need to be done with a stored procedure and function, as a standard select query would be best, but is only used to demonstrated how to do it, should the need arise.

I have provided all the sql used in this demo, in this file.

Setup
For this example, the table Presidents will be used. Sample data will also need to be loaded into the table. You can use the file I provided above, with all the example’s sql to create the table and load it with sample data.

Now that there is a base table and data to work with, creation of the objects needed for the stored function can begin.

Step 1: Create a Temp Table
First, create a temp table to temporarily hold the results from the stored procedure, so the Jasper Report can query it via the stored function, with a standard select query. To create the temp table, use this sql:

CREATE GLOBAL TEMPORARY TABLE “TEMP_PRESIDENTS” (
ID NUMBER(10) not null,
NAME VARCHAR(32) not null,
BIRTHDATE DATE not null,
PARTY char(1) not null
) ON COMMIT PRESERVE ROWSStep 2: Create the Stored Procedure
Next, create the stored procedure which will perform the needed data gathering. In this simple example, the query will select all rows based on the party passed (R for republican, D for democrat).

CREATE PROCEDURE “LOAD_TEMP_PRESIDENTS” (
partyParam CHAR )
as
begin
EXECUTE IMMEDIATE ‘TRUNCATE TABLE TEMP_PRESIDENTS’;
COMMIT;

INSERT INTO TEMP_PRESIDENTS
SELECT ID, NAME, BIRTHDATE, PARTY FROM PRESIDENTS WHERE PARTY = partyParam;
COMMIT;
end;Step 3: Test the Stored Procedure
Before proceeding any further, call the stored procedure and then check the temp table to be sure it behaves properly (sql below). You should get a result of all the Democrat Presidents, four of them. If not, you will need to retrace your steps.

call LOAD_TEMP_PRESIDENTS(‘D’);
select * from TEMP_PRESIDENTS;Step 4: Create the Return Type
This step creates the type that will be used to return the results from the temp table. This type should describe the result set you are expecting in the Jasper Report.

CREATE OR REPLACE TYPE “PRESIDENT_TYPE” AS OBJECT (
ID NUMBER(10),
NAME VARCHAR2(32),
BIRTHDATE DATE,
PARTY CHAR(1)
)Step 5: Create a Table of the Type
In this step we create a table of the type we created in the previous step. This “table” is what we will be selecting from in the Jasper Report. It is not a real table, but instead a type or object representing the structure of the table that we will funnel the stored procedures results through.

CREATE OR REPLACE TYPE “PRESIDENT_TYPE_TABLE” AS TABLE OF “PRESIDENT_TYPE”Step 6: Create the Stored Function
The next step is to create the stored function with the following code, to retrieve all the presidents for the party you select (R or D).

CREATE OR REPLACE FUNCTION “PRESIDENTS_FUNC” (
partyParam CHAR
)
return PRESIDENT_TYPE_TABLE pipelined
is
PRAGMA AUTONOMOUS_TRANSACTION;

TYPE ref0 is REF CURSOR;
myCursor ref0;
out_rec PRESIDENT_TYPE := PRESIDENT_TYPE(0, null, null, null);

BEGIN
LOAD_TEMP_PRESIDENTS(partyParam);

open myCursor for
select id,
name,
birthdate,
party
from TEMP_PRESIDENTS;

LOOP FETCH myCursor into
out_rec.ID,
out_rec.NAME,
out_rec.BIRTHDATE,
out_rec.PARTY;

EXIT WHEN myCursor%NOTFOUND;
PIPE ROW(out_rec);
END LOOP;
CLOSE myCursor;

RETURN;
END;Step 7: Testing and Using the Stored Function
In order to use the stored function you execute the code below:

select * from table(PRESIDENTS_FUNC(‘D’))This code can now be used within a Jasper Report, as you have turned a stored procedure into a stored function accessible with a standard select. To the Jasper Report you are merely issuing a standard query.

Flexible reporting with JasperReports and iBATIS

Integrate JasperReports with your existing iBATIS implementation

The core task of many Java applications is to retrieve data and display it, sometimes in sophisticated print- or Web-based reports. Luckily for Java developers, two popular open source solutions work especially well together to help you accomplish this task. The iBATIS Data Mapper framework provides a simple XML-based mechanism for linking Java objects to a data repository. JasperReports is a full featured Java reporting library that you can embed in your applications. Put the two together and you have a winning combination for producing scalable, easy-to-maintain reports.

JasperReports is an open source Java reporting library that is quickly gaining popularity as a viable alternative to costly proprietary reporting solutions. With any reporting solution, getting the data to the reporting engine is the most basic implementation concern. Unfortunately, Jasper poses a small problem in this area.

Most Java applications use some type of data-fetching framework for data mapping and dynamic SQL generation, such as the iBatis Data Mapper Framework. Jasper's default mechanism for retrieving and managing data isn't flexible enough to leverage existing data mapping frameworks, however. Instead, you pass the Jasper engine a connection to your database, and it uses SQL queries embedded in an XML-based report template to populate the report.

Although simple to implement, this mechanism ties you to the Jasper template's embedded SQL. Besides, who wants to add yet another moving piece to an already complex application? You would be better off leveraging the existing data framework and just letting Jasper handle report generation.

In this article you'll learn how to integrate JasperReports and the iBATIS Data Mapper Framework for just such a solution. I'll walk through two simple scenarios where the goal is to integrate Jasper and iBATIS for report generation. The first scenario applies to iBATIS implementations that use iBATIS's data capabilities to return a list of Java beans. This scenario doesn't require you to write any custom code. The Jasper framework contains supporting classes that allow the data returned from iBATIS to fill a Jasper report.

For the second scenario -- a more basic uses of iBATIS that returns a list of java.util.Map objects -- you'll create a custom Jasper data source to feed a Jasper report. In addition to working with the Jasper framework classes, for both exercises you'll use the iReport report designer, which eases and accelerates the process of creating template files in Jasper.

Running the examples
This article's example code generates a simple monthly sales report for each type of implementation I cover. The data for the reports is retrieved from an embedded Apache Derby database via the iBATIS Data Mapper framework. The examples are built into a JSF/Spring-based Web application that runs in the same JVM as Derby. I've provided an Ant script for building that WAR file -- just execute the buildWar task to compile content and build it. You'll need Tomcat 5.5x to deploy and run the examples. You'll also need the Abode Acrobat Reader Web browser plug-in to view the report output.

Getting the iBATIS data into Jasper
Using iBATIS to return a list of a specific type of Java beans (I'll call this a return list) is much tidier than using the framework to return a list of java.util.Map objects. Most developers using iBATIS take this approach to data mapping, and it happens to make integration with Jasper a snap.

The Jasper framework provides a JRDataSource implementation that your application can use to fill a report template with data from an iBATIS return list. The JRBeanCollectionDataSource class is constructed from a collection of Java beans and knows how to loop through the collection and access the beans' properties. Listing 1 shows how you can pass an instance of a JRBeanCollectionDataSource when calling on the Jasper engine to populate a report.

Listing 1. Populating a report with JRBeanCollectionDataSource
/* Helper method to create a fully populated JasperPrint object from an list of Java beans */
private JasperPrint fillReport (List dataList) throws JRException {

// this map could be filled with parameters defined in the report
Map parameters = new HashMap();

// make sure the .jasper file (a compiled version of the .jrxml template file) exists
String localPath = this.servlet.getServletContext().getRealPath("/");

File reportFile = new File(localPath + "WEB-INF" + File.separator + "monthySales.jasper");

if (!reportFile.exists()) {
throw new JRRuntimeException("monthySales.jasper file not found.");
}

// load up the report
JasperReport jasperReport = (JasperReport)JRLoader.loadObject(reportFile);

// pass JRBeanCollectionDataSource (which is populated with iBATIS list) to fillReport method
return JasperFillManager.fillReport (jasperReport, parameters,
new JRBeanCollectionDataSource (dataList));
}


In Listing 1, you first define the parameters map, which is the mechanism for passing parameter values to the report at runtime. For example, you could define a parameter named REPORT_TITLE in the report template and pass the value for this parameter to the report by simply adding the key/value pair to the map (e.g., Key=REPORT_TITLE, Value=Sale Report). The parameters map is passed to the fillReport method. The next portion of code loads a compiled Jasper template (.jasper) file. Finally, the static fillReport method is called. It does the actual work of building the report and returns a JasperPrint object, which is passed to a specific type of Jasper exporter to write out the report. The example code for this article uses a JRPdfExporter to write the report to PDF format (see the PdfServlet.java class).

Although this mechanism lets the Jasper framework link with iBATIS, you might need to modify the Java beans that iBATIS populates, depending on your report's requirements. Jasper's field objects know how to work with the common JDBC mapping types. For example, Jasper stores an Oracle numeric field type as a java.math.BigDecimal object. Any of the iBATIS bean properties that you plan to use in a report must map to one of Jasper's defined field types. You should select your report field types carefully, because the formatting and expression capabilities are better in some types than in others. For example, a BigDecimal type is more convenient to work with than a String when you're trying to apply a currency format.

Export Swing components to PDF

Use JFreeChart and iText to draw charts

Suppose you've written an application with a GUI using Swing components such as JTable or JTextPane. All these components are derived from the abstract class javax.swing.JComponent, which includes the print(Graphics g) method: You can use this method to let the Swing component print itself to iText's PdfGraphics2D object.

(Note: This article excerpts Chapter 12, "Drawing to Java Graphics2D," from iText in Action, Bruno Lowagie (Manning Publications, December 2006; ISBN: 1932394796): http://www.manning.com/lowagie.)

Figure 1 shows a simple Java application with a JFrame. It contains a JTable found in Sun's Java tutorial on Swing components. If you click the first button, the contents of the table are added to a PDF using createGraphicsShapes() (the upper PDF in the screenshot). If you click the second button, the table is added using createGraphics() (the lower PDF, using the standard Type 1 font Helvetica). Notice the subtle differences between the fonts used for both variants.



Figure 1. A Swing application with a JTable that is printed to PDF two different ways. Click on thumbnail to view full-sized image.

If you run this example, try changing the content of the JTable; the changes are reflected in the PDF. If you select a row, the background of the row is shown in a different color in the Java applications as well as in the PDF.

The code to achieve this is amazingly simple:

/* chapter12/MyJTable.java */
public void createPdf(boolean shapes) {
Document document = new Document();
try {
PdfWriter writer;
if (shapes)
writer = PdfWriter.getInstance(document,
new FileOutputStream("my_jtable_shapes.pdf"));
else
writer = PdfWriter.getInstance(document,
new FileOutputStream("my_jtable_fonts.pdf"));
document.open();
PdfContentByte cb = writer.getDirectContent();
PdfTemplate tp = cb.createTemplate(500, 500);
Graphics2D g2;
if (shapes)
g2 = tp.createGraphicsShapes(500, 500);
else
g2 = tp.createGraphics(500, 500);
table.print(g2);
g2.dispose();
cb.addTemplate(tp, 30, 300);
} catch (Exception e) {
System.err.println(e.getMessage());
}
document.close();
}


The next example was posted to the iText mailing list by Bill Ensley (bearprinting.com), one of the more experienced iText users on the mailing list. It's a simple text editor that allows you to write text in a JTextPane and print it to PDF.

Figure 2 shows this application in action.



Figure 2. A simple editor with a JTextPane that is drawn onto a PDF file. Click on thumbnail to view full-sized image.

The code is a bit more complex than the JTable example. This example performs an affine transformation before the content of the JTextPane is painted:

/* chapter12/JTextPaneToPdf.java */
Graphics2D g2 = cb.createGraphics(612, 792, mapper, true, .95f);
AffineTransform at = new AffineTransform();
at.translate(convertToPixels(20), convertToPixels(20));
at.scale(pixelToPoint, pixelToPoint);
g2.transform(at);
g2.setColor(Color.WHITE);
g2.fill(ta.getBounds());
Rectangle alloc = getVisibleEditorRect(ta);
ta.getUI().getRootView(ta).paint(g2, alloc);
g2.setColor(Color.BLACK);
g2.draw(ta.getBounds());
g2.dispose();


Numerous applications use iText this way. Let me pick two examples; one free/open source software (FOSS) product and one proprietary product:

JasperReports, a free Java reporting tool from JasperSoft, allows you to deliver content onto the screen; to the printer; or into PDF, HTML, XLS, CSV, and XML files. If you choose to generate PDF, iText's PdfGraphics2D object is used behind the scenes.
ICEbrowser is a product from ICEsoft. ICEbrowser parses and lays out advanced Web content (XML/HTML/CSS/JS); PDF is generated by rendering the parsed documents to the PdfGraphics2D object.
It's not my intention to make a complete list of products that use iText. The main purpose of these two examples is to answer the following question: Can I build iText into my commercial product? Lots of people think open source is the opposite of commercial, but that's a misunderstanding. It's not because iText is FOSS that it can only be used in other free products. It's not because iText is free that it isn't a "commercial" product. As long as you respect the license, you can use iText in your closed-source or proprietary software.

Another useful aspect of iText's Graphics2D functionality is that it opens the door to using iText in combination with other libraries with graphical output—for instance, Apache Batik, a library that is able to parse SVG; or JFreeChart, a library that will be introduced in the next section.

Drawing charts with JFreeChart
Suppose you need to make charts showing demographic information. You take the student population of the Technological University of Foobar and graph the number of students per continent.

To make these charts, you'll combine iText with JFreeChart, an interesting library developed by David Gilbert and Thomas Morgner. The Website jfree.org explains that JFreeChart is "a free Java class library for generating charts, including pie charts (2D and 3D), bar charts (regular and stacked, with an optional 3D effect), line and area charts, scatter plots and bubble charts, time series, high/low/open/close charts and candle stick charts, combination charts, Pareto charts, Gantt charts, wind plots, meter charts and symbol charts, and wafer map charts."

These charts can be rendered on an AWT (Abstract Window Toolkit) or Swing component, they can be exported to JPEG or PNG, and you can combine JFreeChart with Apache Batik to produce SVG or with iText to produce PDF.

Figure 3 shows PDFs with a pie chart and a bar chart created using JFreeChart and iText. In JFreeChart, you construct a JFreeChart object using the ChartFactory. One of the parameters passed to one of the methods to create the chart is a dataset object.



Figure 3. Foobar statistics represented in a pie chart and a bar chart. Click on thumbnail to view full-sized image.

The code to create the charts shown in Figure 3 is simple:

/* chapter12/FoobarCharts.java */
public static JFreeChart getBarChart() {
DefaultCategoryDataset dataset = new DefaultCategoryDataset();
dataset.setValue(57, "students", "Asia");
dataset.setValue(36, "students", "Africa");
dataset.setValue(29, "students", "S-America");
dataset.setValue(17, "students", "N-America");
dataset.setValue(12, "students", "Australia");
return ChartFactory.createBarChart("T.U.F. Students",
"continent", "number of students", dataset,
PlotOrientation.VERTICAL, false, true, false);
}
public static JFreeChart getPieChart() {
DefaultPieDataset dataset = new DefaultPieDataset();
dataset.setValue("Europe", 302);
dataset.setValue("Asia", 57);
dataset.setValue("Africa", 17);
dataset.setValue("S-America", 29);
dataset.setValue("N-America", 17);
dataset.setValue("Australia", 12);
return ChartFactory.createPieChart("Students per continent",
dataset, true, true, false);
}


The previous code snippet creates two JFreeChart objects. The following code snippet shows how to create a PDF file per chart:

/* chapter12/FoobarCharts.java */
public static void convertToPdf(JFreeChart chart,
int width, int height, String filename) {
Document document = new Document(new Rectangle(width, height));
try {
PdfWriter writer;
writer = PdfWriter.getInstance(document, new FileOutputStream(filename));
document.open();
PdfContentByte cb = writer.getDirectContent();
PdfTemplate tp = cb.createTemplate(width, height);
Graphics2D g2d = tp.createGraphics(width, height, new DefaultFontMapper());
Rectangle2D r2d = new Rectangle2D.Double(0, 0, width, height);
chart.draw(g2d, r2d);
g2d.dispose();
cb.addTemplate(tp, 0, 0);
}
catch(Exception e) {
e.printStackTrace();
}
document.close();
}


The chart is drawn on a PdfTemplate. This object can easily be wrapped in an iText Image object if you want to add it to the PDF with document.add().

Dynamic PDF generation with JasperReports, Struts and a database

A requirement appeared recently as part of a Purchase Ordering application to allow a user to dynamically generate a PDF copy of the final Purchase Order to send to the supplier. Taking a look around I stumbled rather fortunately upon an API called JasperReports (JR). JasperReports is a powerful open source Java reporting tool that has the ability to deliver rich content onto the screen, to the printer or into PDF, HTML, XLS, CSV and XML files. This tutorial is aimed at the beginner JR user who is happy with J2EE web application development. It will show you how JR was used to deliver the requirement described and should convince you that it is a truly fantastic piece of kit.

What we will be doing in 1 sentence

You will define a report template using JR’s XML syntax and then bind data from a database into it and get a PDF sent back to a web browser.

What will I need

OK, you need a few bits of kit. Firstly, I hate to break it to you but I kind of cheated with defining the report template. See, thing is, you can write this manually, but I didn’t have time to learn the ins and outs of JR’s XML syntax, so I got hold of JasperAssistant, a brilliant Eclipse IDE plugin that allows a developer to visually draw their report for JR. If like me you use Eclipse, or indeed you just want to use this method for creating your report template, grab Eclipse and JasperAssistant. There is also another tool called iReport that does a similar thing without Eclipse but you’ll need to look at that yourself. So, you will need

JasperReports - head to the Download section
Either JasperAssistant, iReports OR a willingness to learn the JR XML syntax for which there are many examples with the JR distribution. Whichever method you chose, I leave it to you to configure the environment - full instructions are available on each site.
A knowledge of J2EE web application development. In this tutorial I shall be using Struts but only in the slightest way to illustrate how to send the PDF back to the web user. You can do the same stuff with a plain old Servlet.
Creating the report template

First of all, you need to think about what data your report needs to show. In my scenario, we are talking about a Purchase Ordering application. In this application is the master object called PurchaseOrder. A PurchaseOrder has at least one or more LineItem stored in a list collection. Each of these 2 objects have other attributes that reveal information, e.g the PuchaseOrder has a createdDate and orderId whereas a LineItem has a description and unitCost. These objects are persisted to a database. It is not really important how, it may be via a series of SQL statements or it may be via some Object Relational Mapping API such as Hibernate (which for the record, is how I have done it), but what matters is that you have code in place to save and fetch your particular application objects/data. Now, my report layout requires that the header contain master detail such as the created date and order id and then to list all the line items in a table below. Finally, some more master detail such as delivery address is required at the foot. JR divides up a report into a series of stacked bands from top to bottom, e.g title, header, detail and footer are names of some bands. In my case, I chose to use the header, detail and footer bands for the areas I have just mentioned.

Parameters, fields and static text

Using JasperAssistant, I was able to draw my report layout using guides and properties boxes. You may do the same or do it manually, but the main elements that I had to use were parameters, fields and static text. JR has a mechanism for binding a Map of data to a report. This is referred to as a parameter map. The idea being that the map element’s key is used for binding the map element’s value to the parameter defined in the report. For example, if I have an empty report with a parameter declaration orderId as follows:

l version="1.0" encoding="UTF-8–>http://jasperreports.sourceforge.net/dtds/jasperreport.dtd"


then I would need a corresponding Map

Map map = new HashMap(); map.put(orderId, “12345″);
I will show how you can bind this to the report later. In addition to the parameter map mechanism, you can also use something called a DataSource. You musn’t think a DataSource is a database necessarily like it is with an application server. A DataSource is an object that provides methods that can be called by the report in obtaining rows of data. For my purposes if you remember, I have a collection of LineItem elements inside my PurchaseOrder and I need to loop through them outputting to a table in my report. The way I achieved this was by implementing JR’s DataSource interface JRDataSource. This interface requires an implementation provide methods;

public boolean next() throws JRException;
public Object getFieldValue(JRField field) throws JRException;
In your report, you must define fields in the detail band. When the report is run together with the custom implementation, JR will automatically keep calling next and then attempt to bind each field in the detail band to a call to getFieldValue(JRField field).

Since my implementation of JRDataSource will return operation on a collection of LineItem I have named my DataSource LineItemDataSource. It has 2 class variables; private List data; private int index; Which is an internal data List to use (which I will populate with LineItem objects later), and the index allows us to know at which position we are in iteration of the List. That’s why you need to use List, because it is indexed and has methods for getting elements at certain indexes.

I also have an add(LineItem lineItem) for adding LineItem objects. Now, the implementation of next is quite simple:

public boolean next() throws JRException {
index++;
return (index < data.size());
}
I increase the index by 1, and then return a boolean as to whether the index is still within the List’s bounds. JR will use this to determine if any more binding to fields in the detail band is required. Finally, the implementation of getFieldValue. First, let’s show you how to define iterating fields in your report template. You need to define fields in your detail band like this:

< ![CDATA[$F{getItemName}]]> < ![CDATA[$F{getItemCost}]]>
The detail band is iterated over using the custom DataSource implementation which I will show you in a moment. What is important is that you declare your textField elements along with their child textFieldExpression elements. The textFieldExpression tells the JR binding process what fields (by name) to look for in the DataSource. You can call these whatever you like, but as you can see in my case, I have decided to call them getXXX like a traditional bean accessor. Why have I done this? Well, because my LineItem object has matching accessor methods. So now let’s return to the custom DataSource implementation of getFieldValue. Here is the full listing:

public Object getFieldValue(JRField field) throws JRException {
LineItem lineItem = (LineItem) data.get(index);
Object value = null;
try {
Object[] args = {};
Class[] paramTypes = {};
Class lineItemClass = lineItem.getClass();
Method getMethod = lineItemClass.getDeclaredMethod(field.getName(), paramTypes);
value = “” + getMethod.invoke(lineItem, args);
} catch (Exception e) {
throw new JRException(e.getMessage());
}
return value;
}
Clever huh? You don’t have to do it like this, but I have decided to use Java Reflection in order to dynamically call the appropriate LineItem method for the JRField parameter. That is why I named my textFieldExpression elements with getXXX. So, now if I were ever to add a new attribute to LineItem that I wanted in my report, I only need add it to LineItem with the accessors, and then into the report. I can leave my custom DataSource alone. One last note, I have defined all my fields as String even through my LineItem has attributes of float, int, Calendar. I am not really bothered that the report uses correct data types, but you can do that if you want, just set it up with your fields.

Putting it all together

So, you have hopefully got an idea about how JR works, particularly for my Purchase Order scenario. You should understand that a report template is defined by you either manually or using an editor like JasperAssistant. You will also appreciate 2 ways in which you can bind data to this report through parameters and fields. Furthermore, you have seen a clever way to use both methods in binding a master object with internal collection of elements to a report template. So now you probably want to see how to get the PDF back to the user. Well, remember that I am using a web application here but you don’t necessarily need to. First of all, I need to load my PurchaseOrder with it’s collection. You can do this however you like. In my case, I use Hibernate to load the object out of the database. PurchaseOrder po = poDAO.load(id); Now, I need to setup a parameter map for the master details

Map parameterMap = new HashMap();
parameterMap.put(“orderId”, po.getOrderId());
parameterMap.put(“createdDate”, convertToDateString(po.getCreated()));
parameterMap.put(“deliveryAddress”, po.getDeliveryAddress());
There are a lot more! But this will do. Finally, I need to add my LineItem collection to my custom DataSource LineItemDataSource

LineItemDataSource lineItemDataSource = new LineItemDataSource(po.getLineItems());
And last of all, let’s setup the response to the browser, and bind the parameter map and custom DataSource.

response.setContentType(“application/pdf”);
response.addHeader(
“Content-Disposition”,
“attachment; filename=PO - “ + po.getReference() + “.pdf”);
try {
JasperRunManager.runReportToPdfStream( getClass().getClassLoader().getResourceAsStream( “com/mycomp/po/pof.jasper”), response.getOutputStream(), parameters, lineItemDataSource );
} catch (Exception e) {
e.printStackTrace(System.out);
logger.error(e);
}

Right, so I have used just one of the many ways in which you can bind to the report. You will of course need to find out how to compile your report template. When you author your template it is in .jrxml format and this needs to be compiled into a .jasper file which you can do either automatically with JasperAssistant, or manually with bundled tools with JR. In my example here, the compiled report is located in the class struture and I dynamically load it as an InputStream as required by the runReportToPdfStream method.

You should examine the JR API for all the other alternatives including running PDFs to file and even doing HTML output rather than PDF. In an application you would need to use slightly different calls that can be found in the JR API also. Some of you have asked how to send the result direct to the browser. Well, that’s easy - the code above forces a Save to Disk for the PDF by using the content-disposition header, so just comment out the response.addHeader call

/* comment the save to disk feature out so that the pdf goes straight to the browser response.addHeader("Content-Disposition", "attachment; filename=PO - " + po.getReference() + ".pdf"); */

Conclusion

This tutorial has covered some specific aspects of the fantastic JasperReports API that may or may not be suitable for your own projects. I hope if nothing else, it provides an insight into one way of using the API or grounds you in the basics. There is so much more to JR that I have not used myself so take time once you get the idea to look at the bundled examples and API to make sure you are making the right choices.

JasperAssistant was an invaluable piece of kit for this job. It is quite tough getting to grips with the report template XML syntax, especially when your report needs pixel perfect alignment and so fourth. I did not go into a great deal of depth with layout elements like boxes and lines, but I have used them to draw the table boundaries around my detail LineItem band. Good luck, and if this article was helpful or not, leave a comment.

Comparing FOP and JasperReports

Anybody looking for OSS reporting solutions in Java usually has to make a choice between Apache FOP and Jasper Reports*. While having somewhat different feature sets and addressing distinct reporting solutions, the two APIs boil down to the same basic thing : generate a report from an XML file (or stream/string/whatever). FOP has a clear advantage of standardization (based on XSL-Formatting Objects) while Jasper plays more in the pragmatic field of obtaining those 80% results with a minimum of effort and uses a proprietary XML format.

But FOP is not a standalone reporting solution : it's just a way of transforming XSL-FO files into a report. In order to fill the report with the necessary data, the obvious choice is a templating engine such as Jakarta Velocity. Thus a FOP report creation is a two-step operation :

create the XML report via Velocity
feed the XML stream to FOP
Jasper alleviates this problem by including its own binding engine, the only restriction being that input data should support some constraints (such as putting your 'rows' inside a JRDataSource).

Both Jasper and FOP allow inclusion of graphic files inside, usual formats (GIF, JPEG) are supported, however FOP has a nice bonus of rendering SVG inside reports. Unfortunately, this comes with the price of using Batik SVG Toolkit, which is a bulky (close to 2MB) and rather slow API. While processing your dynamic charts as XML files (Velocity again) is a seducing idea, the abysmal performance of SVG rendering will make you give up in no time. Unfortunately, I speak from experience.

At first sight, FOP has a lot more options for output format, compared to Jasper Reports. Of course there's PDF and direct printing via AWT, but also Postscript, PCL, MIF as well as SVG. These choices are quite intriguing, since Postscript and PCL are printing formats (easily obtained by redirecting the specific printer queue into a file), MIF is a rather obscure Adobe format (for Framemaker) and SVG … well, a SVG report is too darn slow to be useable (yes, I was foolish enough to try this, too). Jasper makes again a pragmatic choice by allowing really useful output formats such as HTML, CSV and XSL (never underestimate the power of Excel); and of course: direct printing via AWT and PDF.
While FOP's latest version (0.20.5) was released almost a year ago (summer 2003), Jasper Reports is bubbling with activity - Teodor releases a minor version each one or two months (latest being 0.5.3 at 18.05.2004).

I've decided to use as a 'lab rat' one of the apps developed during my 'startup days': the client GUI is written in Swing and features a few mildly complex reports generated using Velocity+FOP. FOP version is 0.20.4 (the current version back in Q1-2003, when we had to quit dreaming about the 'next financing round' and development halted) but as I already told you FOP has evolved little since then. Though, it's perfectly reasonable to use this implementation as a witness for comparison with Jasper (on the opposite, Jasper has evolved a great deal since Q1-2003).

Back then, the report development cycle was quite simplistic. In fact, the XSL-FO templates were written by hand inside a text editor and the application code was run (via a Junit testcase and some necessary configuration and business data mocking) in order to generate a PDF report. In the case of errors, we had feedback by examining the error traces. Visual feedback was given by the PDF output. While simple to perform, this cycle was extremely tiresome after a while as there was an important overhead : start a new JVM, initialize FOP, fire Acrobat Reader (plus we were using some crappy - even by the standards of 2003 - 1GHz machines w 256/512MB RAM). A WYSIWYG editor would have been nice, so one of my coworkers has made some research and the only solution he found was XMLSpy (Stylevision not available back then) - but, at 800USD/seat this was 'a bit' pricey** for us (only the Enterprise flavor covers FO WYSIWYG editing !?). Another interesting idea was to use one of the conversion tools (from RTF to FOP) such as Jfor, XMLMind or rtf2fo (of these products, only Jfor is free, but feature-poor). What stopped us from doing it was that the generated FO was overly complex : we needed comprehensible cut_the_crap files because we were going to integrate inside Velocity templates. And when you have tens of tags and blocks inside blocks and not the slightest idea which one is a row, which one is a column and which one is a transparent dumbass artefact, it's a gruesome trial-and-error task to integrate even simple VTL loops. And you'd have to do this each time you change something in the report : yikes ! Conclusion : the report development cycle was primitive for FOP and there was no way we could change it.

Things are quite different for Jasper Report : there are a lot of available report designers, and some of them are free. While the complete list is on Jasper Report site, I'd like to note at least three of them :

iReport is a Swing editor and very interesting because it's not only covering the basic Jasper functionality but also supplementary features such as barcode support (which is admittedly as easy as embedding a barcode font in Jasper with two lines of XML, but much easier to make it via a mouse click). iReport is free, which is excellent, but is a standalone app without IDE integration, and as any complex Swing app is quite slow and a memory hog.
if you are a developer using Eclipse, you'd appreciate two graphical editors based on Eclipse GEF, available as Eclipse plugins : JasperAssistant and SunshineReports. None of them is free and, at least on paper, the functionality seem identical, but SunshineReports has only the older 1.1 version downloadable, which is free but does NOT work with recent builds of Eclipse 3. How the heck am I supposed to test it ? On the contrary, Assistant has a much more relaxed attitude allowing the download of a free trial for the latest version of their product. Maybe too relaxed, though, because - even if (theoretically) limited in number of usages - you can use the trial as much as you want to***. But if you are serious about doing Jasper in Eclipse you should probably buy Assistant, available for a rather decent 59USD price tag. I am currently using it and it's a good tool.
So much for the tools, let's get the job done. The bad part : if you're experienced with FO templates, don't expect to be immediately proficient with Jasper, even with a GUI editor. The structure of an FO document has powerful analogies with HTML : you have tables, rows, cells, stuff like that, inside special constructs called blocks. It's relatively easy to use a language such as VTL in order to create nested tables, alternating colors and other data layout tricks. You can even render a tree-organized data via a recursive VTL macro, and everything is smooth and easy to understand. Jasper is completely different and at first sight you'll be shocked by its apparent lack of functionality : only rectangles, lines, ellipses, images, boilerplate text and fields (variable text). Each one of this elements has an extensive set of properties about when the element should be displayed, stretch type, associated expression for value and so on. Basically, you'd have to write Java code instead of Velocity macros and call this code from the corresponding properties of various report elements. If at the beginning it feels a little awkward, after a while it comes quite natural and simple. As for nesting and other advanced layouts, there is a powerful concept of 'subreport'. And yes I've managed to render a tree using a recursive subreport, but given the poor performance the final choice was to flatten the data into a vector then feed it into a simple Jasper report. So pay attention to the depth of 'subreporting'.

Once the reports were completely migrated, I've benchmarked a simple one (without SVG, charts, barcodes or other 'exotic' characteristics). The test machine is a 2.4GHz P4 w 512MB Toshiba Satellite laptop. In the case of FOP, the compiled velocity template and the FOP Driver are cached between successive runs. In the case of Jasper, the report is precompiled and loaded only on first run, then refilled with new data before each generation. The lazy loading and caching of reporting engines is the cause of important time differences between the generation of the first report and the subsequent reports. Delta memory is measured after garbage collection. The values presented are median for 10 runs of the 'benchmark report'.

First run Subsequent runs Delta memory
Velocity + FOP 10365ms 381ms 850KB
Jasper Reports 1322ms 82ms 1012KB

While I am totally pro-Jasper after this short experiment, it is important to note that commercial and well-maintained FO rendering engines such as RenderX XEP claim improved performance upon FOP. Depending on your requirements, environment and reporting legacy apps, an FO-based solution might be better, especially when report generation is only on server-side.

Good advice for creating XML

The use of XML has become widespread, but much of it is not well formed. When it is well formed, it's often of poor design, which makes processing and maintenance very difficult. And much of the infrastructure for serving XML can compound these problems. In response, there has been some public discussion of XML best practices, such as Henri Sivonen's document, "HOWTO Avoid Being Called a Bozo When Producing XML." Uche Ogbuji frequently discusses XML best practices on IBM developerWorks, and in this column, he gives you his opinion about the main points discussed in such articles.

I have been discussing XML best practices in this column and in other series for years. Others, such as fellow columnist Elliotte Rusty Harold, have covered it as well. The more XML experts that join the discussion of XML design principles, the better, so the community can converge on solid advice for developers at all levels of XML adoption. In this article, using a recent document and a classic one, you learn more details about XML best practices.

Enter the no bozo zone

Henri Sivonen wrote a useful article, "HOWTO Avoid Being Called a Bozo When Producing XML" (see Resources). Adopting the perspective of XML-based Web feed formats, such as RSS and Atom, he goes over his Dos and Don'ts for producing well-formed XML with namespaces. As he says in his introduction:

There seem to be developers who think that well-formedness is awfully hard -- if not impossible -- to get right when producing XML programmatically and developers who can get it right and wonder why the others are so incompetent. I assume no one wants to appear incompetent or to be called names. Therefore, I hope the following list of Dos and Don'ts helps developers to move from the first group to the latter.

The first bit of advice Henri gives is, "Don't think of XML as a text format." I think this is dangerous advice. Certainly his main point is valid -- you cannot be as careless in producing or editing XML as you would a simple text document, but this applies to all text formats with any structure. However, saying that XML is not text is denying one of the most important characteristics of XML, one that is enshrined in the very definition of XML in the specification. ("A textual object is a well-formed XML document [if it conforms to this specification.]") Henri's statement is also confusing because there is a technical definition of text in XML that is essentially the sequence of characters interpreted as XML. Text is not merely what goes within leaf elements or within attributes -- technically called character data. Text is the fundamental fabric of all XML entities, so to say that XML is not text is a contradiction. I think it's more useful to highlight the specific ways in which XML differs from text formats with which developers might already be familiar.

This comment is an example of how Henri's advice is colored by his interest in the problem of generating well-formed Web feeds. He is right to warn people that carelessly slapping strings together and hoping they are well formed is a dangerous course. I too have written articles advising people to use mature XML toolkits rather than simple text tools when generating XML (see Resources). My concern is that the way in which Henri couches this advice is a bit confusing and could be misconstrued in the broader context of XML processing. He reiterates his advice in the sections, "Don't use text-based templates" and "Don't print". I think this should be summarized as: "Do not use mechanisms that you're not sure will result in well-formed XML." That's very important advice indeed. One approach to safe XML generation is sending SAX events, as Henri suggests in, "Use a tree or a stack (or an XML parser)." If you do so, however, do not assume you are home free. The SAX tools you use might not do all the necessary well-formedness checking. For example, some Unicode characters are not allowed in XML. You may need an additional level of checking to account for such issues.

Henri rightly suggests that users not try to manage namespaces by hand. As I've discussed on developerWorks, XML namespaces require a great deal of care. His suggestion that developers only think in terms of universal name [namespace Uniform Resource Identifier (URI) plus local name] is generally sound, but sometimes a developer cannot avoid dealing with prefixes or XML declarations. In specifications, such as XSLT, a QName (prefix/local name combination) can be used within attribute values, and the prefix is supposed to be interpreted according to in-scope namespace declarations. This kind of pattern is called a QName in context. In this case, the developer must have control over the declared prefix or the resulting XML processing will fail. When developers do manage their own namespace declarations, the result is often messy because of the complexities of XML namespaces.

One way to clean up namespace syntax that might become messy while passing through a pipeline of XML processing is to insert a canonicalization step to the end of the pipeline. XML canonicalization eliminates the syntactic variations permitted by XML 1.0 and XML namespaces, including different namespace declaration patterns. Canonicalization will not eliminate all the issues that make namespace declarations treacherous to developers. Canonicalization does not help with QNames in context problems since it does not change the prefixes used in a document, but it does reduce the mess of namespace declarations to the point where you can easily spot problems or even write code to automatically fix them. The GenX library, which is one of the XML generation options Henri suggests, automatically generates canonical XML, and many other toolkits provide canonicalization as an option.

Henri's advice about Unicode and character handling is almost completely sound. However, in "Avoid adding pretty-printing white space in character data," I think the case is a bit overstated. Pretty-printing XML is safe in most cases between elements, rather than within elements with character data. As Henri says, if you have the XML in Listing 1, it is usually not safe to render it as in Listing 2.

Listing 1. XML sample

bar



Listing 2. XML sample with white space added to character data


bar



But it is usually safe to pretty-print the XML in Listing 3, so that the output is as in Listing 4.

Listing 3. Another XML sample

bar



Listing 4. XML sample in Listing 3 with white space added to character data


bar



Many XML serializer tools understand this distinction between relatively safe and relatively unsafe pretty-printing. It is important to understand that the form of pretty-printing shown in Listings 3 and 4 can cause distortion if white space is added to mixed content. Such problems can be avoided if the serialization is guided by a schema. In practice, though, most vocabularies that use mixed content are not so sensitive to white space normalization, so don't worry too much about pretty-printing. You should be knowledgeable of the issues, and be sure there is an option to turn pretty-printing off (preferably the default should be to not pretty-print). Henri recommends a pretty-printing practice as in Listing 5, but I disagree because I think it makes for ugly markup that's not friendly to manipulation by people.

Listing 5. Pretty-printing convention suggested by Henri Sivonen but not recommended by this author

>bar>


From the monastery

Switching to a very different speed, the second resource I shall explore in this article is Simon St. Laurent's "Monastic XML" (see Resources). This is a collection of brief essays with advice on how to process and even think about XML for maximum effect. Simon uses the metaphors of monasticism and asceticism to suggest that it is dangerous to load XML too heavily with baggage that does not suit its simple, textual roots. In "Marking-up at the foundation," he discusses the fundamental roles of character data and markup (elements and attributes). In "Naming things and reading names," he explains why the generic identifier (also called the element type name) is an important concept and how it should be the sole primary key to the structure of the marked-up information. Realistically, if you're using XML namespaces, the primary key is the universal name (namespace URI plus local name), and this complication is one of the reasons Simon urges caution in "Namespaces as opportunity." "Accepting the discipline of trees" calls out one of XML's dirty secrets: Even though it seems that XML's hierarchical structure could be easily extended to graph structure, in practice, the modeling of graphs in XML has proven a bit difficult. But by far the most important lesson on the "Monastic XML" site is found in "Optimizing markup for processing is always premature." XML is a declarative technology, and therein lies its strengths, as well as its frustrations, for many developers. Developers who try to pull XML design too close to the details of processing generally end up making that processing more difficult in the long term. The key to success with XML is to focus on the nature of the information that needs to be represented in the abstract separately from the technical design of the systems that need to process that information.