Tuesday, June 12, 2012

OSGi case study: a modular vert.x

OSGi enables Java code to be divided cleanly into modules known as bundles with access to code and resources controlled by a class loader for each bundle. OSGi services provide an additional separation mechanism: the users of an interface need have no dependency on implementation classes, factories, and so forth.

The following case study aims to make the above advantages of OSGi bundles and services concrete. It takes an interesting Java project, vert.x, and shows how it can be embedded in OSGi and take advantage of OSGi's facilities.

Disclaimer: I am not proposing to replace the vert.x container or its module system. This is primarily a case study in the use of OSGi although some of the findings should motivate improvements to vert.x, especially when it is embedded in applications with custom class loaders.

vert.x

The vert.x open source project provides a JVM alternative to node.js: an asynchronous, event-driven programming model for writing web applications in a number of languages including Java, Groovy, JavaScript, and Ruby.

vert.x supports HTTP as well as modern protocols such as WebSockets and sockjs (which works in more browsers than WebSockets and can traverse firewalls more easily).

vert.x has a distributed event bus which allows JSON messages to be propagated between vert.x applications known as verticles and shared code libraries known as busmods. A busmod is a special kind of verticle which handles events from the event bus. vert.x ships some busmods, such as a MongoDB 'persistor', and users can write their own.

vert.x components

vert.x's threading model is interesting as each verticle (or busmod) is bound to a particular thread for its lifetime and so the code of a verticle needn't be concerned about thread safety. A pool of threads is used for dispatching work on verticles and each verticle must avoid blocking or long-running operations so as not to impact server throughput (vert.x provides separate mechanisms for implementing long-running operations efficiently). This is similar to the quasi-reentrant threading model in the CICS transaction processor.1

Of particular interest here is the vert.x module system which has a class loader per verticle and code libraries, known as modules, which are loaded into the class loader of each verticle which uses them. So there is no way to share code between verticles except via the event bus.

vert.x has excellent documentation including a main manual, a java manual (as well as manuals for other language), tutorials, and runnable code examples.

OSGi

If you're not already familiar with OSGi, read my OSGi introduction post, but don't bother following the links in that post right now - you can always go back and do that later.

Embedding vert.x in OSGi

I did this in several small steps which are presented in turn below: converting vert.x JARs to OSGi bundles and then modularising verticles, busmods, and event bus clients.

Converting vert.x JARs to OSGi Bundles

The vert.x manual encourages users to embed vert.x in their own applications by using the vert.x core JAR, so the first step in embedding vert.x in OSGi was to convert the vert.x core JAR into an OSGi bundle so it could be loaded into an OSGi runtime.

I used the bundlor tool, although other tools such as bnd would work equally well. Bundlor takes a template and then analyses the bytecode of the JAR to produce a new JAR with appropriate OSGi manifest headers. Please refer to the SpringSource Bundlor documentation for further information about bundlor for now as the Eclipse Virgo Bundlor documentation is not published at the time of writing even though the bundlor project has transferred to Eclipse.org.

The template for the vert.x core JAR is as follows:

Bundle-ManifestVersion: 2
Bundle-SymbolicName: org.vertx.core
Bundle-Version: 1.0.0.final
Bundle-Name: vert.x Core
Import-Template:
 org.jboss.netty.*;version="[3.4.2.Final,4.0)",
 org.codehaus.jackson.*;version="[1.9.4,2.0)",
 com.hazelcast.*;version="[2.0.2,3.0)";resolution:=optional,
 groovy.*;resolution:=optional;version=0,
 org.codehaus.groovy.*;resolution:=optional;version=0,
 javax.net.ssl;resolution:=optional;version=0,
 org.apache.log4j;resolution:=optional;version=0,
 org.slf4j;resolution:=optional;version=0
Export-Template: *;version="1.0.0.final"

(The template and all the other parts of this case study are available on github.)

What this does is define the valid range of versions for packages that the JAR depends on (the range "0" represents the version range of 0 or greater), whether those packages are optional or mandatory, and what version the JARs own packages should be exported at. It also gives the bundle a symbolic name (used to identify the bundle), a version, and a (descriptive) name. Armed with this information, OSGi then wires together the dependencies of bundles by delegating class loads and resource lookups between bundle class loaders.

Thankfully the netty networking JAR and jackson JSON JARs which the vert.x core JAR depends on ship with valid OSGi manifests.

As a sniff test that the manifest was valid, I tried deploying the vert.x core bundle in the Virgo kernel. This was simply a matter of placing the vert.x core bundle in the pickup directory and its dependencies in the repository/usr directory and then starting the kernel. The following console messages showed the vert.x core bundle was installed and resolved successfully:
<hd0001i> Hot deployer processing 'INITIAL' event for file 'vert.x-core-1.0.0.final.jar'.
<de0000i> Installing bundle 'org.vertx.core' version '1.0.0.final'.
<de0001i> Installed bundle 'org.vertx.core' version '1.0.0.final'.
<de0004i> Starting bundle 'org.vertx.core' version '1.0.0.final'.
<de0005i> Started bundle 'org.vertx.core' version '1.0.0.final'.
Using the Virgo shell, I then checked the wiring of the bundles:

osgi> ss
"Framework is launched."

id      State       Bundle
0       ACTIVE      org.eclipse.osgi_3.7.1.R37x_v20110808-1106
...
89      ACTIVE      org.vertx.core_1.0.0.final
90      ACTIVE      jackson-core-asl_1.9.4
91      ACTIVE      jackson-mapper-asl_1.9.4
92      ACTIVE      org.jboss.netty_3.4.2.Final

osgi> bundle 89
org.vertx.core_1.0.0.final [89]
  ...
  Exported packages
    ...
    org.vertx.java.core; version="1.0.0.final"[exported]
    org.vertx.java.core.buffer; version="1.0.0.final"[exported]
    ...
  Imported packages
    org.jboss.netty.util; version="3.4.2.Final"<org.jboss.netty_3.4.2.final [92]>
    ...
    org.codehaus.jackson.map; version="1.9.4"<jackson-mapper-asl_1.9.4 [91]>
    ...
I also converted the vert.x platform JAR to an OSGi bundle in similar fashion as it was needed later.

Modularising Verticles

A typical verticle looks like this:
public class ServerExample extends Verticle {

  public void start() {
    vertx.createHttpServer().requestHandler(new Handler<httpserverrequest>() {
      public void handle(HttpServerRequest req) {
        ...
      }
    }).listen(8080);
  }
}
When the start method is called it creates a HTTP server, registers a handler with the server, and sets the server listening on a port. Apart from the body of the handler, the remainder of this code is boilerplate. So I decided to factor out the boilerplate into a common OSGi bundle (org.vertx.osgi) and replace the verticle with a modular verticle bundle containing the handler and some declarative metadata equivalent to the boilerplate. The common OSGi bundle uses the whiteboard pattern to listen for specific kinds of services in the OSGi service registry, create boilerplate based on the metadata, and register the handler with the resultant HTTP server.

Let's look at the modular verticle bundle. Its code consists of a single HttpServerRequestHandler class:2
public final class HttpServerRequestHandler implements Handler<httpserverrequest> {

    public void handle(HttpServerRequest req) {
        ...
    }

}
It also has declarative metadata in the form of service properties which are registered along with the handler in the OSGi service registry. I used the OSGi Blueprint service to do this, although I could have used OSGi Declarative Services or even registered the service programmatically using the OSGi API. The blueprint metadata is a file blueprint.xml in the bundle that looks like this:
<?xml version="1.0" encoding="UTF-8"?>
<blueprint xmlns="http://www.osgi.org/xmlns/blueprint/v1.0.0">
        
    <service interface="org.vertx.java.core.Handler" ref="handler">
        <service-properties>
            <entry key="type" value="HttpServerRequestHandler">
            <entry key="port" value="8090">
        </service-properties>
    </service>
        
    <bean class="org.vertx.osgi.sample.basic.HttpServerRequestHandler"
          id="handler"/>

</blueprint>
This metadata declares that a HTTP server should be created (via the type service property), the handler registered with it, and the server set listening on port 8090 (via the port service property). This all happens courtesy of the whiteboard pattern when the org.vertx.osgi bundle is running as we'll see below.

Notice that the modular verticle depends only on the Handler and HttpServerRequest classes whereas the original verticle also depends on the Vertx, HttpServer, and Verticle classes. This also makes things quite a bit simpler for those of us who like unit testing (in addition to in-container testing) as fewer mocks or stubs are required.

So what do we now have? Two bundles to add to the bundles we installed earlier: an org.vertx.osgi bundle which encapsulates the boilerplate code and an application bundle representing a modular verticle. We also need a Blueprint service implementation -- as of Virgo 3.5, a Blueprint implementation is built in to the Virgo kernel. The following interaction diagram shows one possible sequence of events:


In OSGi, each bundle has its own lifecycle and in general bundles are designed so that they will function correctly regardless of the order in which they is started relative to other bundles. In the above example the assumed start order is: blueprint service, org.vertx.osgi bundle, modular verticle bundle. However, the org.vertx.osgi bundle could start after the modular verticle bundle and the end result will be the same: a server will be created and the modular verticle bundle's handler registered with the server and the server set listening. If the blueprint service is started after the org.vertx.osgi and modular verticle bundles, then the org.vertx.osgi bundle won't detect the modular verticle bundle's handler service appear in the service registry until the blueprint service has started, but then the end result will again be the same.

The github project contains the source for some sample modular verticles: a basic HTTP vertical (which runs on port 8090) and a sockjs verticle (which runs on port 8091). The org.vertx.osgi bundle needed more code to support sockjs and the modular sockjs verticle needed to provide a sockjs handler in addition to a HTTP handler.

Modularising BusMods

The MongoDB persistor is a typical example of a busmod which processes messages from the event bus:
public class MongoPersistor extends BusModBase implements Handler<message<jsonobject>> {

  private String address;
  private String host;
  private int port;
  private String dbName;

  private Mongo mongo;
  private DB db;

  public void start() {
    super.start();

    address = getOptionalStringConfig("address", "vertx.mongopersistor");
    host = getOptionalStringConfig("host", "localhost");
    port = getOptionalIntConfig("port", 27017);
    dbName = getOptionalStringConfig("db_name", "default_db");

    try {
      mongo = new Mongo(host, port);
      db = mongo.getDB(dbName);
      eb.registerHandler(address, this);
    } catch (UnknownHostException e) {
      logger.error("Failed to connect to mongo server", e);
    }
  }

  public void stop() {
    mongo.close();
  }

  public void handle(Message<jsonobject> message) {
    ...
  }

}
Again there is a mixture of boilerplate code (to register the event bus handler), start/stop logic, configuration handling, and the event bus handler itself. I applied a similar approach to the other verticles and separated out the boilerplate code into the org.vertx.osgi bundle leaving the handler and metadata (including configuration) in a modular busmod. The persistor's dependency on the MongoDB client JAR (mongo.jar) is convenient because this JAR ships with a valid OSGi manifest.

Here's the blueprint.xml:
<?xml version="1.0" encoding="UTF-8"?>
<blueprint xmlns="http://www.osgi.org/xmlns/blueprint/v1.0.0">
        
    <service ref="handler" interface="org.vertx.java.core.Handler">
        <service-properties>
            <entry key="type" value="EventBusHandler"/>
            <entry key="address" value="vertx.mongopersistor"/>
        </service-properties>
    </service>
        
    <bean id="handler" class="org.vertx.osgi.mod.mongo.MongoPersistor"
          destroy-method="stop">
        <argument type="java.lang.String"><value>localhost</value></argument>
        <argument type="int"><value>27017</value></argument>
        <argument type="java.lang.String"><value>default_db</value></argument>
    </bean>
      
</blueprint>
Notice that the boilerplate configuration consists of the handler type and event bus address. The other configuration (host, port, and database name) is specific to the MongoDB persistor.

Here's the modular MongoDB busmod code:
public class MongoPersistor extends BusModBase
                            implements Handler<Message<JsonObject>> {

    private final String host;

    private final int port;

    private final String dbName;

    private final Mongo mongo;

    private final DB db;

    public MongoPersistor(String host, int port, String dbName)
           throws UnknownHostException, MongoException {
        this.host = host;
        this.port = port;
        this.dbName = dbName;

        this.mongo = new Mongo(host, port);
        this.db = this.mongo.getDB(dbName);
    }

    public void stop() {
        mongo.close();
    }

    public void handle(Message<JsonObject> message) {
        ...
    }

}
The code still extends BusModBase simply because BusModBase provides several convenient helper methods. Again the resultant code is simpler and easier to unit test than the non-modular equivalent.

Modularising Event Bus Clients

Finally, I needed a modular verticle to test the modular MongoDB persistor. All this verticle needs to do is to post an appropriate message to the event bus. Normal vert.x verticles obtain the event bus using the Vertx class, but I used the Blueprint service again, this time to look up the event bus service in the service registry and inject it into the modular verticle. I also extended the org.vertx.osgi bundle to publish the event bus service in the service registry.

The blueprint.xml for the modular event bus client is as follows:
<?xml version="1.0" encoding="UTF-8"?>
<blueprint xmlns="http://www.osgi.org/xmlns/blueprint/v1.0.0">

    <reference id="eventBus" interface="org.vertx.java.core.eventbus.EventBus"/>

    <bean class="org.vertx.osgi.sample.mongo.MongoClient">
        <argument ref="eventBus"/>
        <argument type="java.lang.String">
            <value>vertx.mongopersistor</value>
        </argument>
    </bean>
      
</blueprint>
Then the modular event bus client code is straightforward:

public final class MongoClient {

    public MongoClient(EventBus eventBus, String address) {
        JsonObject msg = ...
        eventBus.send(address, msg,
                      new Handler<Message<JsonObject>>(){...});
    }

}

Taking it for a Spin

1. I've made all the necessary OSGi bundles available in the bundles directory in git. You can grab them either by cloning the git repository:

git clone git://github.com/glyn/vert.x.osgi.git

or by downloading a zip of the git repo.

2. vert.x requires Java 7, so set up a terminal shell to use Java 7. Ensure the JAVA_HOME environment variable is set correctly. (If you can't get Java 7 right now, you'll see some errors when the bundles are deployed to OSGi and you won't be able to run the samples in steps 8 and 9.)

3. If you are an OSGi user, simply install and start the bundles in your favourite OSGi framework or container and skip to step 8. If not, then use the copy of the Virgo kernel in the git repository as follows.

4. Change directory to the virgo-kernel-... directory in your local copy of the git repo.

5. On UNIX, issue:

bin/startup.sh -clean

or on Windows, issue:

bin\startup.bat -clean

6. The Virgo kernel should start and deploy the various bundles in its pickup directory:
  • org.vertx.osgi bundle (org.vertx.osgi-0.0.1.jar)
  • HTTP sample modular verticle (org.vertx.osgi.sample.basic-1.0.0.jar)
  • SockJS sample modular verticle (org.vertx.osgi.sample.sockjs-1.0.0.jar)
  • MongoDB persistor sample modular busmod (org.vertx.osgi.mods.mongo-1.0.0.jar)
7. If you want to see which bundles are now running, start the Virgo shell from another terminal:

telnet localhost 2501

and use the ss or lb commands to summarise the installed bundles. The help command will list the other  commands available and disconnect will get you out of the Virgo shell. Here's typical output of the ss command:
...
89      ACTIVE      org.vertx.osgi_0.0.1
90      ACTIVE      jackson-core-asl_1.9.4
91      ACTIVE      jackson-mapper-asl_1.9.4
92      ACTIVE      org.jboss.netty_3.4.2.Final
93      ACTIVE      org.vertx.core_1.0.0.final
94      ACTIVE      org.vertx.osgi.mods.mongo_1.0.0
95      ACTIVE      com.mongodb_2.7.2
96      ACTIVE      org.vertx.platform_1.0.0.final
97      ACTIVE      org.vertx.osgi.sample.basic_1.0.0
98      ACTIVE      org.vertx.osgi.sample.sockjs_1.0.0
and of the lb command (which includes the more descriptive Bundle-Name headers):
   ...
   89|Active     |    4|vert.x OSGi Integration (0.0.1)
   90|Active     |    4|Jackson JSON processor (1.9.4)
   91|Active     |    4|Data mapper for Jackson JSON processor (1.9.4)
   92|Active     |    4|The Netty Project (3.4.2.Final)
   93|Active     |    4|vert.x Core (1.0.0.final)
   94|Active     |    4|MongoDB BusMod (1.0.0)
   95|Active     |    4|MongoDB (2.7.2)
   96|Active     |    4|vert.x Platform (1.0.0.final)
   97|Active     |    4|Sample Basic HTTP Verticle (1.0.0)
   98|Active     |    4|Sample SockJS Verticle (1.0.0)

8. You can now use a web browser to try out the basic HTTP sample at localhost:8090 which should respond "hello" or the SockJS sample at http://localhost:8091 which should display a box into which you can type some text and a button which, when clicked, produces a pop-up:


9. If you want to try the (headless) MongoDB event bus client, download MondoDB and start it locally on its default port and then copy org.vertx.osgi.sample.mongo-1.0.0.jar from the bundles directory to Virgo's pickup directory. As soon as this bundle starts, it will send a message to the event bus and drive the MongoDB persistor to update the database. If  you don't want to use MongoDB to check that an update was made, take a look in Virgo's logs (in serviceability/logs/log.log) to see some System.out lines like the following that confirmed something happened:
System.out Sending message: {action=save, document={x=y}, collection=vertx.osgi} 
... 
System.out Message sent 
...
System.out Message response {_id=95..., status=ok}

OSGi and vert.x Modularity

In this case study the various sample OSGi bundles all depend on, and share, the vert.x core bundle. Each bundle is loaded in its own class loader and OSGi controls the delegation of class loading and resource lookups according to how the OSGi bundles are wired together. In the same way, verticles written as OSGi bundles are free to depend on, and share, other OSGi bundles.

This is quite different from the vert.x module system in which any module (other than a busmod) which a verticle depends on is loaded into the same class loader as the verticle.

The advantages of the OSGi module system are that a single copy of each module is installed in the system and is visible to and may be managed by tools such as the Virgo shell. It also minimises footprint.

The advantages of the vert.x module system are that there is no sharing of modules between verticles so a badly-written module could not inadvertently or deliberately leak information between independent verticles. Also, there is a separate copy of each (non-busmod) module for each verticle that uses it and so the module can be written without worrying about thread safety as each copy will only be executed on its verticle's thread. OSGi users may, however, be happy to require reusable modules to be thread-safe and manage any mutable static data carefully to avoid leakage between threads.

Replacing the Container?

When I raised the topic of embedding vert.x in OSGi, the leader of vert.x, Tim Fox, asked me whether I was writing a replacement for the current container, to which I replied "not really". I said this because I liked vert.x's event driven programming model and its threading model, which seem to be part of "the container". But I was trying to replace a couple of aspects of the vert.x container: the module system and the way verticles register handlers.

Later it struck me that perhaps the notion of "the container" as a monolithic entity is a little odd in a modular system and it might be better to think of multiple, separate notions of containment which could then be combined in different ways to suit different users. However, the subtle interaction between the class loading and threading models seen above shows that the different notions of containment can depend on each other. I wonder what others think about the notion of "the container"?

Conclusions

vert.x's claim that it can be embedded in other applications is essentially validated since the OSGi framework is a fairly exacting application.

The vert.x module system, although not providing isolation between modules, does neatly provide isolation between applications (comprising verticles and their modules) and it enables modules to be written without paying attention to thread safety.

One vert.x issue was raised2 which should make vert.x easier to embed in other environments with custom class loaders.

vert.x could follow the example of netty, jackson, and MongoDB JARs and include OSGi manifests in its core and platform JARs to avoid OSGi users having to convert these JARs to OSGi bundles. I will leave this to someone else to propose as I cannot gauge the demand for using vert.x inside OSGi.

Running vert.x in OSGi addresses some outstanding vert.x requirements such as how to automate in-container tests (OSGi has a number of solutions including Pax Exam while Virgo has a integration test framework) and how to develop verticles and deploy them to vert.x under control of the IDE (see the Virgo IDE tooling guide). Virgo also provides numerous ancillary benefits including the admin shell for inspecting and managing bundles and verticles, sophisticated diagnostics, and much more (see the Virgo white paper for details).

The exercise also had some nice spin-offs for Virgo. Bug 370253 was fixed which was the only known issue in running Virgo under Java 7. Virgo 3.5 depends on Gemini Blueprint which broke in this environment and so bug 379384 was raised and fixed. I used the new Eclipse-based Virgo tooling to develop the various bundles and run them in Virgo. As a consequence, I found a few small issues in the tooling which will be addressed in due course.

Finally, running vert.x on the Virgo kernel is a further validation that the kernel is suitable for building custom server runtimes since now we have vert.x in addition to Tomcat, Jetty, and one or two custom servers running on the kernel.

Footnotes:
  1. I worked in the CICS development team in my IBM days. A colleague at SpringSource gave me a "CICS Does That!" T-shirt soon after we'd started working together. Old habits die hard.
  2. The modular vertical currently needs to intercept vert.x's resource lookup logic so that files in the bundle can easily be served. It would be much better for this common code to move to the org.vertx.osgi bundle, but this requires vert.x issue 161 to be implemented first.

Sunday, June 10, 2012

Temporary simplification

New software projects or products are often introduced which take an existing, successful piece of software and offer similar function, but in a much simpler way.

Sometimes there is a genuine improvement by capturing the gist of the requirements in a much more general way. Take git for example. If you've used certain older version control systems, git seems like a dream come true. This is because git's underlying design cracks the version control problem really neatly. Branching and merging is no longer a sweat. It just works.

But other times a new piece of software only appears simpler than an established alternative because it hasn't yet had time to meet all the requirements. I've noticed this mostly in applications. Newer versions of Apple mail have (useful) features deleted. Google appear to be systematically ruining their applications in the search for simplicity. This kind of temporary simplification where, as important features are added back over time, the result ends up being no simpler than the original, seems like a bit of a con to me.

I'd be interested in other people's views on this phenomenon. Is temporary simplification a good way of appealing to a new group of users or is it a failure to make a genuine design improvement?

Friday, June 01, 2012

Gemini Blueprint 1.0.1.M01

Gemini Blueprint 1.0.1.M01 is available for download. The main highlight is the support for Spring 3.1.x. A full list of bugs in the milestone is here.

Kudos to Aaron Whiteside for providing the vast majority of the changes for this milestone. He is in the process of becoming a committer for Gemini Blueprint.

Projects

OSGi (130) Virgo (59) Eclipse (10) Equinox (9) dm Server (8) Felix (4) WebSphere (3) Aries (2) GlassFish (2) JBoss (1) Newton (1) WebLogic (1)