Thursday, December 1, 2011

Assorted facts about JBoss. Fact 4: per-application logging is broken

You came here to read about setting up the per-application logging in JBoss 6? Congratulations, this is the right place to be. I have posted some info on this subject before. But if you have not read those posts no need to hurry. And if you have read them and followed the advice then forget it and undo the changes.

The per-application logging in JBoss 6 is broken beyond any possibility of repair. Even more, activating per-application logging makes the situation much worse.

Why the hell?! Need to scratch some itch? Look at the bug database, there is so much to do. "No, it is not for us, let's invent something new." And release this piece of ... software, half baked, and forget about any documentation.

So many words unspoken! Or rather unwritten; spoken they were, and a lot.

Back to the topic: why does it not work? To understand it you need first to understand how it really works.

The per-application logging configuration is bound to a classloader that loads jboss-logging.xml file with <define-context> element. When some component (your application code, some jar that come with JBoss, does not matter) invokes getLogger() the execution ends up in JBoss LogManager class. If the requested logger does not exist it has to be created. If the per-application logging is activated some helper class (ClassLoaderLogContextSelector) traverses the invocation stack starting from the most recent entry. For each entry (class) in the invocation stack the check is done whether the entry's classloader has a bound logging configuration. If such a configuration exists it is used and the traversal is stopped. If no classloader-bound logging configuration is found after the traversal the default one (coming from (deploy/jboss-logging.xml) is used.

For example, if your application with a custom jboss-logging.xml calls getLogger() then your class is very close to the top of the invocation stack:
  Some JBoss logging code
Some more JBoss logging code
<your class>.getLogger()
Some other code from the application
Some JBoss EJB/servlet/you name it invocation code, a lot of it actually
Thread.run()
The traversal algorithm starts with Some JBoss logging codeand goes down until it finds <your class>.getLogger(). The application has a per-application logging which is bound to the application's classloader, the very same classloader that has loaded <your class>. Bingo, the logger the application code gets back from JBoss will use the application's custom configuration.

It works even when getLogger() is used by code in some jar packaged with JBoss when your application uses that code. The invocation stack is longer, the traversal takes more time, but this is the only difference:
  Some JBoss logging code
Some more JBoss logging code
<library class>.getLogger()
Some other library code, can be a lot
Some code from the application calling the library
Some other code from the application
Some JBoss EJB/servlet/you name it invocation code, a lot of it actually
Thread.run()
So what is wrong with this?

A lot actually:
  1. What a performance hog this is! Worst of all the code that does not use the per-application logging suffers the most because all the entries in the invocation stack must be checked before JBoss decides to use the default logging configuration. And stack traces under JBoss are known to be looooong. Like hundreds of entries long. To be fair JBoss warns about this problem in deployers/jboss-logging.deployer/META-INF/logmanager-jboss-beans.xml:
    <!--
    ~ These two beans define the per-classloader log context selector which allows per-deployment logging. Since
    ~ enabling this feature may have a performance impact in certain cases, it's started up lazily (on demand)
    ~ when a separate log context is defined in a user deployment.
    -->
    Of course being JBoss code it does not actually works this way. Bug JBAS-9407 strikes back: the proposed workaround activates this stack traversal algorithm even if no component is using the per-application logging. My patch attached to this bug fixes this and makes sure the stack scanning is activated only if a component with the per-application logging configuration is deployed. But being activated the traversal code is always executed when a logger is created, regardless what component/module asks for a logger.

  2. One might say this is not a big problem. After all the traversal is performed only if a new logger is created and not every time it is used to output something. And because loggers are cached it does not matter how many times getLogger() is invoked, creation and hence the traversal happens only once.

    But exactly this fact (traversal at creation) is one of the reasons why the per-application logging is broken. Let's look at the example above, the one with the application using some library. Here is the stack trace again:
      Some JBoss logging code
    Some more JBoss logging code
    <library class>.getLogger()
    Some other library code, can be a lot
    Some code from the application calling the library
    Some other code from the application
    Some JBoss EJB/servlet/you name it invocation code, a lot of it actually
    Thread.run()
    Your application calls the library, for example, JPA. JPA provider classes, Hibernate under JBoss, call getLogger(). JBoss finds the application classes in the invocation stack, so the application logging configuration is used also for Hibernate classes. Logging output from Hibernate goes to your logging file. Sun is shining, life is good. You go ahead and deploy a second application that uses JPA and per-application logging. The log file for the second application is created. Logging from the application classes goes there. But not the logging from Hibernate: it goes to the log file of the first application. Surprise!

    The reason is clear: when the Hibernate logger was created the classes from the first application were on the stack, so the logger became "bound" to the configuration loaded from the first application.

    In reality it gets much more interesting: some Hibernate logging output goes to the default JBoss log (Hibernate code is executed during the application deployment/startup), other might go to this application's log, yet another might go to some other application's log. It all depends on the invocation stack at the moment of creation of the logger. Кто первый встал, того и тапки, which roughly translates to "The first man gets the oyster, the second man gets the shell".

    And what happens if the first application is undeployed? Where Hibernate log goes then? I did not bother to check this: nothing good anyway.

  3. As it were enough. You see, when I am talking about application-specific logging I really want the logging from the shared classes still go to the application log when the shared classes execute on behalf of the application. Even if there is yet no class from the application on the stack. For example, JBoss code that runs before invoking a servlet or an EJB business method from the application. Or a web service: there is a lot going on before the application code is invoked: XML parsing, unmarshaling, you name it. Where the logging, including error stack traces, goes? Most likely to the default logging: there is after all no classes from the application on the stack yet. Or to some other log file: the first man gets the oyster, remember?

  4. It can get even stranger. What I have seen in my tests is that the app-specific logging is activated pretty late in the deployed process. If any of the application code calls getLogger() before the logging configuration is hooked up the logging from the application classes goes to the wrong place: to the default logging. Most often there is a static Logger field in a class. This results in getLogger() being invoked during the loading of the class. For example, JPA provider loads classes looking for annotations. Or CXF code looks for JAX-WS annotations. All this loading is happening before the logging configuration is bound to the application classloader.

    This problem can be somewhat compensated by not using a static Logger field, but it is applicable only to the code you control and not to 3rd party code. There can also be situations where static initializer does something worth logging.

Nicely implemented feature, isn't it?

You might be tempted to package more 3rd party jars in your application and play some classloading tricks to make sure that the classes are loaded from the application-packaged jars and not from JBoss-provided jars. Don't!
  1. This won't buy you anything except maybe the server upgrade. The memory consumption will skyrocket, especially if there are multiple applications doing this.

  2. This might help you with problems 2 and 3 above. Problems 1 and 4 remain.

  3. You need to understand Java classloading pretty good. Then you have to forget it and learn the mess that JBoss made of it. And even then most likely it will not work. JBoss will not even deploy your application no matter how hard you try.

    For example just packaging Hibernate jars in a simplest EJB application that has otherwise no Hibernate or any other persistency dependencies is enough. If the application has JBoss classloading config that instruct JBoss to look for classes first in jars packaged with the application and only then in the parent (JBoss) clasloader the application will not deploy. While I have quite good understanding of JBoss classloading I would not even try to make this application deployable because I also understand the reason why it fails.

    Feel lucky? Give it a try. At best you can make this scheme work with some 3rd party jars but not with the most important and interesting, like Hibernate or CXF.

The question remains: is it actually possible to have the application-specific logging in JBoss 6?

Of course! Scrum solved that problem long time ago: a team decides what they want to have done and as they progress they can redefine what "done" means. Pretty neat and useful. You can do the same here: you can define what "application-specific" means in your case and then implement it.

Jokes aside there is not much you can actually do. The simplest and most sane solution is to forget about all that JBoss woodoo with the application-specific logging. Rollback the workaround for bug JBAS-9407. Use the default logging configuration in deploy/jboss-logging.xml. You can direct the output from your classes to a separate file. All the logging from the shared classes still goes to server.log by default. Directing the output from the shared classes to your application's log file is a bad thing to do: imagine if every application tries to do this. Including thread names in the logging output helps matching the records from the default JBoss log (server.log) back to your application if necessary.

JBoss, you never stops to amuse me!

Tuesday, November 15, 2011

Standalone HornetQ and JBoss, part 2: making it work without embedded server

Rant mode off.

Actually it is not that bad. It is just so happened that HornetQ standalone ended up being last on the list of things I have tried. And it became the last straw.

In reality the biggest single WTF was not HornetQ but ActiveMQ acknowledging received messages despite timed out transaction. The biggest collection of WTFs in a single place was of course (and is, and probably will be forever) OracleAQ.

But back to configuring JBoss AS 6 to work with the standalone HornetQ.
  1. The most important thing is to use JBoss AS 6.1.0.Final and not 6.0. Details are here. This is something I already knew and it is not specific to the standalone configuration of HornetQ, but it is worth mentioning here anyway.

  2. JBoss will be configured to talk to the external standalone HornetQ. The embedded HornetQ will be removed. This is what I want to achieve. Yes, it is possible to go for the standalone HornetQ environment and still run the embedded HornetQ just in case. But I find it pointless. Note also that no new resource adapter is added to JBoss. Instead the configuration of the existing one is changed.

  3. Move JBOSS_HOME/server/<servername>/deploy/hornetq/jms-ds.xml to JBOSS_HOME/server/<servername>/deploy. This is where outbound connection factories are defined. The file will have to be changed, see below.

  4. Delete JBOSS_HOME/server/<servername>/deploy/hornetq directory. This is where the configuration of the embedded HornetQ lives.

  5. Delete JBOSS_HOME/server/<servername>/deployers/hornetq-deployers-jboss-beans.xml. This file specifies how HornetQ specific configuration files should be read to initialize embedded HornetQ server and its destination objects. No embedded HornetQ means no need to read these configuration files.

  6. Change the resource adapter configuration file: JBOSS_HOME/server/<servername>/deploy/jms-ra.rar/META-INF/ra.xml. This is actually documented here: 32.4.4.1.1. Configuring the Incoming Adaptor or here: HORNETQ_HOME/examples/javaee/jca-remote/readme.html, section "configuring the incoming Adapter". Basically you need to change two RA properties:
    <config-property>
    <description>
    The transport type. Multiple connectors can be configured by using a comma separated list,
    i.e. org.hornetq.core.remoting.impl.invm.InVMConnectorFactory,org.hornetq.core.remoting.impl.invm.InVMConnectorFactory.
    </description>
    <config-property-name>ConnectorClassName</config-property-name>
    <config-property-type>java.lang.String</config-property-type>
    <!-- <config-property-value>org.hornetq.core.remoting.impl.invm.InVMConnectorFactory</config-property-value> -->
    <config-property-value>org.hornetq.core.remoting.impl.netty.NettyConnectorFactory</config-property-value>
    </config-property>
    <config-property>
    <description>The transport configuration. These values must be in the form of key=val;key=val;,
    if multiple connectors are used then each set must be separated by a comma i.e. host=host1;port=5445,host=host2;port=5446.
    Each set of params maps to the connector classname specified.
    </description>
    <config-property-name>ConnectionParameters</config-property-name>
    <config-property-type>java.lang.String</config-property-type>
    <!-- <config-property-value>server-id=0</config-property-value> -->
    <config-property-value>host=127.0.0.1;port=5445</config-property-value>
    </config-property>
    The incoming messages will be received from the standalone HornetQ server available at localhost:5445.

  7. Change the connection factory configuration file (jms-ds.xml copied to deploy directory at step 3 above). It is also documented 32.4.4.1.2. Configuring the outgoing adaptor or here: HORNETQ_HOME/examples/javaee/jca-remote/readme.html, section "configuring the outgoing Adapter". This is what the resulting file looks like:
    <connection-factories>
    <tx-connection-factory>
    <jndi-name>jms/HornetQ/XACF</jndi-name>
    <xa-transaction/>
    <rar-name>jms-ra.rar</rar-name>
    <connection-definition>org.hornetq.ra.HornetQRAConnectionFactory</connection-definition>
    <config-property name="SessionDefaultType" type="java.lang.String">javax.jms.Topic</config-property>
    <config-property name="ConnectorClassName" type="java.lang.String">org.hornetq.core.remoting.impl.netty.NettyConnectorFactory</config-property>
    <config-property name="ConnectionParameters" type="java.lang.String">host=127.0.0.1;port=5445</config-property>
    <max-pool-size>20</max-pool-size>
    </tx-connection-factory>
    </connection-factories>

The situation now is still not complete because no destination objects are registered in JBoss JNDI. JBoss can be started but deploying an application that uses @Resource will still fail. The good thing: the embedded HornetQ is out of picture.

The remaining piece of configuration depends actually on how the application uses JMS. If the application contains some MDBs to receive messages from destinations, then it is possible to skip JNDI registration completely. Whether it is a good idea is another question. Just add something like this to the MDB code or the equivalent configuration to its deployment descriptor:
@MessageDriven(activationConfig = {
@ActivationConfigProperty(propertyName = "destinationType", propertyValue = "javax.jms.Queue"),
@ActivationConfigProperty(propertyName = "destination", propertyValue = "TestQueue"),
@ActivationConfigProperty(propertyName = "UseJNDI", propertyValue = "false"),
})

The key here is the property named "UseJNDI". The default value is true and the destination has to be registered in the JNDI. Setting the property to false skips JNDI lookup. After all, JNDI registration is just a bit more than a mapping from say "jms/hq/AQueue" JNDI name to the queue name ("TestQueue") known to the server. The HornetQ resource adapter will use "TestQueue" as the queue name in its communication with the server.

If the application needs that kind of indirection or if it sends messages and relies on @Resource injection line that:
@Resource(mappedName = "jms/hq/AQueue")
private Queue queue;
then some more configuration is needed. This configuration can be done at the server level so the destinations are available to all applications as soon as the server is started or at the application level. Note: I did not actually test the application level configuration. There might be initialization order problems: JNDI name must be available for injection but it is not yet there.

This is what is missing in HornetQ documentation. This and also the way to make it easier and less verbose:

Create a file named jboss-beans.xml in META-INF (or WEB-INF) directory of your application if you want to deploy the configuration with your application. Or create a file named <something meaningful>- jboss-beans.xml in JBOSS_HOME/server/<servername>/deploy. Copy the following to the file:
<?xml version="1.0" encoding="UTF-8"?>
<deployment xmlns="urn:jboss:bean-deployer:2.0">
<bean name="HQ_JNDIBindingService" class="org.jboss.naming.JNDIBindingService">
<property name="bindings">
<bean class="org.jboss.naming.JNDIBindings">
<property name="bindings">
<array elementClass="org.jboss.naming.JNDIBinding">
<bean class="org.jboss.naming.JNDIBinding">
<property name="name">jms/hq/AQueue</property>
<property name="value">
<!-- Use org.hornetq.jms.client.HornetQTopic for topic -->
<bean class="org.hornetq.jms.client.HornetQQueue">
<constructor>
<parameter>TestQueue</parameter>
</constructor>
</bean>
</property>
</bean>

<!-- You can add here more <bean class="org.jboss.naming.JNDIBinding"> definitions for other destinations -->

</array>
</property>
</bean>
</property>
</bean>
</deployment>

This configuration creates a JNDI entry under name "jms/hq/AQueue" and makes sure that is it the correct class (org.hornetq.jms.client.HornetQQueue which implements javax.jms.Queue). This entry represents a queue named "TestQueue" on a server. Note: "on a server". As I have mentioned in the previous post this object is just a mapping to the destination name and it comes alive only with the correct ConnectionFactory/ConnectorFactory.

That is it. Hey, HornetQ, was it difficult to document?

Standalone HornetQ and JBoss

For the last couple of weeks I have been busy playing with JBoss and various JMS implementations, like Apache ActiveMQ, HornetQ and even OracleAQ. As usual, with different degree of success. This post is about trying to make HornetQ standalone work with JBoss.

HornetQ comes bundled with JBoss AS 6 as its default JMS provider. As such HornetQ runs embedded in JBoss JVM process. HornetQ home page mentions a lot of nice feature HornetQ has, for example "Solid high availability" or "Flexible clustering". These feature are available only when HornetQ runs as standalone server.

Shall we try it? No problem, HornetQ is downloaded and started. The minor inconvenience is the lack of any admin user interface. All the configuration must be done via config files or via JMX. Fortunately I need just a single queue for testing purposes. TestQueue is created, HornetQ server is restarted. So far so good.

Now it is time to configure JBoss side. I must say this is ... interesting. Very interesting.

Rant mode on.

It is unbelievable: HornetQ comes from the same company as JBoss AS so it must be easy, right? It is even documented, there are samples, and yet...

First of all there is no documentation on how to configure JBoss AS 6 to completely switch from the embedded HornetQ configuration to standalone. The available documentation describes only how to connect to the standalone server. As if it were enough. The documentation says nothing about the rest of HornetQ configuration files in JBoss.

Guess what, it is not enough. I tried to deploy my tests application and immediately got complaints from JBoss saying that my nice TestQueue is not there.

Nowhere could I find any description on how to make a queue configured in the standalone HornetQ server be visible in the JBoss JNDI tree so that @Resource injection worked.

Next thing I tried was adding my queue in JBOSS_HOME/server/<servername>/deploy/hornetq/hornetq-jms.xml. This looked like a very strange thing to do, but this actually worked. Wow, HornetQ is smart! The corresponding JNDI entry is created, the application is deployed and I can send messages and receive them with an MDB.

There are even couple of JMX mbeans corresponding to my test queue available in JBoss JMX console. Except the attributes of those JMX mbeans do not really match the expected values and operations do not do the expected things. It looks like no messages ever went through the queue. What gives?

Nothing new actually: documentation that is incomplete; adding new functionality is much more important than making it work in the useful configurations, you name it. Works in JBoss as embedded? Good enough, let's move on. Need to do something different? Good luck with that.

So why it works and how?

All the configuration files under deploy/hornetq are related to the embedded HornetQ. Adding a queue/topic definition to hornetq-jms.xml adds the destination to the embedded server. The JMX mbeans are created to manage this local destination. And do not get me started on the story of two JMX beans per queue/topic and the difference between "core" HornetQ and "jms" HornetQ. Layered architecture in practice.

As part of a queue/topic creation HornetQ registers it in JBoss JNDI. This registered object is more or less the destination's name on the server. Yes, the object is an implementation of javax.jms.Queue/Topic interface but this does not really matter. It is useless by itself. The "magic" happens only in combination with the configured ConnectionFactory (for the outbound communication) or ConnectorFactory (for the inbound communication). Is it local "InVM" connection? Then the embedded server is doing the work. Is it "Netty" connection pointing to the standalone HornetQ? Then messages go to/from that server.

The net result: since there is no documented way to register such "destination names" in JBoss JNDI directly one must keep the embedded HornetQ running. And then for each remote destination one must create the corresponding local destination to get the correct JNDI objects in place as a side effect. Really a way to go, HornetQ! Clear and understandable architecture. And you get (for free!) a pair of useless JMX mbeans per destination so that you can waste your time trying to understand what the hell is going on with the messages.

It is even more interesting for people who want to use HornetQ from other application servers. There is no <adminobject> in ra.xml (<adminobject>s are those "destination names" in JNDI). Naturally, since embedded HornetQ creates them programmatically. It works in JBoss as embedded? Good enough then.

Friday, November 4, 2011

javax.namespace.QName and serialization

I have already mentioned a particular quirk of javax.namespace.QName under JBoss AS 6 and how to avoid it. Unfortunately it is not the end of the story.

You see long time ago somebody made a mistake. Different JDK versions were shipped with serialization-incompatible versions of javax.namespace.QName. Ooops.

I did not try to find out when it was made and what exactly happened, like just forgotten serialVersionUID or something else. Can happen, was fixed. You can see the fix in the JDK source code. Of course the fix is not really complete if we take into account some versions of stax-api jar floating around.

Anyway, stax-api.jar in JBoss AS6 is taken care of. But our application started to fail after it was deployed in the test environment. The application is just migrated to JBoss and is running against an existing test database. And here is the problem: the application stores some JAXB unmarshalled objects in the database. JAXB really likes to use javax.xml.bind.JAXBElement class which has a field of type QName. And the previous version of the application was running under a different version of JDK. You've got the picture.

The "fix" provided by JDK does not help here. It just allows me to select the current serialVersionUID from a limited set of supported values. Deserializing an instance of javax.namespace.QName serialized with a different serialVersionUID fails.

It is possible to go to the database, identify the records that have a wrong serialVersionUID and try to change them. Hmm, understanding the format of serialized instances of java classes and Oracle BLOB functions is not what I really wanted to do. On top of that there are other situations where one might need to be able to support multiple values of serialVersionUID like for example remote invocations.

Fortunately these other situations were not applicable in my case. And I was also lucky because the framework that the application is using allows configuring the way the data is read from and written into a database. Sort of. Several configuration files and several classes after I actually could concentrate on the solution itself:

public class CustomObjectInputStream extends ObjectInputStream {

...
//
protected ObjectStreamClass readClassDescriptor() throws IOException, ClassNotFoundException {
// This is the fix: if the description of QName class is read, check what value of
// serialVersionUID it has. And if it is not the "current good" value, but matches
// one of the "known good" values, just return the "current good" descriptor.
ObjectStreamClass desc = super.readClassDescriptor();

if (desc != null && platformQNameDesc != null && platformQNameDesc.getName().equals(desc.getName())) {
// OK, this is QName descriptor
long readVerUID = desc.getSerialVersionUID();

if (readVerUID != platformQNameDesc.getSerialVersionUID()) {
// SerialVersionUID of the serialised class DOES NOT match
// the value used by this JDK process.
if (readVerUID == JDK_defaultSerialVersionUID ||
readVerUID == JDK_compatibleSerialVersionUID ||
readVerUID == STAXAPI_SerialVersionUID)
{
// OK, one of the "known good values". Pretend we read the correct one.
//
// This is not a very nice way, but the thing is: ObjectStreamClass
// is a class, not interface, and has a lot of (package) private fields
// that are directly read by other JDK classes. So it is either
// reflection and changing the private field or this one:
desc = platformQNameDesc;
}
}
}

return desc;
}

//
private ObjectStreamClass platformQNameDesc = ObjectStreamClass.lookup(QName.class);

//
private static final long JDK_defaultSerialVersionUID = -9120448754896609940L;
private static final long JDK_compatibleSerialVersionUID = 4418622981026545151L;
private static final long STAXAPI_SerialVersionUID = -711357515002332258L;
}

I really think that something like this must be implemented by JDK, and not only for QName. Allowing to override serialVersionUID is not enough. And maybe class java.io.ObjectInputStream should be changed as well to be able to serialize objects with non-default values of serialVersionUID.

Friday, October 7, 2011

Dynamically changing app-specific logging under JBoss AS: parallel deployment

01.12.2011. Update: Go read first this post. Then come back here if you want.

Last time I described one way of changing the app-specific logging configuration at runtime. The proposed solution is limited to changing log level only. This post describes what can be done to be able to change the complete logging configuration without redeploying the application.

There are several ways to do it but they share the same basic idea: instead of deploying one artifact (the application with the logging configuration) deploy the application and the logging configuration separately. Later, if some logging detail must be changed, only the separately deployed logging configuration has to be changed. JBoss detects the change and redeploys the configuration. The application keeps running. Important note: if the application has to be redeployed then both deployments must be redeployed.

First things first: in order for this scheme to work the application and the logging configuration must be specially prepared. The steps are:
  1. Configure your application to use the app-specific logging and make sure it really works.

  2. Split your logging configuration in two files. One file remains packaged in the application as jboss-logging.xml. It should look like this:
    <logging xmlns="urn:jboss:logging:6.0" context="MyWonderfulLogContext"> 
    <define-context name="MyWonderfulLogContext"/>
    </logging>
    Only context attribute and define-context element stay in the file. While it is possible to include some other logging configuration in the file I would not recommend it because these definitions can't be changed without redeploying the application itself.

  3. Deploy your application. You will notice that it generates no logging. It is logical: there is after all no logging definitions in jboss-logging.xml of the application.

  4. Prepare the second deployment artifact. The main part of it is the remaining logging configuration. It is the original jboss-logging.xml without define-context element.:
    <logging xmlns="urn:jboss:logging:6.0" context="MyWonderfulLogContext"> 
    <!-- Here come definitions of handlers, loggers, etc -->
    </logging>

  5. Deploy it. There are 3 possibilities:

    • Package it into something JBoss recognizes as a valid deployment (a jar or an ear). The packaging does not have to match the packaging of your application. For example create a file named MyWonderfulLogContext.ear with a single file META-INF/jboss-logging.xml. Deploy this file next to your application. As of this moment you should see some logging being generated. If you need to change the configuration you have to change create jboss-logging.xml, repackage MyWonderfulLogContext.ear and redeploy it.

    • You can also deploy the prepared package exploded. For example, create directory MyWonderfulLogContext.ear in the deploy directory of JBoss, create META-INF subdirectory and copy jboss-logging.xml into it. Later you can just edit this file if you need to change the configuration.

    • Rename the prepared jboss-logging.xml to something like MyWonderfulLogContext-jboss-logging.xml (the name must end with jboss-logging.xml) and copy it into the deploy directory of JBoss. Later you can just edit this file if you need to change the configuration.

      This does not work with out of the box JBoss: it recognizes only the file named 'jboss-logging.xml'. Fortunately it is not hardcoded in some java class but specified as part of the configuration. To enable JBoss to recognize more than just jboss-logging.xml you need to change file JBOSS_HOME/server/<servername>/deployers/jboss-logging.deployer/META-INF/logmanager-jboss-beans.xml. Locate bean named "JBossLoggingMetaDataSchemaResolverDeployer" and change
      <property name="name">jboss-logging.xml</property>
      into
      <property name="suffix">jboss-logging.xml</property>

What makes the parallel deployment scheme work is define-context element in jboss-logging.xml of the main application. As soon as the application is deployed JBoss creates the context object. The loggers used by the application are retrieved from this context object. As long as the application is deployed the context stays around. (Re-)Deploying the second deployment artifact just adds or removes the logging configuration to or from the context.

You can combine this scheme with the JMX bean solution. The JMX bean belongs to the context so it is also available as soon as the context is created (at deployment/startup) and remains valid until the log context is valid.

Friday, September 23, 2011

Dynamically changing app-specific logging under JBoss AS: JMX

01.12.2011. Update: Go read first this post. Then come back here if you want.

I have already blogged on JBoss 6 application specific logging, see here.

The setup works as expected. But I still miss a very important piece of functionality. The logging is not the main (you'd be surprised what some creative people can do!) reason why an application is developed. The logging helps identifying and finding problems in the application. One of the requirements to the logging framework is the ability to change at least some setting without a new deployment of the application:
  1. Changing log levels is a must. If I have a problem with a running application I want to be able to change log levels of some loggers and see the changes being applied as soon as possible.

  2. Changing the output formatting might come handy in some cases. I have been in a situation when I really wanted to change the formatting only once: the output did not include thread names.

  3. Changing where the log is written is not that important. Well, of course, it the log is currently written to nowhere then it is important to be able to change it. But normally the log is written somewhere and it is good enough to investigate a problem.
None of this is possible by default with the application-specific logging in JBoss AS 6. It looks like a crude joke: the standard logging configuration in JBoss is hot-deployable. Any change to <jboss>/server/<server_name>/deploy/jboss-logging.xml is applied almost immediately. There is a JMX MBean (java.util.logging:type=Logging) which can be used to query and change log levels. And if I want to change some configuration in the application-specific logging configuration? Only by redeploying the application.

I must say I find it more and more annoying. It looks suspiciously close to early versions of red hat linux :-) The things seem to work but if I need just a bit more functionality, which I know is there, I need a hammer. Or a debugger this time.

Anyway after some quality swearing, mainly at jboss, but maven and eclipse got their share of attention, and countless attempts to deploy an application and change some configuration files I have found several ways to (partially) change the per-application logging without redeploying the application.

Most of the solutions are just variations of the same approach. They allow changing every detail of the application logging with various degrees of cumbersomeness.

One solution stands really apart. It is quite easy to setup and use, with a single limitation: only log levels can be changed, no other change is possible. But this is exactly what is needed most of the time. The solution offers the possibility to perform changes locally and remotely. Another advantage: if you have already an application with the app-specific logging configured and are just dying to change the log level "right now" you can do it without redeploying the application. The solution is described below:

Changing log levels via JMX.


Changing logging details looks like a perfect candidate for JMX. When I was experimenting with the logging I was a bit surprised by the lack of any logging JMX. JBoss is full with JMX beans, yet there is no JMX related to the logging except the one which comes from JDK: java.util.logging:type=Logging. (JBoss replaces it with its own at startup, but this is beyond the point.)

After inspecting JBoss logging source code I came across some mentioning of JMX. The most promising were classes org.jboss.logmanager.LogContext with its getLoggingMXBean() method and org.jboss.logmanager.LoggingMXBeanImpl implementing java.util.logging.LoggingMXBean interface. So I concentrated on them.

I must say JBoss guys have really interesting sence of humor. It walks like a duck, it swims like a duck, it quacks like a duck, yet it is not a duck. Now I have an example of such a beast. You see, the class is named LoggingMXBeanImpl, it implements LoggingMXBean, the method is getLoggingMXBean, and yet an instance of class org.jboss.logmanager.LoggingMXBeanImpl does not pass the validation rules of the JBoss JMX server. You can't do better than that even on purpose.

Anyway, here is it:
  1. Configure your application to use the app-specific logging. You end up creating a file named jboss-logging.xml that looks like this:
    <logging xmlns="urn:jboss:logging:6.0" context="MyWonderfulLogContext">
    <define-context name="MyWonderfulLogContext"/>

    <!-- Here come definitions of handlers, loggers, etc -->
    ...
    </logging>

  2. Deploy your application; make sure the app-specific logging actually works and the logging goes where it supposed to go.

  3. Create somewhere a file named for example MyWonderfulLog-jboss-beans.xml with the following content:
    <?xml version="1.0" encoding="UTF-8"?>
    <deployment xmlns="urn:jboss:bean-deployer:2.0">
    <bean name="MyWonderfulLogContextMBean" class="java.util.logging.LoggingMXBean">
    <constructor factoryMethod="getLoggingMXBean">
    <factory bean="Logging:CONTEXT:MyWonderfulLogContext"/>
    </constructor>
    <annotation>@org.jboss.aop.microcontainer.aspects.jmx.JMX(
    exposedInterface=java.util.logging.LoggingMXBean.class,
    name="boom.tralala:context=MyWonderfulLogContext",
    registerDirectly=false)</annotation>
    </bean>
    </deployment>
    Attention: the annotation element must be on a single line!

  4. Deploy this file by copying it to <jboss>/server/<server_name>/deploy. If everything went OK and there are no errors in console or in <jboss>/server/<server_name>/log/server.log, then navigate to http://<jboss-host>:<jboss-port>/jmx-console. You should see something like this:




  5. Click on context=MyWonderfulLogContext link and you get to the page that allows you to query some information and, more importantly, change log levels with operation setLoggerLevel. The first parameter is the logger name like 'org.jboss' or 'com.mycompany.app.package'. The second parameter is the logging level like 'INFO' or 'DEBUG'. Go ahead and play with it.

  6. If everything is working you can repackage your application so that there is no need to deploy 2 files, your application and MyWonderfulLog-jboss-beans.xml. Rename the file to jboss-beans.xml and place it next to your jboss-logging.xml, for example, in EAR/META-INF. If you already have jboss-beans.xml just add MyWonderfulLogContextMBean bean definition to your file. Next time you deploy your application you get the logging JMX bean automatically.

That is it.

Oh, yeah. Some technical details:
  1. When JBoss deploys your logging configuration it creates some objects (beans) and registers them internally. These objects are not JMX beans. One of the objects is LogContext and it is registered under the name Logging:CONTEXT:<name attribute of define-context element from jboss-logging.xml>. Thus the bean name <factory bean="Logging:CONTEXT:MyWonderfulLogContext"/> in *-jboss-beans.xml must match the name JBoss uses internally.

  2. Ignoring 'annotation' element the definition can be read as "look up JBoss bean named Logging:CONTEXT:MyWonderfulLogContext, invoke its method getLoggingMXBean and place the result of invocation in the JBoss bean registry under the name MyWonderfulLogContextMBean". There is yet no JMX in sight.

  3. Now comes the annotation element. JBoss allows you to annotate bean instances dynamically. The result looks as if java class of the instance had this annotation. In this case: as if class org.jboss.logmanager.LoggingMXBeanImpl had annotation @org.jboss.aop.microcontainer.aspects.jmx.JMX. The only difference is that the annotation is not there if you have any instance of this class itself but it is there if you have MyWonderfulLogContextMBean bean. Important: the annotation element must be on a single line, and must not start (and probably end) with whitespaces. JBoss annotation parser is very picky.

  4. The annotation itself says that JBoss bean named MyWonderfulLogContextMBean has to be registered as a JMX bean. Attribute 'name' specifies the ObjectName under which the bean has to be registered. It is a good idea to include the name of your LogContext there. And try not to use 'jboss' anywhere in the name to avoid future naming conflicts.

    I have created a JBoss issue and if its fixed JBoss would have to come up with a naming scheme. I doubt 'boom.tralala' would be their first choice but I would not be surprised if they stick 'jboss' somewhere in the name.

    If attribute 'registerDirectly' is true JBoss expects that the bean is already a JMX MBean. Remember I said above that the instance of org.jboss.logmanager.LoggingMXBeanImpl does not pass the validation rules of the JBoss JMX server? This is the reason 'registerDirectly' is set to false. This makes JBoss wrap our instance in an instance of class javax.management.StandardMBean, which is recognized as a valid JMX MBean by the JBoss JMX server.

    And StandardMBean needs to know what interface it is going to represent. This explains attribute 'exposedInterface'.

Next time I explain how to change every aspect of the logging configuration at runtime.

Thursday, September 8, 2011

Assorted facts about JBoss. Fact 3: how to make per-application logging work

01.12.2011. Update: Go read first this post. Then come back here if you want.

I am now busy with some JBoss AS6 related work. One of the things I wanted to be able to do is to have a separate application specific log file. Very easy, very understandable requirement.

Except not under JBoss. It looks like there is a completely new logging subsystem in JBoss AS 6. No advice on how to do it in previous versions would help. And there is not much documentation on how to do it except of some examples here and there and complains that this functionality is broken. There is also a corresponding JBoss issue.

The issue page describes the problem and offers a workaround. The issue is marked as resolved in JBoss AS 6.1.0 Final. The "fix" is actually a proposed workaround; the fact that I immediately did not like.

I have created jboss-logging.xml so that all the logging from my application would go not to ${jboss.server.log.dir}/server.log but to ${jboss.server.log.dir}/my-app.log. After deploying my application I immediately noticed that the fix worked: there was no error. I also got my-app.log next to server.log. All nice and dice. Brand new empty my-app.log no matter what I did in my application. All the logging still went to server.log. So much for the fix.

And it looks like I am not alone. This post mentions the same problem: "My new log file is empty."

But the new wonderful logging subsystem does more for me! When I undeploy my application I get the following exception:
ERROR [org.jboss.deployers.vfs.deployer.kernel.BeanMetaDataDeployer]
Error during undeploy: Logging:REGISTRATION:MyAppLogContext:Anonymous-5: java.lang.NullPointerException
at org.jboss.logging.metadata.GetClassLoaderBeanMetaData.setClassLoader(GetClassLoaderBeanMetaData.java:52)
at org.jboss.deployers.vfs.deployer.kernel.BeanMetaDataDeployer.undeploy(BeanMetaDataDeployer.java:237)
at org.jboss.deployers.vfs.deployer.kernel.BeanMetaDataDeployer.undeploy(BeanMetaDataDeployer.java:58)
Bummer!

After some poking around jboss configuration and the source code I have found a solution. Actually a workaround and a fix.

Workaround:
  1. If you are still working on JBoss AS 6.0.0.Final then you have to apply the workaround described in this issue: change file JBOSS_HOME/server/<servername>/deployers/jboss-logging.deployer/META-INF/logmanager-jboss-beans.xml. Now you have the same configuration as JBoss AS 6.1.0.Final.
  2. Change the logging deployer configuration (file JBOSS_HOME/server/<servername&g;/deployers/jboss-logging.deployer/META-INF/logmanager-jboss-beans.xml) by removing that mode="On Demand" from
    OnDemandJBossLogManagerContextSelector bean:
    <bean name="OnDemandJBossLogManagerContextSelector"
    class="org.jboss.logmanager.LogContextSelectorService"
    mode="On Demand">
    <property name="selector">
    <inject bean="JBossLogManagerContextSelectorService"/>
    </property>
    </bean>
    You can also remove that comment above about lazy loading. It is not true anyway.

Congratulations, you have now your application specific log file and your logging really goes to this file. Unfortunately you will get NPE (see above) when your application is undeployed. This can't be worked around.

The fix is actually a code patch. I have submitted it to JBoss (see here). Unfortunately it means we have to wait until it is accepted and incorporated into some release. For those who are not afraid of javac: you can download the patch from the issue page and build it yourself. You will have to update the logging deployer configuration (file JBOSS_HOME/server/<servername&g;/deployers/jboss-logging.deployer/META-INF/logmanager-jboss-beans.xml). It is also included in the patch.

Friday, August 19, 2011

Assorted facts about JBoss. Fact 2: classloading. Broken by default.

The classloading situation in JBoss is a mess. You know, evolutionary kind of mess. They began with something messy, then started to add more things, configuration parameters, bells and whistles ... Some old classloading problems went away, just to be replaced by new problems.

I had my share of ClassNotFoundExceptions, NoClassDefFoundErrors, and LinkageErrors before. Not all of them were caused by JBoss, but those that were caused by JBoss were the toughest to resolve.

Recently JBoss (JBoss Application Server 6) appeared again on my professional horizon. And one of the first problems I ran into was a LinkageError. Even that problem with StAX API jar happened later.

The deployment of the application fails. Sometimes. And sometimes it is OK. If it fails, the error is
java.lang.LinkageError:

loader constraint violation:
loader (instance of org/jboss/classloader/spi/base/BaseClassLoader)
previously initiated loading for a different type
with name "org/xml/sax/Attributes"
The class name is not always org.xml.sax.Attributes, but it is always a class from org.xml.sax. And if the deployment is OK, the same error happens later, at runtime.

Nothing new. This error screams "Duplicate class". So I looked around and found jtidy-4aug2000r7-dev.jar packaged in EAR/lib (and only there) which has its own copy of org.xml.sax and org.w3c.dom classes. JMX classloader bean for the application confirmed that the classes are coming from the application's EAR.

The reason why the classes are packaged into jtidy-4aug2000r7-dev.jar is not really important here. But I was really surprised (well, initially) that JBoss uses these classes instead of JDK classes. I deployed the application on the stock version of JBoss AS 6 without any modifications of classloading configuration. There was even no jboss-app.xml (I was planning to add it, just in case) let alone other classloading specific files. So I expect JBoss to have JEE compatible classloading behavior.

Initially I did not have time to investigate the problem. After notifying the project owner of the problem the decision is made: remove JTidy jar from EAR/lib and continue. JTidy is used in some really obscure piece of code that is called not that often. We can deal with this later. There is even a chance that this functionality will be rewritten to get rid of JTidy. But now we need a version of the application running in JBoss.

I did just that and went ahead. But I kept wondering. I could not understand why the presence of jtidy-4aug2000r7-dev.jar in EAR/lib causes such an error. It goes against all my knowledge, understanding of and experience with java classloading. Except that I am dealing with JBoss. But even JBoss would not do that, would it?

More importantly that problem might have a much broader effect on the project but I could not even imagine what kind of effect. More likely negative, that I was sure of.

Finally I have got some spare time. I have created a very simple application with a single EJB module with one SessionBean and jtidy-4aug2000r7-dev.jar as a dependency, packaged as an EAR application. After deploying it into JBoss I went into JMX classloader bean and verified that org.xml.sax classes are coming from the test application. The SessionBean has a business method that does Class.forName() and then returns getProtectionDomain().getCodeSource() of the loaded class. I created a simple EJB client application that calls the business method with different class names. All the org.xml.sax and org.w3c.dom classes present in jtidy-4aug2000r7-dev.jar were indeed coming from this jar file.

Next I have created another jar file manually packaging some classes from various places and packages like javax.xml.bind, org.dom4j, org.hibernate, java.text, javax.management, etc. I replaced jtidy jar with this new jar file and repeated the test. Only with java.* classes the business method was returning null which means that only those classes were coming from the primordial class loader. All other classes present in the jar were loaded from it. Well, well, well, JBoss at it again.

Trying to find anything specific about the problem on the net did not help. Classloading problems in JBoss is a really hot topic after all!

If nothing else helps ... Use the Source Luke!

Actually even before going deep I noticed one interesting thing in JBoss JMX Console. It appears that my demo application has a classloader domain JMX even if I did not have JBoss specific deployment descriptors. I clicked on it and again one thing stood out just screaming "Look at me":
ParentPolicyName, MBean Attribute, AFTER_BUT_JAVA_BEFORE.

"After but java before". Sounds familiar. pseudoTransactionEnlistment anyone? Or maybe BOZOSLIVEHERE?

The rest was easy. The biggest problem was to get the right source files quickly. You can't nowadays just download a fat zip or gz file with all the sources in it. After that was done, a bit of grepping and the like reveals the truth:

Class org.jboss.classloader.spi.ParentPolicy with some predefined instances like BEFORE, AFTER, BEFORE_BUT_JAVA_ONLY, etc. The comments around these predefined instances explained the meaning. In my case it was AFTER_BUT_JAVA_BEFORE. Except that the source file claims that AFTER_BUT_JAVA_BEFORE means "Java and Javax classes before, everything else after" and my tests show that javax.* classes also come from the jar file in EAR/lib. A bit more looking around led me to this piece of code in class org.jboss.classloading.spi.dependency.Module:
public ParentPolicy getDeterminedParentPolicy()

{
if (isJ2seClassLoadingCompliance())
return ParentPolicy.BEFORE;
else
return ParentPolicy.AFTER_BUT_ONLY_JAVA_BEFORE;
}

Since I do not have any JBoss specific deployment descriptors isJ2seClassLoadingCompliance() returns false resulting in ParentPolicy.AFTER_BUT_ONLY_JAVA_BEFORE being used. Not AFTER_BUT_JAVA_BEFORE. The comments next to AFTER_BUT_ONLY_JAVA_BEFORE in ParentPolicy.java clearly match the observed behavior: "Java classes before, everything else after". What am I missing?

Turns out there is one more small thing: a copy-paste error in the definition of AFTER_BUT_ONLY_JAVA_BEFORE:
/** Java and Javax classes before, everything else after */

public static final ParentPolicy AFTER_BUT_JAVA_BEFORE =
new ParentPolicy(ClassFilterUtils.JAVA_ONLY,
ClassFilterUtils.EVERYTHING,
"AFTER_BUT_JAVA_BEFORE");

/** Java classes before, everything else after */
public static final ParentPolicy AFTER_BUT_ONLY_JAVA_BEFORE =
new ParentPolicy(ClassFilterUtils.NOTHING_BUT_JAVA,
ClassFilterUtils.EVERYTHING,
"AFTER_BUT_JAVA_BEFORE");
Mystery solved. I have added jboss-app.xml to the EAR with <loader-repository-config>java2ParentDelegation=true</loader-repository-config> (might just as well have added jboss-classloading.xml), redeployed the application and sure enough I have got BEFORE as ParentPolicyName in JMX Console, but more importantly I have now the expected classloading behavior. For each class present the test jar in EAR/lib both JMX Console and my SessionBean load the class not from the test jar but form some other place like JDK or jars from <jboss>/common/lib.

I have mentioned above that I was planning to add jboss-app.xml to the application anyway because I do not trust JBoss. Boy I was right. The end result for me would have been the same but I would have missed all that fun.

But ... Who in their right mind comes up with such interesting classloading logic?? What they were trying to achieve? Why the hell I have to explicitly "opt in" to get the most sensible classloading configuration? *


* Note: ideally. The current state of classloading affairs in JEE containers makes it much harder than necessary. Internal container classes and classes from various third-party jars that container is using leak into an application. This is a big deal even if the application does not have conflicts with those third-party jars. In case of conflicts all bets are off. Granted containers provide some mechanisms to fine tune classloading, but these mechanisms do not always work. Yes, JBoss, it is about you. But I do think that this "delegate to the parent first except when in WAR" is the most sensible classloading configuration and definitely the one to start with and to try to stick to as much as possible.

Monday, August 15, 2011

Assorted facts about JBoss. Fact 1: StAX (Streaming API for XML) and the meaning of -711357515002332258.

Every time I have to do some serious work with JBoss I come across a situation that requires patching JBoss. HelloWorld kinds of applications tend to work, but as soon as things get complicated there is always something...

This time it is JBoss6 and StAX API. You see, there is <jboss>/lib/endorsed directory with some files in it. Normally if you start jboss with <jboss>/bin/run[.bat] the JVM is started with -Djava.endorsed.dirs=<jboss>/lib/endorsed

No problem, it is the desired and documented behavior if one wants to have newer versions of some APIs available in JDK. But <jboss>/lib/endorsed/stax-api.jar is a bit different. It is there for the sake of JDK 1.5. As of JDK 1.6 StAX is part of JDK itself. And it is not like JBoss packages a better or newer version of StAX. So if you run JBoss on JDK 1.6, do yourself a favor: delete <jboss>/lib/endorsed/stax-api.jar right now.

The solution for JDK 1.5 is not so simple because JDK 1.5 does not provide StAX. But first, what is the problem? This is it:
java.io.InvalidClassException: javax.xml.namespace.QName;
local class incompatible: stream classdesc serialVersionUID = -9120448754896609940,
local class serialVersionUID = -711357515002332258
<jboss>/lib/endorsed/stax-api.jar contains more than just StAX classes. It contains some old versions of classes that long ago present in JDK. And because the classes are in an endorsed jar, they override standard JDK classes.

If you look into JDK source code you will see that class javax.xml.namespace.QName goes to some lengths to initialize private static final long serialVersionUID with some known good value. The version packaged in <jboss>/lib/endorsed/stax-api.jar does not define field serialVersionUID leaving you at mercy of the JVM algorithm to calculate serial version UID. Which produces -711357515002332258 in this particular case.

Bad luck if you have a serialized instance of a class which has a non-transient field of type javax.xml.namespace.QName. Or if you have QName as a parameter in one of your remote interfaces.

So I fixed the problem by removing stax-api.jar since I am running under JDK 1.6 and went ahead.

The simplest solution for JDK 1.5 is probably deletion of everything that is not under javax.xml.stream from <jboss>/lib/endorsed/stax-api.jar. There are also stax-api jars around that include only javax.xml.stream.* classes.

But still ... One thing bothered me. This is quite an easy mistake to make especially if this file was added to the endorsed dir some time ago. I can see that it is present in JBoss5; I did not check earlier versions. But come on is it that difficult to review these things for every major release?

I can't be the first one to hit this problem. A bit of googling brought me here (JBPAPP-4223). OK, a bug is reported but I guess nobody is going to do a thing about it. After all it was reported on the 5th of May 2010, JBoss 6.0.0.Final was released half a year later, still with the problem.

And then I found this little gem. The beauty here is the recommendation of the JBoss EJB3 Lead Developer. Just read it. He seriously proposes to add the broken stax-api.jar to the client endorsed jar set. WTF?! JBoss EJB3 Lead Developer? No kidding?

Am I really surprised? Not at all.

Thursday, August 11, 2011

Mule, HTTP and transaction management

Mule has support for transactions, see here. So if the inbound and outbound endpoints are transactional, like JDBC or JMS, it is easy to make sure the messages are handled transactional.

It is not so easy if an endpoint is not transactional. For example we have a configuration with a jms:inbound-endpoint and an http:outbound-endpoint. A message is retrieved from the queue and sent via HTTP to some receiver. Of course the message must be removed from the queue only if it is successfully received (or handled) by the receiver.

The inbound-endpoint configuration is easy:
    <inbound>

<jms:inbound-endpoint queue="${queue_name}" connector-ref="jmsConnector">
<jms:transaction action="ALWAYS_BEGIN"/>
</jms:inbound-endpoint>
</inbound>

ALWAYS_BEGIN ensures that a new transaction is started and a message is received in this transaction.

This leaves the outbound-endpoint. Just saying
        <http:outbound-endpoint address="${http_address}"/>

is not enough because it automatically means action="NONE". Mule throws an exception complaining that the outbound endpoint cannot join the active transaction because it is configured with action="NONE". Fair enough, let's change this into
        <http:outbound-endpoint address="${http_address}">

<http:transaction action="JOIN_IF_POSSIBLE"/>
</http:outbound-endpoint>

But this does not work because "http:transaction" is not recognized as a valid element. Ooops. This is logical, HTTP is not transactional per definition. But we really need JMS to be transactional.

The solution?
        <http:outbound-endpoint address="${http_address}">

<jms:transaction action="JOIN_IF_POSSIBLE"/>
</http:outbound-endpoint>

Mule is happy with this and it does the right thing. If there is a problem connecting to the target or sending the message to it HTTP endpoint makes sure that the message gets "exception payload" set. This triggers Mule transaction support to rollback the active transaction.

This is not a generic solution, but it suits us: no messages are lost; the message is back in the queue and is redelivered later. The only problem with this approach is that the transaction can be rolled back after the message was successfully received by the HTTP receiver. This results in redelivery of the same message to the HTTP receiver which must be able to handle this.

But this is not a big deal in our case. It is so happens that most of the time the message ends up in a dispatcher that looks in its registry for subscribers. The first successful delivery of the message caused subscribers to unsubscribe so the dispatcher just silently drops the message.

In some other cases the message is just notification of some kind so nobody really cares if the same notification appears twice.

The only case when this might cause some trouble in our system is when such a message results in a creation of a BPEL process instance. Most of the time the newly created instance fails with "conflicting receive" error because the instance created after the first message delivery is still running and the process has some <onMessage> with a correlation set. But these cases are easily recognized by the administrators. And I must say if one is using BPEL then "conflicting receive" is the least of one's worry.

Tuesday, July 5, 2011

Fairy tale of developer's heaven

Ah maven! Promises of repeatable and portable builds... Fairy tales of developer's heaven...

Shall I tell one as well?

From time to time I do some things on a JEE project being developed by some other people. The resulting EAR file is quite dependent on some JEE container. Recently a decision was made to migrate the project to another JEE container. It is not a one-off migration effort: we need to be able to build the project for the original container and for the new one. I was asked to look into that.

One possibility to handle this is to have a separate branch for each container. Just thinking about all coming cross merges makes me cry.

Another possibility is to have a single branch and just package additional container specific deployment descriptors here and there. Nice and simple, perfectly JEE compliant ...

Except for one seemingly small detail. The project is using say version 1 of a particular 3rd party dependency. Unfortunately the project with this version can't be deployed into the newly targeted container. Fortunately there is already version 2 of the dependency, and the project built against this version can be deployed in the new container. Unfortunately it is precisely the other way around with version 2: the old container can't handle it.

And it is not that we can build the project against one of the versions and then just package differently: the versions are not binary compatible. There are also some minor source-code compatibility issues, but they can be relatively simply solved. Anyway, bottom line: if we build our project against one of the versions of the library it will not work with the other. We really need to compile our software against the correct version of the library and then package accordingly.

So far so good. The situation is probably quite common, and not very difficult to handle. Normally.

But the project is using maven, more specifically, maven 2.2.1. So far it worked pretty well for the project except for some of maven's WAR/EAR packaging "features". But for this new deployment target I hit a wall.

The end result I wanted to achieve: one checks out the source code, sets up the container specific environment, runs 'mvn install' and gets a set of properly versioned container specific artifacts, say, application-X.Y-container1.ear or application-X.Y-container2.ear. The "properly versioned" part is very important. This way we can refer to the correct versions of artifacts in our poms, we can properly release container specific versions of the project, etc.

The very first question was: how do we achieve that versioning scheme in maven? The 3rd party library is used in all ejb and almost all jar modules making them depend on the library. Web modules also depend on it (indirectly, via ejb/jar modules). The same is true for the ear module.

Why, it is easy I thought. I define a property, say, 'target.container', then put it in <artifactid> or <version> tag so the project's poms have <artifactid>usermanager-ejb-${target.container}</artifactid> or <version>0.1-${target.container}-SNAPSHOT</version> in their maven coordinates. Then I start maven with -Dtarget.container=container1 (or container2). This results in <artifactid>usermanager-ejb-container1</artifactid> or <artifactid>usermanager-ejb-container2</artifactid> (or version <version>0.1-container1-SNAPSHOT</version> or <version>0.1-container2-SNAPSHOT</version>) at build time. Problem solved.

Funny thing: it worked. Damn, I should have been more suspicious. I ended up trying both variants, and both worked when maven was executed from command line. That was actually the last thing that worked. Following hours made me really unhappy.

First I noticed that Eclipse (m2eclipse plugin) does not really like my new poms. It kept complaining that it could not find the project's poms and their dependencies. Executing 'maven install' from Eclipse produced a lot of warnings like "'artifactId' contains an expression but should be a constant." That prompted me to move ${target.container} from <artifactid> to <version>. And again, running maven from command line worked. Eclipse kept complaining.

Googling the message I came across a lot of posts related to the same issue. The message from these posts was clear: maven does not support it.

Strange, because maven documentation does not clearly spells it. For example, POM Reference does not say that artifactId or version must be constant. In fact, the very same POM Reference says here:

Maven properties are value placeholder, like properties in Ant. Their values are accessible anywhere within a POM by using the notation ${X}, where X is the property.


Yet people claim that it should not work, and if it works then it is a bug. For example, here or here.

And it looks like maven finally began enforcing this stupidity in version 3. Which is a pity. The funniest thing here is the reason why. Take for example MNG-4297:
Maven currently allows properties in the groupId, artifactId and version of a pom. This causes artifacts to be produced that require full inheritance and interpolation before they can be uniquely identified. It also poses potential problems if the properties are defined in settings, env or profiles where the consumer can't exactly identify the artifact after the fact.


I am just speechless. Between things like downloaded newer versions of plugins (wow, they fixed this one actually), disappeared artifacts from public repositories, rearranged and/or moved public repositories they finally nailed the real problem preventing maven users to have reproducible builds: property-based project coordinates. Strange, last time I looked maven still supports profiles and properties in general...

Anyway, it looks like the way to do what I want is classifiers. They are mentioned here:

The Maven provided solution for your situation is 'classifiers'.


and are really awesome described here (5.5.3. Platform Classifiers). These guys work for Sonatype, so they should know a thing or two about maven you would think, right? Just replace <classifier>win</classifier> with <classifier>container1</classifier> and <classifier>linux</classifier> with <classifier>container2</classifier>, and we are back in business.

Damn you, people who like to misinform others.

I went ahead and modified poms. Build the project from the command line produced some strange results, for example maven built modules 1-6 successfully and then failed building module 7 because of my mistake in the module's pom. I fixed the problem, ran 'mvn install' for module 7, it completed without failure. Then I ran 'mvn clean install' for the parent project, and it failed building module 5 because some classes from one of the dependencies were not found. Huh?

Finally after some more pom changes the build managed to produce an EAR which was successfully deployed in the old container. Then I cleaned up some poms, added some things to <dependencyManagement> here and there, and executed 'mvn clean install' again. I was not able to deploy the resulting EAR in the old container because of some missing classes. It turned out about 1/3 of jars were missing from EAR/lib this time. WTF?!

Running mvn dependency:tree and analyzing its result explained why: there were no dependencies under the dependencies with classifiers! Time to ask google again.

Apparently maven cannot handle transitive dependencies of a dependency with a classifier, see for example MNG-2759. The story is a bit more complicated because this works sometimes for some people. This worked at least once for me. But most of the time it does not work. And maven is not planning to fix it: MNG-2759 has status "Won't Fix".

Yeah, use classifiers if you need some quality headache. Thanks for advice, guys!

The only solution is to spell all dependencies explicitly everywhere I use a module with a classifier. Thanks, but no, thanks. I already have to do it too many times. <dependencyManagement>, <dependency>, WAR packaging exclusion, EAR packaging to make sure that what is excluded from WAR is packaged in EAR/lib.....

What can I say? Indeed, maven is really a "project ... comprehension tool".

And what I am going to do with all this mess? Nothing, really. I have dropped the requirement of having properly versioned artifacts. I have just removed all the classifiers. The profiles stay. So any time I build a project I get a version 0.1-SNAPSHOT which happens to be for one of the containers. Which one? It depends on the chosen profile. This all means more work during release, but frankly I do not care at the moment. Do you want to refer to the released version of one of the submodules in your pom file? You'd better be absolutely sure you know which version you use. Do you want to refer to a SNAPSHOT version?

Reproducible builds and maven? Do not make me laugh.

Friday, May 27, 2011

Mule startup quirks

Mule saga continues. Next problem I wanted to tackle was "we had to restart the mule server and the problems went away … or not ... and we had to restart it again ...".

A small detour: Mule can run as a standalone application where it handles everything itself. It can also be embedded into some application or packaged as a JEE web application and then deployed in a web container. Originally the project's Mule was running as a standalone server. This was changed at the client's request some time before going into production and now Mule is an EAR file with a single web application that runs in a separate OC4J instance. Mule provides the necessary code to make it work: a servlet context listener ( MuleXmlBuilderContextListener) to initialize Mule and a servlet ( MuleReceiverServlet) to receive HTTP requests. As a side effect of this embedded mode Mule can't really handle its lifecycle. Everything is managed through OC4J and OPMN. This was also given as the reason to use embedded mode: "look, this way we get Mule into our management infrastructure!"

Back to the topic: needless to say stopping/starting the mule server multiple times locally or in a development or test environment did not cause any problems. Time to look at the production log files. No surprises, it is full with exception stack traces, which makes it much harder to find something if one does not know what to look for. It took me some time to find this lonely line:

ERROR [OC4J Launcher] org.mule.retry.notifiers.ConnectNotifier: Failed to connect/reconnect: jms://QUEUE_NAME. Root Exception was: Io exception: Connection reset. Type: class java.sql.SQLException


After looking at the log entries before and after the found one I realized what has happened and why it was necessary to restart Mule multiple times. It goes like this:

  1. The Oracle database that provides AQ JMS support experiences some problems. As a result Mule also has problems, connections get dropped, etc.

  2. Administrators restart Mule server without (fully) solving database problems.

  3. Mule JMS support code fails to initialize JMS connection at Mule startup. This stops Mule startup sequence but not OC4J instance or OC4J Mule application. Mule can't stop itself because it does not control its lifecycle when it runs embedded.

  4. As a result Mule is started in a half broken state with at least some services uninitialized but is seen by OPMN as successfully started. So much for the management infrastructure.

  5. Subsequent requests to Mule fail because even though Mule servlet is up and running most of Mule services are not.


Okay, the problem is clear. Admins are notified to watch for that "Failed to connect/reconnect: jms://QUEUE_NAME" message in the logs. Except that it does not help much because OC4J (actually OPMN) can choose to bring OC4J instances down and up any time OPMN finds necessary. Given the software runs in OC4J cluster this can lead to situations when Mule runs on one node in the cluster and fails on another. Oops.

Now the question is: can anything be done to improve the situation?
I looked at the Mule initialization logic (class MuleXmlBuilderContextListener) and found that it does something ... interesting.

Method MuleXmlBuilderContextListener.initialize(ServletContext context) is responsible for initialization and here it is (from Mule 2.2.1), with some minor details omitted:

public void initialize(ServletContext context)
{
...
try
{
muleContext = createMuleContext(config, context);
muleContext.start();
}
catch (MuleException ex)
{
context.log(ex.getMessage(), ex);
// Logging is not configured OOTB for Tomcat, so we'd better make a
// start-up failure plain to see.
ex.printStackTrace();
}
catch (Error error)
{
...
throw error;
}
}

Basically almost any failure during initialization, be it configuration file errors, connection problems, whatever, are ignored. Sure they are logged, at best, but that is it. RuntimeExceptions are not caught, but most of Mule failures during startup result in MuleException, which is caught. For example, errors in the configuration file lead to an exception thrown somewhere down the chain from createMuleContext(). Some connection failures happen somewhere in muleContext.start() and do not throw any exception but do stop the initialization. What were they thinking?!

Is it possible to do better than that? I ended up with this, but a copy-paste version of MuleXmlBuilderContextListener with some modification would work as well:

public class CustomMuleXmlBuilderContextListener extends MuleXmlBuilderContextListener {

public void contextInitialized(ServletContextEvent event) {
initialize(event.getServletContext());
if (muleContextEx != null) {
String errMsg = "Mule context is not initialized";
logger.error(errMsg, muleContextEx);
throw new IllegalStateException(errMsg, muleContextEx);
}
if (muleContext != null && !muleContext.isStarted()) {
String errMsg = "Mule context is initialized but some services failed to start";
IllegalStateException ex = new IllegalStateException(errMsg);
logger.error(errMsg, ex);
throw ex;
}
}

protected MuleContext createMuleContext(String configResource, ServletContext context)
throws ConfigurationException, InitialisationException
{
try {
return muleContext = super.createMuleContext(configResource, context);
} catch (ConfigurationException ex) {
muleContextEx = ex;
throw ex;
} catch (InitialisationException ex) {
muleContextEx = ex;
throw ex;
}
}

//
private MuleContext muleContext;
private MuleException muleContextEx;
}

I had to override createMuleContext to get the created context instance because muleContext.isStarted() is the only way to find out if Mule has actually started successfully. This new listener makes sure that:

  1. If Mule is not started but reports no exception the listener makes sure there is one, and it is logged. (Yeah, even more noise in the log.) But this makes a failure like the one caused this investigation "Failed to connect/reconnect: jms://QUEUE_NAME" be more visible in the log.

  2. A failure during Mule initialization and startup is reported back to the web container. This hits another grey area of servlet specification which does not say what MUST or SHOULD happen in this case. There is a note, for example, in v. 2.4 of Java Servlet Specification:
    SRV.10.6 Listener Exceptions

    Some exceptions do not occur under the call stack of another component in the application. An example of this is ... a ServletContextListener that throws an unhandled exception during a notification of servlet context initialization ... In this case, the Developer has no opportunity to handle the exception. The container may respond to all subsequent requests to the Web application with an HTTP status code 500 to indicate an application error....

    OC4J has here fortunately a sensible behavior: the web application is marked internally as not started. Any HTTP request to it triggers an attempt to start it. So the listener gets another chance to start Mule and if the problem went away Mule starts and ready to serve requests. This does not work of course for configuration file failures but does its job nicely for connection failures.

Note: I know about RetryPolicy and its friends. It is a subject of one of my upcoming posts.

Tuesday, May 17, 2011

What should I think of mule?

"Every SOA project must use an ESB". I do not remember who said that to me but some people do follow this advice. The project mentioned in the previous post is not an exception. Being a nice example of the vendor driven architecture the project had used Oracle ESB for some time. Apparently that thing was so ridden with bugs the team decided to use something else instead, namely, mule community edition. By the time I have joined the project the switch was completed so I was not really involved with ESB. Until recently.

The production environment has being behaving quite flaky for quite some time. Some of the problems were traced back to the Oracle BPEL software. But there were also problems related to the mule server. The problem descriptions were not really helpful. It went like that: "oh, we have messages remaining in the queue", or "the problems begin after one of other services goes down; we suspect that mule holds HTTP connections open ...", or "we had to restart the mule server and the problems went away … or not ... and we had to restart it again ...".

Time to investigate.

The mule configuration used by the project is not that complex. There are some proxy services that were added under "URL transparency" motto. Couple of those proxy services perform some minor request/response transformations. And then there are some services that logically look like one way WS endpoints but internally use Oracle AQ messaging and a simple content-based routing to decouple sending and receiving components and to add message recovery.

The first thing I wanted to check was that "we suspect that mule holds HTTP connections open ..." and see if it is possible to force mule to close an HTTP connection after each request. After all the task sounded easy.

As a seasoned developer who knows next to nothing about a piece of software I google for "mule http connection keepalive". The first link HTTP Transport - Mule 2.x User Guide - mulesoft.org looks promising.

Wait a minute... to read documentation I have to be registered? Rrrrright. Given I need to investigate several possible problems it might be better to register, but this can wait. Google cache first.

I should say the documentation looks nice. Except of course you need to know all the mule story and architecture before you can get sense of the documentation. And it still does not explain why they need to have both 'keepAlive' and 'keep-alive' properties. Anyway after some reading I thought my problem is solved. After all this is what the documentation says:

Attributes of <outbound-endpoint...>
...







keep-alive boolean  Controls if the socket connection is kept alive. If set to true, a keep-alive header with the connection timeout specified in the connector will be returned. If set to false, a "Connection: close" header will be returned.
So let's try it out. First I verified that the current configuration does not close the HTTP connection after each request. It does not: it uses HTTP 1.1 and sends no "Connection: close" header. Hitting one of the proxy services multiple times in a row results in a single HTTP connection open.

Next I have added keep-alive="false", redeployed mule and executed the same test. There is nice "keepAlive: false" HTTP header and no "Connection" header. Who said it is going to work the first time?

I hate this. I mean I do everything as documented and it does not work. Given I just started poking around mule I would not immediately suspect the documentation is wrong. So I check and double check, and run more tests. And check again and make sure I looked at the correct documentation since the project is using mule 2.2.1 and the latest version is 3.something. Nope, everything looks correct, it just does not work.

Reading the doc page further I come across section Including Custom Header Properties. The same way of specifying Connection: close is also discussed for example here: Disable HTTP Connection keep-alive in CXF proxy.

Guess what? It does not work either. Just for kicks I have added another custom property "ILikeTheMuleDocumentation: false". This one works. Wow!

Next I tried something else that was mentioned in the same thread Disable HTTP Connection keep-alive in CXF proxy, namely, adding "Connection: close" with a deprecated "http.custom.headers" property. This one finally worked. And indeed produced a deprecation warning in logs.

By this time I became quite curious. I mean one must be quite … hmm ... talented developer to achieve such an interesting result of working and non-working features. So I looked in the sources and, after some digging, I found this:

package org.mule.transport.http.transformers;

...

public class ObjectToHttpClientMethodRequest
...

protected void setHeaders(HttpMethod httpMethod, MuleMessage msg)
{
// Standard requestHeaders
String headerValue;
String headerName;

for (Iterator iterator = msg.getPropertyNames().iterator(); iterator.hasNext();)
{
headerName = (String) iterator.next();

if (headerName.equalsIgnoreCase(HttpConnector.HTTP_CUSTOM_HEADERS_MAP_PROPERTY))
{
if (logger.isInfoEnabled())
{
logger.warn("Deprecation warning: There is not need to set custom headers using: " + HttpConnector.HTTP_CUSTOM_HEADERS_MAP_PROPERTY
+ " you can now add the properties directly to the outbound endpoint or use the OUTBOUND property scope on the message.");
}

Map customHeaders = (Map) msg.getProperty(HttpConnector.HTTP_CUSTOM_HEADERS_MAP_PROPERTY);
if (customHeaders != null)
{
for (Iterator headerItr = customHeaders.entrySet().iterator(); headerItr.hasNext();)
{
Map.Entry entry = (Map.Entry) headerItr.next();
if (entry.getValue() != null)
{
httpMethod.addRequestHeader(entry.getKey().toString(), entry.getValue().toString());
}
}
}
}
else if (HttpConstants.REQUEST_HEADER_NAMES.get(headerName) == null
&& !HttpConnector.HTTP_INBOUND_PROPERTIES.contains(headerName))

{
headerValue = msg.getStringProperty(headerName, null);
if (headerName.startsWith(MuleProperties.PROPERTY_PREFIX))
{
headerName = new StringBuffer(30).append("X-").append(headerName).toString();
}

httpMethod.addRequestHeader(headerName, headerValue);
}
}

Set attNams = msg.getAttachmentNames();
if (msg.getPayload() instanceof InputStream
&& attNams != null && attNams.size() > 0)
{
// must set this for receiver to properly parse attachments
httpMethod.addRequestHeader(HttpConstants.HEADER_CONTENT_TYPE, "multipart/related");
}
}
...
Brilliant, is it not? And, by the way, I did not find any place in the outbound path where 'keep-alive' property is checked. So much for a nice documentation. So I ended up using "http.custom.headers" to set "Connection: close" header.

Well, by that time I was well prepared to investigate the remaining issues with our mule configuration. I was sure I would come across even more pearls.