Archive for ‘Java’

August 4th, 2011

It’s Official: I Dislike Maven

by Tim Cull

Apache Maven is a build tool and dependency management system for Java that aims to solve a lot of sticky problems. I’ve been trying to give it a fair chance because people I respect are fans of it, but after giving it many tries I’ve decided it’s official: I dislike it and will not recommend it to my clients.

Here are the reasons I dislike it. For me, any one of them is enough not to use it but all of them together makes the decision a no-brainer:

1) You can’t check your project out of source control and build it based soley on what’s in source control. This means if you have a project in production, lose all your developers, and then have to try to build it again you’re in for a world of hurt. This is not theoretical: I’ve worked on so many projects like this that it’s becoming a specialty of mine. I’ve worked on projects where nobody even knows the login to source control, much less would they know anything about a Maven repo.

2) It’s too hard to tell what’s going on, especially for newbies and even with a simple build. A newbie can only understand a basic Maven build if he first reads a ton of documentation and understands most of the Maven lifecycle.

3) Dependency resolution isn’t consistent. For example, “mvn dependency:tree” doesn’t always give you the same results as “mvn eclipse:eclipse” or “mvn package:single” when using the “jar-with-dependencies” descriptor of the Maven Assembly plugin. For example, on one project I work on I end up with jaxb jars when I run package:single and eclipse:eclipse, but they don’t show up anywhere when I run dependency:tree. This means I have to spend many, many hours figuring out where these Jaxb (or commons-collections, log4j, etc are other common ones) jars are coming from.

4) You are dependent on too much other infrastructure. In particular, you are dependent on (at best) a corporate or departmental Maven repo that needs care and feeding or (at worst) a public Maven repo. These kinds of things move/go down/get lost, etc. This is not theoretical: I worked on one project where an infrastructure team took down and decomissioned a corporate Maven repo without realizing it and even big boys like JBoss move their Maven repos unexpectedly.

5) Other people can mess with your dependent libraries on you. This means a build that works one day might not work the next day. Even though it isn’t best practice, I have seen with my own eyes in the real world where teams manually change jars in a Maven repo without making them new revisions. If you had the jars checked into source control and tagged/labeled then this would be impossible. This doesn’t only happen in private repos, it happens in public ones, too.

So, anyway, it’s Ant and SVN/Git for me. Sorry Maven.

July 31st, 2010

New InfoQ Article: Eight Quick Ways to Refresh Legacy Java Systems

by Tim Cull

I’m happy to announce a new article of mine on InfoQ: Eight Quick Ways To Improve Java Legacy Systems. In this article I explore different, easy ways to improve your legacy Java system.

Also, stay tuned for more news about me and InfoQ…

June 24th, 2010

How to avoid huge transactions with CMP Entity Beans on JBoss

by Tim Cull

By default, CMP Entity Beans on JBoss are set to require a transaction. Also by default, any time you touch any session or entity bean, your request thread takes out a lock on that entire object, even if you are only reading it and not updating it. Lastly, also by default, JBoss will make sure that for any given entity, there is only one instance of that entity in memory at a time.

All of these defaults have serious implications. For one, it implies that anything other than a toy application will likely become a de-facto, single-threaded application. Imagine, for example, that you have an earthquake tracking application. Your application might have an Entity Bean called Earthquake. After getting under way with the application, you realize there are different kinds of earthquake: tectonic, volcanic, and man-made. These don’t merit having a full-on Earthquake subclass of their own, but maybe you want to model the types as a new Entity called EarthquakeType so that the application can be data-driven and new types can be added later without changing code. The vast majority (~90%) of earthquakes are tectonic, so most of what you ever display to a user will be “tectonic”.

So, you might have a web page that displays the last 40 earthquakes in descending chronological order in a table and also a count of how many different types. This could lead to innocent code like, say:

foreach (Earthquake earthquake : earthquakes){
 typeSum[earthquake.getType().getId()]++;
}

The moment you call earthquake.getType() for the first earthquake in the list, you will lock the “tectonic” instance of the EarthquakeType Entity bean. This means that every other thread executing in the same JVM (if configured the default JBoss way) will most likely block (who doesn’t need to know what the earthquake type is, after all?) until this thread is done displaying its page. Even worse, if this thread is holding a lock that some other thread needs, and that other thread is holding a lock that this thread needs, then you have a deadlock. All of this in spite of the fact that actually updating an EarthquakeType is extremely rare because they are read-mostly.

A telltale sign that you are having this problem is seeing stack traces like this one:

org.jboss.util.deadlock.ApplicationDeadlockException: Application deadlock detected, resource=org.jboss.ejb.plugins.lock.QueuedPessimisticEJBLock@290df5c3, bean=

…snip…

at org.jboss.util.deadlock.DeadlockDetector.deadlockDetection(DeadlockDetector.java:69)
at org.jboss.ejb.plugins.lock.QueuedPessimisticEJBLock.waitForTx(QueuedPessimisticEJBLock.java:292)
at org.jboss.ejb.plugins.lock.QueuedPessimisticEJBLock.doSchedule(QueuedPessimisticEJBLock.java:230)

…snip.

At first, it’s tempting to fume at JBoss for having such conservative default settings. I know I did this morning as I was learning more about the details. But the fact is that they really have no choice. The application container has no idea that EarthquakeType is read-mostly. It doesn’t know if you will read it at the beginning of the request and then modify it 300 milliseconds later at the end of the request. So, it is forced to loop absolutely everything you touch into a giant transaction unless you tell it otherwise.

Now, the “telling it otherwise” is where things start to get tricky. Here, I really do think that JBoss hasn’t done us any favors. It’s a multi-step process to making sure you maximize your throughput and minimize deadlocks. If you do some steps but don’t do others, then nothing will change and you won’t know why.

So, here are the steps…

April 21st, 2010

Java Stack Trace RegEx

by Tim Cull

This is just a quick post because it’s been a while and I wanted to save others from the pain I experienced yesterday.

If you want to parse a Java stack trace with a regular expression and pull out the class name, method name, and line number, then you can use this code below:

Pattern pattern = Pattern.compile("([a-zA-Z0-9_\\.]*)\\.([a-zA-Z0-9_\\.]*)\\([a-zA-Z0-9_\\.]*:([\\d]*)\\)");
Matcher matcher = pattern.matcher(traceString);
while (matcher.find()){
    String className = matcher.group(1);
    String methodName = matcher.group(2);
    int lineNumber = Integer.parseInt(matcher.group(3) == null ? "0" : matcher.group(3));
}

Note that because you are passing a Java string into a regular expression, you have to double-escape many of those characters. For example, if you want to say “any decimal” the usual regular expression is “\d” but because you are using a Java string to define the regular expression you have to double escape it to say “\\d” instead.

I’d like to give some props to David Matuszek whose nifty online Regular Expression Test Applet made debugging this hairy thing much easier.