Today I’m approaching the Spring Surf RC2 release and one of the major thing that has been bothering me (and definitely the other devs) has been the slowness and instability of the Maven build.

Recurring issues like:

  • Huge overhead in repository snapshot artifact lookups
  • Multimodule useless plugin invocations
  • Build randomly failing on the build server
  • Release unnecessary complex

In this sense, as Maven has always been my boy, today I decided, prior to the RC2 release, to try and improve performances and stability of the build to a good extent before proceeding with release: in this sense, as I often suggest, a Maven project is somehow like a very sweet but complicated woman, who really need some Tender, Love & Care before actually being able to release all its potential.

And as I believe that, in the infinite hate & love between software writers and software users/configurators, Maven design has too many times blamed for circumstances that actually are fully under the control of the configurator (e.g. the dev who writes the Maven POM), while I go trough this Maven refactoring I dediced to share my experience and achievements with you, to actually try and somehow give Ceasar what belongs to Ceasar.

The Before

Before we enter in the Maven refactoring best practices a glimpse on what the project looks like right now.

  • My machine is a MacBook Pro 2.66 Ghz Intel Core 2 Duo, with 4GB DDR3 RAM and MacOSX 10.5.8
  • Maven POM on Spring Surf Trunk at revision 603
  • Maven 2.0.11 (old version but needed for some non updated plugins) on Java 1.6.0_20
    zion:spring-surf-parent mindthegab$ mvn --version
    Apache Maven 2.0.11 (r909250; 2010-02-12 06:55:50+0100)
    Java version: 1.6.0_20
    Java home: /System/Library/Frameworks/JavaVM.framework/Versions/1.6.0/Home
    Default locale: en_US, platform encoding: MacRoman
  • Using Cloc to count lines in pom.xml files across the whole project, I got 7139 LOC
    Command: zion:tools mindthegab$ perl cloc-1.52.pl ~/Dev/alfresco/workspace/spring-surf-parent/ --match-f=pom.xml

  • With a fairly full repository (already built this at least once) this is the current build performance:
    When Command Performance
    Original mvn clean install 18 minutes 49 seconds

Let’s make it fly, now :)

Best Practice 1 – Lower the number of repositories

Keeping the number of Repositories declared in the POM as low as possible has a drastic impact on the build experience and also lowers concerns on the build being dependent on multiple internet sources to be reproducible. This has to be achieved by:

  • Remove development leftovers, i.e. validating which repositories are actually needed by the build
  • Use a corporate repository to proxy external repositories, allowing Maven to focus on building and leaving the hassle of remote repository dependency resolution to the proxy repository. In our example we’ll an Enterprise Maven repository software called Sonatype Nexus, which is hosted by Alfresco at http://maven.alfresco.com

In our case Surf original POM contained 10 repositories and 3 plugin repositories which seems really to be an exaggeration, especially most repositories are from the Spring community anyways (and one of them is already the http://maven.alfresco.com Nexus instance).

So first of all I removed the following repositories:

  • http://private.repository.springsource.com/maven/bundles/external – This is private repo, not suitable for an opensource project

And then I proceeded removing from the POM (and creating corresponding Proxy URLs in http://maven.alfresco.com) for the following repositories:

Repository Proxied URL
(under

http://maven.alfresco.com/nexus/content/repositories)

Type
http://repository.springsource.com/maven/bundles/release /com.springsource.repository.bundles.release Release
http://repository.springsource.com/maven/bundles/external /com.springsource.repository.bundles.external Release
http://repository.springsource.com/maven/bundles/milestone /com.springsource.repository.bundles.milestone Release
http://repository.springsource.com/maven/bundles/snapshot /com.springsource.repository.bundles.snapshot/ Snapshot (was wrongly configured as release in the POM)
http://maven.springframework.org/milestone /maven.springframework.org.milestone Release
http://maven.springframework.org/snapshot /maven.springframework.org.snapshot/ Snapshot (was wrongly configured as release in the POM)
http://extensions.springframework.org/snapshot /snapshot.extensions.springframework.org Snapshot
http://extensions.springframework.org/milestone /snapshot.extensions.springframework.org/ Release (was wrongly configured as snapshot in the POM)
http://mc-repo.googlecode.com/svn/maven2/releases /mc-repo/ Release
https://nexus.codehaus.org/content/repositories/releases /codehaus-repository-nexus/ Release
https://nexus.codehaus.org/content/repositories/snapshots/ /codehaus-snapshots-nexus/ Snapshot (was wrongly configured as release in the POM)

Now the whole point of this repository proxying is that Nexus allows us to group repositories on common URLs so that we can have a single reference for only one releases and only one snapshots repositories. In this sense what I did is grouping:

This way the refactored Surf POM will mention only 2 repositories (+ 2 pluginRepositories) which looks already much better than when we started, see below:
<repositories>
<!-- Alfresco Community Maven Repositories -->
<repository>
<id>alfresco-public-snapshots</id>
<url>http://maven.alfresco.com/nexus/content/groups/public-snapshots</url>
<releases>
<enabled>false</enabled>
</releases>
<snapshots>
<enabled>true</enabled>
</snapshots>
</repository>
<repository>
<id>alfresco-public-releases</id>
<url>http://maven.alfresco.com/nexus/content/groups/public</url>
</repository>
</repositories>
<pluginRepositories>
<pluginRepository>
<id>alfresco-public-snapshots</id>
<url>http://maven.alfresco.com/nexus/content/groups/public-snapshots</url>
<releases>
<enabled>false</enabled>
</releases>
<snapshots>
<enabled>true</enabled>
</snapshots>
</pluginRepository>
<pluginRepository>
<id>alfresco-public-releases</id>
<url>http://maven.alfresco.com/nexus/content/groups/public</url>
</pluginRepository>
</pluginRepositories>

After this refactoring, we tried to run the build again with a following (encouraging, have to admit) drastical 48% performance improvement:

When Command Performance Perf delta
Original mvn clean install 18 minutes 49 seconds Not applicable
Post Repository Consolidation mvn clean install 9 minutes 42 seconds 48.4 %

Best Practice 2 – Fine tune SNAPSHOT repositories policies

By default Maven 2 checks always for newer versions of SNAPSHOT dependencies and plugin, but this can definitely be controlled by POM/settings configurations. It’s always a best practice to limit SNAPSHOT update policies to perform a daily check rather than bothering every single build with useless (unless you’re in heavy development phases) HTTP failed requests.

In our case we only have (after applying Best Practice 1) 2 snapshot repositories: in this sense, with respect to the above repositories configuration we only need to add the <updatePolicy>daily</updatePolicy> snippet to the alfresco-public-snapshots repository and pluginRepository.

Since we have few repositories after refactoring, we expect this won’t improve that much the performances (and anyways SNAPSHOTs were already updated today by earlier builds), so we skip the performance improvement step for this best practice.

Best Practice 3 – Limit Multimodule unnecessary plugin executions

When used in their default configuration certain plugins result in unnecessary (or at least overhead) executions for all the submodules in the reactor. In our context, especially, being Spring Surf composed of 30 submodules + 1 parent, well, you can already see the issue.

In particular, looking around in the POM I found and optimized the usage of :

Plugin Usage Optimization
buildnumber-maven-plugin Used to incrementally label builds based on the SVN revision. Using getRevisionOnlyOnce, we limit the execution of this plugin (and SVN remote requests) to the parent project, and this save 30 plugin executions

With this configurations turned on, we are happily improving the build performances of another incremental 40%:

When Command Performance Perf delta
Original mvn clean install 18 minutes 49 seconds Not applicable
Post Repository Consolidation mvn clean install 9 minutes 42 seconds 48.4 %
Post Plugins configuration mvn clean install 5 minutes 51 seconds 39.6 %

Best Practice 4 – Fix plugin versions

One the main causes of non reproducible or instable builds is the Maven “feature” which allows you not to specify a plugin version, in which case Maven will try to retrieve the latest plugin version fronm any of the available repositories. This becomes expecially dangerous when using proxying which are basically mirroring the full Maven public artifacts arena, because, simply, if and whenever someone will upload a new (maybe broken) plugin version your build will pick it up right away potentially failing.

In our contexts no plugins were using SNAPSHOT versions, but I leave this best practice here as a reference

Best Practice 5 – Check for weird plugin executions in your lifecycle

In certain contexts a complex POM configuration or a heavily multi-module build might require certain plugins (in certain configurations) to be ran multiple times, with unnecessary build time wasted. It’s always a good practice to follow (at least once) your build in order to understand if there are weird loops.

In our contexts I was able to find a bug in the maven javadoc plugin 2.6+ which was causing the plugin to be invoked N*N times where N is the number of modules of my project (so N times per each of the N modules). The workaround was to downgrade to 2.5 version of the javadoc plugin.

Wrap up and considrations
Looking at the numbers we can be quite happy of having reduced the build time from 18 minutes and 49 seconds to actually 5 minutes and 51  (a good 68% performance improvement), while POM LOC went down to 6990 (~2% reduction) Not saying this is the recipe for any build improvements, just trying to convey the message that, when used properly, Maven can actually be a great automation tool instead of an additional hassles for the devs.
Last but not least, the ASF has just announced Maven 3, which promises to speed up our builds and provide a much safer framework to do Application Lifecycle Management. So stay posted, because I might try to Maven3ize the Spring Surf build sooner or later :)

2 Responses so far.

  1. Registry Booster…

    [...]mindthegab.com » Blog Archive » Boost your Maven build with best practices[...]…

Leave a Reply