11 September 2018

Tags: gradle maven lifecycle

The build lifecycle

There’s a very common misconception I read from Maven users about Gradle: that there’s no default lifecycle. Not only this is wrong, but actually the Gradle lifecycle is significanly richer.

First, let’s explain in a few words what the lifecycle is in Maven. Detailed explanations can be found here, but in a nutshell, the idea is that _any build will always consist in a sequence of phases, and that each phase is built on top of the previous one. This has the advantage of being simple to explain: to deploy an application, you first need compile it, run the tests, package it, perform validation (checkstyle, …) and deploy. Let me ignore install which is an artifact of how Maven works.

Maven plugins attach themselves to those lifecycle phases, and define goals on different phases. For example, a code generator would attach itself to the generate-sources phase, and define a goal that runs at this phase. Fun begins when you have dependencies between the goals, and ordering matters… Anyway, the general idea is that if you want to get the outcome of, say, packaging, you have to execute all previous phases and consequently all goals that are defined on those phases. The other consequence is that the Maven lifecycle is biased towards the Java model, and more specifically, building Java libraries. It’s even clearer when you think about the term deploy, which doesn’t mean "deploy this application on production", but "push this jar on the external Maven repository". Similarly "install" doesn’t mean "install this application on my laptop", but "copy this library into my local .m2 repository". This, I would argue, is rather counter-intuitive…

Gradle, on the other hand, is a generic build tool. It’s aimed at the Java ecosystem, but also the native one, the Android ecosystem, Python, Go, … It doesn’t matter. All of those ecosystems have an underlying model, and the way you build applications or libraries in each of those ecosystems is different. Gradle offers the APIs to model the builds of each ecosystem. This flexibility is often what troubles Maven users, and makes them think there’s no lifecycle, but it’s not true.

Goals vs tasks

To understand why, we need to explain that Gradle model is not phase based, but task based. It’s a bit like what Ant did, but the similarity stops there. While Ant didn’t define any convention, any lifecycle, and everything was different from one build to the other, this is not true with Gradle. By default, if you apply this "Java library" plugin, you’ll get all the conventions you find with Maven:

plugins {
   id 'java-library'
}

This is the minimal build file you need to build, test and package a Java library with Gradle, with the same conventions as Maven src/main/java, … By applying this plugin, Gradle internally applies a sequence of plugins, which, in turn, would define new tasks, and more specifically for the topic of this blog post, lifecycle tasks.

In Gradle, a task is responsible for executing a unit of work. For example, "compile this source set". A task has inputs (the source files) and outputs (the class files). But a task also has dependencies. In particular, dependencies on other tasks. Gradle makes sure that a task graph is a DAG (direct, acyclic graph). This means that if you run compile on the command-line, what Gradle does is:

  1. compute the task dependencies of compile

  2. execute the tasks in order

Task dependencies can be explicit (say, compile.dependsOn(compileJava)), or implicit (because you add a source set as an input, and that this source set is generated by another task, then we know we need to run the code generation for this source set before). This model is nice because it’s significantly more fine grained than the phase one. When you execute a task, Gradle will always perform the minimal amount of work required to get the output of this task. Let’s illustrate with an example: say you want to run the unit tests of your library. You would run the test task with Gradle. Gradle would then determine that:

  • it needs to compile the sources of the library (compileJava)

  • but the sources includes a generated source set, so it needs to execute it (generateSources)

  • it would also find that the "resources" are an input of the test classpath, so execute the processResources task

  • etc…

But, in the end, it would not generate the jar file. Because Gradle knows that to run the tests, there’s no need to get the jar: we can build a classpath that consists of the generated classes and the resources directories. It’s actually very easy to figure out what the task dependencies are by running with --dry-run, or using a build scan.

So, will you tell me what the lifecycle of Gradle is then?

This is the trick. With Gradle, everything boils down to tasks, which are a bit like functions, with inputs and outputs. But there are special kinds of tasks, that we call "lifecycle tasks", which are binding other tasks together. They, effectively, produce no output individually. Their only role is to have dependencies on other tasks, so that we have nice shortcuts to produce our outputs. For example, the check task is a lifecycle task which has dependencies on the test task, but also the checkstyle task, etc… Plugins are free to add dependencies to the check task, and enrich the check lifecycle this way. But even better, by defining dependencies between tasks like that, and clearly defining the inputs and outputs of each task, we make it possible to get correct incremental builds, as well as caching (and no, this has nothing to do with ~/.m2).

The good news is that because lifecycle tasks are just regular tasks, it means they can also depend on each other, and you can build your own lifecycle tasks. It becomes very easy to model your build production pipeline. So here is a simple correspondance matrix for Maven users, for the Java library plugin:

Table 1. Lifecycle correspondance matrix
Maven Gradle Description

clean

clean

Removes the outputs of tasks

compile

classes

Generates the classes from source files

test

test

Executes unit tests

package

assemble

Creates a jar

verify

check

Runs all tests, integration tests, quality checks, …

install

publishToMavenLocal

Gradle doesn’t need a local repository, but should you need Maven interoperability, you can add the maven-publish plugin to add this task

deploy

publishToMavenRepository

This tasks is not available by default, as it depends on which type of repository you deploy to. In general you just apply the maven-publish plugin to add this task

But remember: in Gradle, tasks depend on each other. So it means that if you run a lifecycle task, only the tasks required for that specific target are going to be executed. Nothing more.

See us at Devoxx Belgium!

If you want to discover more of the differences between Gradle and Maven, come see my colleague Louis Jacomet and I during Devoxx Belgium, we’re giving a deep dive into Gradle where we’re going to cover what is explained here, and much more!

If you like this blog or my talks, consider helping me acquire astronomy equipment

comments powered by Disqus