Minutes to seconds, maximizing incrementality

Cédric Champeau (@CedricChampeau), Gradle

Who am I

speaker {
    name 'Cédric Champeau'
    company 'Gradle Inc'
    oss 'Apache Groovy committer',
    successes 'Static type checker',
              'Static compilation',
              'Traits',
              'Markup template engine',
              'DSLs'
        failures Stream.of(bugs),
        twitter '@CedricChampeau',
        github 'melix',
        extraDescription '''Groovy in Action 2 co-author
Misc OSS contribs (Gradle plugins, deck2pdf, jlangdetect, ...)'''
}

Agenda

  • Incremental builds

  • Compile avoidance

  • Incremental compilation

  • Variant-aware dependency management

Incremental builds

Why does it matter?

  • Gradle is meant for incremental builds

  • clean is a waste of time

  • Time is $$$

Gradle team

  • ~30 developers

  • ~20000 builds per week

  • 1 min saved means 333 hours/week!

The incrementality test

  • Run a build

  • Run again with no change

  • If a task was re-executed, you got it wrong

Properly writing tasks

Example: building a shaded jar

task shadedJar(type: ShadedJar) {
   jarFile = file("$buildDir/libs/shaded.jar")
   classpath = configurations.runtime
   mapping = ['org.apache': 'shaded.org.apache']
}
  • What are the task inputs?

  • What are the task outputs?

  • What if one of them changes?

Declaring inputs

public class ShadedJar extends DefaultTask {
   ...
   @InputFiles
   FileCollection getClasspath() { ... }

   @Input
   Map<String, String> getMapping() { ... }
}

Declaring outputs

public class ShadedJar extends DefaultTask {
   ...

   @OutputFile
   File getJarFile() { ... }
}

Know why your task is out-of-date

task out of date

Incremental task inputs

  • Know precisely which files have changed

  • Task action can perform the minimal amount of work

Incremental task inputs

@TaskAction
public void execute(IncrementalTaskInputs inputs) {
   if (!inputs.isIncremental()) {
      // clean build, for example
      // ...
   } else {
      inputs.outOfDate(change ->
         if (change.isAdded()) {
           ...
         } else if (change.isRemoved()) {
           ...
         } else {
            ...
         }
      });
   }
}

Compile avoidance

Compile classpath leakage

A typical dependency graph

Cascading recompilation

Cascading recompilation

But also with side effects:

  • compile dependencies leak to the downstream consumers

  • hard to upgrade dependencies without breaking clients

Separating API and implementation

Example

import com.acme.model.Person;
import com.google.common.collect.ImmutableSet;
import com.google.common.collect.Iterables;

...

public Set<String> getNames(Set<Person> persons) {
   return ImmutableSet.copyOf(Iterables.transform(persons, TO_NAME))
}

Before Gradle 3.4

apply plugin: 'java'

dependencies {
   compile project(':model')
   compile 'com.google.guava:guava:18.0'
}

But…​

// exported dependency
import com.acme.model.Person;
// internal dependencies
import com.google.common.collect.ImmutableSet;
import com.google.common.collect.Iterables;

...

public Set<String> getNames(Set<Person> persons) {
   return ImmutableSet.copyOf(
            Iterables.transform(persons, TO_NAME))
}

Starting from Gradle 3.4

// This component has an API and an implementation
apply plugin: 'java-library'

dependencies {
   api project(':model')
   implementation 'com.google.guava:guava:18.0'
}

API vs impl graph

Change to impl dependency

Change to API dependency

Consumers are not equal

Compile classpath

What does a compiler care about?

  • Input: jars, or class directories

  • Jar: class files

  • Class file: both API and implementation

Compile classpath

What we provide to the compiler

public class Foo {
    private int x = 123;

    public int getX() { return x; }
    public int getSquaredX() { return x * x; }
}

Compile classpath

What the compiler cares about:

public class Foo {
    public int getX()
    public int getSquaredX()
}

Compile classpath

But it could also be

public class Foo {
    public int getSquaredX()
    public int getX()
}

only public signatures matter

Compile classpath snapshotting

  • Compute a hash of the signature of class : aedb00fd

  • Combine hashes of all classes : e45bdc17

  • Combine hashes of all input on classpath: 4500fc1

  • Result: hash of the compile classpath

  • Only consists of what is relevant to the javac compiler

Runtime classpath

What does the runtime care about?

Runtime classpath

What does the runtime care about:

public class Foo {
    private int x = 123;

    public int getX() { return x; }
    public int getSquaredX() { return x * x; }
}

At runtime, everything matters, from classes to resources.

Compile vs runtime classpath

In practice:

@InputFiles
@CompileClasspath
FileCollection getCompileClasspath() { ... }

@InputFiles
@Classpath
FileCollection getRuntimeClasspath() { ... }

Compile avoidance

  • compile and runtime classpath have different semantics

  • Gradle makes the difference

  • Ignores irrelevant (non ABI) changes to compile classpath

Effect on recompilations

Icing on the cake

  • Upgrade a dependency from 1.0.1 to 1.0.2

  • If ABI hasn’t changed, Gradle will not recompile

  • Even if the name of the jar is different (mydep-1.0.1.jar vs mydep-1.0.2.jar)

  • Because only contents matter

Incremental compilation

Basics

  • Given a set of source files

  • Only compile the files which have changed…​

  • and their dependencies

  • Language specific

Gradle has support for incremental compilation of Java

compileJava {
    //enable incremental compilation
    options.incremental = true
}
Kotlin plugin implements its own incremental compilation

In practice

import org.apache.commons.math3.complex.Complex;

public class Library {
    public Complex someLibraryMethod() {
        return Complex.I;
    }
}
  • Complex is a dependency of Library

  • if Complex is changed, we need to recompile Library

  • if ComplexUtils is changed, no need to recompile

Gotcha

import org.apache.commons.math3.dfp.Dfp;

public class LibraryUtils {
   public static int getMaxExp() {
      return Dfp.MAX_EXP;
   }
}
  • Dfp is a dependency of LibraryUtils

  • so if MAX_EXP changes, we should recompile LibraryUtils, right?

Wait a minute…​

javap -v build/classes/java/main/LibraryUtils.class

...
  public static int getMaxExp();
    descriptor: ()I
    flags: ACC_PUBLIC, ACC_STATIC
    Code:
      stack=1, locals=0, args_size=0
         0: ldc           #3                  // int 32768
         2: ireturn
  • reference to Dfp is gone!

  • compiler inlines some constants

  • JLS says compiler doesn’t have to add the dependent class to constant pool

What Gradle does

  • Analyze all bytecode of all classes

  • Record which constants are used in which file

  • Whenever a producer changes, check if a constant changed

  • If yes, recompile everything

Annotation processors

  • Disable incremental compilation (working on it!)

  • Implementation of the annotation processors matter at compile time

  • Don’t add annotation processors to compile classpath

  • or we cannot use smart classpath snapshotting

Annotation processors

Use annotationProcessorPath:

configurations {
    apt
}
dependencies {
    // The dagger compiler and its transitive dependencies will only be found on annotation processing classpath
    apt 'com.google.dagger:dagger-compiler:2.8'

    // And we still need the Dagger annotations on the compile classpath itself
    compileOnly 'com.google.dagger:dagger:2.8'
}

compileJava {
    options.annotationProcessorPath = configurations.apt
}

Variant aware dependency management

Producer vs consumer

  • A consumer depends on a producer

  • There are multiple requirements

    • What is required to compile against a producer?

    • What is required at runtime for a specific configuration?

    • What artifacts does the producer offer?

    • Is the producer a sub-project or an external component?

What do you need to compile against a component?

  • Class files

  • Can be found in different forms:

    • class directories

    • jars

    • aars, …​

Question: do we need to build a jar of the producer if all we want is to compile against it?

Discriminate thanks to usage

Give me something that I can use to compile

— Consumer

Discriminate thanks to usage

Sure, here’s a jar

— Producer

Discriminate thanks to usage

But we can be finer:

Sure, here’s a class directory

— Producer

Discriminate thanks to usage

Or smarter:

mmm, all I have is an AAR, but don’t worry, I know how to transform it to something you can use for compile

— Producer

The Java Library Plugin

  • will provide consumers with a class directory for compile

  • will provide consumers with a jar for runtime

As a consequence:

  • only classes task will be triggerred when compiling

  • jar (and therefore processResources) only triggerred when needed at runtime

Conclusion

Use the Java Library Plugin!

Thank you!