This is a study note of Mill: a new Scala/Java build tool. The motivation is to use the functional programming concept and good ideas from Bazel to make build easy.

1 Getting Started

1.1 Installation

1
2
brew install mill
echo 'export PATH="/usr/local/opt/openjdk/bin:$PATH"' >> ~/.zshrc

To build a Scala project, create a build.sc file:

1
2
3
4
5
6
7
8
// for bloop work properly in IDE
import $ivy.`com.lihaoyi::mill-contrib-bloop:$MILL_VERSION`

import scalalib.ScalaModule

object hi extends ScalaModule {
  def scalaVersion = "2.13.1"
}

The first line import $ivy.`com.lihaoyi::mill-contrib-bloop:$MILL_VERSION` imports the right version of Bloop plugin that matchs the project Mill version. The setting is important for Bloop working properly in Visual Studio Code.

To add Mill bootstrap script in a codebase, download the script, name it as mill and make it executable script in the project folder in a Linux or OS-X machine: curl -L https://github.com/lihaoyi/mill/releases/download/0.6.1/0.6.1 > mill && chmod +x mill.

Then create a hi/src/Hi.scala file with the following content:

1
2
3
4
5
object Hi {
  def main(args: Array[String]): Unit = {
    println("Hi")
  }
}

Run the code using mill hi. Because a non-trivial project depends on Ivy packages, the build.sc should import more types. Following is a build file of a sample Cask application:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
import $ivy.`com.lihaoyi::mill-contrib-bloop:$MILL_VERSION`

import mill.Agg
import mill.scalalib.{DepSyntax, ScalaModule}

object hi extends ScalaModule {
  def scalaVersion = "2.13.1"
  def ivyDeps = Agg(
    ivy"com.lihaoyi::cask:0.5.7",
    ivy"com.lihaoyi::upickle:0.5.1",
    ivy"com.lihaoyi::scalatags:0.8.6"
  )

  object test extends Tests {
  def ivyDeps = Agg(ivy"com.lihaoyi::utest:0.7.1")
  def testFrameworks = Seq("utest.runner.Framework")
}

1.2 Mill Version

Once Mill is installed, it can use any version for a project. To specify a version, create a .mill-version file in the project root folder. It takes precedence over the version of mill.

To override the Mill version manually, pass in a MILL_VERSION enviornment variable in the command line. For example: MILL_VERSION=0.6.1 mill __.complie.

All three methods: the mill, the .mill-version and the MILL_VERSION can be used for a development version that has a syntax of #.#.#-n-hash.

1.3 Scala Project Structure

It expects a project structure as the following:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
build.sc
foo/
    src/
        FileA.scala
        FileB.scala
    resources/
        ...
out/
    foo/
        compile/
        run/
        jar/
        assembly/
    mill-worker-somerandomcharacters
    mill-profiel.json

The code object foo extends ScalaModule {...} defines a module and the object name foo is the module root folder. The foo folder is in the same folder as the build.sc file. The build output will be in out/foo/.

The scala source files are in foo/src/.

The out/ folder has a subfolder for each module. In each module’s folder, there is a subfolder for each task. Each task’s folder has the following contents:

  • dest/: a path used as a scratch space or generated files (binary blobs) as defined in PathRef.
  • log: the output of stdout and stderr. Same as the console output during execution.
  • meta.json: contains the metadata/output returned by that task. It include cache-key for target or JOSN-serialized return-value of a command.

Additionally, the out/ folder has some top-level build-related files and folders.

  • mill-profie.json logs the tasks, time used and cached flags for the last Mill executed command.
  • mill-worker-somerandomcharacters/ contains some runtime data.
  • mill/ contains internal tools. For example, Mill uses sbt Zinc as the incremental compiler for Scala.
  • resolve, show, plan etc: the results of Mill command line tools.

2 Basic Usage

2.1 Common Tasks

Some common commands are:

1
2
3
4
5
6
7
mill foo # It is the same as "mill foo.run", run the main method, if any.
mill foo.run  # run the main method, if any
mill foo.compile  # compile sources into classfiles
mill foo.runBackground  # run the main method in the background
mill foo.launcher  # create a laucher command as "./out/foo/launcher/dest/run"
mill foo.jar  # bundle the classfiles into a jar
mill foo.assembly  # bundle classfiles and all dependencies into a jar

The first time you run a command, there are several warning messages like WARNING: An illegal reflective access operation has occurred. It is generated by an old version of com.google.protobuf.UnsafeUtil and can be safely ignored.

The command after the module name is a task. Tasks can be cached targets such as foo.compile or un-cached commands such as foo.run.

2.2 Command-line Tools

Mill has the following command-line tools: all, clean, inspect, path, plan, resolve, show, shutdown, version, visualize, and visualizePlan.

In command line, _ is a wildcard pattern that matchs all modules or tasks in the specified scope. At top level, it represents all top-level modules. The __ represents all modules including top-level modules and nested modules.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
mill resolve _ # list all Mill commands and top-level modules. Don't run the task.
mill resolve __  # list all available tasks, some are general tasks, such as "console" and "run" etc,  for every module.
mill resolve foo._  # list tasks within foo module, including the general tasks.

mill inspect foo.compile #  inspect a task's doc, source location and its input tasks.

mill show foo.compile  # run a task and show its output. By default, Mill doesn't print out the metadata from evaluating a task.

mill all foo.{compile, run} # run multiple tasks.
mill all _.compile  # run the task for all top modules.
mill all __.compile # run the task for all modules.

mill plan foo # prints out what tasks would be evaluated, in what order, if you ran mill foo, but without actually running them.

mill path foo.assembly foo.compile # prints out a dependency chain from the first to the second

mill clean # delate all the cached outputs
mill clean foo.compile # only delete outputs of on task

mill visualize foo._ # generates relationship files in different format (.png, .txt, .json, .svg) for a set of tasks.

mill visualizePlan foo._ # generate relationship files for the entire build plan, include indirect tasks.

# additional command
mill mill.scalalib.Dependency/updates # search for updated version of the project

# options
mill -i # open the Mill build REPL and run tasks interactively
mill -i foo.console  # start a Scala REPL in interactive mode with the module and its dependencies loaded.
mill -i foo.repl  # start an Ammonite REPL in interactive mode with the module and its dependencies loaded.

mill -w foo.compile # same as "--watch" option, watch the task and re-evaulate if there is a change.
mill -j 4 __.compile # same as "-jobs" option, execute tasks in parallel with the specified number of threads.

mill path prints out a dependency chain between the first task and the second.

mill show visualize takes a subset of the Mill build graph and draws out their relationships in .svg and .png form for you to inspect. It also generates .txt, .dot and .json for easy processing by downstream tools. Another use case is to view the relationships between modules.

mill show visualizePlan is similar to mill show visualize except that it shows a graph of the entire build plan, including tasks not directly resolved by the query. Tasks directly resolved are shown with a solid border, and dependencies are shown with a dotted border.

mill clean deletes all the cached outputs of previously executed tasks. It can apply to the entire project, entire modules, or specific tasks. mill clean foo.runBackgournd stops the background process and clean its output.

mill mill.scalalib.Dependency/updates searches for dependency updates.

3 Configure Mill

3.1 Compilation & Execution Flags

  • scalaVersion: scala version, String
  • scalacOptions: scalac flags, Seq[String]
  • forkArgs: JVM flags for run and test tasks in a sub process, Seq[String]
  • forkEnv: environment variables for run, Map[String, String]

Use runLocal or testLocal to run the application in-process without using forkArgs and forEnv.

3.2 Ivy Dependencies

1
2
3
4
5
6
7
8
9
import mill.Agg
import mill.scalalib.{DepSyntax, ScalaModule}

def ivyDeps = Agg(
  ivy"com.lihaoyi::upickle:0.5.1",
  ivy"com.lihaoyi::pprint:0.5.2",
  ivy"com.lihaoyi::fansi:0.2.4",
  ivy"${scalaOrganization()}:scala-reflect:${scalaVersion()}"
)

The ivyDeps has a value of Agg[Dep] that is created from the Agg object and a set of ivy strings. The Agg[Dep] is a set of Dep, the type for each dependent artifact.

Use :: for Scala version dependencies. Use ::: for cross-published dependencies using full Scala version such as 2.13.1 instead of 2.13.

Use: for Java dependencies. If Mill cannot generate the correct artifact name and version, search the repository name and version and use the full name separated by :.

To add repositories:

1
2
3
4
5
import coursier.maven.MavenRepository

def repositories = super.repositories ++ Seq(
  MavenRepository("https://oss.sonatype.org/content/repositories/releases")
)

Create a custom ZincWorkerModule to add a custom resolver.

3.3 Test Suite

Add a nested module extending Tests:

1
2
3
4
object test extends Tests {
  def ivyDeps = Agg(ivy"com.lihaoyi::utest:0.7.1")
  def testFrameworks = Seq("utest.runner.Framework")
}

To run a specific test case using uTest, pass it as an argument, ex: mill too.test foo.MyTestSuite.myCase. You can have multiple test suites that each has its folder.

You can define more than one test modules.

3.4 Scala Compiler Plugin

1
2
3
def compileIvyDeps = Agg(ivy"com.lihaoyi::acyclic:0.1.7")
def scalacOptions = Seq("-P:acyclic:force")
def scalacPluginIvyDeps = Agg(ivy"com.lihaoyi::acyclic:0.1.7")

The scalacPluginIvyDeps depends on compileIvyDeps, therefore both are set.

3.5 Code Format

Extend mill.scalalib.scalafmt.ScalafmtModule. run mill.foo.reformat to format code and mill checkFormat to check format.

Reformat a project’s code globally with mill mill.scalalib.scalafmt.ScalafmtModule/reformatAll __.sources command, or only check the code’s format with mill mill.scalalib.scalafmt.ScalafmtModule/checkFormatAll __.sources.

Create a .scalafmt.conf file at the project root to config formatting.

3.6 Common and Global Configuration

You can define a trait that extends ScalaModule and contains common configurations. Then another module can extend from the trait. Common tasks include scalaVersoin, scalacOptions, Tests and ScalafmtModule.

There are two ways to define global configuration. For interactive mode, use ~/.mill/ammonite/predef.sc, for command line use ~/.mill/ammonite/predefScript.sc. You can create symlink if they are the same.

3.7 Custom

Define new cached tasks, called targets, using the T {...} syntax. Targets cannot take parametrs. The return-type of a target has to be JSON-serializable (using uPickle). Mill provides T.ctx.dest folder as scratch space or to store files.

Define an un-cached tasks, called commands, using the T.command {...} syntax.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
def lineCount = T {
  foo
    .sources()
    .flatMap(ref => os.walk(ref.path))
    .filter(os.isFile(_))
    .flatMap(os.read.lines(_))
    .length
}

def printLineCount() = T.command {
  println(lineCount())
}

Extend object myGroup extends Module{...} to create custom module that can group other modules. Use trait mySpecial extend Module {...} to work as a parent of other modules.

Redefine tasks to override them and use super to call the original task. In Mill build files, the override keywoard is optional.

Override unmanagedClasspath to define unmaged Jars. It can use mill.module.Util.download to download non-maven Jars.

1
2
3
4
def unmanagedClasspath = T {
  if (!ammonite.ops.exists(millSourcePath / "lib")) Agg()
  else Agg.from(ammonite.ops.ls(millSourcePath / "lib").map(PathRef(_)))
}

Redefine main class by def mainClass = Some("foo.bar.Baz").

To merge/exclude files form assembly, redefine assemblyRules with Rule.append, Rule.AppendPattern and Rule.ExcludePattern.

4 Task Graph

Task is a core abstraction that is used by Mill to define, order and cache work it needs to do. Each task has a list of inputs and single output of type [[T]]. There are three primary kinds of tasks:

  • Target T {...}
  • Sources T.sources {...}
  • Command T.command {...}

4.1 Target

The code def allSources = T { os.walk(sourceRoot().path).map(PathRef(_)) } defines a target named allSources that depends on the sourceRoot task. The return value of targets has to be JSON-serializable via uPickle. Use the following code to make a case class JSON-serializable:

1
2
3
object MyCaseClass {
  implicit def rw: upickle.default.ReadWriter[MyCaseClass] = upickle.default.macroRW
}

The PathRef is used to wrap the os.Path and the result is JSON-serializable. PathRef has a .hashCode() method to hash all input files to track input file changes. Mill creates a path on disk as scratch sapce and to store its output files.

  • The out/targetName/meta.json file has the returned metadata.
  • The out/targetName/dest/ has the output files.
  • The out/targetName/log has the logging output.

The graph of inter-dependent targets is evaluated in topological order and stops when one of upstream dependencies has failed, the downstream targets are not evaluated.

4.2 Sources

The following code defines one Sources task:

1
2
def sourceRootPath = os.pwd / 'src
def sourceRoots = T.sources { sourceRootPath }

Sources is defined using T.sources {...}, taking one-or-more os.Path arguments. The Sources is a subclass of Target[Seq[PathRef]]. Its hash code depends on both the path and MD5 hash of the filesystem tree under that path.

T.sources also take Seq[PathRef] to create a Sources from multiple Sources task. Following is an example:

1
2
def additionalSources = T.sources { os.pwd / 'additionalSources }
def sourceRoots = T.sources { super.sourceRoots() ++ additionalSources() }

4.3 Command

Use T.command {...} to define a command that can run arbitrary code. Commands can only be defined directly within a Module body. Following is an example:

1
2
3
def run(mainClsName: String) = T.command {
  os.proc('java, "-cp", classFiles().path, mainClsName).call()
}

4.4 Task Context API

Task context APIs are available within the body of a T {...} or T.command{...}. APIs include T.ctx.dest (out/classFile/dest or out/run/dest for each task), T.ctx.log (out/run/log and out/classFiles/log for each task), T.ctx.env for environment variables. Mill starts a long-lived JVM server to avoid recurrent classloading. Therefore System.getenv may not yield up to date enviornment variables. Use def envVar = T.input { T.ctx.env.get("ENV_VAR")} to read a ennvironment variable.

4.5 Other Tasks

  • Use T.task {...} to define cached anonymous tasks that to be shared by other tasks. Not runnable from the command-line. It can take arguments and its output doesn’t need to be JSON-serializable. the output is not cached.
  • Use T.persistent {...} to define a task whose dest folder is not cleared, mostly for incremental compilers such as Zinc or WebPack. Otherwise, it is identitcal to Target.
  • Use T.input {...} to define a task that re-evaluated every thime. It is a generalization of Sources. It is evaluated every time and can contain aribrary code. It is used in cases where re-evaluation is enforced for each run.
  • Use T.worker {...} to keep returan-value in memory between evaluations.

4.6 Summary of Tasks

Target Command Source/Input Anonymous Task Persistent Target Worker
Cached on Disk X X X
Must be JSON Writable X X X
Must be JSON Readable X X
Runnable from the Command Line X X X
Can Take Arguments X X
Cached between Evaluations X

5 Modules

A Mill modules is an object extending mill.Module. It is used to group related tasks. Mill comes with several pre-defined modules. Each module will have a folder with the same name as the module name. Modules can be nested and the folder of a neste module is nested in the folder of its parent. The nested folders form a path that is used to run module tasks.

Use object foo extends ScalaModule {...} to define a Scala module named foo. Mutiple modules can be defined. Inside a module, use def moduleDeps = Seq(foo, bar) to define modules foo and bar as its dependencies.

To publish to Maven Central, you cnd extend the PublishMOdule as object foo extends ScalaModule with PublishModule {...}. Then you can publish using mill foo.publish task.

Use trait to group common collections of tasks. The built-in mill.scalalib package uses traits to define ScalaModule, TestScalaModule and SbtModule, all of which contain a set of operations such as compile, jar or assembly.

Each module has a millSourcePath field that is the path of its input source files. YOu can override it such as def millSourcePath = super.millSourcePath / "mysource". The value is propagated to the child modules.

ExternalModule is used to provide a library for use with Mill that is shared by the entire build.

Mill can load other projects from external or sub folders using Ammonite’s $file import.

Mill handle cross-building via the Cross[T] module.

6 Extending Mill

  • custom targets and commands.
  • custom workers
  • custom module
  • import $file
  • import $ivy to pull in artifacts from public repositories

7 Mill Internals

Mill’s most important abstraction is the dependency graph of tasks.

The module hierarchy is the graph of objects, starting from the root of the build.sc file, that extend mill.Module. The leaves of the hierarchy are the Targets you can run. The poistion in the module hierachy determines the cache output folder, output folder, source folder, and execute path in command line and in program.

Tasks are Applicative, not Monadic. That means tasks support map and zip, but not flatMap. This allow Mill to build static dependency grpah by finding out the structure of the entire dependency graph wihtout executing the tasks. Mill use T macro to avoid map and zip calls. The T {...} marcro creates call graph that determines the dependencies, run order, parallel possibility, watch files.

Mill operates on three concepts: the object hierarchy, the call graph, instantiating traits and classes.

7.1 Object Hierarchy

The module hierarchy is the graph of objects, starting from the root of the build.sc and extending mill.Module. The leaves are runnalbe targets or commands. The hiearchy is used consistently by task definition, command line and input/output/cache structures.

7.2 The Call Graph

The call graph is reified via the T {...} macro. The call graph determines the task depdencies, source file watching, running order and parallelism. It simplfies the declaration of task dependency.

Instead of the regular Scala code:

1
2
3
4
val b = ...
val c = ...
val d = ...
val a = f(b, c, d)

Use the macro to make it easy to build call graph:

1
2
3
4
val b = T { ... }
val c = T { ... }
val d = T { ... }
val a = T { f(b(), c(), d()) }

7.3 Instantiating Traits and Classes

Mill prefer to use traits for re-using common parts of a build.