This is a study note of Mill: a new Scala/Java build tool. The motivation is to use the functional programming concept and good ideas from Bazel to make build easy.
1 Getting Started
1.1 Installation
|
|
To build a Scala project, create a build.sc
file:
|
|
The first line import $ivy.`com.lihaoyi::mill-contrib-bloop:$MILL_VERSION`
imports the right version of Bloop plugin that matchs the project Mill version. The setting is important for Bloop working properly in Visual Studio Code.
To add Mill bootstrap script in a codebase, download the script, name it as mill
and make it executable script in the project folder in a Linux or OS-X machine: curl -L https://github.com/lihaoyi/mill/releases/download/0.6.1/0.6.1 > mill && chmod +x mill
.
Then create a hi/src/Hi.scala
file with the following content:
|
|
Run the code using mill hi
. Because a non-trivial project depends on Ivy packages, the build.sc
should import more types. Following is a build file of a sample Cask application:
|
|
1.2 Mill Version
Once Mill is installed, it can use any version for a project. To specify a version, create a .mill-version
file in the project root folder. It takes precedence over the version of mill
.
To override the Mill version manually, pass in a MILL_VERSION
enviornment variable in the command line. For example: MILL_VERSION=0.6.1 mill __.complie
.
All three methods: the mill
, the .mill-version
and the MILL_VERSION
can be used for a development version that has a syntax of #.#.#-n-hash
.
1.3 Scala Project Structure
It expects a project structure as the following:
|
|
The code object foo extends ScalaModule {...}
defines a module and the object name foo
is the module root folder. The foo
folder is in the same folder as the build.sc
file. The build output will be in out/foo/
.
The scala source files are in foo/src/
.
The out/
folder has a subfolder for each module. In each module’s folder, there is a subfolder for each task. Each task’s folder has the following contents:
dest/
: a path used as a scratch space or generated files (binary blobs) as defined inPathRef
.log
: the output ofstdout
andstderr
. Same as the console output during execution.meta.json
: contains the metadata/output returned by that task. It include cache-key for target or JOSN-serialized return-value of a command.
Additionally, the out/
folder has some top-level build-related files and folders.
mill-profie.json
logs the tasks, time used and cached flags for the last Mill executed command.mill-worker-somerandomcharacters/
contains some runtime data.mill/
contains internal tools. For example, Mill uses sbt Zinc as the incremental compiler for Scala.resolve
,show
,plan
etc: the results of Mill command line tools.
2 Basic Usage
2.1 Common Tasks
Some common commands are:
|
|
The first time you run a command, there are several warning messages like WARNING: An illegal reflective access operation has occurred
. It is generated by an old version of com.google.protobuf.UnsafeUtil
and can be safely ignored.
The command after the module name is a task. Tasks can be cached targets
such as foo.compile
or un-cached commands
such as foo.run
.
2.2 Command-line Tools
Mill has the following command-line tools: all
, clean
, inspect
, path
, plan
, resolve
, show
, shutdown
, version
, visualize
, and visualizePlan
.
In command line, _
is a wildcard pattern that matchs all modules or tasks in the specified scope. At top level, it represents all top-level modules. The __
represents all modules including top-level modules and nested modules.
|
|
mill path
prints out a dependency chain between the first task and the second.
mill show visualize
takes a subset of the Mill build graph and draws out their relationships in .svg and .png form for you to inspect. It also generates .txt, .dot and .json for easy processing by downstream tools. Another use case is to view the relationships between modules.
mill show visualizePlan
is similar to mill show visualize except that it shows a graph of the entire build plan, including tasks not directly resolved by the query. Tasks directly resolved are shown with a solid border, and dependencies are shown with a dotted border.
mill clean
deletes all the cached outputs of previously executed tasks. It can apply to the entire project, entire modules, or specific tasks. mill clean foo.runBackgournd
stops the background process and clean its output.
mill mill.scalalib.Dependency/updates
searches for dependency updates.
3 Configure Mill
3.1 Compilation & Execution Flags
scalaVersion
: scala version,String
scalacOptions
: scalac flags,Seq[String]
forkArgs
: JVM flags forrun
andtest
tasks in a sub process,Seq[String]
forkEnv
: environment variables forrun
,Map[String, String]
Use runLocal
or testLocal
to run the application in-process without using forkArgs
and forEnv
.
3.2 Ivy Dependencies
|
|
The ivyDeps
has a value of Agg[Dep]
that is created from the Agg
object and a set of ivy
strings. The Agg[Dep]
is a set of Dep
, the type for each dependent artifact.
Use ::
for Scala version dependencies. Use :::
for cross-published dependencies using full Scala version such as 2.13.1
instead of 2.13
.
Use:
for Java dependencies. If Mill cannot generate the correct artifact name and version, search the repository name and version and use the full name separated by :
.
To add repositories:
|
|
Create a custom ZincWorkerModule
to add a custom resolver.
3.3 Test Suite
Add a nested module extending Tests
:
|
|
To run a specific test case using uTest, pass it as an argument, ex: mill too.test foo.MyTestSuite.myCase
. You can have multiple test suites that each has its folder.
You can define more than one test modules.
3.4 Scala Compiler Plugin
|
|
The scalacPluginIvyDeps
depends on compileIvyDeps
, therefore both are set.
3.5 Code Format
Extend mill.scalalib.scalafmt.ScalafmtModule
. run mill.foo.reformat
to format code and mill checkFormat
to check format.
Reformat a project’s code globally with mill mill.scalalib.scalafmt.ScalafmtModule/reformatAll __.sources
command, or only check the code’s format with mill mill.scalalib.scalafmt.ScalafmtModule/checkFormatAll __.sources
.
Create a .scalafmt.conf
file at the project root to config formatting.
3.6 Common and Global Configuration
You can define a trait that extends ScalaModule
and contains common configurations. Then another module can extend from the trait. Common tasks include scalaVersoin
, scalacOptions
, Tests
and ScalafmtModule
.
There are two ways to define global configuration. For interactive mode, use ~/.mill/ammonite/predef.sc
, for command line use ~/.mill/ammonite/predefScript.sc
. You can create symlink if they are the same.
3.7 Custom
Define new cached tasks, called target
s, using the T {...}
syntax. Targets cannot take parametrs. The return-type of a target has to be JSON-serializable (using uPickle
). Mill provides T.ctx.dest
folder as scratch space or to store files.
Define an un-cached tasks, called command
s, using the T.command {...}
syntax.
|
|
Extend object myGroup extends Module{...}
to create custom module that can group other modules. Use trait mySpecial extend Module {...}
to work as a parent of other modules.
Redefine tasks to override them and use super
to call the original task. In Mill build files, the override
keywoard is optional.
Override unmanagedClasspath
to define unmaged Jars. It can use mill.module.Util.download
to download non-maven Jars.
|
|
Redefine main class by def mainClass = Some("foo.bar.Baz")
.
To merge/exclude files form assembly, redefine assemblyRules
with Rule.append
, Rule.AppendPattern
and Rule.ExcludePattern
.
4 Task Graph
Task is a core abstraction that is used by Mill to define, order and cache work it needs to do. Each task has a list of inputs and single output of type [[T]]
. There are three primary kinds of tasks:
- Target
T {...}
- Sources
T.sources {...}
- Command
T.command {...}
4.1 Target
The code def allSources = T { os.walk(sourceRoot().path).map(PathRef(_)) }
defines a target named allSources
that depends on the sourceRoot
task. The return value of targets has to be JSON-serializable via uPickle
. Use the following code to make a case class JSON-serializable:
|
|
The PathRef
is used to wrap the os.Path
and the result is JSON-serializable. PathRef
has a .hashCode()
method to hash all input files to track input file changes. Mill creates a path on disk as scratch sapce and to store its output files.
- The
out/targetName/meta.json
file has the returned metadata. - The
out/targetName/dest/
has the output files. - The
out/targetName/log
has the logging output.
The graph of inter-dependent targets is evaluated in topological order and stops when one of upstream dependencies has failed, the downstream targets are not evaluated.
4.2 Sources
The following code defines one Sources
task:
|
|
Sources
is defined using T.sources {...}
, taking one-or-more os.Path
arguments. The Sources
is a subclass of Target[Seq[PathRef]]
. Its hash code depends on both the path and MD5 hash of the filesystem tree under that path.
T.sources
also take Seq[PathRef]
to create a Sources
from multiple Sources
task. Following is an example:
|
|
4.3 Command
Use T.command {...}
to define a command that can run arbitrary code. Commands can only be defined directly within a Module
body. Following is an example:
|
|
4.4 Task Context API
Task context APIs are available within the body of a T {...}
or T.command{...}
. APIs include T.ctx.dest
(out/classFile/dest
or out/run/dest
for each task), T.ctx.log
(out/run/log
and out/classFiles/log
for each task), T.ctx.env
for environment variables. Mill starts a long-lived JVM server to avoid recurrent classloading. Therefore System.getenv
may not yield up to date enviornment variables. Use def envVar = T.input { T.ctx.env.get("ENV_VAR")}
to read a ennvironment variable.
4.5 Other Tasks
- Use
T.task {...}
to define cached anonymous tasks that to be shared by other tasks. Not runnable from the command-line. It can take arguments and its output doesn’t need to be JSON-serializable. the output is not cached. - Use
T.persistent {...}
to define a task whosedest
folder is not cleared, mostly for incremental compilers such asZinc
orWebPack
. Otherwise, it is identitcal toTarget
. - Use
T.input {...}
to define a task that re-evaluated every thime. It is a generalization ofSources
. It is evaluated every time and can contain aribrary code. It is used in cases where re-evaluation is enforced for each run. - Use
T.worker {...}
to keep returan-value in memory between evaluations.
4.6 Summary of Tasks
Target | Command | Source/Input | Anonymous Task | Persistent Target | Worker | |
---|---|---|---|---|---|---|
Cached on Disk | X | X | X | |||
Must be JSON Writable | X | X | X | |||
Must be JSON Readable | X | X | ||||
Runnable from the Command Line | X | X | X | |||
Can Take Arguments | X | X | ||||
Cached between Evaluations | X |
5 Modules
A Mill modules is an object
extending mill.Module
. It is used to group related tasks. Mill comes with several pre-defined modules. Each module will have a folder with the same name as the module name. Modules can be nested and the folder of a neste module is nested in the folder of its parent. The nested folders form a path that is used to run module tasks.
Use object foo extends ScalaModule {...}
to define a Scala module named foo
. Mutiple modules can be defined. Inside a module, use def moduleDeps = Seq(foo, bar)
to define modules foo
and bar
as its dependencies.
To publish to Maven Central, you cnd extend the PublishMOdule
as object foo extends ScalaModule with PublishModule {...}
. Then you can publish using mill foo.publish
task.
Use trait to group common collections of tasks. The built-in mill.scalalib
package uses traits to define ScalaModule
, TestScalaModule
and SbtModule
, all of which contain a set of operations such as compile
, jar
or assembly
.
Each module has a millSourcePath
field that is the path of its input source files. YOu can override it such as def millSourcePath = super.millSourcePath / "mysource"
. The value is propagated to the child modules.
ExternalModule
is used to provide a library for use with Mill that is shared by the entire build.
Mill can load other projects from external or sub folders using Ammonite’s $file
import.
Mill handle cross-building via the Cross[T]
module.
6 Extending Mill
- custom targets and commands.
- custom workers
- custom module
import $file
import $ivy
to pull in artifacts from public repositories
7 Mill Internals
Mill’s most important abstraction is the dependency graph of tasks.
The module hierarchy is the graph of objects, starting from the root of the build.sc
file, that extend mill.Module
. The leaves of the hierarchy are the Target
s you can run. The poistion in the module hierachy determines the cache output folder, output folder, source folder, and execute path in command line and in program.
Tasks are Applicative
, not Monadic
. That means tasks support map
and zip
, but not flatMap
. This allow Mill to build static dependency grpah by finding out the structure of the entire dependency graph wihtout executing the tasks. Mill use T
macro to avoid map
and zip
calls. The T {...}
marcro creates call graph that determines the dependencies, run order, parallel possibility, watch files.
Mill operates on three concepts: the object hierarchy, the call graph, instantiating traits and classes.
7.1 Object Hierarchy
The module hierarchy is the graph of objects, starting from the root of the build.sc
and extending mill.Module
. The leaves are runnalbe targets or commands. The hiearchy is used consistently by task definition, command line and input/output/cache structures.
7.2 The Call Graph
The call graph is reified via the T {...}
macro. The call graph determines the task depdencies, source file watching, running order and parallelism. It simplfies the declaration of task dependency.
Instead of the regular Scala code:
|
|
Use the macro to make it easy to build call graph:
|
|
7.3 Instantiating Traits and Classes
Mill prefer to use traits for re-using common parts of a build.