About

A very short guide to GNU Make.

This guide was written to give a quickstart and pointers for using make together with Kerblam!, but can be read as a standalone document.

If you are looking for the full GNU make documentation, it can be found here. It's pretty long.

Contributing

If you find a typo or you want to change something, feel free to open an issue.

Shorthands and conventions

  • Make (or make) is the GNU Make program, while make is the command make.
  • Important notes or prenthetical elements are contained in boxes,

    Like this.

  • Keywords or definitions are bold italics.

Notes

The \t character in code blocs is forcibly rendered as four spaces. I cannot easily change this, so read the example makefiles with a pinch of salt.

Introduction

Make creates files based on recipes, know in Make jargon as rules. By default, make looks for and reads a file in the current working directory named makefile.

You can tell make to read another file with make -f path/to/makefile. Note that all paths in a makefile are relative to the working directory in which make was called, not where the makefile is.

A make rule looks like this:

file/to/create:
	echo "Commands to create the file" > file/to/create

Let's break it down. The first line is the rule's signature. file/to/create: tells make that this rule creates a file named create in the folder file/to/. This file is the rule's target. It's up to you to actually write commands that create this file: make does not check that the file is really created after a rule is run. This can be useful in some cases, as we will see later.

Next, we have the body of the rule (e.g. the echo ... line). It MUST be indented with a <Tab> (\t) directly underneath the rule's signature. Each line that starts with a <Tab> thereafter is part of the same rule:

target:
	command # Body starts here
	        # |
    command # |
	        # Last line of the body
# Here the rule has ended - there is no Tab

Each line in the body of a rule is a shell command that will be run when the rule is executed.

You can specify requirement files in a rule's signature after the ::

file/to/create: first_requirement second_requirement
	wc -l first_requirement >> file/to/create
	wc -l second_requirement >> file/to/create

This lets Make know that to create file/to/create you first need the first_requirement and second_requirement files. It's here that makes becomes useful: it can "string together" different rules to create the files that we want:

output_file: intermediate_file
	wc intermediate_file > output_file

intermediate_file:
	echo "This is some words in the file" > intermediate_file

You need not write rules for all requirement files. If make finds a requirement with no rule, and the file is not already present in the filesystem, it simply fails with an error.

Make sees the output_file as the first rule. It implicitely sets it to be the default target and tries to create it. It sees that it first needs to create the intermediate_file. It has a rule for it, so it executes that rule first, followed by the rule for the output_file. Done!

Internally, make creates a dependency graph of the rules, starting from the default target(s). For example, consider this makefile (rule bodies are omitted for clarity):

Target: req1 req2
	# ...

req2: sub_req_1 sub_req_2
	# ...

sub_req_1:
	# ...

It will be parsed to a tree structure like this:

           ┌─────┐               
         ┌►│req 1│               
┌──────┐ │ └─────┘    ┌─────────┐
│Target├─┤         ┌─►│sub req 1│
└──────┘ │ ┌─────┐ │  └─────────┘
         └►│req 2├─┤             
           └─────┘ │  ┌─────────┐
                   └─►│sub req 2│
                      └─────────┘

From this, make can then run the rules in the correct order: from the "leaves" of the tree up to the "root". You can also see that there is no relationship between the req 2 and req 1 branches. If you tell make to run in parallel (with the -j or --jobs flag), make will run these independent branches in parallel for you, speeding up execution by a lot. See the parallel execution section of the manual to learn more.

Another useful feature of make is that it skips creating files that are already there. In the example above, say that sub req 2 changed, but everything else did not. Make is smart enough to only run the rule for sub req 2 (if any), then just req 2 and then Target, since that branch is out of date:

           ┌─────┐               
         ┌►│req 1│               
┌ ─ ─ ─┐ │ └─────┘    ┌ ─ ─ ─ ─ ┐
 Target ─┤         ┌─► sub req 1 
└─ ─ ─ ┘ │ ┌ ─ ─ ┐ │  └ ─ ─ ─ ─ ┘
         └► req 2 ─┤             
           └ ─ ─ ┘ │  ┌─────────┐
                   └─►│sub req 2│
                      └─────────┘

You can now hopefully see why writing makefiles instead of shell scripts for complex file transformations is useful.

Variables

Make supports variable assignment. All variables in make are strings:

my_variable = this is the content

You can then reference the text in the variable by using $(VAR):

my_variable = this is the content

default:
	echo "$(my_variable)"

Variables just one character long may omit the () wrapping the variable name. E.g. $(a) is the same as $a.

To use the variable, make will literally remove $(VAR) and put the string inside the variable in its place. This is important to remember. Make literally substitutes the string in place of the variable, without any magic.

The above file is exactly the same as this one:

default:
	echo: "this is the content"

Make first reads the whole makefile, then substitutes the variables, and then executes the rules.

You will have noticed that the $(VAR) syntax is also used for shell variable substitutions and function calls. If you want to use shell variables in your rules, you will need to escape the $ used by make by doubling it; e.g. use $$(SHELL_VAR) instead of $(VAR). While expanding variables, make will convert $$ to $, and pass it to the shell, therefore making the canonical $ for shell invocations.

However, you can use shell variables directly with $(): continue reading!

Make copies all shell environment variables upon starting as if they where written in the makefile. For example, this works:

path_content.txt:
	# Notice that the variable here is a make variable, since it has just
	# one \$ not two.
	echo $(PATH) > path_content.txt

You can conditionally override environment variables with the ?= assignment. This assigns the value of the variable only if it's not already assigned:

some_var ?= my_text

default:
	echo $(some_var)

If you run make, the output will be "my text". If you run some_var="alternative text" make, the output will be "alternative text", since the assignment in the makefile will not be made.

Try to keep your makefile variables snake_case.

What to do with variables

You can do a lot with variables. They are most commonly used to write the requirements for a rule:

files = one.txt two.txt three.txt

output: $(files)
    ...

You can also use them to shorten long calls:

e_dir = long/path/to/executable/directory
flags = --some --default --flags --that-are --always-used

output:
	$(e_dir)/create_file $(flags) > output

This is very useful to work with stuff like Rscript:

r = Rscript --vanilla
output:
	$(r) my_r_script.R > output

You can use conditional assignment for variables that the user can override:

option ?= default_value

default:
	execute --var $(option)

The user can use the default value or alternatively export an option variable and override it. This can be useful when debugging makefiles: the default option is what you want to use for a "regular" run, but you can override it during development to get, e.g. debug information.

Automatic variables

When a recipe is run, make sets some automatic variables, so that you can write your recipes in a less verbose way.

You can read the full list of all automatic variables in the make manual.

Here are some of the most commonly used ones:

  • $@ is the target of the rule:
    path/to/target.txt: requirement.txt
    	cat requirement.txt > $@
    	# cat requirement.txt > path/to/target.txt
    
  • $< is the first requirement. Careful when using this when you have more than one requirement, as it's order specific:
    path/to/target.txt: requirement.txt
    	cat $< > $@
    	# cat requirement.txt > path/to/target.txt
    
  • $(@D) and $(@F) are the directory of the target file and the name of the target file, respectively. This is very useful to create containing folders for output files:
    path/to/target.txt: requirement.txt
    	mkdir -p $(@D)
    	# mkdir -p path/to/
    	cat $< > $@
    	# cat requirement.txt > path/to/target.txt
    
    You can do the same with $(<D) and $(<F) for the first requirement file.
  • $^ is the full list of requirements, separated by spaces:
    target.txt: one.txt two.txt three.txt
    	concat_files $^
    	# concat_files one.txt two.txt three.txt
    

You may wonder how to select the N-th requirement. There is no automatic variable for each requirement, but you can use $^ to get it with the word function: $(word n, $^) where n is the 1-indexed position of the requirement you want. For example, if one two three are the requirements, $(word 2, $^) will result in the string two.

Read more about functions here or in the manual.

Special Variables

There are many special variables

More about variables

// Include the different types of variable expansion

Pattern matching

To convert a .tsv to a .csv file, you can use the xsv tool, with the command:

xsv fmt -d '\t' file.tsv > file.csv

We can write the same in our makefile:

file.csv: file.tsv
	xsv fmt -d '\t' file.tsv > file.csv

We can make it shorter with automatic variables:

file.csv: file.tsv
	xsv fmt -d '\t' $< > $@

However, since we can do this for any .tsv file, we can write a generic rule:

%.csv: %.tsv
	xsv fmt -d '\t' $< > $@

The % is a wildcard, working similarly to a shell *: it will try to match the longest string it can, matching any character. If the resulting full strings exactly match, the rule applies to that case.

A generic rule must have one and only one % in the target. There can be one and only one % in each of the requirements.

In this case, any file ending in .csv can be created from a file (in the same folder) ending in .tsv:

  • path/to/file.csv from path/to/file.tsv -> OK!;
  • file.csv from path/to/file.csv -> NO! The files do not share the same stem;
  • file.csv from file.tsv.gz -> NO! The files do not share the same suffix;

Here, using automatic variables is not optional. Since we do not know what % will be replaced with at runtime, we cannot write static filenames. Using % in the body of the rule is not supported.

Make will use the generic rule whenever it has to, but not more than that. This means that you cannot have make create all possible .csv files from all .tsv files by just writing the generic rule above. You need to specifically ask for .csv files as requirements to have make create them. Assume that you have the one.tsv and two.tsv files and you want to convert them in .csv. You could write this makefile:

default: one.csv two.csv
# The above rule has no body, but it's ok! Make will just do nothing when
# the rule is executed.

%.csv: %.tsv
	xsv fmt -d '\t' $< > $@

When you call make, it will use the generic rule twice to create the prerequisites of default, as if you had written:

default: one.csv two.csv

one.csv: one.tsv
	xsv fmt -d '\t' $< > $@

two.csv: two.tsv
	xsv fmt -d '\t' $< > $@

Functions

Make has support for a ton of text and path parsing built-in functions.

Confusingly, functions are called just as variables are referenced, with the $(fun arg1,arg2,...,argn) syntax: the name of the function, a space, and a comma-separated list of arguments.

You cannot escape characters in functions. Read more

Functions can do a lot of things. They are where the real power of make is. The full list of functions can be found in the manual.

You can do a bunch of things with functions. Here are some examples.

Finding files

The wildcard function searches for files:

files = $(wildcard inputs/*.txt)

default:
	echo "Input files are: $(files)"

Please read the more about variables section before using the wildcard function!

This is very useful for finding requirements. Say that the requirement for a all.txt file are all the files in the input folder, but you don't know in advance what the input folder will contain. You can use the wildcard function for that:

files = $(wildcard input/*)

all.txt: $(files)
	merge_files $^

Weirdness of Makefiles

This section covers a bunch of edge cases that you will encounter as you write more complex makefiles.

One shell

One of the odd things about make is that it runs each line in the body in a different shell.

Unknown prerequisites

Assume you have a create_random_files function that does just that: it makes some N number of files, with unknown file names, in some output directory.

You want to use all of these files as input to create the all.txt file, just like the above example. You might write this makefile (that will not work):

files = $(wildcard input/*.txt)

all.txt: $(files)
	merge_files $^

create_files:
	create_random_files input/

.DEFAULT_TARGET: all.txt create_files

We cannot control in what order make will run our rules. Assuming input/ is originally empty, the all.txt requirements might be evaluated to be... nothing, since create_files has not been run yet.

Even if the above issue would not occur, we would have other problems. In this makefile, the create_files rule will be always run, since it's a (phony) default target. This causes all.txt to also be always re-run, since its requirements are always newer than the target. If create_files is near the start of our make graph, it triggers the remake of many (if not all) rules, which defeats the purpose of using make.

There is no elegant way to fix this issue. However, we can fix it by using a flag file. These files are not used in the make process, but are just there to serve as timestamps of creation of other - unknown - files:

files = $(wildcard input/*.txt)

all.txt: flags/create_files.flag
	merge_files $(files)

flags/create_files.flag:
	create_random_files input/
    touch $@

The flags/create_files.flag file is empty, but has a timestamp record of when its rule is run (by the touch command). We can then use it in place of the $(files) variable when we define the requirements for a rule.

In this way, the rules will be executed in the correct order, so $(files) will always be correctly populated, plus we do not lose the ability of make to only recreate out-of-date files.

It's always a good idea to keep flag files to a minimum. Rules that create or act upon an unknow number of files should be rare. Using flag files can quickly become a crutch that allows your scripts to take a folder as input instead of a list of files to be processed, but this will bite you in the ass in the long-run, or when you need to process one specific file instead of a bunch of them.