How to join repositories in CMake¶
Sometimes there is a need in a project to use directly some other repository (local or external). This means, that we want to be able to incorporate parts (or all) of sources of the imported repository into our build system. Usually, in such a case we would also like to track which version is used at a given time.
We can solve this problem in a few ways, e.g. by using:
- git submodules,
- git subtree,
- CMake
FetchContent
, - CMake
ExternalProject
.
First two are handled by the version control system and the last two are handled by the build system. Each of them has its own strengths and weaknesses, depending on the current project needs.
Today I want to briefly present how CMake allows joining repositories with its FetchContent
and ExternalProject
modules.
FetchContent
¶
Info
FetchContent
is available since CMake 3.11.
The role of FetchContent
is to obtain resources from a specified location and make it available to the rest of the
build system. Note, that I used the word “resources” which can mean not only source codes, but also toolchains,
artifacts from other builds, scripts, application icons etc.
Below you can see a typical usage:
In this example, we are downloading my platform
repository and using the main
branch. Here we can see, that it
consists of two parts:
- define (declare) what should be downloaded and where it is,
- launch download and make it available (populate) to the rest of the build system.
Note, that platform
(which I will reference later as <resource_name>
) is later used as prefix in the related
variables or argument in related functions, so make it meaningful and unique.
Everything happens at the “configure” (generation) stage, once CMake reaches FetchContent_MakeAvailable(platform)
command. This is important to remember and understand, because generation step is usually done once and FetchContent
can influence this process. FetchContent_MakeAvailable
command is executed only once for every resource during CMake
configuration. Consider those two functions as the FetchContent
idiom.
Info
Using FetchContent
requires including proper module: include(FetchContent)
.
By default, everything is downloaded into your build directory in a well-defined structure:
${CMAKE_BINARY_DIR}/
- _deps/
- <resource_name>-build
- <resource_name>-src
- <resource_name>-subbuild
Note
I’m not exactly sure what is the purpose of the subbuild
directory. After examining its contents it looks like
CMake is generating CMakeLists.txt
files there to implement FetchContent
in terms of ExternalProject
command.
Maybe it was the easiest way of adding FetchContent
, since ExternalProject
was already available for a long time
and can be reused at generation stage with a simple trick.
Once CMake successfully downloads our external content, it sets two variables that can be used in CMakeLists.txt
to
locate the new data:
<resource_name>_SOURCE_DIR
– specifies the location of the downloaded sources,<resource_name>_BINARY_DIR
– specifies where is the default build directory for the downloaded sources.
However using them directly is not needed thanks to FetchContent_MakeAvailable(<resource_name>)
command. It both
downloads and includes that sources to our build system as if they were already part of our codebase.
From now on, downloded directory is subject to all CMake settings of the main project and from the build system point of view is treated as the local sources.
ExternalProject
¶
Info
ExternalProject
is available since CMake 3.0.
ExternalProject
command is almost identical to FetchContent
in terms of the purpose and available options with,
except for one extremely important difference: it is launched at build stage. In my opinion it is a crucial drawback
comparing to FetchContent
because external content is made available to us after our build system is already generated
and all build decisions are already made. There is no way to add sources downloaded with this method into our
compilation stage (or at least not a trivial way).
To compensate this problem, ExternalProject
allows launching another CMake instance on the downloaded sources and pass
custom command to be used to compile it. Let’s see this in an example:
Info
Using ExternalProject
requires including proper module: include(ExternalProject)
.
The snippet above will perform the following actions:
- It will download specified repository into
platform-src
path (like inFetchContent
). - It will automatically call CMake in
platform-build
passing-DPLATFORM=freertos-arm
:
Note
FetchContent
offers the same possibility.
Making a dependency to some other target with add_dependencies(<some_target> platform)
command is required in order to
tell CMake when all that actions should be actually done. With FetchContent
we didn’t have that problem: generation is
done sequentially in order of parsing CMakeLists.txt
files. At build stage, CMake is composing a dependency graph of
all defined targets in order to determine in which order all targets should be built. That graph is influenced by two
commands:
target_link_libraries()
, where we implicitly say that in order to build one target we need to link with another one so it would be nice if it was already built,add_dependecies()
, where we explicitly say, that one target should be built before another one (even if they are not using each other).
Without making an explicit dependency between ExternalProject
target and <some_target>
CMake would simply ignore
that step.
To be honest, once FetchContent
became available in CMake I completely lost the purpose for using ExternalProject
.
Customization¶
Both FetchContent
and ExternalProject
are highly customizable. Moreover, most of the options are available in both
of them. Let’s see some examples along with short comments on usage.
Download methods¶
Both commands support the following download types:
- Git,
- SVN,
- CVS,
- Mercurial,
- HTTP.
Each method has a typical set of settings you would normally expect, like branch name, commit/revision number, URL etc. I have personally used only Git in this context.
While doing so, for a long time I have specified main
branch as the content source. The problem with such declaration
is that every time something changes in the content’s upstream, it immediately gets populated to the places that use it.
In my case, I had a few “utility” repositories, that were reused with FetchContent
in highly active product
repositories. For sure, using branch name relieves you from remembering to update FetchContent
declaration in every
repository once a change is introduced. But on the other side, if you are doing some breaking change, then it can create
a lot of chaos. Especially if you have an active team. Trust me, you don’t want to hear all those complaints. So instead
of branch name, specify tag or commit hash. The only drawback of this approach is that you need to remember to update
that hash whenever necessary. It is also not obvious at first sight if the changed hash is newer or older in history
without checking the Git log.
Fetching without network¶
Most fetching methods naturally assume, that you have an Internet/Intranet connection to your remote content. However there are situations, where you are totally offline (business trips, network problems, etc). This could literally block your work.
Fortunately there are two options you can use to work without connection to the upstream:
- using local content copy on you hard drive (e.g. clone of the repo done while connection was still available),
- forcing CMake to not try updating (checking for changes) of the content already downloaded by the previous
FetchContent
/ExternalProject
run.
In the first case, you can simply replace repository URL with a path to the local repository copy. It works very well. The only observable difference is the fact that content will not be physically copied into you build directory – CMake will use the existing files that you are pointing to. This should hardly ever be an issue.
In the second case, you rely on the fact that you have successfully run FetchContent
/ExternalProject
at least once.
Every time CMake is processing it, it checks if the content is up-to-date. In the offline scenario that check will
result in an error. You can explicitly ask CMake to skip that part by adding additional parameter –
UPDATE_DISCONNECTED
with value ON
:
Note, that this will not skip downloading content for the first time. It will also not update your content if you have a healthy connection.
Hint
Local directory turned out to be the perfect solution if you need to make changes in fetched repository and check if it breaks dependent projects. It would be really annoying to do this using Git (it could generate a lot of trash commits).
Choosing the right place to "run" FetchContent
¶
FetchContent
runs at the generation stage, so choosing the place where it is declared determines which parts of the
build system could interact with the fetched content.
For example, FetchContent
can be run before project()
function (both declare and make available statements). This
could allow you to setup custom toolchain (via CMAKE_TOOLCHAIN_FILE
) which is downloaded from the company’s server.
Note
FetchContent_Declare()
only declares what and from where should be downloaded. FetchContent_MakeAvailable()
actually does the job.
Specifying subdirectory of referenced content¶
By default FetchContent
blindly adds root of downloaded content to the build system. If you need to bypass it and get
to the specific directory, then you have to do it by hand, e.g. my HAL project requires, that clients should use only
lib
directory, while root path is used to launch tests, generate documentation etc.:
Organizing FetchContent
declarations¶
I also like to put FetchContent
commands in the separate files called <resource_name>.cmake
and include them in
other CMakeLists.txt
where needed:
Summary¶
There are at least a few ways how you can join repositories in your source code. CMake offers FetchContent
at
generation stage and ExternalProject
at build stage. Each comes with a rich set of options, which you can check in the
official docs. Choose the right method for your specific case. In most cases FetchContent
would
suit you better.
If you liked this article, then please share it on the social media you use. It will greatly increase the chance that someone else will benefit from it. It will also motivate me to write more and better articles in the future.