Sometimes there is a need in a project to use directly some other repository (local or external). This means, that we want to be able to incorporate parts (or all) of sources of the imported repository into our build system. Usually, in such a case we would also like to track which version is used at a given time.
We can solve this problem in a few ways, e.g. by using:
- git submodules,
- git subtree,
- CMake
FetchContent
, - CMake
ExternalProject
.
First two are handled by the version control system and the last two are handled by the build system. Each of them has its own strengths and weaknesses, depending on the current project needs.
Today I want to briefly present how CMake allows joining repositories with its FetchContent
and ExternalProject
modules.
FetchContent
Note
FetchContent
is available since CMake 3.11.
The role of FetchContent
is to obtain resources from a specified location and make it available to the rest of the build system. Note, that I used the word “resources” which can mean not only source codes, but also toolchains, artifacts from other builds, scripts, application icons etc.
Below you can see a typical usage:
include(FetchContent)
# 1.
FetchContent_Declare(
platform
GIT_REPOSITORY https://github.com/kubasejdak/platform.git
GIT_TAG origin/master
)
# 2.
FetchContent_GetProperties(platform)
if (NOT platform_POPULATED)
FetchContent_Populate(platform)
endif ()
In this example, we are downloading my platform
repository and using the master
branch. Here we can see, that it consists of two parts:
- define (declare) what should be downloaded and where it is,
- launch download and make it available (populate) to the rest of the build system.
Note, that platform
(which I will reference later as <resource_name>
) is later used as prefix in the related variables or argument in related functions, so make it meaningful and unique.
Everything happens at the “configure” (generation) stage, once CMake reaches FetchContent_Populate(platform)
command. This is important to remember and understand, because generation step is usually done once and FetchContent
can influence this process. Populate command can be used only once for every resource during cmake configuration, which explains the conditional above. Consider it the FetchContent
idiom.
Note
Using FetchContent
requires including proper module: include(FetchContent)
.
By default, everything is downloaded into your build directory in a well-defined structure:
${CMAKE_CURRENT_BINARY_DIR}/
- _deps/
- <resource_name>-build
- <resource_name>-src
- <resource_name>-subbuild
Note
I’m not exactly sure what is the purpose of the subbuild
directory. After examining its contents it looks like CMake is generating CMakeLists.txt files there to implement FetchContent
in terms of ExternalProject
command. Maybe it was the easiest way of adding FetchContent
, since ExternalProject
was already available for a long time and can be reused at generation stage with a simple trick.
Once CMake successfully downloads our external content, it sets two variables that can be used in CMakeLists.txt to locate the new data:
<resource_name>_SOURCE_DIR
– specifies the location of the downloaded sources,<resource_name>_BINARY_DIR
– specifies where is the default build directory for the downloaded sources.
Typically, we add that repository to our build system as if it was already part of our codebase with the following command:
add_subdirectory(${<resource_name>_SOURCE_DIR} ${<resource_name>_BINARY_DIR})
The second argument passed to add_subdirectory()
is optional and tells CMake, that we don’t want to build that directory in a place basing on our repository structure (${CMAKE_CURRENT_BINARY_DIR}
) but in a place specified by FetchContent
. This nicely separates out compilation products from the external ones.
From now on, included directory is subject to all CMake settings of the main project and from the build system point of view is treated as the local sources.
ExternalProject
Note
ExternalProject is available since CMake 3.0.
ExternalProject
command is almost identical to FetchContent
in terms of the purpose and available options with, except for one extremely important difference: it is launched at build stage. In my opinion it is a crucial drawback comparing to FetchContent
because external content is made available to us after our build system is already generated and all build decisions are already made. There is no way to add sources downloaded with this method into our compilation stage (or at least not a trivial way).
To compensate this problem, ExternalProject
allows launching another CMake instance on the downloaded sources and pass custom command to be used to compile it. Let’s see this in an example:
include(ExternalProject)
ExternalProject_Add(
platform
GIT_REPOSITORY https://github.com/kubasejdak/platform.git
GIT_TAG origin/master
CMAKE_ARGS -DPLATFORM=freertos-arm
)
add_dependencies(<some_target> platform)
Note
Using ExternalProject
requires including proper module: include(ExternalProject)
.
The snippet above will perform the following actions:
- It will download specified repository into
platform-src
path (like inFetchContent
). - It will automatically call CMake in platform-build passing
-DPLATFORM=freertos-arm
:
cd platform-build
cmake -DPLATFORM=freertos-arm ../platform-src
Note
FetchContent
offers the same possibility, but it also allows achieving the same thing in a more natural way via call to add_subdirectory()
.
Making a dependency to some other target with add_dependencies(<some_target> platform)
command is required in order to tell CMake when all that actions should be actually done. With FetchContent
we didn’t have that problem: generation is done sequentially in order of parsing CMakeLists.txt files. At build stage, CMake is composing a dependency graph of all defined targets in order to determine in which order all targets should be built. That graph is influenced by two commands:
target_link libraries()
, where we implicitly say that in order to build one target we need to link with another one so it would be nice if it was already built,add_dependecies()
, where we explicitly say, that one target should be built before another one (even if they are not using each other).
Without making an explicit dependency between ExternalProject
target and <some_target>
CMake would simply ignore that step.
To be honest, once FetchContent
became available in CMake I completely lost the purpose for using ExternalProject
.
Customization
Both FetchContent
and ExternalProject
are highly customizable. Moreover, most of the options are available in both of them. Let’s see some examples along with short comments on usage.
Download methods
Both commands support the following download types:
- Git,
- SVN,
- CVS,
- Mercurial,
- HTTP.
Each method has a typical set of settings you would normally expect, like branch name, commit/revision number, URL etc. I have personally used only Git in this context.
While doing so, for a long time I have specified master
branch as the content source. The problem with such declaration is that every time something changes in the content’s upstream, it immediately gets populated to the places that use it. In my case, I had a few “utility” repositories, that were reused with FetchContent
in highly active product repositories. For sure, using branch name relieves you from remembering to update FetchContent
declaration in every repository once a change is introduced. But on the other side, if you are doing some breaking change, then it can create a lot of chaos. Especially if you have an active team. Trust me, you don’t want to hear all those complaints. So instead of branch name, specify tag or commit hash. The only drawback of this approach is that you need to remember to update that hash whenever necessary. It is also not obvious at first sight if the changed hash is newer or older in history without checking the Git log.
include(FetchContent)
FetchContent_Declare(
platform
GIT_REPOSITORY https://github.com/kubasejdak/platform.git
GIT_TAG ee3ce49227e809aa3ef0f6270b4c996b4808cddf
)
FetchContent_GetProperties(platform)
if (NOT platform_POPULATED)
FetchContent_Populate(platform)
endif ()
Fetching without network
Most fetching methods naturally assume, that you have an Internet/Intranet connection to your remote content. However there are situations, where you are totally offline (business trips, network problems, etc). This could literally block your work.
Fortunately there are two options you can use to work without connection to the upstream:
- using local content copy on you hard drive (e.g. clone of the repo done while connection was still available),
- forcing CMake to not try updating (checking for changes) of the content already downloaded by the previous
FetchContent
/ExternalProject
run.
In the first case, you can simply replace repository URL with a path to the local repository copy. It works very well. The only observable difference is the fact that content will not be physically copied into you build directory – CMake will use the existing files that you are pointing to. This should hardly ever be an issue.
In the second case, you rely on the fact that you have successfully run FetchContent
/ExternalProjecy
at least once. Every time CMake is processing it, it checks if the content is up-to-date. In the offline scenario that check will result in an error. You can explicitly ask CMake to skip that part by adding additional parameter – UPDATE_DISCONNECTED ON
:
include(FetchContent)
FetchContent_Declare(
platform
GIT_REPOSITORY https://github.com/kubasejdak/platform.git
GIT_TAG origin/master
UPDATE_DISCONNECTED ON
)
FetchContent_GetProperties(platform)
if (NOT platform_POPULATED)
FetchContent_Populate(platform)
endif ()
Note, that this will not skip downloading content for the first time. It will also not update your content if you have a healthy connection.
Hint
Local directory turned out to be the perfect solution if you need to make changes in fetched repository and check if it breaks dependent projects. It would be really annoying to do this using Git (it could generate a lot of trash commits).
Choosing the right place to run FetchContent
FetchContent
runs at the generation stage, so choosing the place where it is declared determines which parts of the build system could interact with the fetched content.
For example, FetchContent
can be run before project()
function (both declare and populate statements). This could allow you to setup custom toolchain (via CMAKE_TOOLCHAIN_FILE
) which is downloaded from the company’s server. On the other hand, utility codebases (extracted to the separate repositories just to avoid duplication) can be fetch just before the place, where you would like to call add_subdirectory()
on it.
The second scenario is so popular, that CMake provides a special command FetchContent_MakeAvailable()
which populates and calls add_subdirectory()
for you. This assumes of course, that you want to enter to the root directory of the downloaded repository:
include(FetchContent)
FetchContent_Declare(
tools
GIT_REPOSITORY https://github.com/kubasejdak/tools.git
GIT_TAG origin/master
)
FetchContent_MakeAvailable(tools) # Populate and add_subdirectory() in one command.
If you need to get to the specific directory, then you have to do it by hand, e.g. my HAL project requires, that clients should use only lib
directory, while root path is used to launch tests, generate documentation etc.:
include(FetchContent)
FetchContent_Declare(
hal
GIT_REPOSITORY https://github.com/kubasejdak/hal.git
GIT_TAG v1.5b
)
FetchContent_GetProperties(hal)
if (NOT hal_POPULATED)
FetchContent_Populate(hal)
endif ()
add_subdirectory(${hal_SOURCE_DIR}/lib ${hal_BINARY_DIR}) # <--- Note "lib" in the first argument.
I also like to put FetchContent
commands in the separate files called <resource_name>.cmake
and include them in other CMakeLists.txt where needed:
# cmake/platform.cmake
include(FetchContent)
FetchContent_Declare(
platform
GIT_REPOSITORY https://github.com/kubasejdak/platform.git
GIT_TAG origin/master
)
FetchContent_GetProperties(platform)
if (NOT platform_POPULATED)
FetchContent_Populate(platform)
endif ()
# cmake/osal.cmake
include(FetchContent)
FetchContent_Declare(
osal
GIT_REPOSITORY https://github.com/kubasejdak/osal.git
GIT_TAG origin/master
)
FetchContent_GetProperties(osal)
if (NOT osal_POPULATED)
FetchContent_Populate(osal)
endif ()
# cmake/hal.cmake
include(FetchContent)
FetchContent_Declare(
hal
GIT_REPOSITORY https://github.com/kubasejdak/hal.git
GIT_TAG origin/master
)
FetchContent_GetProperties(hal)
if (NOT hal_POPULATED)
FetchContent_Populate(hal)
endif ()
# Root CMakeLists.txt
# ...
include(cmake/platform.cmake)
include(cmake/osal.cmake)
include(cmake/hal.cmake)
# ...
add_subdirectory(${platform_SOURCE_DIR}/app ${platform_BINARY_DIR})
add_subdirectory(${hal_SOURCE_DIR}/lib ${hal_BINARY_DIR})
# ...
Summary
There are at least a few ways how you can join repositories in your source code. CMake offers FetchContent
at generation stage and ExternalProject
at build stage. Each comes with a rich set of options, which you can check in the official docs. Choose the right method for your specific case. In most cases FetchContent
would suit you better.
If you liked this article, then please share it on the social media you use. It will greatly increase the chance that someone else will benefit from it. It will also motivate me to write more and better articles in the future.
Subscribe to get notified about new content
Thanks for taking your time to read it. If you like it or have other opinions, please share it in the comments. You can also subscribe to my newsletter to get latest news about my content. Happy coding and let’s stay connected!