A library formed only by class templates and function templates contains only header files. One example is Eigen, but many others are available.
Using a header-only library is very simple: you have to store the header files in a directory later searched by the preprocessor.
So either you store them in a system include directory, like /usr/include
or /usr/local/include
(you must have administrator privileges),
or in a directory of your choice that you will then indicate using the -I
option of the compiler (actually, of the preprocessor).
g++ -I/path/to/library/include/ ...
# Download Eigen 3.4.0.
wget https://gitlab.com/libeigen/eigen/-/archive/3.4.0/eigen-3.4.0.tar.gz
# Extract the archive to your Desktop.
tar xzvf eigen-3.4.0.tar.gz -C ${HOME}/Desktop
# Compile and run 'example/eigen.cpp'.
g++ -I${HOME}/Desktop/eigen-3.4.0 eigen.cpp -o main_eigen && ./main_eigen
As simple as that.
From now on, however, we will deal with libraries that contain machine code, not header-only libraries.
Static library: A static library, often denoted by a .lib
(on Windows) or .a
(on Unix-like systems) file extension, contains compiled code that is linked directly into an executable at compile time. When you build a program using a static library, a copy of the library's code is included in the final executable. This means that the resulting executable is independent of the original library file; it contains all the necessary code to run without relying on external library files.
Shared library (Dynamic Link Library .dll
on Windows, Shared Object .so
on Unix-like systems, Dynamic Library .dylib
on macOS): A shared library contains code that is loaded at run-time when the program starts or during execution. Instead of being included in the executable, the program references the shared library, and the operating system loads the library into memory when needed. Multiple programs can use the same shared library, which can result in more efficient use of system resources.
main.cpp
(developed by me)#include "mylib.hpp"
...
myfun();
...
mylib
(developed by somebody else)// mylib.hpp
void myfun();
// mylib.cpp
#include "mylib.hpp"
void myfun() {}
The preprocessing and compilation steps
g++ -Imylib/ -c main.cpp
produce the object file main.o
. What does it contain?
$ nm -C main.o
0000000000000000 T main
U myfun()
The T
in the second column indicates that the function main()
is actually defined (resolved) by the library. While myfun()
is referenced but undefined. So, to produce a working executable, you have to specify to the linker another library or object file where it is defined.
$ g++ main.o -o main
/usr/bin/ld: main.o: in function `main':
main.cpp:(.text+0x9): undefined reference to `myfun()'
collect2: error: ld returned 1 exit status
myfun()
myfun()
g++ -c mylib.cpp
g++ main.o mylib/mylib.o -o main
Now both main
and myfun
are resolved:
$ nm -C main
00000000000011a9 T main
00000000000011bd T myfun()
...
Real-case scenarios are typically much more complex because:
mylib
or it is updated, one has to recompile mylib
and relink all their applications using mylib
.mylib
may not be so nice: they want to hide the actual implementation. They are ok with providing users with mylib.hpp
and the corresponding machine code (which is not human-readable), but not mylib.cpp
.This is why, typically, developers of a library provide users with header files and a library file.
g++ main.o /path/to/mylib/libmylib.a -o main
-L<dir> -l<libname>
options.g++ main.o -L/path/to/mylib -lmylib -o main
-L<dir>
is not needed if the library is stored in a standard directory (typically /usr/lib
or /usr/local/lib
).
Note that libxx.a
becomes -lxx
.
If the linker finds a shared library with the same name available in the system and/or in the specified directories, it is given the precedence. If you want to override this behavior, use the -static
flag.
When linking, the order matters. Libraries should be listed in reverse order of dependency. Libraries that depend on symbols from other libraries should come first in the list.
So, for example, if myprogram
depends on mylibrary1
which on turn depends on mylibrary2
, then mylibrary2
should come first:
g++ myprogram.o -lmylibrary1 -lmylibrary2 -o myprogram
Other permutations are wrong:
g++ myprogram.o -lmylibrary2 -lmylibrary1 -o myprogram
g++ -lmylibrary1 -lmylibrary2 myprogram.o -o myprogram
Undefined symbols in main.o
are not searched in the given libraries.
The command nm
works not only with object files and executables, but also with libraries:
$ nm -C libmylib.a
...
0000000000000000 T myfun()
...
Besides T
and U
, the command may use other letters. The most important ones are:
D
or G
: The symbol refers to initialized data.V
or W
: The symbol is a weak symbol. It basically means that the (ODR) One Definition Rule with not be applied by the linker on those symbols.A note: If a function declared inline
has been actually inlined, the corresponding symbol is not present, since inline in this case really means inline. The same happens for a constexpr
function. If the compiler instead decides to treat them as normal functions, the symbol is marked W
.
Static libraries are the oldest and most basic way of integrating third-party code. They are basically a collection of object files stored in a single archive.
At the linking stage of the compilation processes, the symbols (which identify objects used in the code) that are still unresolved (i.e., they have not been defined in that translation unit) are searched into the other object files indicated to the linker and in the indicated libraries, and eventually the corresponding code is inserted in the executable.
In practice, libraries result themselves from preprocessing and compiling their corresponding source codes. In our example:
g++ -c mylib.cpp
ar rs libmylib.a mylib.o
More in general, a static library is just an archive collecting object files:
g++ -c a.cpp b.cpp c.cpp d.cpp // Create object files.
ar rs libxx.a a.o b.o c.o
ar rs libxx.a d.o // You can add one more.
Option r
adds/replaces an object in the library. Option s
adds an index to the archive, making it a searchable library.
The command ar -t libxx.a
lists all object files contained in the archive.
With shared libraries, the mechanism by which code from the library is integrated into your own is very different than the static case.
The version is an identifier typically represented by a sequence of numbers, indicating instances of a library with a common public interface and functionality. I recommend you to stick with the Semantic Versioning convention.
-lmylib
option, of the form libmylib.so
.soname
(Shared Object Name): Looked after by the loader, typically formed by the link name followed by the major version number, e.g., libmylib.so.3
.libmylib.so.3.2.4
.The ldd
command lists all shared libraries used by an executable (or another shared library):
ldd /usr/bin/octave-cli | grep fftw3.so
libfftw3.so.3 => /lib/x86_64-linux-gnu/libfftw3.so.3 (...)
The loader searches for the library in special directories and finds /lib/x86_64-linux-gnu/libfftw3.so.3
. This library is used when launching Octave.
If there's a new release, placing the corresponding file in the /lib/x86_64-linux-gnu
directory, and resetting symbolic links, will make Octave use the new release without recompiling (and this is what happens when, for example, you upgrade a package via apt
or similar).
$ ls -l /lib/x86_64-linux-gnu/libfftw3.so
... /lib/x86_64-linux-gnu/libfftw3.so -> libfftw3.so.3.5.8
This means that libfftw3.so.3
is a symbolic link to libfftw3.so.3.5.8
. Hence, we are actually using version 3.5.8 of libfftw3
.
Another nice thing about shared libraries is that they may depend on another shared library. This information can be encoded when creating the library. For instance:
ldd /usr/x86_64-linux-gnu/libumfpack.so
...
libblas.so.3 => /usr/lib/libblas.so.3
The UMFPACK library is linked against version 3 of the BLAS library. This helps to avoid using an incorrect version of dependent libraries.
You then proceed as usual:
g++ -I/path/to/mylib -c main.cpp
g++ main.o -L/path/to/mylib -lmylib -o main
The linker looks for libmylib.so
in system and/or in the specified directories, controls the symbols it provides, and verifies if the library contains a soname
. If it doesn't, the link name libmylib.so
is assumed to be also the soname
.
For example, libumfpack.so
provides a soname
(of course, this has been taken care of by the library developers). If you wish, you can check it:
$ objdump -p /lib/x86_64-linux-gnu/libumfpack.so | grep SONAME
SONAME libumfpack.so.5
Being libmylib.so
a shared library, the linker does not integrate the code of the resolved symbols into the executable. Instead, it just controls that the library provides the symbols and inserts the information about the soname of the library in the executable:
ldd main
libmylib.so.2 => /path/to/libmylib.so.2 (...)
In conclusion, linking a shared library is not more complicated than linking a static one. However, knowing what happens "under the hood" may be useful to tackle unexpected situations.
The loader has a different search strategy with respect to the linker. It looks in /lib
, /usr/lib
, and in all the directories contained in /etc/ld.conf
or in files with the extension conf
contained in the /etc/ld.conf.d/
directory.
If you want to permanently add a directory in the search path of the loader, you need to add it to /etc/ld.conf
or add a conf file in the /etc/ld.conf.d/
directory with the name of the directory and then launch ldconfig
. This command rebuilds the database of the shared libraries and should be called every time one adds a new library (for example, apt
does it for you, and moreover, ldconfig
is launched at every boot of the computer).
Launching the command sudo ldconfig -n directory
has the same effect, but in this case modifications will remain valid until the next restart of the computer.
All these operations require you to act as an administrator, for instance using the sudo
command. Safer alternatives are in the next slide.
Setting the environment variable LD_LIBRARY_PATH
: It contains a colon-separated list of directory names where the loader will first look for libraries.
# Permanently, for the current terminal session:
export LD_LIBRARY_PATH+=:dir1:dir2
./main
# Or, temporarily valid for a single command:
LD_LIBRARY_PATH+=:dir1 ./main
With the special linker option, -Wl,-rpath,directory
: During the compilation (linking stage) of the executable, for instance
g++ main.cpp -Wl,-rpath,/path/to/mylib -L/path/to/mylib -lmylib
The loader will look in /path/to/mylib
before the standard directories. You can use also relative paths.
Compile the source files:
g++ -fPIC -c mylib.cpp
PIC
stands for Position-Independent Code.
Create the library:
g++ -shared mylib.o -Wl,-soname,libmylib.so.1 -o libmylib.so.1.0
Note: The library's real name is libmylib.so.1.0
.
Create symbolic links for version control:
ln -s libmylib.so.1.0 libmylib.so.1
ln -s libmylib.so.1 libmylib.so
Compile the executable, linking the library:
g++ -I/path/to/mylib -c main.cpp
g++ main.o -L/path/to/mylib -lmylib -o main
However, running the executable may result in an error:
./main error while loading shared libraries:
libmylib.so.1: cannot open shared object file: No such file or directory
To fix this, direct the loader as explained in the previous section, for instance by modifying LD_LIBRARY_PATH
or changing the rpath
:
g++ main.o -Wl,-rpath,/path/to/mylib -L/path/to/mylib -lmylib -o main
Now, the executable works as expected!
Assuming a new release (e.g., version 1.1), compile and link the new library without recompiling the executable:
g++ -c -fPIC mylib.cpp # mylib.cpp has some new features!
g++ -shared mylib.o -Wl,-soname,libmylib.so.1 -o libmylib.so.1.1
ln -s libmylib.so.1.1 libmylib.so.1
ln -s libmylib.so.1 libmylib.so
Now, running the executable uses the updated library without recompilation or relinking.
For smaller projects without versioning, you can use the same name for link name, soname
, and real name (e.g., libmylib.so
). In this case, the -Wl,soname
option can be omitted and the symbolic links are not needed.
-fPIC
option.soname
is used by the loader and is specified during library creation.-Wl,-rpath
during linking or set LD_LIBRARY_PATH
for directory search during development.Shared libraries offer two intriguing features:
These features form the foundation for implementing plugins (and are also employed in Python modules).
Dynamic loading is a fundamental aspect of a plugin architecture, allowing an application to load parts of its implementation dynamically based on user requests.
This is a very advanced topic. For more information, have a look at this interesting post (source code here).
Looks easy, doesn't it?
Actually, the crux lies in step 3: