foddex.net - the home of the foddex :-)

C++ Lesson 4: Using External Libraries

Contents

Why Code Libraries Exist
Differences
Using Dynamic Libraries
Linking Against Dynamic Libraries In Unix/Linux
Linking Against Dynamic Libraries In Windows
Running Applications Depending On Dynamic Libraries
Static Libraries
Deciding What Form Of Library To Use
Versioning And Dll Hell
Example - Using Libcurl In Linux
Example - Using Libcurl In Windows

Why Code Libraries Exist

In a lot of projects, not all of the code that is a part of the application is written by the project's coders. It may be decided, or even required, to use code from an external source. Those external sources are often proprietary libraries written by companies. But without the source available, how can you incorporate such libraries into your own project? If you have read lesson 3, you might already have a clue. Remember (if not, read that lesson now) how source is first compiled into object files? And remember that header files basically describe what's inside the actual source files? Wouldn't it be awesome to combine those object files in a non-executable form (in stead of a executable form, an application), but in a form so that other applications can use them at run-time? Indeed it would be, and that's why it IS possible, and it's a very common concept to use.

The reasons are plenty for putting code into a separate library file, but the most obvious one is code reuse. For example, take the GUI subsystem in Microsoft Windows. If it wouldn't be available in library form, building an application for it would be nearly impossible: Microsoft would have to make the sources for it downloadable, and every application would have to link that code into itself. Suppose all code together is 1MB of binary code, that would mean that EVERY application running on Microsoft Windows would be at least 1MB in size! While with today's hard drives a 1MB minimum filesize is ludicrously low, many other reasons are still valid. For example, what if Microsoft would fix a bug in their code? With their code available through libraries, all they have to do is patch the library file, and all applications using it will no longer suffer from the bug. But without libraries, every application depending on that library would need to be rebuilt! Getting the drift?

This concept of libraries of code are called "Dynamic Link Libraries" (.dll files) in Windows, and "Shared Objects" (.so files) in a Unix-like environment. Both names indicate very nicely what the concept does: code is "shared" between applications (explaining the Unix name), and functions from a library are linked dynamically at run time (explaining the Windows name).

Differences

While the general concept behind DLL's and SO's is very much the same, there are a number of very important differences. For example, in Unix/Linux all shared objects together with aplication share the same memory manager. In Windows, this is not the case. This is very, very important, as this means that in Linux, one could have a function from a shared object return a pointer to allocated memory, and free that same memory in application code. Or any other shared object. But in Windows, every dynamic link library has its own self contained memory manager. Memory declared in one library may absolutely not be freed from another, or from the main application. This is why many libraries have their own allocate en deallocate functions!

There is another important difference. In Linux/Unix, the "scope" of symbols is one and the same, and duplicate symbol names are never allowed. So if one shared object is linked into the application with function

foo

, another shared object containing the exact same symbol can then not be loaded! It would make the symbol

foo

ambiguous. But with dynamic link libraries in Windows, this is not the case. Every library is its own self contained "symbol container", and therefor duplicate symbol names across libraries are not a problem.

Using Dynamic Libraries

From here on "dynamic library" simple refers to a code library loaded at runtime, whether it be DLL's in Windows, or SO's in Unix/Linux. Note that the concept of "static libraries" also exists, which is detailed later in this lesson.

As you already know from lesson 3, object files can be linked into executables. But as mentioned earlier in this lesson, it's also possible to link them into dynamic libraries. The accompanying header files will tell you which functions and variables are available in the DLL. So at a language/compiler level, using dynamic libraries is exactly the same as using functions from your own code inside the application: you include one or more header files to let the compiler know which functions are available. In what form it will be available (be it from the application, or from a dynamic library) is in many cases not important (note that there are situations in which it is important, but that's way out of scope here). The only place where it becomes somewhat different is when the linking process starts. There's also - again - a difference between Unix/Linux here, and Windows. Since it's the simplest in Unix/Linux, we'll start there.

Linking Against Dynamic Libraries In Unix/Linux

In Unix/Linux, all you need to use a dynamic library is the .so file and its header files. As explained above header files are used by the compiler. The .so file is used at runtime (when the application using the dynamic library runs), and at linker time. The linker can extract symbol information from the .so as e.g. gcc stores this information in there. So when the linker needs to satisfy a function reference symbol (see lesson 3), it first checks against the local object files. If they cannot be found in there, the linker checks the .so files it was instructed to use. If the symbol can be found in one of the .so files, it includes a reference to that symbol file. This means that your application will depend on the .so file to run. When the application is started, the .so files are looked up, linked into your application to satisfy its dependencies, and then actually started. If a .so file is not available when the application is started, it will fail to do so and an error message like "missing shared object: somefile.so" will be echoed before the application bails.

Linking Against Dynamic Libraries In Windows

In Windows, the technical concept is completely the same as with Unix/Linux. But there's an important difference. Where symbol information to be used by the linker is stored in .so files, Windows' Microsoft Visual Studio compiler will output a special form of library, a so called static library, which contains the actual references to the .dll files. More on static libraries later, for now just assume you need one.

When you compile and link a dynamic library in Windows, you end up with two files: the .dll file, and a .lib file with the same name. To make your dynamic library reusable for others, you need to distribute both the .dll file and the appriopriate header files, but ALSO the .lib file! To tell the linker it should use code from a .dll file, you tell it to import the static library .lib file in stead. This will make your application depend on the .dll file automatically.

Running Applications Depending On Dynamic Libraries

Each operating system has its own way of finding dynamic libraries when an application depends on them. Unix/Linux usually stores .so files in the /usr/lib directory, or /usr/lib64 if you're running a 64-bits distribution. But it will also check both the directory in which the application resides, as well as the startup directory. There's a particular order in which these are checked of which I'm not sure, but as with most things, JFGI. Many distributions also check the LD_LIBRARY_PATH environment variable. If set, it interprets its value as a ; separated list of directories in which it should look for .so files as well. So anytime you have a problem with an application not starting because of missing shared objects, check the above directories and environment variables!

In Windows, things are pretty much the same. In stead of the /usr/lib directory, windows uses the C:\WINDOWS\SYSTEM32 directory (or WINNT instead of WINDOWS). And it too checks the startup directory, and the directory in which the .exe file resides for .dll files it requires. So if you are writing an installer for your application, either dump the file in the SYSTEM32 directory, or place it alongside your .exe file. That will work fine in most cases.

Static Libraries

So what then are static libraries? Simply put, all a static library is is a collection of object files. But wait a minute, weren't dynamic libaries the exact same thing? Yes, and no. Static libraries are object file collections in its purest form: they are .o files concatenated into a single file, raw and unchanged. They are not yet in an executable form. Dynamic libraries are also object files combined into a single file, but they have been linked! This means that any external references have been resolved: it's a collection of functions and variables ready to be included. Static libraries are very much not so: they have to be included by a linker into an application or dynamic library, and, also very important, will end up into whatever it's being linked in! So static libraries are used at compile time, while dynamic libraries are used at runtime.

Static libaries are .a files in Unix/Linux, and .lib files in Windows.

Furthermore there may also be a difference in the way the code is compiled between static and dynamic libraries. It may be compiled in "PIC" format, which stands for Position Independent Code. This however falls far outside of the scope of this lesson.

Deciding What Form Of Library To Use

Many open source libraries come in two flavors: static and dynamic. The ones that don't come in both flavors usually only come in the dynamic flavor. Why is this? In most cases it has to do with licensing: the GPL and LGPL licenses do not allow static linking of libraries without opening up your own source code (which can be a bad thing if you're not planning on writing open source). Because static libraries become an actual, integral part of the executable or dynamic library, it cannot later be removed or replaced with a newer version. The aformentioned licenses do not allow this.

There are however alternative license like the MIT or BSD license (along with many others) that do not have similar requirements about static linking. Many libraries that ship with those licenses come in static form as well as dynamic form.

So which one to choose? The choice is yours. Sometimes you don't want users to know you're using a specific library. In that case using the static form is the best, as there will be no trace of the fact that you're using that library in the file structure of your application. If you use the dynamic form, then you need to ship the dynamic library file as well, and install it with the application, or make it a automatic dependency when you're working with a package system like apt-get or yum (in Linux).

Versioning And Dll Hell

Before we go on to an actual example of using an external library, I want to discuss versioning. In the Unix/Linux world, proper versioning of your libraries is absolutely mandatory. No distribution will ship your library if it's versionless. So how to version? This is completely up to you. If you want to go version 1.0 for the first release, and then go version 2.0 for the next, it's fine. If you want to go 0.1, 0.2, that's fine too. As long as it's a higher number for a newer version, it's okay. But why is this important?

As you have learned already the header files tell you what functions and variables are available in a library. It's called the interface of the library. But what if this interface changes? You might decide that you want to add a few functions. Or remove some, or rename them. Or change parameters types of a function, or add a parameter. All of this is very important to the application using the function. It has to use the exact right version of your library, or it might crash, or not even start at all. This is where versioning comes into play. If you're working with e.g. the libcurl library to handle HTTP in your application and you link against version 7.14.0, then it's very handy that the .so filename your application depends on will be libcurl-7.14.0.so! It has the version information built into it. Because when you then decide to update libcurl to version 7.16 and uninstall the old version, you're application will no longer work! This is a much better way to handle such a situation than your application trying to link with a version 7.16 library thinking it's 7.14!

But this is the Unix/Linux world. Unfortunately this is not how these things work in Windows. Microsoft decided to include version information in the library file itself, not in its filename. This gives more room for all kinds of additional information - e.g. author, copyrights, language, etc. all embedded in the .dll file itself - but it also has a big downside: you can no longer by the filename alone know if it's the right version for your application. You have to link to it before you find out! Plus, one application might depend on version 7.14 of libcurl, while another depends on 7.16, but both applications require "libcurl.dll"! How to solve this if you have a single directory to put all the system libraries in? As you might realize this has resulted in a lot of frustration with developers. This is also why most applications simply dump all the right dll files in their application directory in stead of the SYSTEM32 directory, somewhat nullifying the idea behind code reuse (ok, it might still be code reuse, but not file reuse). All these issues are generally referred to as "DLL hell". Nice huh? ;-)

Example - Using Libcurl In Linux

We'll be building a small application in this lesson to download the mainpage from google.com. Since we do not wish to handle opening a network socket and sending and retrieving information using the HTTP protocol ourselves, we decide to use an open source library called libcurl ("lib" for library, "c" for written-in-C, and "url" for handling internet protocols like FTP, HTTP and HTTPS; this format is by no means standard or de facto, although many libraries do start with "lib").

In Linux, most libraries are packaged in two parts. One that is required to run an application that depends on the library (the normal version), and one that can be used to compile applications. The former contains the .so file and any other files the library might require, the latter contains the header files. So the latter supplements (not replaces) the first.

To install the development package for libcurl in a Fedora environment, execute the following:

yum install libcurl-devel

That will automatically fetch the latest version from the package repository, and install it. Since libcurl-devel depends on the libcurl package (remember, no use to have header files without the accompanying .so file), it will be installed automatically. But after it's installed, then what? Where is the .so file? Where are the header files? The compiler needs to know this and there's no magic for this built into the compiler.

To solve this, a very neat tool was developed called Package Config. The internals of how it works are not revelant, but all you need to know is what information it can give you! For any package it supports, it can tell you

what the compiler flags are to use functions from the library (i.e. to use the header files)
what the linker flags are to use link against the library

More and more packages support Package Config - although definitely not all of them, unfortunately - making the life of a developer more easy! When writing a build system, you do not have to worry anymore about determining the current distribution, finding its header file locations, etc. etc. Package config does this for you. But first, let's go on to the code.

#include <curl/curl.h>
#include <stdio.h>

int main( int argc, char** argv ) {

	// initialize curl session
	CURL* curl = curl_easy_init();
	if (!curl) {
		fprintf( stderr, "Failed to initialize CURL\n" );
		return 1;
	}

	// set it up
	curl_easy_setopt( curl, CURLOPT_URL, "http://www.google.com/" );

	// execute the command to fetch the URL contentse
	if (curl_easy_perform( curl ) != CURLE_OK) {
		fprintf( stderr, "Failed to fetch URL contents\n" );
		return 1;
	}

	return 0;
}

Place the above code into a file named main.cpp, and then execute the following:

g++ main.cpp `pkg-config --cflags --libs libcurl` -o fetchgoogle

The pkg-config call in between the back-quotes (or back ticks) is where the magic happens. Your Linux shell will execute that command, and whatever it outputs, insert that into the command line options. The --cflags parameter means, "give me the compiler flags for a library". The --libs parameter means, "give me the linker flags for a library". Then a list of library names follows, but in this case there's only one: libcurl. (Hint: to find all packages supported by pkg-config, execute

pkg-config --list-all

. To install pkg-config itself, execute

yum install pkgconfig

.) The pkg-config command shown above will echo something like:

-I/usr/include/curl -lcurl

Which are flags to tell the compiler and linker (resp.):

where additional header files might be found (the -I switch)
what .so files to depend on (the -l switch)

This

make

command will work on any Unix based platform or distribution that has Package Config installed, as well as the development package for libcurl. The latter is a very simple check to do with autoconf/automake (not discussed here). As said, the big advantage is that you don't need to figure out yourself where the header files and lib files are located.

Finally, to test your application, execute it:

./fetchgoogle

It will fetch the contents for Google's main website as appropriate for your location. (For me in The Netherlands, it outputs a directive to forward me to the Dutch version of Google, google.nl. YMMV.)

Example - Using Libcurl In Windows

In Windows, the C++ part is identical to the Linux part; the code remains the same and is not repeated here. But what is interesting to learn is how to setup your project in Visual Studio so that it can make use of the libcurl library. But before we do that, we first have to install it.

Installing the library is not done in the same way as with Linux: there is no centralized repository with WIN32 libraries. You have to go to the libcurl website, and download it yourself. Then it's up to you where you place the files: there's no centralized place for this either in Window. One m.o. is to place such files in your project's directory, and then ship the header and .lib and .dll files along with your project's source code (so that other Windows coders don't need to download the files themselves). This is what I do most of the time, as it's by far the easiest setup (I do get a lot of heat for it from more Linux minded friends though, but I don't care ;-)). So let's do that now, and download the latest version for Windows here (hint, download the latest Win32 - MSVC build, at the time of writing 7.19.3. If it's no longer there, you can download it from my site here).

In the same file as the main.cpp file with the code for this lesson, create a directory named

libraries

, and then create two subdirectories:

include

and

libs

. Your directory structure should look like this:

fetchgoogle/
	main.cpp
	libraries/
		include/
		libs/

Now open up the zip file you download, and copy/paste the include/curl directory into the

include

directory, and copy/paste the lib/Debug and lib/Release directories into the

libs

directory. You are now good to go to create your first application that uses a dynamic library in Windows! (A little note on the include and libs directories: the idea is to have a single place where external header files and library files are stored. This is not only easy to maintain, but also easier to setup your code. In Linux, the

#include

statement to use libcurl is

#include &lt;curl/curl.h&gt;

. If you want your code to be cross platform compilable, then the same statement should work in Windows as well. So you need to setup the compiler in Visual Studio in such a way, that including curl/curl.h works! Also, including the .lib files for your .dll files becomes a lot easier if they're all in the same directory.)

Now create a new application as demonstrated in Lesson 1. Name the project fetchgoogle. Go for the same options as dictated there: empty project, console application, and so on. Place the .sln and .vcproj files in the same directory as the main.cpp file. It's imperative you do this right, as the compiler flags in this lesson depend on it. To make sure you've done it right, I again will show you the full file/directory structure you should have at this point. Verify it before you continue.

fetchgoogle/
	main.cpp
	fetchgoogle.sln
	fetchgoogle.vcproj
	libraries/
		include/
			curl/
				curl.h
				curlbuid.h
				... etc ...
		libs/
			Debug/
				curllib.dll
				curllib.lib
				curllib_static.lib
			Release/
				curllib.dll
				curllib.lib
				curllib_static.lib

Try hitting Compile | Build, and see that the following happens:

1>------ Build started: Project: fetchgoogle, Configuration: Debug Win32 ------
1>Compiling...
1>main.cpp
1>d:\projects\c++\fetchgoogle\main.cpp(1) : fatal error C1083: Cannot open include file: 'curl/curl.h': No such file or directory
1>Build log was saved at "file://d:\Projects\C++\fetchgoogle\Debug\BuildLog.htm"
1>fetchgoogle - 1 error(s), 0 warning(s)
========== Build: 0 succeeded, 1 failed, 0 up-to-date, 0 skipped ==========

It fails, because the compiler cannot find a curl directory in any of the paths it's allowed to look in for header files. So basically it's missing a header file, which is always a reason to stop the compilation project for the compiler unit (main.cpp in our case). To instruct the compiler to look in our custom include directory as well (where a

curl

directory is present, containing a

curl.h

header file), take the following steps:

select the fetchgoogle project node in the project tree
right click it, and select Properties at the bottom
in the top left of the window that pops up, select the "All configurations" configuration: what you're about to do is for both a Debug and a Release setup. If you don't do this, you will change the settings for the default selected Debug configuration only. You will have to do all of this again for the Release later. To avoid this redundancy, select the All Configurations configuration
select the Configuration Properties node inside the tree on the left
select the C/C++ node
select the General node
on the right, there should now be a "Additional Include Directories" option
select it to edit it
enter "include" there: any non-absolute path you enter here is interpreted as relative to the directory where the .vcproj files is located. This is both important as well as convienent to know

You are now done setting up the compiler. Just to show you what will happen if you only setup the compiler, and not the linker, try building the application at this stage. If you've followed all instructions, the code should compile, but not link. It will complain about missing references to the curl_easy_init, curl_easy_setopt and curl_easy_perform functions. That makes sense: you told the compiler the functions would be there at link time, but the linker can't find them in its current set of allowed library files.

1>------ Build started: Project: fetchgoogle, Configuration: Debug Win32 ------
1>Compiling...
1>main.cpp
1>Linking...
1>main.obj : error LNK2019: unresolved external symbol __imp__curl_easy_perform referenced in function _main
1>main.obj : error LNK2019: unresolved external symbol __imp__curl_easy_setopt referenced in function _main
1>main.obj : error LNK2019: unresolved external symbol __imp__curl_easy_init referenced in function _main
1>D:\Projects\C++\fetchgoogle\Debug\fetchgoogle.exe : fatal error LNK1120: 3 unresolved externals
1>Build log was saved at "file://d:\Projects\C++\fetchgoogle\Debug\BuildLog.htm"
1>fetchgoogle - 4 error(s), 0 warning(s)
========== Build: 0 succeeded, 1 failed, 0 up-to-date, 0 skipped ==========

To fix this we setup the linker too:

select the fetchgoogle project node in the project tree
right click it, and select Properties at the bottom
in the top left of the window that pops up, select the "All configurations" configuration
select the Configuration Properties node inside the tree on the left
select the Linker node
select the General node
on the right, there should now be an "Additional Library Directories" option
select it to edit it
enter "libs/$(ConfigurationName)" there: the same applies here as it did with the header file directories: it's all relative to the vcproj file's location. The $(ConfiguratioName) parameter will be substituted with Debug or Release, depending on the current build target
select the Input node
on the right, there should now be an "Additional Dependencies" option
select it to edit it
enter "curllib.lib" there

Try building again! It should now work, and you're almost done. Remember about how Windows finds the DLL files it needs to run your application. Although the executable for your application is placed inside a Debug or Release directory, it gets executed from the directory where the .vcproj file is located. So you either have to copy the DLL from the libs/Debug (or Release, depending on what you just built) directory into the same directory as the .vcproj file, or copy it to the Debug directory containing fetchgoogle.exe. Howver, when you now try to start the application in Microsoft Visual studio by hitting Ctrl-F5, it will still not work complaining about either a missing libeay32.dll, openldap.dll or ssleay32.dll file. This is because libcurl itself depends on these libraries related to handling SSL! They are in the libcurl zip you downloaded as well, copy paste them in the same directory as the curllib.dll file, and you're done!

5 comment(s)

Click to write your own comment

On Mon 29-03-2010 23:06 Burn wrote: Small error: The include paths are "libraries/include" and "libraries/libs/$(ConfigurationName)" according to the tutorial setup, but you forgot the "libraries/" folder.

Also the built executable complains about a libsasl.dll missing from the system (Running win7 RTM build here) and that is not shipped with the libcurl package.

On Tue 30-03-2010 13:12 Foddex wrote: Nice spot, will fix!

On Tue 08-02-2011 19:55 Jorge wrote: Problem stated by Burn (first poster) still persist. Now, it's possible to find that dll but how to ensure it's the right dll version?

Very nice article!

On Tue 01-03-2011 14:57 Foddex wrote: Long overdue, but I fixed the libraries/ directory :)

On Fri 08-04-2011 11:53 Foddex wrote: @Jorge: that's really dependent on the library you use. Some libraries have a runtime function you can call to verify you're using the right version. But ofcourse, when you're using the wrong version, chances are that your application will have crashed due to missing symbols before it could even reach the call to that function. As said in the article, it's why most developers just ship all DLLs they need with their executables in Windows...

Write your comment:
Name:
URL:	(optional!)

Answer this question to prove you're human:
	What are wooden tables made of?

Professional

Personal