Overview
Dlib is a general purpose cross-platform open source software library written in the C++ programming
language. Its design is heavily influenced by ideas from design by contract and component-based
software engineering. This means it is, first and foremost, a collection of independent
software components, each accompanied by extensive documentation and thorough debugging modes.
Davis King has been the primary
author of dlib since development began in 2002. In that time
dlib has grown to include a wide variety of tools. In particular,
it now contains software components for dealing with networking,
threads, graphical interfaces, complex data structures, linear
algebra, statistical machine learning, image processing, data
mining, XML and text parsing, numerical optimization, Bayesian
networks, and numerous other tasks. In
recent years, much of the development has been focused on creating
a broad set of statistical machine learning tools. However, dlib
remains a general purpose library and welcomes contributions of high
quality software components useful in any domain.
Core to the development philosophy of dlib is a dedication to
portability and ease of use. Therefore, all code in dlib is designed
to be as portable as possible and similarly to not require a user to
configure or install anything. To help achieve this, all platform
specific code is confined inside the API wrappers. Everything else is
either layered on top of those wrappers or is written in pure ISO
standard C++. Currently the library is known to work on OS X, MS
Windows, Linux, Solaris, the BSDs, and HP-UX. It should work on any
POSIX platform but I haven't had the opportunity to test it on any
others (if you have access to other platforms and would like to help
increase this list then let me know).
The rest of this page explains everything you need to know to get started using the library. It
explains where to find the documentation for each object/function and how to interpret
what you find there. For help compiling with dlib check out the how to compile
page. Or if you are having trouble finding where a particular object's documentation is located you may
be able to find it by consulting the index.
The library is also covered by the very liberal Boost Software License
so feel free to use it any way you like. However, if you use dlib in
your research then please cite its Journal of Machine Learning Research paper when
publishing.
Finally, I must give some credit to the Reusable
Software Research Group at Ohio State since they taught me much
of the software engineering techniques used in the creation of this library.
Notation
For the most part I try to document my code in a way that any C++ programmer would understand,
but for the sake of brevity I use some of the following uncommon notation.
CONVENTION
This is a section of the formal comment which appears at the top of classes which are
actual implementations (as opposed to specifications). This section of the comment contains
a list of invariants that tell you what the member variables are used for. It also relates
the state of the member variables to the class interface.
For example, you might see a line in this section that says "my_size == size()". This just means
that the member variable my_size always contains the value returned by the size() function.
"initial value for its type"
I frequently say that after a function executes some variable or argument will have an
initial value for its type. This makes sense for objects with a user defined constructor,
but for anything else not so much. Therefore the initial value of a type with no user defined
constructor is undefined.
Organization
The library can be thought of as a collection of components. Each component always consists of
at least two separate files, a specification file and an implementation file. The specification
files are the ones that end with _abstract.h. Each of these specification files don't actually
contain any code and they even have preprocessor directives that prevent any of their contents from
being included. Their purpose is purely to document a component's interface in a file that isn't
cluttered with implementation details the user shouldn't need to know about.
The next important concept in dlib organization is multi-implementation components. That is,
some components provide more than one implementation of what is defined in their specification.
When you use these components you have to identify them with names like dlib::component::kernel_1a.
Often these components will have just a debugging and non-debugging implementation. However, many components
provide a large number of alternate implementations. For example, the entropy_encoder_model
has 32 different implementations you can choose from.
- File organization for multi-implementation components
Each component gets its own folder and one file in the root of the directory tree.
I will use the queue object as a typical example and
explain what each of its files contain.
Below is the directory structure and all the files related to the queue component.
- file tree
- dlib/
- queue.h
- queue/
- queue_kernel_abstract.h
- queue_kernel_1.h
- queue_kernel_2.h
- queue_kernel_c.h
- queue_sort_abstract.h
- queue_sort_1.h
- queue.h
This file does not contain any executable code. All it does is define the typedefs such as
kernel_1a, kernel_1a_c, etc. for the queue object. See the Creating Objects
section to learn what these typedefs are for.
- queue_kernel_abstract.h
This file does not contain any code. It even has preprocessor directives that prevent
any of its contents from being included.
The purpose of this file is to define exactly what a queue object does and what its
interface is.
- queue_sort_abstract.h
This file also doesn't contain any code. Its only purpose is to define the sort
extension to queue objects.
- queue_kernel_1.h
This file contains an implementation of the queue kernel specification found
in queue_kernel_abstract.h
- queue_kernel_2.h
This file contains another implementation of the queue kernel specification found
in queue_kernel_abstract.h
- queue_sort_1.h
This file contains an implementation of the queue sort extension specification found
in queue_sort_abstract.h
- queue_kernel_c.h
This file contains a templated class which wraps any implementation of the queue kernel
specification. It is used during debugging to check that the requires clauses are never
violated.
Creating Objects
To create many of the objects in this library you need to choose which kernel implementation you would like and if you
want the checking version or any extensions.
To make this easy there are header files which define typedefs of all this stuff. For
example, to create a queue of ints using queue kernel implementation 1 you would type
dlib::queue<int>::kernel_1a my_queue;. Or to get the debugging/checking version you
would type dlib::queue<int>::kernel_1a_c my_queue;.
There can be a lot of different typedefs for each component. You can find a list of them
in the section for the component in question. For the queue component they can be found
here.
None of the above applies to the single-implementation components, that is, anything that doesn't have an "implementations"
section in its documentation. These tools are designed to have only one implementation and thus do not follow the
above naming convention. For example, to create a
logger object you would simply type dlib::logger mylog("name");.
For the purposes of object creation the API components also appear to be single-implementation. That is, there is no
need to specify which implementation you want since it is automatically determined by which platform you compile under.
Note also that there are no explicit checking versions of these components. However, there are
DLIB_ASSERT statements that perform checking and you can
enable them by #defining DEBUG or ENABLE_ASSERTS.
Assumptions
There are some restrictions on the behavior of certain objects or functions.
Rather than replicating these restrictions all over the place in my documentation they
are listed here.
- global swap()
It is assumed that this operator does not throw. Undefined behavior results if it does.
Note that std::swap() for all intrinsics and std::string does not throw.
- operator<()
It is assumed that this operator (or std::less or any similar functor supplied by you to the library)
does not throw. Undefined behavior results if it does.
- dlib::general_hash
It is assumed that general_hash does not throw. Undefined behavior results if it does.
This is actually noted in the general hash spec file but I'm listing it here also for good measure.
Thread Safety
In the library there are three kinds of objects with regards to threading:
- Objects which are completely thread safe. This means that any pattern of access from
multiple threads is safe.
- Objects which are safe to use if no threads touch the same instance, but require access
to a particular instance to be serialized via a mutex if it is shared among threads.
- Objects which share some kind of global resource or are reference counted. This kind of object is
extremely thread unfriendly and can only be used in a threaded program with great care.
How do you know which components/objects are thread safe and which aren't? The rule is that if
the specification for the component doesn't mention threading or thread safety then
it is ok to use as long as you serialize access to shared instances. If the component might have
some global resources or be reference counted then the specifications will tell you this.
Lastly if the component is completely thread safe then the specification will tell you this.
Also note that global functions in dlib are always thread safe.