Introduction

Overview

Dlib is a general purpose cross-platform open source software library written in the C++ programming language. Its design is heavily influenced by ideas from design by contract and component-based software engineering. This means it is, first and foremost, a collection of independent software components, each accompanied by extensive documentation and thorough debugging modes.

Davis King has been the primary author of dlib since development began in 2002. In that time dlib has grown to include a wide variety of tools. In particular, it now contains software components for dealing with networking, threads, graphical interfaces, complex data structures, linear algebra, statistical machine learning, image processing, data mining, XML and text parsing, numerical optimization, Bayesian networks, and numerous other tasks. In recent years, much of the development has been focused on creating a broad set of statistical machine learning tools. However, dlib remains a general purpose library and welcomes contributions of high quality software components useful in any domain.

Core to the development philosophy of dlib is a dedication to portability and ease of use. Therefore, all code in dlib is designed to be as portable as possible and similarly to not require a user to configure or install anything. To help achieve this, all platform specific code is confined inside the API wrappers. Everything else is either layered on top of those wrappers or is written in pure ISO standard C++. Currently the library is known to work on OS X, MS Windows, Linux, Solaris, the BSDs, and HP-UX. It should work on any POSIX platform but I haven't had the opportunity to test it on any others (if you have access to other platforms and would like to help increase this list then let me know).

The rest of this page explains everything you need to know to get started using the library. It explains where to find the documentation for each object/function and how to interpret what you find there. For help compiling with dlib check out the how to compile page. Or if you are having trouble finding where a particular object's documentation is located you may be able to find it by consulting the index.

The library is also covered by the very liberal Boost Software License so feel free to use it any way you like. However, if you use dlib in your research then please cite its Journal of Machine Learning Research paper when publishing.

Finally, I must give some credit to the Reusable Software Research Group at Ohio State since they taught me much of the software engineering techniques used in the creation of this library.

Notation

For the most part I try to document my code in a way that any C++ programmer would understand, but for the sake of brevity I use some of the following uncommon notation.

  • kernel, extension, and abstract
      Each component of the library has a specification which defines its core behavior and interface. This specification defines what is called the component's kernel. Additionally, each component may have any number of extensions. An extension is essentially a specification for something that layers functionality on top of the kernel of a component.

      In the naming of files I use the word abstract to indicate that a file contains a specification of a kernel component or extension rather than an actual implementation.

  • /*! comments like this !*/
      This is just for "formal comments." Generally these appear after a function prototype and contain the requires/ensures stuff or at the top of a class and tell you general things about the class.

  • requires/ensures/throws
      These words appear in the formal comment following function prototypes and have the following meanings.

      requires: This defines a list of requirements for calling the function. These requirements MUST be met or a call to the function has undefined results. (note that when the checking/debugging modes are enabled on an object then it will throw the dlib::fatal_error exception with fatal_error::type == EBROKEN_ASSERT when the requires clause is broken rather than causing "undefined results")

      ensures: This defines what the function does. It is a list of conditions that will be true after the function finishes executing. Note that if an exception is thrown then nothing in the ensures clause is guaranteed to be true.

      throws: This defines what exceptions may be thrown by this function. It generally tells you why the exception might be thrown. It also tells you what the function does in this event: Does it have no effect at all? Does it corrupt any objects? etc.

      Sometimes these blocks do not appear in the formal comment. The meanings in these cases are as follows:
      missing requires: There are no requirements, you may put anything in the function arguments.
      missing ensures: This means that the effects of the function are unspecified. This is often used for call backs where the client programmer implements the actual function.
      missing throws: This doesn't mean anything. A function without a throws block might throw exceptions or it might not.

      So in summary, the requires clause must always be satisfied, the ensures clause tells you what the function does when it does not throw or return an error, and the throws clause tells you what happens when the function does throw.

  • meaning of # symbol
      I use this as a prefix on identifiers to make reference to the value of the identifier "after" some event has occurred.

      The most common place I use this notation is inside the formal comment following a function prototype. If the # symbol appears in a requires/ensures/throws block then it means the value of the identifier after the function has finished, otherwise all references to an identifier refer to its value before the function was called.

      An example will make it clear.
      int funct(
          int& something
      );
      /*!
          requires
              - something > 4     
          ensures
              - #some_other_function() == 9
              - #something == something + 1
              - returns something
      !*/
      
      This says that funct() requires that "something" be greater than 4, that funct() will increment "something" by 1, and funct() returns the original value of something. It also says that after the call to funct() ends a call to some_other_function() will return the value 9.

  • CONVENTION
      This is a section of the formal comment which appears at the top of classes which are actual implementations (as opposed to specifications). This section of the comment contains a list of invariants that tell you what the member variables are used for. It also relates the state of the member variables to the class interface.

      For example, you might see a line in this section that says "my_size == size()". This just means that the member variable my_size always contains the value returned by the size() function.

  • "initial value for its type"
      I frequently say that after a function executes some variable or argument will have an initial value for its type. This makes sense for objects with a user defined constructor, but for anything else not so much. Therefore the initial value of a type with no user defined constructor is undefined.

Organization

The library can be thought of as a collection of components. Each component always consists of at least two separate files, a specification file and an implementation file. The specification files are the ones that end with _abstract.h. Each of these specification files don't actually contain any code and they even have preprocessor directives that prevent any of their contents from being included. Their purpose is purely to document a component's interface in a file that isn't cluttered with implementation details the user shouldn't need to know about.

The next important concept in dlib organization is multi-implementation components. That is, some components provide more than one implementation of what is defined in their specification. When you use these components you have to identify them with names like dlib::component::kernel_1a. Often these components will have just a debugging and non-debugging implementation. However, many components provide a large number of alternate implementations. For example, the entropy_encoder_model has 32 different implementations you can choose from.

  • File organization for multi-implementation components
      Each component gets its own folder and one file in the root of the directory tree.

      I will use the queue object as a typical example and explain what each of its files contain. Below is the directory structure and all the files related to the queue component.

      • file tree
        • dlib/
          • queue.h
          • queue/
            • queue_kernel_abstract.h
            • queue_kernel_1.h
            • queue_kernel_2.h
            • queue_kernel_c.h
            • queue_sort_abstract.h
            • queue_sort_1.h

      • queue.h
          This file does not contain any executable code. All it does is define the typedefs such as kernel_1a, kernel_1a_c, etc. for the queue object. See the Creating Objects section to learn what these typedefs are for.
      • queue_kernel_abstract.h
          This file does not contain any code. It even has preprocessor directives that prevent any of its contents from being included.

          The purpose of this file is to define exactly what a queue object does and what its interface is.
      • queue_sort_abstract.h
          This file also doesn't contain any code. Its only purpose is to define the sort extension to queue objects.
      • queue_kernel_1.h
          This file contains an implementation of the queue kernel specification found in queue_kernel_abstract.h
      • queue_kernel_2.h
          This file contains another implementation of the queue kernel specification found in queue_kernel_abstract.h
      • queue_sort_1.h
          This file contains an implementation of the queue sort extension specification found in queue_sort_abstract.h
      • queue_kernel_c.h
          This file contains a templated class which wraps any implementation of the queue kernel specification. It is used during debugging to check that the requires clauses are never violated.

Creating Objects

To create many of the objects in this library you need to choose which kernel implementation you would like and if you want the checking version or any extensions.

To make this easy there are header files which define typedefs of all this stuff. For example, to create a queue of ints using queue kernel implementation 1 you would type dlib::queue<int>::kernel_1a my_queue;. Or to get the debugging/checking version you would type dlib::queue<int>::kernel_1a_c my_queue;.

There can be a lot of different typedefs for each component. You can find a list of them in the section for the component in question. For the queue component they can be found here.

None of the above applies to the single-implementation components, that is, anything that doesn't have an "implementations" section in its documentation. These tools are designed to have only one implementation and thus do not follow the above naming convention. For example, to create a logger object you would simply type dlib::logger mylog("name");. For the purposes of object creation the API components also appear to be single-implementation. That is, there is no need to specify which implementation you want since it is automatically determined by which platform you compile under. Note also that there are no explicit checking versions of these components. However, there are DLIB_ASSERT statements that perform checking and you can enable them by #defining DEBUG or ENABLE_ASSERTS.

Assumptions

There are some restrictions on the behavior of certain objects or functions. Rather than replicating these restrictions all over the place in my documentation they are listed here.
  • global swap()
      It is assumed that this operator does not throw. Undefined behavior results if it does. Note that std::swap() for all intrinsics and std::string does not throw.

  • operator<()
      It is assumed that this operator (or std::less or any similar functor supplied by you to the library) does not throw. Undefined behavior results if it does.

  • dlib::general_hash
      It is assumed that general_hash does not throw. Undefined behavior results if it does. This is actually noted in the general hash spec file but I'm listing it here also for good measure.

Thread Safety

In the library there are three kinds of objects with regards to threading:

  • Objects which are completely thread safe. This means that any pattern of access from multiple threads is safe.
  • Objects which are safe to use if no threads touch the same instance, but require access to a particular instance to be serialized via a mutex if it is shared among threads.
  • Objects which share some kind of global resource or are reference counted. This kind of object is extremely thread unfriendly and can only be used in a threaded program with great care.

How do you know which components/objects are thread safe and which aren't? The rule is that if the specification for the component doesn't mention threading or thread safety then it is ok to use as long as you serialize access to shared instances. If the component might have some global resources or be reference counted then the specifications will tell you this. Lastly if the component is completely thread safe then the specification will tell you this.

Also note that global functions in dlib are always thread safe.