Dlib C++ Library

Dlib is principally a C++ library, however, you can use a number of its tools from python applications. This page documents the python API for working with these dlib tools. If you haven’t done so already, you should probably look at the python example programs first before consulting this reference. These example programs are little mini-tutorials for using dlib from python. They are listed on the left of the main dlib web page.

Detailed API Listing

dlib.apply_cca_transform((matrix)m, (sparse_vector)v) → vector :
requires
  • max_index_plus_one(v) <= m.nr()
ensures
  • returns trans(m)*v (i.e. multiply m by the vector v and return the result)
class dlib.array

This object represents a 1D array of floating point numbers. Moreover, it binds directly to the C++ type std::vector<double>.

append((array)arg1, (object)arg2) → None
clear((array)arg1) → None
extend((array)arg1, (object)arg2) → None
resize((array)arg1, (int)arg2) → None
dlib.assignment_cost((matrix)cost, (list)assignment) → float :
requires
  • cost.nr() == cost.nc() (i.e. the input must be a square matrix)
  • for all valid i:
    • 0 <= assignment[i] < cost.nr()
ensures
  • Interprets cost as a cost assignment matrix. That is, cost[i][j] represents the cost of assigning i to j.

  • Interprets assignment as a particular set of assignments. That is, i is assigned to assignment[i].

  • returns the cost of the given assignment. That is, returns a number which is:

    sum over i: cost[i][assignment[i]]

dlib.cca((sparse_vectors)L, (sparse_vectors)R, (int)num_correlations[, (int)extra_rank=5[, (int)q=2[, (float)regularization=0]]]) → cca_outputs :
requires
  • num_correlations > 0
  • len(L) > 0
  • len(R) > 0
  • len(L) == len(R)
  • regularization >= 0
  • L and R must be properly sorted sparse vectors. This means they must list their elements in ascending index order and not contain duplicate index values. You can use make_sparse_vector() to ensure this is true.
ensures
  • This function performs a canonical correlation analysis between the vectors in L and R. That is, it finds two transformation matrices, Ltrans and Rtrans, such that row vectors in the transformed matrices L*Ltrans and R*Rtrans are as correlated as possible (note that in this notation we interpret L as a matrix with the input vectors in its rows). Note also that this function tries to find transformations which produce num_correlations dimensional output vectors.

  • Note that you can easily apply the transformation to a vector using apply_cca_transform(). So for example, like this:

    • apply_cca_transform(Ltrans, some_sparse_vector)
  • returns a structure containing the Ltrans and Rtrans transformation matrices as well as the estimated correlations between elements of the transformed vectors.

  • This function assumes the data vectors in L and R have already been centered (i.e. we assume the vectors have zero means). However, in many cases it is fine to use uncentered data with cca(). But if it is important for your problem then you should center your data before passing it to cca().

  • This function works with reduced rank approximations of the L and R matrices. This makes it fast when working with large matrices. In particular, we use the dlib::svd_fast() routine to find reduced rank representations of the input matrices by calling it as follows: svd_fast(L, U,D,V, num_correlations+extra_rank, q) and similarly for R. This means that you can use the extra_rank and q arguments to cca() to influence the accuracy of the reduced rank approximation. However, the default values should work fine for most problems.

  • The dimensions of the output vectors produced by L*#Ltrans or R*#Rtrans are ordered such that the dimensions with the highest correlations come first. That is, after applying the transforms produced by cca() to a set of vectors you will find that dimension 0 has the highest correlation, then dimension 1 has the next highest, and so on. This also means that the list of estimated correlations returned from cca() will always be listed in decreasing order.

  • This function performs the ridge regression version of Canonical Correlation Analysis when regularization is set to a value > 0. In particular, larger values indicate the solution should be more heavily regularized. This can be useful when the dimensionality of the data is larger than the number of samples.

  • A good discussion of CCA can be found in the paper “Canonical Correlation Analysis” by David Weenink. In particular, this function is implemented using equations 29 and 30 from his paper. We also use the idea of doing CCA on a reduced rank approximation of L and R as suggested by Paramveer S. Dhillon in his paper “Two Step CCA: A new spectral method for estimating vector models of words”.

class dlib.cca_outputs
Ltrans
Rtrans
correlations
dlib.chinese_whispers_clustering((list)descriptors, (float)threshold) → list :

Takes a list of descriptors and returns a list that contains a label for each descriptor. Clustering is done using dlib::chinese_whispers.

class dlib.cnn_face_detection_model_v1

This object detects human faces in an image. The constructor loads the face detection model from a file. You can download a pre-trained model from http://dlib.net/files/mmod_human_face_detector.dat.bz2.

class dlib.correlation_tracker

This is a tool for tracking moving objects in a video stream. You give it the bounding box of an object in the first frame and it attempts to track the object in the box from frame to frame. This tool is an implementation of the method described in the following paper:

Danelljan, Martin, et al. ‘Accurate scale estimation for robust visual tracking.’ Proceedings of the British Machine Vision Conference BMVC. 2014.
get_position((correlation_tracker)arg1) → drectangle :

returns the predicted position of the object under track.

start_track((correlation_tracker)arg1, (object)image, (drectangle)bounding_box) → None :
requires
  • image is a numpy ndarray containing either an 8bit grayscale or RGB image.
  • bounding_box.is_empty() == false
ensures
  • This object will start tracking the thing inside the bounding box in the given image. That is, if you call update() with subsequent video frames then it will try to keep track of the position of the object inside bounding_box.
  • #get_position() == bounding_box
start_track( (correlation_tracker)arg1, (object)image, (rectangle)bounding_box) -> None :
requires
  • image is a numpy ndarray containing either an 8bit grayscale or RGB image.
  • bounding_box.is_empty() == false
ensures
  • This object will start tracking the thing inside the bounding box in the given image. That is, if you call update() with subsequent video frames then it will try to keep track of the position of the object inside bounding_box.
  • #get_position() == bounding_box
update((correlation_tracker)arg1, (object)image) → float :
requires
  • image is a numpy ndarray containing either an 8bit grayscale or RGB image.
  • get_position().is_empty() == false (i.e. you must have started tracking by calling start_track())
ensures
  • performs: return update(img, get_position())
update( (correlation_tracker)arg1, (object)image, (drectangle)guess) -> float :
requires
  • image is a numpy ndarray containing either an 8bit grayscale or RGB image.
  • get_position().is_empty() == false (i.e. you must have started tracking by calling start_track())
ensures
  • When searching for the object in img, we search in the area around the provided guess.
  • #get_position() == the new predicted location of the object in img. This location will be a copy of guess that has been translated and scaled appropriately based on the content of img so that it, hopefully, bounds the object in img.
  • Returns the peak to side-lobe ratio. This is a number that measures how confident the tracker is that the object is inside #get_position(). Larger values indicate higher confidence.
update( (correlation_tracker)arg1, (object)image, (rectangle)guess) -> float :
requires
  • image is a numpy ndarray containing either an 8bit grayscale or RGB image.
  • get_position().is_empty() == false (i.e. you must have started tracking by calling start_track())
ensures
  • When searching for the object in img, we search in the area around the provided guess.
  • #get_position() == the new predicted location of the object in img. This location will be a copy of guess that has been translated and scaled appropriately based on the content of img so that it, hopefully, bounds the object in img.
  • Returns the peak to side-lobe ratio. This is a number that measures how confident the tracker is that the object is inside #get_position(). Larger values indicate higher confidence.
dlib.cross_validate_ranking_trainer((svm_rank_trainer)trainer, (ranking_pairs)samples, (int)folds) → _ranking_test

cross_validate_ranking_trainer( (svm_rank_trainer_sparse)trainer, (sparse_ranking_pairs)samples, (int)folds) -> _ranking_test

dlib.cross_validate_sequence_segmenter((vectorss)samples, (rangess)segments, (int)folds[, (segmenter_params)params=<BIO, highFeats, signed, win=5, threads=4, eps=0.1, cache=40, non-verbose, C=100>]) → segmenter_test

cross_validate_sequence_segmenter( (sparse_vectorss)samples, (rangess)segments, (int)folds [, (segmenter_params)params=<BIO,highFeats,signed,win=5,threads=4,eps=0.1,cache=40,non-verbose,C=100>]) -> segmenter_test

dlib.cross_validate_trainer((svm_c_trainer_radial_basis)trainer, (vectors)x, (array)y, (int)folds) → _binary_test

cross_validate_trainer( (svm_c_trainer_sparse_radial_basis)trainer, (sparse_vectors)x, (array)y, (int)folds) -> _binary_test

cross_validate_trainer( (svm_c_trainer_histogram_intersection)trainer, (vectors)x, (array)y, (int)folds) -> _binary_test

cross_validate_trainer( (svm_c_trainer_sparse_histogram_intersection)trainer, (sparse_vectors)x, (array)y, (int)folds) -> _binary_test

cross_validate_trainer( (svm_c_trainer_linear)trainer, (vectors)x, (array)y, (int)folds) -> _binary_test

cross_validate_trainer( (svm_c_trainer_sparse_linear)trainer, (sparse_vectors)x, (array)y, (int)folds) -> _binary_test

dlib.cross_validate_trainer_threaded((svm_c_trainer_radial_basis)trainer, (vectors)x, (array)y, (int)folds, (int)num_threads) → _binary_test

cross_validate_trainer_threaded( (svm_c_trainer_sparse_radial_basis)trainer, (sparse_vectors)x, (array)y, (int)folds, (int)num_threads) -> _binary_test

cross_validate_trainer_threaded( (svm_c_trainer_histogram_intersection)trainer, (vectors)x, (array)y, (int)folds, (int)num_threads) -> _binary_test

cross_validate_trainer_threaded( (svm_c_trainer_sparse_histogram_intersection)trainer, (sparse_vectors)x, (array)y, (int)folds, (int)num_threads) -> _binary_test

cross_validate_trainer_threaded( (svm_c_trainer_linear)trainer, (vectors)x, (array)y, (int)folds, (int)num_threads) -> _binary_test

cross_validate_trainer_threaded( (svm_c_trainer_sparse_linear)trainer, (sparse_vectors)x, (array)y, (int)folds, (int)num_threads) -> _binary_test

dlib.dot((vector)arg1, (vector)arg2) → float :

Compute the dot product between two dense column vectors.

class dlib.drectangle

This object represents a rectangular area of an image with floating point coordinates.

area((drectangle)arg1) → float
bottom((drectangle)arg1) → float
center((drectangle)arg1) → point
contains((drectangle)arg1, (point)point) → bool

contains( (drectangle)arg1, (int)x, (int)y) -> bool

contains( (drectangle)arg1, (drectangle)rectangle) -> bool

dcenter((drectangle)arg1) → point
height((drectangle)arg1) → float
intersect((drectangle)arg1, (drectangle)rectangle) → drectangle
is_empty((drectangle)arg1) → bool
left((drectangle)arg1) → float
right((drectangle)arg1) → float
top((drectangle)arg1) → float
width((drectangle)arg1) → float
class dlib.face_recognition_model_v1

This object maps human faces into 128D vectors where pictures of the same person are mapped near to each other and pictures of different people are mapped far apart. The constructor loads the face recognition model from a file. The model file is available here: http://dlib.net/files/dlib_face_recognition_resnet_model_v1.dat.bz2

compute_face_descriptor((face_recognition_model_v1)arg1, (object)img, (full_object_detection)face[, (int)num_jitters=0]) → vector :

Takes an image and a full_object_detection that references a face in that image and converts it into a 128D face descriptor. If num_jitters>1 then each face will be randomly jittered slightly num_jitters times, each run through the 128D projection, and the average used as the face descriptor.

compute_face_descriptor( (face_recognition_model_v1)arg1, (object)img, (full_object_detections)faces [, (int)num_jitters=0]) -> vectors :
Takes an image and an array of full_object_detections that reference faces in that image and converts them into 128D face descriptors. If num_jitters>1 then each face will be randomly jittered slightly num_jitters times, each run through the 128D projection, and the average used as the face descriptor.
class dlib.fhog_object_detector

This object represents a sliding window histogram-of-oriented-gradients based object detector.

run((fhog_object_detector)arg1, (object)image[, (int)upsample_num_times=0[, (float)adjust_threshold=0.0]]) → tuple :
requires
  • image is a numpy ndarray containing either an 8bit grayscale or RGB image.
  • upsample_num_times >= 0
ensures
  • This function runs the object detector on the input image and returns a tuple of (list of detections, list of scores, list of weight_indices).
  • Upsamples the image upsample_num_times before running the basic detector.
static run_multiple((list)detectors, (object)image[, (int)upsample_num_times=0[, (float)adjust_threshold=0.0]]) → tuple :
requires
  • detectors is a list of detectors.
  • image is a numpy ndarray containing either an 8bit grayscale or RGB image.
  • upsample_num_times >= 0
ensures
  • This function runs the list of object detectors at once on the input image and returns a tuple of (list of detections, list of scores, list of weight_indices).
  • Upsamples the image upsample_num_times before running the basic detector.
save((fhog_object_detector)arg1, (str)detector_output_filename) → None :

Save a simple_object_detector to the provided path.

dlib.find_candidate_object_locations((object)image, (list)rects[, (tuple)kvals=(50, 200, 3)[, (int)min_size=20[, (int)max_merging_iterations=50]]]) → None :

Returns found candidate objects requires

  • image == an image object which is a numpy ndarray
  • len(kvals) == 3
  • kvals should be a tuple that specifies the range of k values to use. In particular, it should take the form (start, end, num) where num > 0.
ensures
  • This function takes an input image and generates a set of candidate rectangles which are expected to bound any objects in the image. It does this by running a version of the segment_image() routine on the image and then reports rectangles containing each of the segments as well as rectangles containing unions of adjacent segments. The basic idea is described in the paper:

    Segmentation as Selective Search for Object Recognition by Koen E. A. van de Sande, et al.

    Note that this function deviates from what is described in the paper slightly. See the code for details.

  • The basic segmentation is performed kvals[2] times, each time with the k parameter (see segment_image() and the Felzenszwalb paper for details on k) set to a different value from the range of numbers linearly spaced between kvals[0] to kvals[1].

  • When doing the basic segmentations prior to any box merging, we discard all rectangles that have an area < min_size. Therefore, all outputs and subsequent merged rectangles are built out of rectangles that contain at least min_size pixels. Note that setting min_size to a smaller value than you might otherwise be interested in using can be useful since it allows a larger number of possible merged boxes to be created.

  • There are max_merging_iterations rounds of neighboring blob merging. Therefore, this parameter has some effect on the number of output rectangles you get, with larger values of the parameter giving more output rectangles.

  • This function appends the output rectangles into #rects. This means that any rectangles in rects before this function was called will still be in there after it terminates. Note further that #rects will not contain any duplicate rectangles. That is, for all valid i and j where i != j it will be true that:

    • #rects[i] != rects[j]
class dlib.full_object_detection

This object represents the location of an object in an image along with the positions of each of its constituent parts.

num_parts

The number of parts of the object.

part((full_object_detection)arg1, (int)idx) → point :

A single part of the object as a dlib point.

parts((full_object_detection)arg1) → points :

A vector of dlib points representing all of the parts.

rect

Bounding box from the underlying detector. Parts can be outside box if appropriate.

class dlib.full_object_detections

An array of full_object_detection objects.

append((full_object_detections)arg1, (object)arg2) → None
clear((full_object_detections)arg1) → None
extend((full_object_detections)arg1, (object)arg2) → None
resize((full_object_detections)arg1, (int)arg2) → None
dlib.get_frontal_face_detector() → fhog_object_detector :

Returns the default face detector

dlib.hit_enter_to_continue() → None :

Asks the user to hit enter to continue and pauses until they do so.

class dlib.image_window

This is a GUI window capable of showing images on the screen.

add_overlay((image_window)arg1, (rectangles)rectangles[, (rgb_pixel)color=rgb_pixel(ÿ, , )]) → None :

Add a list of rectangles to the image_window. They will be displayed as red boxes by default, but the color can be passed.

add_overlay( (image_window)arg1, (rectangle)rectangle [, (rgb_pixel)color=rgb_pixel(ÿ,,)]) -> None :
Add a rectangle to the image_window. It will be displayed as a red box by default, but the color can be passed.
add_overlay( (image_window)arg1, (drectangle)rectangle [, (rgb_pixel)color=rgb_pixel(ÿ,,)]) -> None :
Add a rectangle to the image_window. It will be displayed as a red box by default, but the color can be passed.
add_overlay( (image_window)arg1, (full_object_detection)detection [, (rgb_pixel)color=rgb_pixel(,,ÿ)]) -> None :
Add full_object_detection parts to the image window. They will be displayed as blue lines by default, but the color can be passed.
clear_overlay((image_window)arg1) → None :

Remove all overlays from the image_window.

set_image((image_window)arg1, (object)image) → None :

Make the image_window display the given image.

set_image( (image_window)arg1, (fhog_object_detector)detector) -> None :
Make the image_window display the given HOG detector’s filters.
set_image( (image_window)arg1, (simple_object_detector)detector) -> None :
Make the image_window display the given HOG detector’s filters.
set_title((image_window)arg1, (str)title) → None :

Set the title of the window to the given value.

wait_until_closed((image_window)arg1) → None :

This function blocks until the window is closed.

dlib.load_libsvm_formatted_data((str)file_name) → tuple :

ensures - Attempts to read a file of the given name that should contain libsvm

formatted data. The data is returned as a tuple where the first tuple element is an array of sparse vectors and the second element is an array of labels.
dlib.make_sparse_vector((sparse_vector)arg1) → None :
This function modifies its argument so that it is a properly sorted sparse vector.
This means that the elements of the sparse vector will be ordered so that pairs with smaller indices come first. Additionally, there won’t be any pairs with identical indices. If such pairs were present in the input sparse vector then their values will be added together and only one pair with their index will be present in the output.
make_sparse_vector( (sparse_vectors)arg1) -> None :
This function modifies a sparse_vectors object so that all elements it contains are properly sorted sparse vectors.
class dlib.matrix

This object represents a dense 2D matrix of floating point numbers.Moreover, it binds directly to the C++ type dlib::matrix<double>.

nc((matrix)arg1) → int :

Return the number of columns in the matrix.

nr((matrix)arg1) → int :

Return the number of rows in the matrix.

set_size((matrix)arg1, (int)rows, (int)cols) → None :

Set the size of the matrix to the given number of rows and columns.

shape
dlib.max_cost_assignment((matrix)cost) → list :
requires
  • cost.nr() == cost.nc() (i.e. the input must be a square matrix)
ensures
  • Finds and returns the solution to the following optimization problem:

    Maximize: f(A) == assignment_cost(cost, A) Subject to the following constraints:

    • The elements of A are unique. That is, there aren’t any elements of A which are equal.
    • len(A) == cost.nr()
  • Note that this function converts the input cost matrix into a 64bit fixed point representation. Therefore, you should make sure that the values in your cost matrix can be accurately represented by 64bit fixed point values. If this is not the case then the solution my become inaccurate due to rounding error. In general, this function will work properly when the ratio of the largest to the smallest value in cost is no more than about 1e16.

dlib.max_index_plus_one((sparse_vector)v) → int :

ensures - returns the dimensionality of the given sparse vector. That is, returns a

number one larger than the maximum index value in the vector. If the vector is empty then returns 0.
class dlib.mmod_rectangle

Wrapper around a rectangle object and a detection confidence score.

confidence
rect
class dlib.mmod_rectangles

An array of mmod rectangle objects.

append((mmod_rectangles)arg1, (object)arg2) → None
extend((mmod_rectangles)arg1, (object)arg2) → None
class dlib.mmod_rectangless

A 2D array of mmod rectangle objects.

append((mmod_rectangless)arg1, (object)arg2) → None
extend((mmod_rectangless)arg1, (object)arg2) → None
class dlib.pair

This object is used to represent the elements of a sparse_vector.

first

This field represents the index/dimension number.

second

This field contains the value in a vector at dimension specified by the first field.

class dlib.point

This object represents a single point of integer coordinates that maps directly to a dlib::point.

x

The x-coordinate of the point.

y

The y-coordinate of the point.

class dlib.points

An array of point objects.

append((points)arg1, (object)arg2) → None
clear((points)arg1) → None
extend((points)arg1, (object)arg2) → None
resize((points)arg1, (int)arg2) → None
class dlib.range

This object is used to represent a range of elements in an array.

begin

The index of the first element in the range. This is represented using an unsigned integer.

end

One past the index of the last element in the range. This is represented using an unsigned integer.

class dlib.ranges

This object is an array of range objects.

append((ranges)arg1, (object)arg2) → None
clear((ranges)arg1) → None
extend((ranges)arg1, (object)arg2) → None
resize((ranges)arg1, (int)arg2) → None
class dlib.rangess

This object is an array of arrays of range objects.

append((rangess)arg1, (object)arg2) → None
clear((rangess)arg1) → None
extend((rangess)arg1, (object)arg2) → None
resize((rangess)arg1, (int)arg2) → None
class dlib.ranking_pair
nonrelevant
relevant
class dlib.ranking_pairs
append((ranking_pairs)arg1, (object)arg2) → None
clear((ranking_pairs)arg1) → None
extend((ranking_pairs)arg1, (object)arg2) → None
resize((ranking_pairs)arg1, (int)arg2) → None
class dlib.rectangle

This object represents a rectangular area of an image.

area((rectangle)arg1) → int
bottom((rectangle)arg1) → int
center((rectangle)arg1) → point
contains((rectangle)arg1, (point)point) → bool

contains( (rectangle)arg1, (int)x, (int)y) -> bool

contains( (rectangle)arg1, (rectangle)rectangle) -> bool

dcenter((rectangle)arg1) → point
height((rectangle)arg1) → int
intersect((rectangle)arg1, (rectangle)rectangle) → rectangle
is_empty((rectangle)arg1) → bool
left((rectangle)arg1) → int
right((rectangle)arg1) → int
top((rectangle)arg1) → int
width((rectangle)arg1) → int
class dlib.rectangles

An array of rectangle objects.

append((rectangles)arg1, (object)arg2) → None
clear((rectangles)arg1) → None
extend((rectangles)arg1, (object)arg2) → None
resize((rectangles)arg1, (int)arg2) → None
class dlib.rgb_pixel
blue
green
red
dlib.save_face_chip((object)img, (full_object_detection)face, (str)chip_filename) → None :

Takes an image and a full_object_detection that references a face in that image and saves the face with the specified file name prefix. The face will be rotated upright and scaled to 150x150 pixels.

dlib.save_face_chips((object)img, (full_object_detections)faces, (str)chip_filename) → None :

Takes an image and a full_object_detections object that reference faces in that image and saves the faces with the specified file name prefix. The faces will be rotated upright and scaled to 150x150 pixels.

dlib.save_libsvm_formatted_data((str)file_name, (sparse_vectors)samples, (array)labels) → None :
requires
  • len(samples) == len(labels)
ensures
  • saves the data to the given file in libsvm format
class dlib.segmenter_params

This class is used to define all the optional parameters to the train_sequence_segmenter() and cross_validate_sequence_segmenter() routines.

C

SVM C parameter

allow_negative_weights
be_verbose
epsilon
max_cache_size
num_threads
use_BIO_model
use_high_order_features
window_size
class dlib.segmenter_test

This object is the output of the dlib.test_sequence_segmenter() and dlib.cross_validate_sequence_segmenter() routines.

f1
precision
recall
class dlib.segmenter_type

This object represents a sequence segmenter and is the type of object returned by the dlib.train_sequence_segmenter() routine.

weights
class dlib.shape_predictor

This object is a tool that takes in an image region containing some object and outputs a set of point locations that define the pose of the object. The classic example of this is human face pose prediction, where you take an image of a human face as input and are expected to identify the locations of important facial landmarks such as the corners of the mouth and eyes, tip of the nose, and so forth.

save((shape_predictor)arg1, (str)predictor_output_filename) → None :

Save a shape_predictor to the provided path.

class dlib.shape_predictor_training_options

This object is a container for the options to the train_shape_predictor() routine.

be_verbose

If true, train_shape_predictor() will print out a lot of information to stdout while training.

cascade_depth

The number of cascades created to train the model with.

feature_pool_region_padding

Size of region within which to sample features for the feature pool, e.g a padding of 0.5 would cause the algorithm to sample pixels from a box that was 2x2 pixels

feature_pool_size

Number of pixels used to generate features for the random trees.

lambda_param

Controls how tight the feature sampling should be. Lower values enforce closer features.

nu

The regularization parameter. Larger values of this parameter will cause the algorithm to fit the training data better but may also cause overfitting. The value must be in the range (0, 1].

num_test_splits

Number of split features at each node to sample. The one that gives the best split is chosen.

num_trees_per_cascade_level

The number of trees created for each cascade.

oversampling_amount

The number of randomly selected initial starting points sampled for each training example

random_seed

The random seed used by the internal random number generator

tree_depth

The depth of the trees used in each cascade. There are pow(2, get_tree_depth()) leaves in each tree

class dlib.simple_object_detector

This object represents a sliding window histogram-of-oriented-gradients based object detector.

save((simple_object_detector)arg1, (str)detector_output_filename) → None :

Save a simple_object_detector to the provided path.

class dlib.simple_object_detector_training_options

This object is a container for the options to the train_simple_object_detector() routine.

C

C is the usual SVM C regularization parameter. So it is passed to structural_object_detection_trainer::set_c(). Larger values of C will encourage the trainer to fit the data better but might lead to overfitting. Therefore, you must determine the proper setting of this parameter experimentally.

add_left_right_image_flips

if true, train_simple_object_detector() will assume the objects are left/right symmetric and add in left right flips of the training images. This doubles the size of the training dataset.

be_verbose

If true, train_simple_object_detector() will print out a lot of information to the screen while training.

detection_window_size

The sliding window used will have about this many pixels inside it.

epsilon

epsilon is the stopping epsilon. Smaller values make the trainer’s solver more accurate but might take longer to train.

num_threads

train_simple_object_detector() will use this many threads of execution. Set this to the number of CPU cores on your machine to obtain the fastest training speed.

upsample_limit

train_simple_object_detector() will upsample images if needed no more than upsample_limit times. Value 0 will forbid trainer to upsample any images. If trainer is unable to fit all boxes with required upsample_limit, exception will be thrown. Higher values of upsample_limit exponentially increases memory requiremens. Values higher than 2 (default) are not recommended.

class dlib.simple_test_results
average_precision
precision
recall
dlib.solve_structural_svm_problem((object)problem) → vector :

This function solves a structural SVM problem and returns the weight vector that defines the solution. See the example program python_examples/svm_struct.py for documentation about how to create a proper problem object.

class dlib.sparse_ranking_pair
nonrelevant
relevant
class dlib.sparse_ranking_pairs
append((sparse_ranking_pairs)arg1, (object)arg2) → None
clear((sparse_ranking_pairs)arg1) → None
extend((sparse_ranking_pairs)arg1, (object)arg2) → None
resize((sparse_ranking_pairs)arg1, (int)arg2) → None
class dlib.sparse_vector

This object represents the mathematical idea of a sparse column vector. It is simply an array of dlib.pair objects, each representing an index/value pair in the vector. Any elements of the vector which are missing are implicitly set to zero.

Unless otherwise noted, any routines taking a sparse_vector assume the sparse vector is sorted and has unique elements. That is, the index values of the pairs in a sparse_vector should be listed in increasing order and there should not be duplicates. However, some functions work with “unsorted” sparse vectors. These are dlib.sparse_vector objects that have either duplicate entries or non-sorted index values. Note further that you can convert an “unsorted” sparse_vector into a properly sorted sparse vector by calling dlib.make_sparse_vector() on it.

append((sparse_vector)arg1, (object)arg2) → None
clear((sparse_vector)arg1) → None
extend((sparse_vector)arg1, (object)arg2) → None
resize((sparse_vector)arg1, (int)arg2) → None
class dlib.sparse_vectors

This object is an array of sparse_vector objects.

append((sparse_vectors)arg1, (object)arg2) → None
clear((sparse_vectors)arg1) → None
extend((sparse_vectors)arg1, (object)arg2) → None
resize((sparse_vectors)arg1, (int)arg2) → None
class dlib.sparse_vectorss

This object is an array of arrays of sparse_vector objects.

append((sparse_vectorss)arg1, (object)arg2) → None
clear((sparse_vectorss)arg1) → None
extend((sparse_vectorss)arg1, (object)arg2) → None
resize((sparse_vectorss)arg1, (int)arg2) → None
class dlib.svm_c_trainer_histogram_intersection
c_class1
c_class2
cache_size
epsilon
set_c((svm_c_trainer_histogram_intersection)arg1, (float)arg2) → None
train((svm_c_trainer_histogram_intersection)arg1, (vectors)arg2, (array)arg3) → _decision_function_histogram_intersection
class dlib.svm_c_trainer_linear
be_quiet((svm_c_trainer_linear)arg1) → None
be_verbose((svm_c_trainer_linear)arg1) → None
c_class1
c_class2
epsilon
force_last_weight_to_1
has_prior
learns_nonnegative_weights
max_iterations
set_c((svm_c_trainer_linear)arg1, (float)arg2) → None
set_prior((svm_c_trainer_linear)arg1, (_decision_function_linear)arg2) → None
train((svm_c_trainer_linear)arg1, (vectors)arg2, (array)arg3) → _decision_function_linear
class dlib.svm_c_trainer_radial_basis
c_class1
c_class2
cache_size
epsilon
gamma
set_c((svm_c_trainer_radial_basis)arg1, (float)arg2) → None
train((svm_c_trainer_radial_basis)arg1, (vectors)arg2, (array)arg3) → _decision_function_radial_basis
class dlib.svm_c_trainer_sparse_histogram_intersection
c_class1
c_class2
cache_size
epsilon
set_c((svm_c_trainer_sparse_histogram_intersection)arg1, (float)arg2) → None
train((svm_c_trainer_sparse_histogram_intersection)arg1, (sparse_vectors)arg2, (array)arg3) → _decision_function_sparse_histogram_intersection
class dlib.svm_c_trainer_sparse_linear
be_quiet((svm_c_trainer_sparse_linear)arg1) → None
be_verbose((svm_c_trainer_sparse_linear)arg1) → None
c_class1
c_class2
epsilon
force_last_weight_to_1
has_prior
learns_nonnegative_weights
max_iterations
set_c((svm_c_trainer_sparse_linear)arg1, (float)arg2) → None
set_prior((svm_c_trainer_sparse_linear)arg1, (_decision_function_sparse_linear)arg2) → None
train((svm_c_trainer_sparse_linear)arg1, (sparse_vectors)arg2, (array)arg3) → _decision_function_sparse_linear
class dlib.svm_c_trainer_sparse_radial_basis
c_class1
c_class2
cache_size
epsilon
gamma
set_c((svm_c_trainer_sparse_radial_basis)arg1, (float)arg2) → None
train((svm_c_trainer_sparse_radial_basis)arg1, (sparse_vectors)arg2, (array)arg3) → _decision_function_sparse_radial_basis
class dlib.svm_rank_trainer
be_quiet((svm_rank_trainer)arg1) → None
be_verbose((svm_rank_trainer)arg1) → None
c
epsilon
force_last_weight_to_1
has_prior
learns_nonnegative_weights
max_iterations
set_prior((svm_rank_trainer)arg1, (_decision_function_linear)arg2) → None
train((svm_rank_trainer)arg1, (ranking_pair)arg2) → _decision_function_linear

train( (svm_rank_trainer)arg1, (ranking_pairs)arg2) -> _decision_function_linear

class dlib.svm_rank_trainer_sparse
be_quiet((svm_rank_trainer_sparse)arg1) → None
be_verbose((svm_rank_trainer_sparse)arg1) → None
c
epsilon
force_last_weight_to_1
has_prior
learns_nonnegative_weights
max_iterations
set_prior((svm_rank_trainer_sparse)arg1, (_decision_function_sparse_linear)arg2) → None
train((svm_rank_trainer_sparse)arg1, (sparse_ranking_pair)arg2) → _decision_function_sparse_linear

train( (svm_rank_trainer_sparse)arg1, (sparse_ranking_pairs)arg2) -> _decision_function_sparse_linear

dlib.test_binary_decision_function((_decision_function_linear)function, (vectors)samples, (array)labels) → _binary_test

test_binary_decision_function( (_decision_function_sparse_linear)function, (sparse_vectors)samples, (array)labels) -> _binary_test

test_binary_decision_function( (_decision_function_radial_basis)function, (vectors)samples, (array)labels) -> _binary_test

test_binary_decision_function( (_decision_function_sparse_radial_basis)function, (sparse_vectors)samples, (array)labels) -> _binary_test

test_binary_decision_function( (_decision_function_polynomial)function, (vectors)samples, (array)labels) -> _binary_test

test_binary_decision_function( (_decision_function_sparse_polynomial)function, (sparse_vectors)samples, (array)labels) -> _binary_test

test_binary_decision_function( (_decision_function_histogram_intersection)function, (vectors)samples, (array)labels) -> _binary_test

test_binary_decision_function( (_decision_function_sparse_histogram_intersection)function, (sparse_vectors)samples, (array)labels) -> _binary_test

test_binary_decision_function( (_decision_function_sigmoid)function, (vectors)samples, (array)labels) -> _binary_test

test_binary_decision_function( (_decision_function_sparse_sigmoid)function, (sparse_vectors)samples, (array)labels) -> _binary_test

dlib.test_ranking_function((_decision_function_linear)function, (ranking_pairs)samples) → _ranking_test

test_ranking_function( (_decision_function_sparse_linear)function, (sparse_ranking_pairs)samples) -> _ranking_test

test_ranking_function( (_decision_function_linear)function, (ranking_pair)sample) -> _ranking_test

test_ranking_function( (_decision_function_sparse_linear)function, (sparse_ranking_pair)sample) -> _ranking_test

dlib.test_regression_function((_decision_function_linear)function, (vectors)samples, (array)targets) → _regression_test

test_regression_function( (_decision_function_sparse_linear)function, (sparse_vectors)samples, (array)targets) -> _regression_test

test_regression_function( (_decision_function_radial_basis)function, (vectors)samples, (array)targets) -> _regression_test

test_regression_function( (_decision_function_sparse_radial_basis)function, (sparse_vectors)samples, (array)targets) -> _regression_test

test_regression_function( (_decision_function_histogram_intersection)function, (vectors)samples, (array)targets) -> _regression_test

test_regression_function( (_decision_function_sparse_histogram_intersection)function, (sparse_vectors)samples, (array)targets) -> _regression_test

test_regression_function( (_decision_function_sigmoid)function, (vectors)samples, (array)targets) -> _regression_test

test_regression_function( (_decision_function_sparse_sigmoid)function, (sparse_vectors)samples, (array)targets) -> _regression_test

test_regression_function( (_decision_function_polynomial)function, (vectors)samples, (array)targets) -> _regression_test

test_regression_function( (_decision_function_sparse_polynomial)function, (sparse_vectors)samples, (array)targets) -> _regression_test

dlib.test_sequence_segmenter((segmenter_type)arg1, (vectorss)arg2, (rangess)arg3) → segmenter_test

test_sequence_segmenter( (segmenter_type)arg1, (sparse_vectorss)arg2, (rangess)arg3) -> segmenter_test

dlib.test_shape_predictor((str)dataset_filename, (str)predictor_filename) → float :
ensures
  • Loads an image dataset from dataset_filename. We assume dataset_filename is a file using the XML format written by save_image_dataset_metadata().
  • Loads a shape_predictor from the file predictor_filename. This means predictor_filename should be a file produced by the train_shape_predictor() routine.
  • This function tests the predictor against the dataset and returns the mean average error of the detector. In fact, The return value of this function is identical to that of dlib’s shape_predictor_trainer() routine. Therefore, see the documentation for shape_predictor_trainer() for a detailed definition of the mean average error.
test_shape_predictor( (list)images, (list)detections, (shape_predictor)shape_predictor) -> float :
requires
  • len(images) == len(object_detections)
  • images should be a list of numpy matrices that represent images, either RGB or grayscale.
  • object_detections should be a list of lists of dlib.full_object_detection objects. Each dlib.full_object_detection contains the bounding box and the lists of points that make up the object parts.
ensures
  • shape_predictor should be a file produced by the train_shape_predictor() routine.
  • This function tests the predictor against the dataset and returns the mean average error of the detector. In fact, The return value of this function is identical to that of dlib’s shape_predictor_trainer() routine. Therefore, see the documentation for shape_predictor_trainer() for a detailed definition of the mean average error.
test_shape_predictor( (list)images, (list)detections, (list)scales, (shape_predictor)shape_predictor) -> float :
requires
  • len(images) == len(object_detections)
  • len(object_detections) == len(scales)
  • for every sublist in object_detections: len(object_detections[i]) == len(scales[i])
  • scales is a list of floating point scales that each predicted part location should be divided by. Useful for normalization.
  • images should be a list of numpy matrices that represent images, either RGB or grayscale.
  • object_detections should be a list of lists of dlib.full_object_detection objects. Each dlib.full_object_detection contains the bounding box and the lists of points that make up the object parts.
ensures
  • shape_predictor should be a file produced by the train_shape_predictor() routine.
  • This function tests the predictor against the dataset and returns the mean average error of the detector. In fact, The return value of this function is identical to that of dlib’s shape_predictor_trainer() routine. Therefore, see the documentation for shape_predictor_trainer() for a detailed definition of the mean average error.
dlib.test_simple_object_detector((str)dataset_filename, (str)detector_filename[, (int)upsampling_amount=-1]) → simple_test_results :
requires
  • Optionally, take the number of times to upsample the testing images (upsampling_amount >= 0).
ensures
  • Loads an image dataset from dataset_filename. We assume dataset_filename is a file using the XML format written by save_image_dataset_metadata().
  • Loads a simple_object_detector from the file detector_filename. This means detector_filename should be a file produced by the train_simple_object_detector() routine.
  • This function tests the detector against the dataset and returns the precision, recall, and average precision of the detector. In fact, The return value of this function is identical to that of dlib’s test_object_detection_function() routine. Therefore, see the documentation for test_object_detection_function() for a detailed definition of these metrics.
test_simple_object_detector( (list)images, (list)boxes, (fhog_object_detector)detector [, (int)upsampling_amount=0]) -> simple_test_results :
requires
  • len(images) == len(boxes)
  • images should be a list of numpy matrices that represent images, either RGB or grayscale.
  • boxes should be a list of lists of dlib.rectangle object.
  • Optionally, take the number of times to upsample the testing images (upsampling_amount >= 0).
ensures
  • Loads a simple_object_detector from the file detector_filename. This means detector_filename should be a file produced by the train_simple_object_detector() routine.
  • This function tests the detector against the dataset and returns the precision, recall, and average precision of the detector. In fact, The return value of this function is identical to that of dlib’s test_object_detection_function() routine. Therefore, see the documentation for test_object_detection_function() for a detailed definition of these metrics.
test_simple_object_detector( (list)images, (list)boxes, (simple_object_detector)detector [, (int)upsampling_amount=-1]) -> simple_test_results :
requires
  • len(images) == len(boxes)
  • images should be a list of numpy matrices that represent images, either RGB or grayscale.
  • boxes should be a list of lists of dlib.rectangle object.
ensures
  • Loads a simple_object_detector from the file detector_filename. This means detector_filename should be a file produced by the train_simple_object_detector() routine.
  • This function tests the detector against the dataset and returns the precision, recall, and average precision of the detector. In fact, The return value of this function is identical to that of dlib’s test_object_detection_function() routine. Therefore, see the documentation for test_object_detection_function() for a detailed definition of these metrics.
dlib.train_sequence_segmenter((vectorss)samples, (rangess)segments[, (segmenter_params)params=<BIO, highFeats, signed, win=5, threads=4, eps=0.1, cache=40, non-verbose, C=100>]) → segmenter_type

train_sequence_segmenter( (sparse_vectorss)samples, (rangess)segments [, (segmenter_params)params=<BIO,highFeats,signed,win=5,threads=4,eps=0.1,cache=40,non-verbose,C=100>]) -> segmenter_type

dlib.train_shape_predictor((list)images, (list)object_detections, (shape_predictor_training_options)options) → shape_predictor :
requires
  • options.lambda_param > 0
  • 0 < options.nu <= 1
  • options.feature_pool_region_padding >= 0
  • len(images) == len(object_detections)
  • images should be a list of numpy matrices that represent images, either RGB or grayscale.
  • object_detections should be a list of lists of dlib.full_object_detection objects. Each dlib.full_object_detection contains the bounding box and the lists of points that make up the object parts.
ensures
  • Uses dlib’s shape_predictor_trainer object to train a shape_predictor based on the provided labeled images, full_object_detections, and options.
  • The trained shape_predictor is returned
train_shape_predictor( (str)dataset_filename, (str)predictor_output_filename, (shape_predictor_training_options)options) -> None :
requires
  • options.lambda_param > 0
  • 0 < options.nu <= 1
  • options.feature_pool_region_padding >= 0
ensures
  • Uses dlib’s shape_predictor_trainer to train a shape_predictor based on the labeled images in the XML file dataset_filename and the provided options. This function assumes the file dataset_filename is in the XML format produced by dlib’s save_image_dataset_metadata() routine.
  • The trained shape predictor is serialized to the file predictor_output_filename.
dlib.train_simple_object_detector((str)dataset_filename, (str)detector_output_filename, (simple_object_detector_training_options)options) → None :
requires
  • options.C > 0
ensures
  • Uses the structural_object_detection_trainer to train a simple_object_detector based on the labeled images in the XML file dataset_filename. This function assumes the file dataset_filename is in the XML format produced by dlib’s save_image_dataset_metadata() routine.
  • This function will apply a reasonable set of default parameters and preprocessing techniques to the training procedure for simple_object_detector objects. So the point of this function is to provide you with a very easy way to train a basic object detector.
  • The trained object detector is serialized to the file detector_output_filename.
train_simple_object_detector( (list)images, (list)boxes, (simple_object_detector_training_options)options) -> simple_object_detector :
requires
  • options.C > 0
  • len(images) == len(boxes)
  • images should be a list of numpy matrices that represent images, either RGB or grayscale.
  • boxes should be a list of lists of dlib.rectangle object.
ensures
  • Uses the structural_object_detection_trainer to train a simple_object_detector based on the labeled images and bounding boxes.
  • This function will apply a reasonable set of default parameters and preprocessing techniques to the training procedure for simple_object_detector objects. So the point of this function is to provide you with a very easy way to train a basic object detector.
  • The trained object detector is returned.
class dlib.vector

This object represents the mathematical idea of a column vector.

resize((vector)arg1, (int)arg2) → None
set_size((vector)arg1, (int)arg2) → None
shape
class dlib.vectors

This object is an array of vector objects.

append((vectors)arg1, (object)arg2) → None
clear((vectors)arg1) → None
extend((vectors)arg1, (object)arg2) → None
resize((vectors)arg1, (int)arg2) → None
class dlib.vectorss

This object is an array of arrays of vector objects.

append((vectorss)arg1, (object)arg2) → None
clear((vectorss)arg1) → None
extend((vectorss)arg1, (object)arg2) → None
resize((vectorss)arg1, (int)arg2) → None