Dlib is principally a C++ library, however, you can use a number of its tools from python applications. This page documents the python API for working with these dlib tools. If you haven’t done so already, you should probably look at the python example programs first before consulting this reference. These example programs are little mini-tutorials for using dlib from python. They are listed on the left of the main dlib web page.

Classes¶

dlib.array
dlib.cca_outputs
dlib.cnn_face_detection_model_v1
dlib.correlation_tracker
dlib.drectangle
dlib.face_recognition_model_v1
dlib.fhog_object_detector
dlib.full_object_detection
dlib.full_object_detections
dlib.image_window
dlib.matrix
dlib.mmod_rectangle
dlib.mmod_rectangles
dlib.mmod_rectangless
dlib.pair
dlib.point
dlib.points
dlib.range
dlib.ranges
dlib.rangess
dlib.ranking_pair
dlib.ranking_pairs
dlib.rectangle
dlib.rectangles
dlib.rgb_pixel
dlib.segmenter_params
dlib.segmenter_test
dlib.segmenter_type
dlib.shape_predictor
dlib.shape_predictor_training_options
dlib.simple_object_detector
dlib.simple_object_detector_training_options
dlib.simple_test_results
dlib.sparse_ranking_pair
dlib.sparse_ranking_pairs
dlib.sparse_vector
dlib.sparse_vectors
dlib.sparse_vectorss
dlib.svm_c_trainer_histogram_intersection
dlib.svm_c_trainer_linear
dlib.svm_c_trainer_radial_basis
dlib.svm_c_trainer_sparse_histogram_intersection
dlib.svm_c_trainer_sparse_linear
dlib.svm_c_trainer_sparse_radial_basis
dlib.svm_rank_trainer
dlib.svm_rank_trainer_sparse
dlib.vector
dlib.vectors
dlib.vectorss

Functions¶

dlib.apply_cca_transform()
dlib.assignment_cost()
dlib.cca()
dlib.chinese_whispers_clustering()
dlib.cross_validate_ranking_trainer()
dlib.cross_validate_sequence_segmenter()
dlib.cross_validate_trainer()
dlib.cross_validate_trainer_threaded()
dlib.dot()
dlib.find_candidate_object_locations()
dlib.get_frontal_face_detector()
dlib.hit_enter_to_continue()
dlib.load_libsvm_formatted_data()
dlib.make_sparse_vector()
dlib.max_cost_assignment()
dlib.max_index_plus_one()
dlib.save_face_chip()
dlib.save_face_chips()
dlib.save_libsvm_formatted_data()
dlib.solve_structural_svm_problem()
dlib.test_binary_decision_function()
dlib.test_ranking_function()
dlib.test_regression_function()
dlib.test_sequence_segmenter()
dlib.test_shape_predictor()
dlib.test_simple_object_detector()
dlib.train_sequence_segmenter()
dlib.train_shape_predictor()
dlib.train_simple_object_detector()

Detailed API Listing¶

dlib.apply_cca_transform((matrix)m, (sparse_vector)v) → vector :¶

requires

max_index_plus_one(v) <= m.nr()

ensures

returns trans(m)*v (i.e. multiply m by the vector v and return the result)

class dlib.array¶

This object represents a 1D array of floating point numbers. Moreover, it binds directly to the C++ type std::vector<double>.

append((array)arg1, (object)arg2) → None¶

clear((array)arg1) → None¶

extend((array)arg1, (object)arg2) → None¶

resize((array)arg1, (int)arg2) → None¶

dlib.assignment_cost((matrix)cost, (list)assignment) → float :¶

requires

cost.nr() == cost.nc() (i.e. the input must be a square matrix)
for all valid i:
- 0 <= assignment[i] < cost.nr()

ensures

Interprets cost as a cost assignment matrix. That is, cost[i][j] represents the cost of assigning i to j.
Interprets assignment as a particular set of assignments. That is, i is assigned to assignment[i].
returns the cost of the given assignment. That is, returns a number which is:

sum over i: cost[i][assignment[i]]

dlib.cca((sparse_vectors)L, (sparse_vectors)R, (int)num_correlations[, (int)extra_rank=5[, (int)q=2[, (float)regularization=0]]]) → cca_outputs :¶

requires

num_correlations > 0
len(L) > 0
len(R) > 0
len(L) == len(R)
regularization >= 0
L and R must be properly sorted sparse vectors. This means they must list their elements in ascending index order and not contain duplicate index values. You can use make_sparse_vector() to ensure this is true.

ensures

This function performs a canonical correlation analysis between the vectors in L and R. That is, it finds two transformation matrices, Ltrans and Rtrans, such that row vectors in the transformed matrices L*Ltrans and R*Rtrans are as correlated as possible (note that in this notation we interpret L as a matrix with the input vectors in its rows). Note also that this function tries to find transformations which produce num_correlations dimensional output vectors.
Note that you can easily apply the transformation to a vector using apply_cca_transform(). So for example, like this:
- apply_cca_transform(Ltrans, some_sparse_vector)
returns a structure containing the Ltrans and Rtrans transformation matrices as well as the estimated correlations between elements of the transformed vectors.
This function assumes the data vectors in L and R have already been centered (i.e. we assume the vectors have zero means). However, in many cases it is fine to use uncentered data with cca(). But if it is important for your problem then you should center your data before passing it to cca().
This function works with reduced rank approximations of the L and R matrices. This makes it fast when working with large matrices. In particular, we use the dlib::svd_fast() routine to find reduced rank representations of the input matrices by calling it as follows: svd_fast(L, U,D,V, num_correlations+extra_rank, q) and similarly for R. This means that you can use the extra_rank and q arguments to cca() to influence the accuracy of the reduced rank approximation. However, the default values should work fine for most problems.
The dimensions of the output vectors produced by L*#Ltrans or R*#Rtrans are ordered such that the dimensions with the highest correlations come first. That is, after applying the transforms produced by cca() to a set of vectors you will find that dimension 0 has the highest correlation, then dimension 1 has the next highest, and so on. This also means that the list of estimated correlations returned from cca() will always be listed in decreasing order.
This function performs the ridge regression version of Canonical Correlation Analysis when regularization is set to a value > 0. In particular, larger values indicate the solution should be more heavily regularized. This can be useful when the dimensionality of the data is larger than the number of samples.
A good discussion of CCA can be found in the paper “Canonical Correlation Analysis” by David Weenink. In particular, this function is implemented using equations 29 and 30 from his paper. We also use the idea of doing CCA on a reduced rank approximation of L and R as suggested by Paramveer S. Dhillon in his paper “Two Step CCA: A new spectral method for estimating vector models of words”.

class dlib.cca_outputs¶

Ltrans¶

Rtrans¶

correlations¶

dlib.chinese_whispers_clustering((list)descriptors, (float)threshold) → list :¶: Takes a list of descriptors and returns a list that contains a label for each descriptor. Clustering is done using dlib::chinese_whispers.

class dlib.cnn_face_detection_model_v1¶: This object detects human faces in an image. The constructor loads the face detection model from a file. You can download a pre-trained model from http://dlib.net/files/mmod_human_face_detector.dat.bz2.

class dlib.correlation_tracker¶

This is a tool for tracking moving objects in a video stream. You give it the bounding box of an object in the first frame and it attempts to track the object in the box from frame to frame. This tool is an implementation of the method described in the following paper:

Danelljan, Martin, et al. ‘Accurate scale estimation for robust visual tracking.’ Proceedings of the British Machine Vision Conference BMVC. 2014.

get_position((correlation_tracker)arg1) → drectangle :¶: returns the predicted position of the object under track.

start_track((correlation_tracker)arg1, (object)image, (drectangle)bounding_box) → None :¶

requires

image is a numpy ndarray containing either an 8bit grayscale or RGB image.

bounding_box.is_empty() == false

ensures

This object will start tracking the thing inside the bounding box in the given image. That is, if you call update() with subsequent video frames then it will try to keep track of the position of the object inside bounding_box.
#get_position() == bounding_box

start_track( (correlation_tracker)arg1, (object)image, (rectangle)bounding_box) -> None :

requires

image is a numpy ndarray containing either an 8bit grayscale or RGB image.
bounding_box.is_empty() == false

ensures

This object will start tracking the thing inside the bounding box in the given image. That is, if you call update() with subsequent video frames then it will try to keep track of the position of the object inside bounding_box.
#get_position() == bounding_box

update((correlation_tracker)arg1, (object)image) → float :¶

requires

image is a numpy ndarray containing either an 8bit grayscale or RGB image.

get_position().is_empty() == false (i.e. you must have started tracking by calling start_track())

ensures

performs: return update(img, get_position())

update( (correlation_tracker)arg1, (object)image, (drectangle)guess) -> float :

requires

image is a numpy ndarray containing either an 8bit grayscale or RGB image.
get_position().is_empty() == false (i.e. you must have started tracking by calling start_track())

ensures

When searching for the object in img, we search in the area around the provided guess.
#get_position() == the new predicted location of the object in img. This location will be a copy of guess that has been translated and scaled appropriately based on the content of img so that it, hopefully, bounds the object in img.
Returns the peak to side-lobe ratio. This is a number that measures how confident the tracker is that the object is inside #get_position(). Larger values indicate higher confidence.

update( (correlation_tracker)arg1, (object)image, (rectangle)guess) -> float :

requires

image is a numpy ndarray containing either an 8bit grayscale or RGB image.
get_position().is_empty() == false (i.e. you must have started tracking by calling start_track())

ensures

When searching for the object in img, we search in the area around the provided guess.
#get_position() == the new predicted location of the object in img. This location will be a copy of guess that has been translated and scaled appropriately based on the content of img so that it, hopefully, bounds the object in img.
Returns the peak to side-lobe ratio. This is a number that measures how confident the tracker is that the object is inside #get_position(). Larger values indicate higher confidence.

dlib.cross_validate_ranking_trainer((svm_rank_trainer)trainer, (ranking_pairs)samples, (int)folds) → _ranking_test¶: cross_validate_ranking_trainer( (svm_rank_trainer_sparse)trainer, (sparse_ranking_pairs)samples, (int)folds) -> _ranking_test

dlib.cross_validate_sequence_segmenter((vectorss)samples, (rangess)segments, (int)folds[, (segmenter_params)params=<BIO, highFeats, signed, win=5, threads=4, eps=0.1, cache=40, non-verbose, C=100>]) → segmenter_test¶: cross_validate_sequence_segmenter( (sparse_vectorss)samples, (rangess)segments, (int)folds [, (segmenter_params)params=<BIO,highFeats,signed,win=5,threads=4,eps=0.1,cache=40,non-verbose,C=100>]) -> segmenter_test

dlib.cross_validate_trainer((svm_c_trainer_radial_basis)trainer, (vectors)x, (array)y, (int)folds) → _binary_test¶

cross_validate_trainer( (svm_c_trainer_sparse_radial_basis)trainer, (sparse_vectors)x, (array)y, (int)folds) -> _binary_test

cross_validate_trainer( (svm_c_trainer_histogram_intersection)trainer, (vectors)x, (array)y, (int)folds) -> _binary_test

cross_validate_trainer( (svm_c_trainer_sparse_histogram_intersection)trainer, (sparse_vectors)x, (array)y, (int)folds) -> _binary_test

cross_validate_trainer( (svm_c_trainer_linear)trainer, (vectors)x, (array)y, (int)folds) -> _binary_test

cross_validate_trainer( (svm_c_trainer_sparse_linear)trainer, (sparse_vectors)x, (array)y, (int)folds) -> _binary_test

dlib.cross_validate_trainer_threaded((svm_c_trainer_radial_basis)trainer, (vectors)x, (array)y, (int)folds, (int)num_threads) → _binary_test¶

cross_validate_trainer_threaded( (svm_c_trainer_sparse_radial_basis)trainer, (sparse_vectors)x, (array)y, (int)folds, (int)num_threads) -> _binary_test

cross_validate_trainer_threaded( (svm_c_trainer_histogram_intersection)trainer, (vectors)x, (array)y, (int)folds, (int)num_threads) -> _binary_test

cross_validate_trainer_threaded( (svm_c_trainer_sparse_histogram_intersection)trainer, (sparse_vectors)x, (array)y, (int)folds, (int)num_threads) -> _binary_test

cross_validate_trainer_threaded( (svm_c_trainer_linear)trainer, (vectors)x, (array)y, (int)folds, (int)num_threads) -> _binary_test

cross_validate_trainer_threaded( (svm_c_trainer_sparse_linear)trainer, (sparse_vectors)x, (array)y, (int)folds, (int)num_threads) -> _binary_test

dlib.dot((vector)arg1, (vector)arg2) → float :¶: Compute the dot product between two dense column vectors.

class dlib.drectangle¶

This object represents a rectangular area of an image with floating point coordinates.

area((drectangle)arg1) → float¶

bottom((drectangle)arg1) → float¶

center((drectangle)arg1) → point¶

contains((drectangle)arg1, (point)point) → bool¶

contains( (drectangle)arg1, (int)x, (int)y) -> bool

contains( (drectangle)arg1, (drectangle)rectangle) -> bool

dcenter((drectangle)arg1) → point¶

height((drectangle)arg1) → float¶

intersect((drectangle)arg1, (drectangle)rectangle) → drectangle¶

is_empty((drectangle)arg1) → bool¶

left((drectangle)arg1) → float¶

right((drectangle)arg1) → float¶

top((drectangle)arg1) → float¶

width((drectangle)arg1) → float¶

class dlib.face_recognition_model_v1¶

This object maps human faces into 128D vectors where pictures of the same person are mapped near to each other and pictures of different people are mapped far apart. The constructor loads the face recognition model from a file. The model file is available here: http://dlib.net/files/dlib_face_recognition_resnet_model_v1.dat.bz2

compute_face_descriptor((face_recognition_model_v1)arg1, (object)img, (full_object_detection)face[, (int)num_jitters=0]) → vector :¶

Takes an image and a full_object_detection that references a face in that image and converts it into a 128D face descriptor. If num_jitters>1 then each face will be randomly jittered slightly num_jitters times, each run through the 128D projection, and the average used as the face descriptor.

compute_face_descriptor( (face_recognition_model_v1)arg1, (object)img, (full_object_detections)faces [, (int)num_jitters=0]) -> vectors :: Takes an image and an array of full_object_detections that reference faces in that image and converts them into 128D face descriptors. If num_jitters>1 then each face will be randomly jittered slightly num_jitters times, each run through the 128D projection, and the average used as the face descriptor.

class dlib.fhog_object_detector¶

This object represents a sliding window histogram-of-oriented-gradients based object detector.

run((fhog_object_detector)arg1, (object)image[, (int)upsample_num_times=0[, (float)adjust_threshold=0.0]]) → tuple :¶

requires

image is a numpy ndarray containing either an 8bit grayscale or RGB image.
upsample_num_times >= 0

ensures

This function runs the object detector on the input image and returns a tuple of (list of detections, list of scores, list of weight_indices).
Upsamples the image upsample_num_times before running the basic detector.

static run_multiple((list)detectors, (object)image[, (int)upsample_num_times=0[, (float)adjust_threshold=0.0]]) → tuple :¶

requires

detectors is a list of detectors.
image is a numpy ndarray containing either an 8bit grayscale or RGB image.
upsample_num_times >= 0

ensures

This function runs the list of object detectors at once on the input image and returns a tuple of (list of detections, list of scores, list of weight_indices).
Upsamples the image upsample_num_times before running the basic detector.

save((fhog_object_detector)arg1, (str)detector_output_filename) → None :¶: Save a simple_object_detector to the provided path.

dlib.find_candidate_object_locations((object)image, (list)rects[, (tuple)kvals=(50, 200, 3)[, (int)min_size=20[, (int)max_merging_iterations=50]]]) → None :¶

Returns found candidate objects requires

image == an image object which is a numpy ndarray

len(kvals) == 3

kvals should be a tuple that specifies the range of k values to use. In particular, it should take the form (start, end, num) where num > 0.

ensures

This function takes an input image and generates a set of candidate rectangles which are expected to bound any objects in the image. It does this by running a version of the segment_image() routine on the image and then reports rectangles containing each of the segments as well as rectangles containing unions of adjacent segments. The basic idea is described in the paper:

Segmentation as Selective Search for Object Recognition by Koen E. A. van de Sande, et al.

Note that this function deviates from what is described in the paper slightly. See the code for details.
The basic segmentation is performed kvals[2] times, each time with the k parameter (see segment_image() and the Felzenszwalb paper for details on k) set to a different value from the range of numbers linearly spaced between kvals[0] to kvals[1].
When doing the basic segmentations prior to any box merging, we discard all rectangles that have an area < min_size. Therefore, all outputs and subsequent merged rectangles are built out of rectangles that contain at least min_size pixels. Note that setting min_size to a smaller value than you might otherwise be interested in using can be useful since it allows a larger number of possible merged boxes to be created.
There are max_merging_iterations rounds of neighboring blob merging. Therefore, this parameter has some effect on the number of output rectangles you get, with larger values of the parameter giving more output rectangles.
This function appends the output rectangles into #rects. This means that any rectangles in rects before this function was called will still be in there after it terminates. Note further that #rects will not contain any duplicate rectangles. That is, for all valid i and j where i != j it will be true that:
- #rects[i] != rects[j]

class dlib.full_object_detection¶

This object represents the location of an object in an image along with the positions of each of its constituent parts.

num_parts¶: The number of parts of the object.

part((full_object_detection)arg1, (int)idx) → point :¶: A single part of the object as a dlib point.

parts((full_object_detection)arg1) → points :¶: A vector of dlib points representing all of the parts.

rect¶: Bounding box from the underlying detector. Parts can be outside box if appropriate.

class dlib.full_object_detections¶

An array of full_object_detection objects.

append((full_object_detections)arg1, (object)arg2) → None¶

clear((full_object_detections)arg1) → None¶

extend((full_object_detections)arg1, (object)arg2) → None¶

resize((full_object_detections)arg1, (int)arg2) → None¶

dlib.get_frontal_face_detector() → fhog_object_detector :¶: Returns the default face detector

dlib.hit_enter_to_continue() → None :¶: Asks the user to hit enter to continue and pauses until they do so.

class dlib.image_window¶

This is a GUI window capable of showing images on the screen.

add_overlay((image_window)arg1, (rectangles)rectangles[, (rgb_pixel)color=rgb_pixel(ÿ, , )]) → None :¶

Add a list of rectangles to the image_window. They will be displayed as red boxes by default, but the color can be passed.

add_overlay( (image_window)arg1, (rectangle)rectangle [, (rgb_pixel)color=rgb_pixel(ÿ,,)]) -> None :: Add a rectangle to the image_window. It will be displayed as a red box by default, but the color can be passed.
add_overlay( (image_window)arg1, (drectangle)rectangle [, (rgb_pixel)color=rgb_pixel(ÿ,,)]) -> None :: Add a rectangle to the image_window. It will be displayed as a red box by default, but the color can be passed.
add_overlay( (image_window)arg1, (full_object_detection)detection [, (rgb_pixel)color=rgb_pixel(,,ÿ)]) -> None :: Add full_object_detection parts to the image window. They will be displayed as blue lines by default, but the color can be passed.

clear_overlay((image_window)arg1) → None :¶: Remove all overlays from the image_window.

set_image((image_window)arg1, (object)image) → None :¶

Make the image_window display the given image.

set_image( (image_window)arg1, (fhog_object_detector)detector) -> None :: Make the image_window display the given HOG detector’s filters.
set_image( (image_window)arg1, (simple_object_detector)detector) -> None :: Make the image_window display the given HOG detector’s filters.

set_title((image_window)arg1, (str)title) → None :¶: Set the title of the window to the given value.

wait_until_closed((image_window)arg1) → None :¶: This function blocks until the window is closed.

dlib.load_libsvm_formatted_data((str)file_name) → tuple :¶: ensures - Attempts to read a file of the given name that should contain libsvm

formatted data. The data is returned as a tuple where the first tuple element is an array of sparse vectors and the second element is an array of labels.

dlib.make_sparse_vector((sparse_vector)arg1) → None :¶

This function modifies its argument so that it is a properly sorted sparse vector.: This means that the elements of the sparse vector will be ordered so that pairs with smaller indices come first. Additionally, there won’t be any pairs with identical indices. If such pairs were present in the input sparse vector then their values will be added together and only one pair with their index will be present in the output.
make_sparse_vector( (sparse_vectors)arg1) -> None :: This function modifies a sparse_vectors object so that all elements it contains are properly sorted sparse vectors.

class dlib.matrix¶

This object represents a dense 2D matrix of floating point numbers.Moreover, it binds directly to the C++ type dlib::matrix<double>.

nc((matrix)arg1) → int :¶: Return the number of columns in the matrix.

nr((matrix)arg1) → int :¶: Return the number of rows in the matrix.

set_size((matrix)arg1, (int)rows, (int)cols) → None :¶: Set the size of the matrix to the given number of rows and columns.

shape¶

dlib.max_cost_assignment((matrix)cost) → list :¶

requires

cost.nr() == cost.nc() (i.e. the input must be a square matrix)

ensures

Finds and returns the solution to the following optimization problem:
Maximize: f(A) == assignment_cost(cost, A) Subject to the following constraints:
The elements of A are unique. That is, there aren’t any elements of A which are equal.

len(A) == cost.nr()
Note that this function converts the input cost matrix into a 64bit fixed point representation. Therefore, you should make sure that the values in your cost matrix can be accurately represented by 64bit fixed point values. If this is not the case then the solution my become inaccurate due to rounding error. In general, this function will work properly when the ratio of the largest to the smallest value in cost is no more than about 1e16.

dlib.max_index_plus_one((sparse_vector)v) → int :¶: ensures - returns the dimensionality of the given sparse vector. That is, returns a

number one larger than the maximum index value in the vector. If the vector is empty then returns 0.

class dlib.mmod_rectangle¶

Wrapper around a rectangle object and a detection confidence score.

confidence¶

rect¶

class dlib.mmod_rectangles¶

An array of mmod rectangle objects.

append((mmod_rectangles)arg1, (object)arg2) → None¶

extend((mmod_rectangles)arg1, (object)arg2) → None¶

class dlib.mmod_rectangless¶

A 2D array of mmod rectangle objects.

append((mmod_rectangless)arg1, (object)arg2) → None¶

extend((mmod_rectangless)arg1, (object)arg2) → None¶

class dlib.pair¶

This object is used to represent the elements of a sparse_vector.

first¶: This field represents the index/dimension number.

second¶: This field contains the value in a vector at dimension specified by the first field.

class dlib.point¶

This object represents a single point of integer coordinates that maps directly to a dlib::point.

x¶: The x-coordinate of the point.

y¶: The y-coordinate of the point.

class dlib.points¶

An array of point objects.

append((points)arg1, (object)arg2) → None¶

clear((points)arg1) → None¶

extend((points)arg1, (object)arg2) → None¶

resize((points)arg1, (int)arg2) → None¶

class dlib.range¶

This object is used to represent a range of elements in an array.

begin¶: The index of the first element in the range. This is represented using an unsigned integer.

end¶: One past the index of the last element in the range. This is represented using an unsigned integer.

class dlib.ranges¶

This object is an array of range objects.

append((ranges)arg1, (object)arg2) → None¶

clear((ranges)arg1) → None¶

extend((ranges)arg1, (object)arg2) → None¶

resize((ranges)arg1, (int)arg2) → None¶

class dlib.rangess¶

This object is an array of arrays of range objects.

append((rangess)arg1, (object)arg2) → None¶

clear((rangess)arg1) → None¶

extend((rangess)arg1, (object)arg2) → None¶

resize((rangess)arg1, (int)arg2) → None¶

class dlib.ranking_pair¶

nonrelevant¶

relevant¶

class dlib.ranking_pairs¶

append((ranking_pairs)arg1, (object)arg2) → None¶

clear((ranking_pairs)arg1) → None¶

extend((ranking_pairs)arg1, (object)arg2) → None¶

resize((ranking_pairs)arg1, (int)arg2) → None¶

class dlib.rectangle¶

This object represents a rectangular area of an image.

area((rectangle)arg1) → int¶

bottom((rectangle)arg1) → int¶

center((rectangle)arg1) → point¶

contains((rectangle)arg1, (point)point) → bool¶

contains( (rectangle)arg1, (int)x, (int)y) -> bool

contains( (rectangle)arg1, (rectangle)rectangle) -> bool

dcenter((rectangle)arg1) → point¶

height((rectangle)arg1) → int¶

intersect((rectangle)arg1, (rectangle)rectangle) → rectangle¶

is_empty((rectangle)arg1) → bool¶

left((rectangle)arg1) → int¶

right((rectangle)arg1) → int¶

top((rectangle)arg1) → int¶

width((rectangle)arg1) → int¶

class dlib.rectangles¶

An array of rectangle objects.

append((rectangles)arg1, (object)arg2) → None¶

clear((rectangles)arg1) → None¶

extend((rectangles)arg1, (object)arg2) → None¶

resize((rectangles)arg1, (int)arg2) → None¶

class dlib.rgb_pixel¶

blue¶

green¶

red¶

dlib.save_face_chip((object)img, (full_object_detection)face, (str)chip_filename) → None :¶: Takes an image and a full_object_detection that references a face in that image and saves the face with the specified file name prefix. The face will be rotated upright and scaled to 150x150 pixels.

dlib.save_face_chips((object)img, (full_object_detections)faces, (str)chip_filename) → None :¶: Takes an image and a full_object_detections object that reference faces in that image and saves the faces with the specified file name prefix. The faces will be rotated upright and scaled to 150x150 pixels.

dlib.save_libsvm_formatted_data((str)file_name, (sparse_vectors)samples, (array)labels) → None :¶

requires

len(samples) == len(labels)

ensures

saves the data to the given file in libsvm format

class dlib.segmenter_params¶

This class is used to define all the optional parameters to the train_sequence_segmenter() and cross_validate_sequence_segmenter() routines.

C¶: SVM C parameter

allow_negative_weights¶

be_verbose¶

epsilon¶

max_cache_size¶

num_threads¶

use_BIO_model¶

use_high_order_features¶

window_size¶

class dlib.segmenter_test¶

This object is the output of the dlib.test_sequence_segmenter() and dlib.cross_validate_sequence_segmenter() routines.

f1¶

precision¶

recall¶

class dlib.segmenter_type¶

This object represents a sequence segmenter and is the type of object returned by the dlib.train_sequence_segmenter() routine.

weights¶

class dlib.shape_predictor¶

This object is a tool that takes in an image region containing some object and outputs a set of point locations that define the pose of the object. The classic example of this is human face pose prediction, where you take an image of a human face as input and are expected to identify the locations of important facial landmarks such as the corners of the mouth and eyes, tip of the nose, and so forth.

save((shape_predictor)arg1, (str)predictor_output_filename) → None :¶: Save a shape_predictor to the provided path.

class dlib.shape_predictor_training_options¶

This object is a container for the options to the train_shape_predictor() routine.

be_verbose¶: If true, train_shape_predictor() will print out a lot of information to stdout while training.

cascade_depth¶: The number of cascades created to train the model with.

feature_pool_region_padding¶: Size of region within which to sample features for the feature pool, e.g a padding of 0.5 would cause the algorithm to sample pixels from a box that was 2x2 pixels

feature_pool_size¶: Number of pixels used to generate features for the random trees.

lambda_param¶: Controls how tight the feature sampling should be. Lower values enforce closer features.

nu¶: The regularization parameter. Larger values of this parameter will cause the algorithm to fit the training data better but may also cause overfitting. The value must be in the range (0, 1].

num_test_splits¶: Number of split features at each node to sample. The one that gives the best split is chosen.

num_trees_per_cascade_level¶: The number of trees created for each cascade.

oversampling_amount¶: The number of randomly selected initial starting points sampled for each training example

random_seed¶: The random seed used by the internal random number generator

tree_depth¶: The depth of the trees used in each cascade. There are pow(2, get_tree_depth()) leaves in each tree

class dlib.simple_object_detector¶

This object represents a sliding window histogram-of-oriented-gradients based object detector.

save((simple_object_detector)arg1, (str)detector_output_filename) → None :¶: Save a simple_object_detector to the provided path.

class dlib.simple_object_detector_training_options¶

This object is a container for the options to the train_simple_object_detector() routine.

C¶: C is the usual SVM C regularization parameter. So it is passed to structural_object_detection_trainer::set_c(). Larger values of C will encourage the trainer to fit the data better but might lead to overfitting. Therefore, you must determine the proper setting of this parameter experimentally.

add_left_right_image_flips¶: if true, train_simple_object_detector() will assume the objects are left/right symmetric and add in left right flips of the training images. This doubles the size of the training dataset.

be_verbose¶: If true, train_simple_object_detector() will print out a lot of information to the screen while training.

detection_window_size¶: The sliding window used will have about this many pixels inside it.

epsilon¶: epsilon is the stopping epsilon. Smaller values make the trainer’s solver more accurate but might take longer to train.

num_threads¶: train_simple_object_detector() will use this many threads of execution. Set this to the number of CPU cores on your machine to obtain the fastest training speed.

upsample_limit¶: train_simple_object_detector() will upsample images if needed no more than upsample_limit times. Value 0 will forbid trainer to upsample any images. If trainer is unable to fit all boxes with required upsample_limit, exception will be thrown. Higher values of upsample_limit exponentially increases memory requiremens. Values higher than 2 (default) are not recommended.

class dlib.simple_test_results¶

average_precision¶

precision¶

recall¶

dlib.solve_structural_svm_problem((object)problem) → vector :¶: This function solves a structural SVM problem and returns the weight vector that defines the solution. See the example program python_examples/svm_struct.py for documentation about how to create a proper problem object.

class dlib.sparse_ranking_pair¶

nonrelevant¶

relevant¶

class dlib.sparse_ranking_pairs¶

append((sparse_ranking_pairs)arg1, (object)arg2) → None¶

clear((sparse_ranking_pairs)arg1) → None¶

extend((sparse_ranking_pairs)arg1, (object)arg2) → None¶

resize((sparse_ranking_pairs)arg1, (int)arg2) → None¶

class dlib.sparse_vector¶

This object represents the mathematical idea of a sparse column vector. It is simply an array of dlib.pair objects, each representing an index/value pair in the vector. Any elements of the vector which are missing are implicitly set to zero.

Unless otherwise noted, any routines taking a sparse_vector assume the sparse vector is sorted and has unique elements. That is, the index values of the pairs in a sparse_vector should be listed in increasing order and there should not be duplicates. However, some functions work with “unsorted” sparse vectors. These are dlib.sparse_vector objects that have either duplicate entries or non-sorted index values. Note further that you can convert an “unsorted” sparse_vector into a properly sorted sparse vector by calling dlib.make_sparse_vector() on it.

append((sparse_vector)arg1, (object)arg2) → None¶

clear((sparse_vector)arg1) → None¶

extend((sparse_vector)arg1, (object)arg2) → None¶

resize((sparse_vector)arg1, (int)arg2) → None¶

class dlib.sparse_vectors¶

This object is an array of sparse_vector objects.

append((sparse_vectors)arg1, (object)arg2) → None¶

clear((sparse_vectors)arg1) → None¶

extend((sparse_vectors)arg1, (object)arg2) → None¶

resize((sparse_vectors)arg1, (int)arg2) → None¶

class dlib.sparse_vectorss¶

This object is an array of arrays of sparse_vector objects.

append((sparse_vectorss)arg1, (object)arg2) → None¶

clear((sparse_vectorss)arg1) → None¶

extend((sparse_vectorss)arg1, (object)arg2) → None¶

resize((sparse_vectorss)arg1, (int)arg2) → None¶

class dlib.svm_c_trainer_histogram_intersection¶

c_class1¶

c_class2¶

cache_size¶

epsilon¶

set_c((svm_c_trainer_histogram_intersection)arg1, (float)arg2) → None¶

train((svm_c_trainer_histogram_intersection)arg1, (vectors)arg2, (array)arg3) → _decision_function_histogram_intersection¶

class dlib.svm_c_trainer_linear¶

be_quiet((svm_c_trainer_linear)arg1) → None¶

be_verbose((svm_c_trainer_linear)arg1) → None¶

c_class1¶

c_class2¶

epsilon¶

force_last_weight_to_1¶

has_prior¶

learns_nonnegative_weights¶

max_iterations¶

set_c((svm_c_trainer_linear)arg1, (float)arg2) → None¶

set_prior((svm_c_trainer_linear)arg1, (_decision_function_linear)arg2) → None¶

train((svm_c_trainer_linear)arg1, (vectors)arg2, (array)arg3) → _decision_function_linear¶

class dlib.svm_c_trainer_radial_basis¶

c_class1¶

c_class2¶

cache_size¶

epsilon¶

gamma¶

set_c((svm_c_trainer_radial_basis)arg1, (float)arg2) → None¶

train((svm_c_trainer_radial_basis)arg1, (vectors)arg2, (array)arg3) → _decision_function_radial_basis¶

class dlib.svm_c_trainer_sparse_histogram_intersection¶

c_class1¶

c_class2¶

cache_size¶

epsilon¶

set_c((svm_c_trainer_sparse_histogram_intersection)arg1, (float)arg2) → None¶

train((svm_c_trainer_sparse_histogram_intersection)arg1, (sparse_vectors)arg2, (array)arg3) → _decision_function_sparse_histogram_intersection¶

class dlib.svm_c_trainer_sparse_linear¶

be_quiet((svm_c_trainer_sparse_linear)arg1) → None¶

be_verbose((svm_c_trainer_sparse_linear)arg1) → None¶

c_class1¶

c_class2¶

epsilon¶

force_last_weight_to_1¶

has_prior¶

learns_nonnegative_weights¶

max_iterations¶

set_c((svm_c_trainer_sparse_linear)arg1, (float)arg2) → None¶

set_prior((svm_c_trainer_sparse_linear)arg1, (_decision_function_sparse_linear)arg2) → None¶

train((svm_c_trainer_sparse_linear)arg1, (sparse_vectors)arg2, (array)arg3) → _decision_function_sparse_linear¶

class dlib.svm_c_trainer_sparse_radial_basis¶

c_class1¶

c_class2¶

cache_size¶

epsilon¶

gamma¶

set_c((svm_c_trainer_sparse_radial_basis)arg1, (float)arg2) → None¶

train((svm_c_trainer_sparse_radial_basis)arg1, (sparse_vectors)arg2, (array)arg3) → _decision_function_sparse_radial_basis¶

class dlib.svm_rank_trainer¶

be_quiet((svm_rank_trainer)arg1) → None¶

be_verbose((svm_rank_trainer)arg1) → None¶

c¶

epsilon¶

force_last_weight_to_1¶

has_prior¶

learns_nonnegative_weights¶

max_iterations¶

set_prior((svm_rank_trainer)arg1, (_decision_function_linear)arg2) → None¶

train((svm_rank_trainer)arg1, (ranking_pair)arg2) → _decision_function_linear¶: train( (svm_rank_trainer)arg1, (ranking_pairs)arg2) -> _decision_function_linear

class dlib.svm_rank_trainer_sparse¶

be_quiet((svm_rank_trainer_sparse)arg1) → None¶

be_verbose((svm_rank_trainer_sparse)arg1) → None¶

c¶

epsilon¶

force_last_weight_to_1¶

has_prior¶

learns_nonnegative_weights¶

max_iterations¶

set_prior((svm_rank_trainer_sparse)arg1, (_decision_function_sparse_linear)arg2) → None¶

train((svm_rank_trainer_sparse)arg1, (sparse_ranking_pair)arg2) → _decision_function_sparse_linear¶: train( (svm_rank_trainer_sparse)arg1, (sparse_ranking_pairs)arg2) -> _decision_function_sparse_linear

dlib.test_binary_decision_function((_decision_function_linear)function, (vectors)samples, (array)labels) → _binary_test¶

test_binary_decision_function( (_decision_function_sparse_linear)function, (sparse_vectors)samples, (array)labels) -> _binary_test

test_binary_decision_function( (_decision_function_radial_basis)function, (vectors)samples, (array)labels) -> _binary_test

test_binary_decision_function( (_decision_function_sparse_radial_basis)function, (sparse_vectors)samples, (array)labels) -> _binary_test

test_binary_decision_function( (_decision_function_polynomial)function, (vectors)samples, (array)labels) -> _binary_test

test_binary_decision_function( (_decision_function_sparse_polynomial)function, (sparse_vectors)samples, (array)labels) -> _binary_test

test_binary_decision_function( (_decision_function_histogram_intersection)function, (vectors)samples, (array)labels) -> _binary_test

test_binary_decision_function( (_decision_function_sparse_histogram_intersection)function, (sparse_vectors)samples, (array)labels) -> _binary_test

test_binary_decision_function( (_decision_function_sigmoid)function, (vectors)samples, (array)labels) -> _binary_test

test_binary_decision_function( (_decision_function_sparse_sigmoid)function, (sparse_vectors)samples, (array)labels) -> _binary_test

dlib.test_ranking_function((_decision_function_linear)function, (ranking_pairs)samples) → _ranking_test¶

test_ranking_function( (_decision_function_sparse_linear)function, (sparse_ranking_pairs)samples) -> _ranking_test

test_ranking_function( (_decision_function_linear)function, (ranking_pair)sample) -> _ranking_test

test_ranking_function( (_decision_function_sparse_linear)function, (sparse_ranking_pair)sample) -> _ranking_test

dlib.test_regression_function((_decision_function_linear)function, (vectors)samples, (array)targets) → _regression_test¶

test_regression_function( (_decision_function_sparse_linear)function, (sparse_vectors)samples, (array)targets) -> _regression_test

test_regression_function( (_decision_function_radial_basis)function, (vectors)samples, (array)targets) -> _regression_test

test_regression_function( (_decision_function_sparse_radial_basis)function, (sparse_vectors)samples, (array)targets) -> _regression_test

test_regression_function( (_decision_function_histogram_intersection)function, (vectors)samples, (array)targets) -> _regression_test

test_regression_function( (_decision_function_sparse_histogram_intersection)function, (sparse_vectors)samples, (array)targets) -> _regression_test

test_regression_function( (_decision_function_sigmoid)function, (vectors)samples, (array)targets) -> _regression_test

test_regression_function( (_decision_function_sparse_sigmoid)function, (sparse_vectors)samples, (array)targets) -> _regression_test

test_regression_function( (_decision_function_polynomial)function, (vectors)samples, (array)targets) -> _regression_test

test_regression_function( (_decision_function_sparse_polynomial)function, (sparse_vectors)samples, (array)targets) -> _regression_test

dlib.test_sequence_segmenter((segmenter_type)arg1, (vectorss)arg2, (rangess)arg3) → segmenter_test¶: test_sequence_segmenter( (segmenter_type)arg1, (sparse_vectorss)arg2, (rangess)arg3) -> segmenter_test

dlib.test_shape_predictor((str)dataset_filename, (str)predictor_filename) → float :¶

ensures

Loads an image dataset from dataset_filename. We assume dataset_filename is a file using the XML format written by save_image_dataset_metadata().
Loads a shape_predictor from the file predictor_filename. This means predictor_filename should be a file produced by the train_shape_predictor() routine.
This function tests the predictor against the dataset and returns the mean average error of the detector. In fact, The return value of this function is identical to that of dlib’s shape_predictor_trainer() routine. Therefore, see the documentation for shape_predictor_trainer() for a detailed definition of the mean average error.

test_shape_predictor( (list)images, (list)detections, (shape_predictor)shape_predictor) -> float :

requires

len(images) == len(object_detections)

images should be a list of numpy matrices that represent images, either RGB or grayscale.

object_detections should be a list of lists of dlib.full_object_detection objects. Each dlib.full_object_detection contains the bounding box and the lists of points that make up the object parts.

ensures

shape_predictor should be a file produced by the train_shape_predictor() routine.
This function tests the predictor against the dataset and returns the mean average error of the detector. In fact, The return value of this function is identical to that of dlib’s shape_predictor_trainer() routine. Therefore, see the documentation for shape_predictor_trainer() for a detailed definition of the mean average error.

test_shape_predictor( (list)images, (list)detections, (list)scales, (shape_predictor)shape_predictor) -> float :

requires

len(images) == len(object_detections)

len(object_detections) == len(scales)

for every sublist in object_detections: len(object_detections[i]) == len(scales[i])

scales is a list of floating point scales that each predicted part location should be divided by. Useful for normalization.

images should be a list of numpy matrices that represent images, either RGB or grayscale.

object_detections should be a list of lists of dlib.full_object_detection objects. Each dlib.full_object_detection contains the bounding box and the lists of points that make up the object parts.

ensures

shape_predictor should be a file produced by the train_shape_predictor() routine.
This function tests the predictor against the dataset and returns the mean average error of the detector. In fact, The return value of this function is identical to that of dlib’s shape_predictor_trainer() routine. Therefore, see the documentation for shape_predictor_trainer() for a detailed definition of the mean average error.

dlib.test_simple_object_detector((str)dataset_filename, (str)detector_filename[, (int)upsampling_amount=-1]) → simple_test_results :¶

requires

Optionally, take the number of times to upsample the testing images (upsampling_amount >= 0).

ensures

Loads an image dataset from dataset_filename. We assume dataset_filename is a file using the XML format written by save_image_dataset_metadata().
Loads a simple_object_detector from the file detector_filename. This means detector_filename should be a file produced by the train_simple_object_detector() routine.
This function tests the detector against the dataset and returns the precision, recall, and average precision of the detector. In fact, The return value of this function is identical to that of dlib’s test_object_detection_function() routine. Therefore, see the documentation for test_object_detection_function() for a detailed definition of these metrics.

test_simple_object_detector( (list)images, (list)boxes, (fhog_object_detector)detector [, (int)upsampling_amount=0]) -> simple_test_results :

requires

len(images) == len(boxes)

images should be a list of numpy matrices that represent images, either RGB or grayscale.

boxes should be a list of lists of dlib.rectangle object.

Optionally, take the number of times to upsample the testing images (upsampling_amount >= 0).

ensures

Loads a simple_object_detector from the file detector_filename. This means detector_filename should be a file produced by the train_simple_object_detector() routine.
This function tests the detector against the dataset and returns the precision, recall, and average precision of the detector. In fact, The return value of this function is identical to that of dlib’s test_object_detection_function() routine. Therefore, see the documentation for test_object_detection_function() for a detailed definition of these metrics.

test_simple_object_detector( (list)images, (list)boxes, (simple_object_detector)detector [, (int)upsampling_amount=-1]) -> simple_test_results :

requires

len(images) == len(boxes)

images should be a list of numpy matrices that represent images, either RGB or grayscale.

boxes should be a list of lists of dlib.rectangle object.

ensures

Loads a simple_object_detector from the file detector_filename. This means detector_filename should be a file produced by the train_simple_object_detector() routine.
This function tests the detector against the dataset and returns the precision, recall, and average precision of the detector. In fact, The return value of this function is identical to that of dlib’s test_object_detection_function() routine. Therefore, see the documentation for test_object_detection_function() for a detailed definition of these metrics.

dlib.train_sequence_segmenter((vectorss)samples, (rangess)segments[, (segmenter_params)params=<BIO, highFeats, signed, win=5, threads=4, eps=0.1, cache=40, non-verbose, C=100>]) → segmenter_type¶: train_sequence_segmenter( (sparse_vectorss)samples, (rangess)segments [, (segmenter_params)params=<BIO,highFeats,signed,win=5,threads=4,eps=0.1,cache=40,non-verbose,C=100>]) -> segmenter_type

dlib.train_shape_predictor((list)images, (list)object_detections, (shape_predictor_training_options)options) → shape_predictor :¶

requires

options.lambda_param > 0

0 < options.nu <= 1

options.feature_pool_region_padding >= 0

len(images) == len(object_detections)

images should be a list of numpy matrices that represent images, either RGB or grayscale.

object_detections should be a list of lists of dlib.full_object_detection objects. Each dlib.full_object_detection contains the bounding box and the lists of points that make up the object parts.

ensures

Uses dlib’s shape_predictor_trainer object to train a shape_predictor based on the provided labeled images, full_object_detections, and options.
The trained shape_predictor is returned

train_shape_predictor( (str)dataset_filename, (str)predictor_output_filename, (shape_predictor_training_options)options) -> None :

requires

options.lambda_param > 0
0 < options.nu <= 1
options.feature_pool_region_padding >= 0

ensures

Uses dlib’s shape_predictor_trainer to train a shape_predictor based on the labeled images in the XML file dataset_filename and the provided options. This function assumes the file dataset_filename is in the XML format produced by dlib’s save_image_dataset_metadata() routine.
The trained shape predictor is serialized to the file predictor_output_filename.

dlib.train_simple_object_detector((str)dataset_filename, (str)detector_output_filename, (simple_object_detector_training_options)options) → None :¶

requires

options.C > 0

ensures

Uses the structural_object_detection_trainer to train a simple_object_detector based on the labeled images in the XML file dataset_filename. This function assumes the file dataset_filename is in the XML format produced by dlib’s save_image_dataset_metadata() routine.
This function will apply a reasonable set of default parameters and preprocessing techniques to the training procedure for simple_object_detector objects. So the point of this function is to provide you with a very easy way to train a basic object detector.
The trained object detector is serialized to the file detector_output_filename.

train_simple_object_detector( (list)images, (list)boxes, (simple_object_detector_training_options)options) -> simple_object_detector :

requires

options.C > 0
len(images) == len(boxes)
images should be a list of numpy matrices that represent images, either RGB or grayscale.
boxes should be a list of lists of dlib.rectangle object.

ensures

Uses the structural_object_detection_trainer to train a simple_object_detector based on the labeled images and bounding boxes.
This function will apply a reasonable set of default parameters and preprocessing techniques to the training procedure for simple_object_detector objects. So the point of this function is to provide you with a very easy way to train a basic object detector.
The trained object detector is returned.

class dlib.vector¶

This object represents the mathematical idea of a column vector.

resize((vector)arg1, (int)arg2) → None¶

set_size((vector)arg1, (int)arg2) → None¶

shape¶

class dlib.vectors¶

This object is an array of vector objects.

append((vectors)arg1, (object)arg2) → None¶

clear((vectors)arg1) → None¶

extend((vectors)arg1, (object)arg2) → None¶

resize((vectors)arg1, (int)arg2) → None¶

class dlib.vectorss¶

This object is an array of arrays of vector objects.

append((vectorss)arg1, (object)arg2) → None¶

clear((vectorss)arg1) → None¶

extend((vectorss)arg1, (object)arg2) → None¶

resize((vectorss)arg1, (int)arg2) → None¶

Classes¶

Functions¶

Detailed API Listing¶

Table Of Contents

This Page