# Introduction¶

seqlearn extends the scikit-learn machine learning library to deal with sequence classification: sequences of observations that must be individually labeled, but where the order in which they appear matters.

seqlearn mimicks the basic scikit-learn `fit`

/`predict`

API
and tries to stay compatible with scikit-learn’s data formats,
but adds an argument to the scikit-learn methods that encodes the structure
of the input. This argument is called `lengths`

and should be an array of integers denoting the respective lengths
of sequences in `(X, y)`

.

For example, if `X`

and `y`

both have length (`shape[0]`

) of 10, then
`lengths=[6, 4]`

encodes the information that `(X[:6], y[:6])`

and
`(X[6:10], y[6:10])`

are both coherent sequences.
This encoding of sequence information allows for a fast implementation
using NumPy’s vectorized operations.