API

Top level user functions:

all(a[, axis, keepdims, split_every, out])

Test whether all array elements along a given axis evaluate to True.

allclose(arr1, arr2[, rtol, atol, equal_nan])

Returns True if two arrays are element-wise equal within a tolerance.

angle(x[, deg])

Return the angle of the complex argument.

any(a[, axis, keepdims, split_every, out])

Test whether any array element along a given axis evaluates to True.

apply_along_axis(func1d, axis, arr, \*args)

Apply a function to 1-D slices along the given axis.

apply_over_axes(func, a, axes)

Apply a function repeatedly over multiple axes.

arange(\*args, \*\*kwargs)

Return evenly spaced values from start to stop with step size step.

arccos(x, /[, out, where, casting, order, …])

This docstring was copied from numpy.arccos.

arccosh(x, /[, out, where, casting, order, …])

This docstring was copied from numpy.arccosh.

arcsin(x, /[, out, where, casting, order, …])

This docstring was copied from numpy.arcsin.

arcsinh(x, /[, out, where, casting, order, …])

This docstring was copied from numpy.arcsinh.

arctan(x, /[, out, where, casting, order, …])

This docstring was copied from numpy.arctan.

arctan2(x1, x2, /[, out, where, casting, …])

This docstring was copied from numpy.arctan2.

arctanh(x, /[, out, where, casting, order, …])

This docstring was copied from numpy.arctanh.

argmax(x[, axis, split_every, out])

Return the maximum of an array or maximum along an axis.

argmin(x[, axis, split_every, out])

Return the minimum of an array or minimum along an axis.

argtopk(a, k[, axis, split_every])

Extract the indices of the k largest elements from a on the given axis, and return them sorted from largest to smallest.

argwhere(a)

Find the indices of array elements that are non-zero, grouped by element.

around(x[, decimals])

Evenly round to the given number of decimals.

array(object[, dtype, copy, order, subok, ndmin])

This docstring was copied from numpy.array.

asanyarray(a)

Convert the input to a dask array.

asarray(a, \*\*kwargs)

Convert the input to a dask array.

atleast_1d(\*arys)

Convert inputs to arrays with at least one dimension.

atleast_2d(\*arys)

View inputs as arrays with at least two dimensions.

atleast_3d(\*arys)

View inputs as arrays with at least three dimensions.

average(a[, axis, weights, returned])

Compute the weighted average along the specified axis.

bincount(x[, weights, minlength])

This docstring was copied from numpy.bincount.

bitwise_and(x1, x2, /[, out, where, …])

This docstring was copied from numpy.bitwise_and.

bitwise_not(x, /[, out, where, casting, …])

This docstring was copied from numpy.invert.

bitwise_or(x1, x2, /[, out, where, casting, …])

This docstring was copied from numpy.bitwise_or.

bitwise_xor(x1, x2, /[, out, where, …])

This docstring was copied from numpy.bitwise_xor.

block(arrays[, allow_unknown_chunksizes])

Assemble an nd-array from nested lists of blocks.

blockwise(func, out_ind, \*args[, name, …])

Tensor operation: Generalized inner and outer products

broadcast_arrays(\*args, \*\*kwargs)

Broadcast any number of arrays against each other.

broadcast_to(x, shape[, chunks])

Broadcast an array to a new shape.

coarsen(reduction, x, axes[, trim_excess])

Coarsen array by applying reduction to fixed size neighborhoods

ceil(x, /[, out, where, casting, order, …])

This docstring was copied from numpy.ceil.

choose(a, choices)

Construct an array from an index array and a set of arrays to choose from.

clip(\*args, \*\*kwargs)

Clip (limit) the values in an array.

compress(condition, a[, axis])

Return selected slices of an array along given axis.

concatenate(seq[, axis, …])

Concatenate arrays along an existing axis

conj(x, /[, out, where, casting, order, …])

This docstring was copied from numpy.conjugate.

copysign(x1, x2, /[, out, where, casting, …])

This docstring was copied from numpy.copysign.

corrcoef(x[, y, rowvar])

Return Pearson product-moment correlation coefficients.

cos(x, /[, out, where, casting, order, …])

This docstring was copied from numpy.cos.

cosh(x, /[, out, where, casting, order, …])

This docstring was copied from numpy.cosh.

count_nonzero(a[, axis])

Counts the number of non-zero values in the array a.

cov(m[, y, rowvar, bias, ddof])

Estimate a covariance matrix, given data and weights.

cumprod(x[, axis, dtype, out])

Return the cumulative product of elements along a given axis.

cumsum(x[, axis, dtype, out])

Return the cumulative sum of the elements along a given axis.

deg2rad(x, /[, out, where, casting, order, …])

This docstring was copied from numpy.deg2rad.

degrees(x, /[, out, where, casting, order, …])

This docstring was copied from numpy.degrees.

diag(v)

Extract a diagonal or construct a diagonal array.

diagonal(a[, offset, axis1, axis2])

Return specified diagonals.

diff(a[, n, axis])

Calculate the n-th discrete difference along the given axis.

divmod(x1, x2[, out1, out2], / [[, out, …])

This docstring was copied from numpy.divmod.

digitize(a, bins[, right])

Return the indices of the bins to which each value in input array belongs.

dot(a, b[, out])

This docstring was copied from numpy.dot.

dstack(tup[, allow_unknown_chunksizes])

Stack arrays in sequence depth wise (along third axis).

ediff1d(ary[, to_end, to_begin])

The differences between consecutive elements of an array.

einsum(subscripts, *operands[, out, dtype, …])

This docstring was copied from numpy.einsum.

empty(\*args, \*\*kwargs)

Blocked variant of empty

empty_like(a[, dtype, order, chunks, name, …])

Return a new array with the same shape and type as a given array.

exp(x, /[, out, where, casting, order, …])

This docstring was copied from numpy.exp.

expm1(x, /[, out, where, casting, order, …])

This docstring was copied from numpy.expm1.

eye(N[, chunks, M, k, dtype])

Return a 2-D Array with ones on the diagonal and zeros elsewhere.

fabs(x, /[, out, where, casting, order, …])

This docstring was copied from numpy.fabs.

fix(\*args, \*\*kwargs)

Round to nearest integer towards zero.

flatnonzero(a)

Return indices that are non-zero in the flattened version of a.

flip(m, axis)

Reverse element order along axis.

flipud(m)

Flip array in the up/down direction.

fliplr(m)

Flip array in the left/right direction.

floor(x, /[, out, where, casting, order, …])

This docstring was copied from numpy.floor.

fmax(x1, x2, /[, out, where, casting, …])

This docstring was copied from numpy.fmax.

fmin(x1, x2, /[, out, where, casting, …])

This docstring was copied from numpy.fmin.

fmod(x1, x2, /[, out, where, casting, …])

This docstring was copied from numpy.fmod.

frexp(x[, out1, out2], / [[, out, where, …])

This docstring was copied from numpy.frexp.

fromfunction(func[, chunks, shape, dtype])

Construct an array by executing a function over each coordinate.

frompyfunc(func, nin, nout)

This docstring was copied from numpy.frompyfunc.

full(shape, fill_value, \*args, \*\*kwargs)

Blocked variant of full

full_like(a, fill_value[, order, dtype, …])

Return a full array with the same shape and type as a given array.

gradient(f, \*varargs, \*\*kwargs)

Return the gradient of an N-dimensional array.

histogram(a[, bins, range, normed, weights, …])

Blocked variant of numpy.histogram().

hstack(tup[, allow_unknown_chunksizes])

Stack arrays in sequence horizontally (column wise).

hypot(x1, x2, /[, out, where, casting, …])

This docstring was copied from numpy.hypot.

imag(\*args, \*\*kwargs)

Return the imaginary part of the complex argument.

indices(dimensions[, dtype, chunks])

Implements NumPy’s indices for Dask Arrays.

insert(arr, obj, values, axis)

Insert values along the given axis before the given indices.

invert(x, /[, out, where, casting, order, …])

This docstring was copied from numpy.invert.

isclose(arr1, arr2[, rtol, atol, equal_nan])

Returns a boolean array where two arrays are element-wise equal within a tolerance.

iscomplex(\*args, \*\*kwargs)

Returns a bool array, where True if input element is complex.

isfinite(x, /[, out, where, casting, order, …])

This docstring was copied from numpy.isfinite.

isin(element, test_elements[, …])

Calculates element in test_elements, broadcasting over element only.

isinf(x, /[, out, where, casting, order, …])

This docstring was copied from numpy.isinf.

isneginf(\*args, \*\*kwargs)

This docstring was copied from numpy.equal.

isnan(x, /[, out, where, casting, order, …])

This docstring was copied from numpy.isnan.

isnull(values)

pandas.isnull for dask arrays

isposinf(\*args, \*\*kwargs)

This docstring was copied from numpy.equal.

isreal(\*args, \*\*kwargs)

Returns a bool array, where True if input element is real.

ldexp(x1, x2, /[, out, where, casting, …])

This docstring was copied from numpy.ldexp.

linspace(start, stop[, num, endpoint, …])

Return num evenly spaced values over the closed interval [start, stop].

log(x, /[, out, where, casting, order, …])

This docstring was copied from numpy.log.

log10(x, /[, out, where, casting, order, …])

This docstring was copied from numpy.log10.

log1p(x, /[, out, where, casting, order, …])

This docstring was copied from numpy.log1p.

log2(x, /[, out, where, casting, order, …])

This docstring was copied from numpy.log2.

logaddexp(x1, x2, /[, out, where, casting, …])

This docstring was copied from numpy.logaddexp.

logaddexp2(x1, x2, /[, out, where, casting, …])

This docstring was copied from numpy.logaddexp2.

logical_and(x1, x2, /[, out, where, …])

This docstring was copied from numpy.logical_and.

logical_not(x, /[, out, where, casting, …])

This docstring was copied from numpy.logical_not.

logical_or(x1, x2, /[, out, where, casting, …])

This docstring was copied from numpy.logical_or.

logical_xor(x1, x2, /[, out, where, …])

This docstring was copied from numpy.logical_xor.

map_overlap(func, \*args[, depth, boundary, …])

Map a function over blocks of arrays with some overlap

map_blocks(func, \*args[, name, token, …])

Map a function across all blocks of a dask array.

matmul(x1, x2, /[, out, casting, order, …])

This docstring was copied from numpy.matmul.

max(a[, axis, keepdims, split_every, out])

Return the maximum of an array or maximum along an axis.

maximum(x1, x2, /[, out, where, casting, …])

This docstring was copied from numpy.maximum.

mean(a[, axis, dtype, keepdims, …])

Compute the arithmetic mean along the specified axis.

median(a[, axis, keepdims, out])

Compute the median along the specified axis.

meshgrid(\*xi, \*\*kwargs)

Return coordinate matrices from coordinate vectors.

min(a[, axis, keepdims, split_every, out])

Return the minimum of an array or minimum along an axis.

minimum(x1, x2, /[, out, where, casting, …])

This docstring was copied from numpy.minimum.

modf(x[, out1, out2], / [[, out, where, …])

This docstring was copied from numpy.modf.

moment(a, order[, axis, dtype, keepdims, …])

moveaxis(a, source, destination)

Move axes of an array to new positions.

nanargmax(x[, axis, split_every, out])

Return the maximum of an array or maximum along an axis, ignoring any NaNs.

nanargmin(x[, axis, split_every, out])

Return minimum of an array or minimum along an axis, ignoring any NaNs.

nancumprod(x, axis[, dtype, out])

Return the cumulative product of array elements over a given axis treating Not a Numbers (NaNs) as one.

nancumsum(x, axis[, dtype, out])

Return the cumulative sum of array elements over a given axis treating Not a Numbers (NaNs) as zero.

nanmax(a[, axis, keepdims, split_every, out])

Return the maximum of an array or maximum along an axis, ignoring any NaNs.

nanmean(a[, axis, dtype, keepdims, …])

Compute the arithmetic mean along the specified axis, ignoring NaNs.

nanmedian(a[, axis, keepdims, out])

Compute the median along the specified axis, while ignoring NaNs.

nanmin(a[, axis, keepdims, split_every, out])

Return minimum of an array or minimum along an axis, ignoring any NaNs.

nanprod(a[, axis, dtype, keepdims, …])

Return the product of array elements over a given axis treating Not a Numbers (NaNs) as ones.

nanstd(a[, axis, dtype, keepdims, ddof, …])

Compute the standard deviation along the specified axis, while ignoring NaNs.

nansum(a[, axis, dtype, keepdims, …])

Return the sum of array elements over a given axis treating Not a Numbers (NaNs) as zero.

nanvar(a[, axis, dtype, keepdims, ddof, …])

Compute the variance along the specified axis, while ignoring NaNs.

nan_to_num(\*args, \*\*kwargs)

Replace NaN with zero and infinity with large finite numbers (default behaviour) or with the numbers defined by the user using the nan, posinf and/or neginf keywords.

nextafter(x1, x2, /[, out, where, casting, …])

This docstring was copied from numpy.nextafter.

nonzero(a)

Return the indices of the elements that are non-zero.

notnull(values)

pandas.notnull for dask arrays

ones(\*args, \*\*kwargs)

Blocked variant of ones

ones_like(a[, dtype, order, chunks, name, shape])

Return an array of ones with the same shape and type as a given array.

outer(a, b)

Compute the outer product of two vectors.

pad(array, pad_width[, mode])

Pad an array.

percentile(a, q[, interpolation, method])

Approximate percentile of 1-D array

PerformanceWarning

A warning given when bad chunking may cause poor performance

piecewise(x, condlist, funclist, \*args, \*\*kw)

Evaluate a piecewise-defined function.

prod(a[, axis, dtype, keepdims, …])

Return the product of array elements over a given axis.

ptp(a[, axis])

Range of values (maximum - minimum) along an axis.

rad2deg(x, /[, out, where, casting, order, …])

This docstring was copied from numpy.rad2deg.

radians(x, /[, out, where, casting, order, …])

This docstring was copied from numpy.radians.

ravel(array)

Return a contiguous flattened array.

real(\*args, \*\*kwargs)

Return the real part of the complex argument.

rechunk(x[, chunks, threshold, block_size_limit])

Convert blocks in dask array x for new chunks.

reduction(x, chunk, aggregate[, axis, …])

General version of reductions

repeat(a, repeats[, axis])

Repeat elements of an array.

reshape(x, shape)

Reshape array to new shape

result_type(*arrays_and_dtypes)

This docstring was copied from numpy.result_type.

rint(x, /[, out, where, casting, order, …])

This docstring was copied from numpy.rint.

roll(array, shift[, axis])

Roll array elements along a given axis.

rollaxis(a, axis[, start])

round(a[, decimals])

Round an array to the given number of decimals.

sign(x, /[, out, where, casting, order, …])

This docstring was copied from numpy.sign.

signbit(x, /[, out, where, casting, order, …])

This docstring was copied from numpy.signbit.

sin(x, /[, out, where, casting, order, …])

This docstring was copied from numpy.sin.

sinh(x, /[, out, where, casting, order, …])

This docstring was copied from numpy.sinh.

sqrt(x, /[, out, where, casting, order, …])

This docstring was copied from numpy.sqrt.

square(x, /[, out, where, casting, order, …])

This docstring was copied from numpy.square.

squeeze(a[, axis])

Remove single-dimensional entries from the shape of an array.

stack(seq[, axis, allow_unknown_chunksizes])

Stack arrays along a new axis

std(a[, axis, dtype, keepdims, ddof, …])

Compute the standard deviation along the specified axis.

sum(a[, axis, dtype, keepdims, split_every, out])

Sum of array elements over a given axis.

take(a, indices[, axis])

Take elements from an array along an axis.

tan(x, /[, out, where, casting, order, …])

This docstring was copied from numpy.tan.

tanh(x, /[, out, where, casting, order, …])

This docstring was copied from numpy.tanh.

tensordot(lhs, rhs[, axes])

Compute tensor dot product along specified axes.

tile(A, reps)

Construct an array by repeating A the number of times given by reps.

topk(a, k[, axis, split_every])

Extract the k largest elements from a on the given axis, and return them sorted from largest to smallest.

trace(a[, offset, axis1, axis2, dtype])

Return the sum along diagonals of the array.

transpose(a[, axes])

Permute the dimensions of an array.

tril(m[, k])

Lower triangle of an array with elements above the k-th diagonal zeroed.

triu(m[, k])

Upper triangle of an array with elements below the k-th diagonal zeroed.

trunc(x, /[, out, where, casting, order, …])

This docstring was copied from numpy.trunc.

unify_chunks(\*args, \*\*kwargs)

Unify chunks across a sequence of arrays

unique(ar[, return_index, return_inverse, …])

Find the unique elements of an array.

unravel_index(indices, shape[, order])

This docstring was copied from numpy.unravel_index.

var(a[, axis, dtype, keepdims, ddof, …])

Compute the variance along the specified axis.

vdot(a, b)

This docstring was copied from numpy.vdot.

vstack(tup[, allow_unknown_chunksizes])

Stack arrays in sequence vertically (row wise).

where(condition, [x, y])

This docstring was copied from numpy.where.

zeros(\*args, \*\*kwargs)

Blocked variant of zeros

zeros_like(a[, dtype, order, chunks, name, …])

Return an array of zeros with the same shape and type as a given array.

Fast Fourier Transforms

fft.fft_wrap(fft_func[, kind, dtype])

Wrap 1D, 2D, and ND real and complex FFT functions

fft.fft(a[, n, axis])

Wrapping of numpy.fft.fft

fft.fft2(a[, s, axes])

Wrapping of numpy.fft.fft2

fft.fftn(a[, s, axes])

Wrapping of numpy.fft.fftn

fft.ifft(a[, n, axis])

Wrapping of numpy.fft.ifft

fft.ifft2(a[, s, axes])

Wrapping of numpy.fft.ifft2

fft.ifftn(a[, s, axes])

Wrapping of numpy.fft.ifftn

fft.rfft(a[, n, axis])

Wrapping of numpy.fft.rfft

fft.rfft2(a[, s, axes])

Wrapping of numpy.fft.rfft2

fft.rfftn(a[, s, axes])

Wrapping of numpy.fft.rfftn

fft.irfft(a[, n, axis])

Wrapping of numpy.fft.irfft

fft.irfft2(a[, s, axes])

Wrapping of numpy.fft.irfft2

fft.irfftn(a[, s, axes])

Wrapping of numpy.fft.irfftn

fft.hfft(a[, n, axis])

Wrapping of numpy.fft.hfft

fft.ihfft(a[, n, axis])

Wrapping of numpy.fft.ihfft

fft.fftfreq(n[, d, chunks])

Return the Discrete Fourier Transform sample frequencies.

fft.rfftfreq(n[, d, chunks])

Return the Discrete Fourier Transform sample frequencies (for usage with rfft, irfft).

fft.fftshift(x[, axes])

Shift the zero-frequency component to the center of the spectrum.

fft.ifftshift(x[, axes])

The inverse of fftshift.

Linear Algebra

linalg.cholesky(a[, lower])

Returns the Cholesky decomposition, \(A = L L^*\) or \(A = U^* U\) of a Hermitian positive-definite matrix A.

linalg.inv(a)

Compute the inverse of a matrix with LU decomposition and forward / backward substitutions.

linalg.lstsq(a, b)

Return the least-squares solution to a linear matrix equation using QR decomposition.

linalg.lu(a)

Compute the lu decomposition of a matrix.

linalg.norm(x[, ord, axis, keepdims])

Matrix or vector norm.

linalg.qr(a)

Compute the qr factorization of a matrix.

linalg.solve(a, b[, sym_pos])

Solve the equation a x = b for x.

linalg.solve_triangular(a, b[, lower])

Solve the equation a x = b for x, assuming a is a triangular matrix.

linalg.svd(a)

Compute the singular value decomposition of a matrix.

linalg.svd_compressed(a, k[, n_power_iter, …])

Randomly compressed rank-k thin Singular Value Decomposition.

linalg.sfqr(data[, name])

Direct Short-and-Fat QR

linalg.tsqr(data[, compute_svd, …])

Direct Tall-and-Skinny QR algorithm

Masked Arrays

ma.average(a[, axis, weights, returned])

Return the weighted average of array over the given axis.

ma.filled(a[, fill_value])

Return input as an array with masked data replaced by a fill value.

ma.fix_invalid(a[, fill_value])

Return input with invalid data masked and replaced by a fill value.

ma.getdata(a)

Return the data of a masked array as an ndarray.

ma.getmaskarray(a)

Return the mask of a masked array, or full boolean array of False.

ma.masked_array(data[, mask, fill_value])

An array class with possibly masked values.

ma.masked_equal(a, value)

Mask an array where equal to a given value.

ma.masked_greater(x, value[, copy])

Mask an array where greater than a given value.

ma.masked_greater_equal(x, value[, copy])

Mask an array where greater than or equal to a given value.

ma.masked_inside(x, v1, v2)

Mask an array inside a given interval.

ma.masked_invalid(a)

Mask an array where invalid values occur (NaNs or infs).

ma.masked_less(x, value[, copy])

Mask an array where less than a given value.

ma.masked_less_equal(x, value[, copy])

Mask an array where less than or equal to a given value.

ma.masked_not_equal(x, value[, copy])

Mask an array where not equal to a given value.

ma.masked_outside(x, v1, v2)

Mask an array outside a given interval.

ma.masked_values(x, value[, rtol, atol, shrink])

Mask using floating point equality.

ma.masked_where(condition, a)

Mask an array where a condition is met.

ma.set_fill_value(a, fill_value)

Set the filling value of a, if a is a masked array.

Random

random.beta(a, b[, size, chunks])

Draw samples from a Beta distribution.

random.binomial(n, p[, size, chunks])

Draw samples from a binomial distribution.

random.chisquare(df[, size, chunks])

Draw samples from a chi-square distribution.

random.choice(a[, size, replace, p, chunks])

Generates a random sample from a given 1-D array

random.exponential([scale, size, chunks])

Draw samples from an exponential distribution.

random.f(dfnum, dfden[, size, chunks])

Draw samples from an F distribution.

random.gamma(shape[, scale, size, chunks])

Draw samples from a Gamma distribution.

random.geometric(p[, size, chunks])

Draw samples from the geometric distribution.

random.gumbel([loc, scale, size, chunks])

Draw samples from a Gumbel distribution.

random.hypergeometric(ngood, nbad, nsample)

Draw samples from a Hypergeometric distribution.

random.laplace([loc, scale, size, chunks])

Draw samples from the Laplace or double exponential distribution with specified location (or mean) and scale (decay).

random.logistic([loc, scale, size, chunks])

Draw samples from a logistic distribution.

random.lognormal([mean, sigma, size, chunks])

Draw samples from a log-normal distribution.

random.logseries(p[, size, chunks])

Draw samples from a logarithmic series distribution.

random.negative_binomial(n, p[, size, chunks])

Draw samples from a negative binomial distribution.

random.noncentral_chisquare(df, nonc[, …])

Draw samples from a noncentral chi-square distribution.

random.noncentral_f(dfnum, dfden, nonc[, …])

Draw samples from the noncentral F distribution.

random.normal([loc, scale, size, chunks])

Draw random samples from a normal (Gaussian) distribution.

random.pareto(a[, size, chunks])

Draw samples from a Pareto II or Lomax distribution with specified shape.

random.permutation(x)

Randomly permute a sequence, or return a permuted range.

random.poisson([lam, size, chunks])

Draw samples from a Poisson distribution.

random.power(a[, size, chunks])

Draws samples in [0, 1] from a power distribution with positive exponent a - 1.

random.randint(low[, high, size, chunks, dtype])

Return random integers from low (inclusive) to high (exclusive).

random.random([size, chunks])

Return random floats in the half-open interval [0.0, 1.0).

random.random_sample([size, chunks])

Return random floats in the half-open interval [0.0, 1.0).

random.rayleigh([scale, size, chunks])

Draw samples from a Rayleigh distribution.

random.standard_cauchy([size, chunks])

Draw samples from a standard Cauchy distribution with mode = 0.

random.standard_exponential([size, chunks])

Draw samples from the standard exponential distribution.

random.standard_gamma(shape[, size, chunks])

Draw samples from a standard Gamma distribution.

random.standard_normal([size, chunks])

Draw samples from a standard Normal distribution (mean=0, stdev=1).

random.standard_t(df[, size, chunks])

Draw samples from a standard Student’s t distribution with df degrees of freedom.

random.triangular(left, mode, right[, size, …])

Draw samples from the triangular distribution over the interval [left, right].

random.uniform([low, high, size, chunks])

Draw samples from a uniform distribution.

random.vonmises(mu, kappa[, size, chunks])

Draw samples from a von Mises distribution.

random.wald(mean, scale[, size, chunks])

Draw samples from a Wald, or inverse Gaussian, distribution.

random.weibull(a[, size, chunks])

Draw samples from a Weibull distribution.

random.zipf(a[, size, chunks])

Standard distributions

Stats

stats.ttest_ind(a, b[, axis, equal_var])

Calculate the T-test for the means of two independent samples of scores.

stats.ttest_1samp(a, popmean[, axis, nan_policy])

Calculate the T-test for the mean of ONE group of scores.

stats.ttest_rel(a, b[, axis, nan_policy])

Calculate the t-test on TWO RELATED samples of scores, a and b.

stats.chisquare(f_obs[, f_exp, ddof, axis])

Calculate a one-way chi-square test.

stats.power_divergence(f_obs[, f_exp, ddof, …])

Cressie-Read power divergence statistic and goodness of fit test.

stats.skew(a[, axis, bias, nan_policy])

Compute the sample skewness of a data set.

stats.skewtest(a[, axis, nan_policy])

Test whether the skew is different from the normal distribution.

stats.kurtosis(a[, axis, fisher, bias, …])

Compute the kurtosis (Fisher or Pearson) of a dataset.

stats.kurtosistest(a[, axis, nan_policy])

Test whether a dataset has normal kurtosis.

stats.normaltest(a[, axis, nan_policy])

Test whether a sample differs from a normal distribution.

stats.f_oneway(\*args)

Perform one-way ANOVA.

stats.moment(a[, moment, axis, nan_policy])

Calculate the nth moment about the mean for a sample.

Image Support

image.imread(filename[, imread, preprocess])

Read a stack of images into a dask array

Slightly Overlapping Computations

overlap.overlap(x, depth, boundary)

Share boundaries between neighboring blocks

overlap.map_overlap(func, \*args[, depth, …])

Map a function over blocks of arrays with some overlap

overlap.trim_internal(x, axes[, boundary])

Trim sides from each block

overlap.trim_overlap(x, depth[, boundary])

Trim sides from each block.

Create and Store Arrays

from_array(x[, chunks, name, lock, asarray, …])

Create dask array from something that looks like an array

from_delayed(value, shape[, dtype, meta, name])

Create a dask array from a dask delayed value

from_npy_stack(dirname[, mmap_mode])

Load dask array from stack of npy files

from_zarr(url[, component, storage_options, …])

Load array from the zarr storage format

from_tiledb(uri[, attribute, chunks, …])

Load array from the TileDB storage format

store(sources, targets[, lock, regions, …])

Store dask arrays in array-like objects, overwrite data in target

to_hdf5(filename, \*args, \*\*kwargs)

Store arrays in HDF5 file

to_zarr(arr, url[, component, …])

Save array to the zarr storage format

to_npy_stack(dirname, x[, axis])

Write dask array to a stack of .npy files

to_tiledb(darray, uri[, compute, …])

Save array to the TileDB storage format

Generalized Ufuncs

apply_gufunc(func, signature, \*args, \*\*kwargs)

Apply a generalized ufunc or similar python function to arrays.

as_gufunc([signature])

Decorator for dask.array.gufunc.

gufunc(pyfunc, **kwargs)

Binds pyfunc into dask.array.apply_gufunc when called.

Internal functions

blockwise(func, out_ind, \*args[, name, …])

Tensor operation: Generalized inner and outer products

normalize_chunks(chunks[, shape, limit, …])

Normalize chunks to tuple of tuples

Other functions

dask.array.from_array(x, chunks='auto', name=None, lock=False, asarray=None, fancy=True, getitem=None, meta=None)

Create dask array from something that looks like an array

Input must have a .shape, .ndim, .dtype and support numpy-style slicing.

Parameters
xarray_like
chunksint, tuple

How to chunk the array. Must be one of the following forms:

  • A blocksize like 1000.

  • A blockshape like (1000, 1000).

  • Explicit sizes of all blocks along all dimensions like ((1000, 1000, 500), (400, 400)).

  • A size in bytes, like “100 MiB” which will choose a uniform block-like shape

  • The word “auto” which acts like the above, but uses a configuration value array.chunk-size for the chunk size

-1 or None as a blocksize indicate the size of the corresponding dimension.

namestr, optional

The key name to use for the array. Defaults to a hash of x. By default, hash uses python’s standard sha1. This behaviour can be changed by installing cityhash, xxhash or murmurhash. If installed, a large-factor speedup can be obtained in the tokenisation step. Use name=False to generate a random name instead of hashing (fast)

Note

Because this name is used as the key in task graphs, you should ensure that it uniquely identifies the data contained within. If you’d like to provide a descriptive name that is still unique, combine the descriptive name with dask.base.tokenize() of the array_like. See Task Graphs for more.

lockbool or Lock, optional

If x doesn’t support concurrent reads then provide a lock here, or pass in True to have dask.array create one for you.

asarraybool, optional

If True then call np.asarray on chunks to convert them to numpy arrays. If False then chunks are passed through unchanged. If None (default) then we use True if the __array_function__ method is undefined.

fancybool, optional

If x doesn’t support fancy indexing (e.g. indexing with lists or arrays) then set to False. Default is True.

metaArray-like, optional

The metadata for the resulting dask array. This is the kind of array that will result from slicing the input array. Defaults to the input array.

Examples

>>> x = h5py.File('...')['/data/path']  
>>> a = da.from_array(x, chunks=(1000, 1000))  

If your underlying datastore does not support concurrent reads then include the lock=True keyword argument or lock=mylock if you want multiple arrays to coordinate around the same lock.

>>> a = da.from_array(x, chunks=(1000, 1000), lock=True)  

If your underlying datastore has a .chunks attribute (as h5py and zarr datasets do) then a multiple of that chunk shape will be used if you do not provide a chunk shape.

>>> a = da.from_array(x, chunks='auto')  
>>> a = da.from_array(x, chunks='100 MiB')  
>>> a = da.from_array(x)  

If providing a name, ensure that it is unique

>>> import dask.base
>>> token = dask.base.tokenize(x)  
>>> a = da.from_array('myarray-' + token)  
dask.array.from_delayed(value, shape, dtype=None, meta=None, name=None)

Create a dask array from a dask delayed value

This routine is useful for constructing dask arrays in an ad-hoc fashion using dask delayed, particularly when combined with stack and concatenate.

The dask array will consist of a single chunk.

Examples

>>> import dask
>>> import dask.array as da
>>> value = dask.delayed(np.ones)(5)
>>> array = da.from_delayed(value, (5,), dtype=float)
>>> array
dask.array<from-value, shape=(5,), dtype=float64, chunksize=(5,), chunktype=numpy.ndarray>
>>> array.compute()
array([1., 1., 1., 1., 1.])
dask.array.store(sources, targets, lock=True, regions=None, compute=True, return_stored=False, **kwargs)

Store dask arrays in array-like objects, overwrite data in target

This stores dask arrays into object that supports numpy-style setitem indexing. It stores values chunk by chunk so that it does not have to fill up memory. For best performance you can align the block size of the storage target with the block size of your array.

If your data fits in memory then you may prefer calling np.array(myarray) instead.

Parameters
sources: Array or iterable of Arrays
targets: array-like or Delayed or iterable of array-likes and/or Delayeds

These should support setitem syntax target[10:20] = ...

lock: boolean or threading.Lock, optional

Whether or not to lock the data stores while storing. Pass True (lock each file individually), False (don’t lock) or a particular threading.Lock object to be shared among all writes.

regions: tuple of slices or list of tuples of slices

Each region tuple in regions should be such that target[region].shape = source.shape for the corresponding source and target in sources and targets, respectively. If this is a tuple, the contents will be assumed to be slices, so do not provide a tuple of tuples.

compute: boolean, optional

If true compute immediately, return dask.delayed.Delayed otherwise

return_stored: boolean, optional

Optionally return the stored result (default False).

Examples

>>> x = ...  
>>> import h5py  
>>> f = h5py.File('myfile.hdf5', mode='a')  
>>> dset = f.create_dataset('/data', shape=x.shape,
...                                  chunks=x.chunks,
...                                  dtype='f8')  
>>> store(x, dset)  

Alternatively store many arrays at the same time

>>> store([x, y, z], [dset1, dset2, dset3])  
dask.array.coarsen(reduction, x, axes, trim_excess=False, **kwargs)

Coarsen array by applying reduction to fixed size neighborhoods

Parameters
reduction: function

Function like np.sum, np.mean, etc…

x: np.ndarray

Array to be coarsened

axes: dict

Mapping of axis to coarsening factor

Examples

>>> x = np.array([1, 2, 3, 4, 5, 6])
>>> coarsen(np.sum, x, {0: 2})
array([ 3,  7, 11])
>>> coarsen(np.max, x, {0: 3})
array([3, 6])

Provide dictionary of scale per dimension

>>> x = np.arange(24).reshape((4, 6))
>>> x
array([[ 0,  1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10, 11],
       [12, 13, 14, 15, 16, 17],
       [18, 19, 20, 21, 22, 23]])
>>> coarsen(np.min, x, {0: 2, 1: 3})
array([[ 0,  3],
       [12, 15]])

You must avoid excess elements explicitly

>>> x = np.array([1, 2, 3, 4, 5, 6, 7, 8])
>>> coarsen(np.min, x, {0: 3}, trim_excess=True)
array([1, 4])
dask.array.stack(seq, axis=0, allow_unknown_chunksizes=False)

Stack arrays along a new axis

Given a sequence of dask arrays, form a new dask array by stacking them along a new dimension (axis=0 by default)

Parameters
seq: list of dask.arrays
axis: int

Dimension along which to align all of the arrays

allow_unknown_chunksizes: bool

Allow unknown chunksizes, such as come from converting from dask dataframes. Dask.array is unable to verify that chunks line up. If data comes from differently aligned sources then this can cause unexpected results.

See also

concatenate

Examples

Create slices

>>> import dask.array as da
>>> import numpy as np
>>> data = [from_array(np.ones((4, 4)), chunks=(2, 2))
...          for i in range(3)]
>>> x = da.stack(data, axis=0)
>>> x.shape
(3, 4, 4)
>>> da.stack(data, axis=1).shape
(4, 3, 4)
>>> da.stack(data, axis=-1).shape
(4, 4, 3)

Result is a new dask Array

dask.array.concatenate(seq, axis=0, allow_unknown_chunksizes=False)

Concatenate arrays along an existing axis

Given a sequence of dask Arrays form a new dask Array by stacking them along an existing dimension (axis=0 by default)

Parameters
seq: list of dask.arrays
axis: int

Dimension along which to align all of the arrays

allow_unknown_chunksizes: bool

Allow unknown chunksizes, such as come from converting from dask dataframes. Dask.array is unable to verify that chunks line up. If data comes from differently aligned sources then this can cause unexpected results.

See also

stack

Examples

Create slices

>>> import dask.array as da
>>> import numpy as np
>>> data = [from_array(np.ones((4, 4)), chunks=(2, 2))
...          for i in range(3)]
>>> x = da.concatenate(data, axis=0)
>>> x.shape
(12, 4)
>>> da.concatenate(data, axis=1).shape
(4, 12)

Result is a new dask Array

dask.array.all(a, axis=None, keepdims=False, split_every=None, out=None)

Test whether all array elements along a given axis evaluate to True.

This docstring was copied from numpy.all.

Some inconsistencies with the Dask version may exist.

Parameters
aarray_like

Input array or object that can be converted to an array.

axisNone or int or tuple of ints, optional

Axis or axes along which a logical AND reduction is performed. The default (axis=None) is to perform a logical AND over all the dimensions of the input array. axis may be negative, in which case it counts from the last to the first axis.

New in version 1.7.0.

If this is a tuple of ints, a reduction is performed on multiple axes, instead of a single axis or all the axes as before.

outndarray, optional

Alternate output array in which to place the result. It must have the same shape as the expected output and its type is preserved (e.g., if dtype(out) is float, the result will consist of 0.0’s and 1.0’s). See ufuncs-output-type for more details.

keepdimsbool, optional

If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the input array.

If the default value is passed, then keepdims will not be passed through to the all method of sub-classes of ndarray, however any non-default value will be. If the sub-class’ method does not implement keepdims any exceptions will be raised.

Returns
allndarray, bool

A new boolean or array is returned unless out is specified, in which case a reference to out is returned.

See also

ndarray.all

equivalent method

any

Test whether any element along a given axis evaluates to True.

Notes

Not a Number (NaN), positive infinity and negative infinity evaluate to True because these are not equal to zero.

Examples

>>> np.all([[True,False],[True,True]])  
False
>>> np.all([[True,False],[True,True]], axis=0)  
array([ True, False])
>>> np.all([-1, 4, 5])  
True
>>> np.all([1.0, np.nan])  
True
>>> o=np.array(False)  
>>> z=np.all([-1, 4, 5], out=o)  
>>> id(z), id(o), z  
(28293632, 28293632, array(True)) # may vary
dask.array.allclose(arr1, arr2, rtol=1e-05, atol=1e-08, equal_nan=False)

Returns True if two arrays are element-wise equal within a tolerance.

This docstring was copied from numpy.allclose.

Some inconsistencies with the Dask version may exist.

The tolerance values are positive, typically very small numbers. The relative difference (rtol * abs(b)) and the absolute difference atol are added together to compare against the absolute difference between a and b.

NaNs are treated as equal if they are in the same place and if equal_nan=True. Infs are treated as equal if they are in the same place and of the same sign in both arrays.

Parameters
a, barray_like

Input arrays to compare.

rtolfloat

The relative tolerance parameter (see Notes).

atolfloat

The absolute tolerance parameter (see Notes).

equal_nanbool

Whether to compare NaN’s as equal. If True, NaN’s in a will be considered equal to NaN’s in b in the output array.

New in version 1.10.0.

Returns
allclosebool

Returns True if the two arrays are equal within the given tolerance; False otherwise.

See also

isclose, all, any, equal

Notes

If the following equation is element-wise True, then allclose returns True.

absolute(a - b) <= (atol + rtol * absolute(b))

The above equation is not symmetric in a and b, so that allclose(a, b) might be different from allclose(b, a) in some rare cases.

The comparison of a and b uses standard broadcasting, which means that a and b need not have the same shape in order for allclose(a, b) to evaluate to True. The same is true for equal but not array_equal.

Examples

>>> np.allclose([1e10,1e-7], [1.00001e10,1e-8])  
False
>>> np.allclose([1e10,1e-8], [1.00001e10,1e-9])  
True
>>> np.allclose([1e10,1e-8], [1.0001e10,1e-9])  
False
>>> np.allclose([1.0, np.nan], [1.0, np.nan])  
False
>>> np.allclose([1.0, np.nan], [1.0, np.nan], equal_nan=True)  
True
dask.array.angle(x, deg=0)

Return the angle of the complex argument.

This docstring was copied from numpy.angle.

Some inconsistencies with the Dask version may exist.

Parameters
zarray_like (Not supported in Dask)

A complex number or sequence of complex numbers.

degbool, optional

Return angle in degrees if True, radians if False (default).

Returns
anglendarray or scalar

The counterclockwise angle from the positive real axis on the complex plane in the range (-pi, pi], with dtype as numpy.float64.

..versionchanged:: 1.16.0

This function works on subclasses of ndarray like ma.array.

See also

arctan2
absolute

Examples

>>> np.angle([1.0, 1.0j, 1+1j])               # in radians  
array([ 0.        ,  1.57079633,  0.78539816]) # may vary
>>> np.angle(1+1j, deg=True)                  # in degrees  
45.0
dask.array.any(a, axis=None, keepdims=False, split_every=None, out=None)

Test whether any array element along a given axis evaluates to True.

This docstring was copied from numpy.any.

Some inconsistencies with the Dask version may exist.

Returns single boolean unless axis is not None

Parameters
aarray_like

Input array or object that can be converted to an array.

axisNone or int or tuple of ints, optional

Axis or axes along which a logical OR reduction is performed. The default (axis=None) is to perform a logical OR over all the dimensions of the input array. axis may be negative, in which case it counts from the last to the first axis.

New in version 1.7.0.

If this is a tuple of ints, a reduction is performed on multiple axes, instead of a single axis or all the axes as before.

outndarray, optional

Alternate output array in which to place the result. It must have the same shape as the expected output and its type is preserved (e.g., if it is of type float, then it will remain so, returning 1.0 for True and 0.0 for False, regardless of the type of a). See ufuncs-output-type for more details.

keepdimsbool, optional

If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the input array.

If the default value is passed, then keepdims will not be passed through to the any method of sub-classes of ndarray, however any non-default value will be. If the sub-class’ method does not implement keepdims any exceptions will be raised.

Returns
anybool or ndarray

A new boolean or ndarray is returned unless out is specified, in which case a reference to out is returned.

See also

ndarray.any

equivalent method

all

Test whether all elements along a given axis evaluate to True.

Notes

Not a Number (NaN), positive infinity and negative infinity evaluate to True because these are not equal to zero.

Examples

>>> np.any([[True, False], [True, True]])  
True
>>> np.any([[True, False], [False, False]], axis=0)  
array([ True, False])
>>> np.any([-1, 0, 5])  
True
>>> np.any(np.nan)  
True
>>> o=np.array(False)  
>>> z=np.any([-1, 4, 5], out=o)  
>>> z, o  
(array(True), array(True))
>>> # Check now that z is a reference to o
>>> z is o  
True
>>> id(z), id(o) # identity of z and o              
(191614240, 191614240)
dask.array.apply_along_axis(func1d, axis, arr, *args, dtype=None, shape=None, **kwargs)

Apply a function to 1-D slices along the given axis.

This docstring was copied from numpy.apply_along_axis.

Some inconsistencies with the Dask version may exist.

Apply a function to 1-D slices along the given axis. This is a blocked variant of numpy.apply_along_axis() implemented via dask.array.map_blocks()

Parameters
func1dcallable

Function to apply to 1-D slices of the array along the given axis

axisint

Axis along which func1d will be applied

arrdask array

Dask array to which func1d will be applied

argsany

Additional arguments to func1d.

dtypestr or dtype, optional

The dtype of the output of func1d.

shapetuple, optional

The shape of the output of func1d.

kwargsany

Additional keyword arguments for func1d.

Returns
outndarray (Ni…, Nj…, Nk…)

The output array. The shape of out is identical to the shape of arr, except along the axis dimension. This axis is removed, and replaced with new dimensions equal to the shape of the return value of func1d. So if func1d returns a scalar out will have one fewer dimensions than arr.

See also

apply_over_axes

Apply a function repeatedly over multiple axes.

Notes

If either of dtype or shape are not provided, Dask attempts to determine them by calling func1d on a dummy array. This may produce incorrect values for dtype or shape, so we recommend providing them.

Execute func1d(a, *args) where func1d operates on 1-D arrays and a is a 1-D slice of arr along axis.

This is equivalent to (but faster than) the following use of ndindex and s_, which sets each of ii, jj, and kk to a tuple of indices:

Ni, Nk = a.shape[:axis], a.shape[axis+1:]
for ii in ndindex(Ni):
    for kk in ndindex(Nk):
        f = func1d(arr[ii + s_[:,] + kk])
        Nj = f.shape
        for jj in ndindex(Nj):
            out[ii + jj + kk] = f[jj]

Equivalently, eliminating the inner loop, this can be expressed as:

Ni, Nk = a.shape[:axis], a.shape[axis+1:]
for ii in ndindex(Ni):
    for kk in ndindex(Nk):
        out[ii + s_[...,] + kk] = func1d(arr[ii + s_[:,] + kk])

Examples

>>> def my_func(a):  
...     """Average first and last element of a 1-D array"""
...     return (a[0] + a[-1]) * 0.5
>>> b = np.array([[1,2,3], [4,5,6], [7,8,9]])  
>>> np.apply_along_axis(my_func, 0, b)  
array([4., 5., 6.])
>>> np.apply_along_axis(my_func, 1, b)  
array([2.,  5.,  8.])

For a function that returns a 1D array, the number of dimensions in outarr is the same as arr.

>>> b = np.array([[8,1,7], [4,3,9], [5,2,6]])  
>>> np.apply_along_axis(sorted, 1, b)  
array([[1, 7, 8],
       [3, 4, 9],
       [2, 5, 6]])

For a function that returns a higher dimensional array, those dimensions are inserted in place of the axis dimension.

>>> b = np.array([[1,2,3], [4,5,6], [7,8,9]])  
>>> np.apply_along_axis(np.diag, -1, b)  
array([[[1, 0, 0],
        [0, 2, 0],
        [0, 0, 3]],
       [[4, 0, 0],
        [0, 5, 0],
        [0, 0, 6]],
       [[7, 0, 0],
        [0, 8, 0],
        [0, 0, 9]]])
dask.array.apply_over_axes(func, a, axes)

Apply a function repeatedly over multiple axes.

This docstring was copied from numpy.apply_over_axes.

Some inconsistencies with the Dask version may exist.

func is called as res = func(a, axis), where axis is the first element of axes. The result res of the function call must have either the same dimensions as a or one less dimension. If res has one less dimension than a, a dimension is inserted before axis. The call to func is then repeated for each axis in axes, with res as the first argument.

Parameters
funcfunction

This function must take two arguments, func(a, axis).

aarray_like

Input array.

axesarray_like

Axes over which func is applied; the elements must be integers.

Returns
apply_over_axisndarray

The output array. The number of dimensions is the same as a, but the shape can be different. This depends on whether func changes the shape of its output with respect to its input.

See also

apply_along_axis

Apply a function to 1-D slices of an array along the given axis.

Notes

This function is equivalent to tuple axis arguments to reorderable ufuncs with keepdims=True. Tuple axis arguments to ufuncs have been available since version 1.7.0.

Examples

>>> a = np.arange(24).reshape(2,3,4)  
>>> a  
array([[[ 0,  1,  2,  3],
        [ 4,  5,  6,  7],
        [ 8,  9, 10, 11]],
       [[12, 13, 14, 15],
        [16, 17, 18, 19],
        [20, 21, 22, 23]]])

Sum over axes 0 and 2. The result has same number of dimensions as the original array:

>>> np.apply_over_axes(np.sum, a, [0,2])  
array([[[ 60],
        [ 92],
        [124]]])

Tuple axis arguments to ufuncs are equivalent:

>>> np.sum(a, axis=(0,2), keepdims=True)  
array([[[ 60],
        [ 92],
        [124]]])
dask.array.arange(*args, **kwargs)

Return evenly spaced values from start to stop with step size step.

The values are half-open [start, stop), so including start and excluding stop. This is basically the same as python’s range function but for dask arrays.

When using a non-integer step, such as 0.1, the results will often not be consistent. It is better to use linspace for these cases.

Parameters
startint, optional

The starting value of the sequence. The default is 0.

stopint

The end of the interval, this value is excluded from the interval.

stepint, optional

The spacing between the values. The default is 1 when not specified. The last value of the sequence.

chunksint

The number of samples on each block. Note that the last block will have fewer samples if len(array) % chunks != 0.

dtypenumpy.dtype

Output dtype. Omit to infer it from start, stop, step

Returns
samplesdask array
dask.array.arccos(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])

This docstring was copied from numpy.arccos.

Some inconsistencies with the Dask version may exist.

Trigonometric inverse cosine, element-wise.

The inverse of cos so that, if y = cos(x), then x = arccos(y).

Parameters
xarray_like

x-coordinate on the unit circle. For real arguments, the domain is [-1, 1].

outndarray, None, or tuple of ndarray and None, optional

A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.

wherearray_like, optional

This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default out=None, locations within it where the condition is False will remain uninitialized.

**kwargs

For other keyword-only arguments, see the ufunc docs.

Returns
anglendarray

The angle of the ray intersecting the unit circle at the given x-coordinate in radians [0, pi]. This is a scalar if x is a scalar.

See also

cos, arctan, arcsin, emath.arccos

Notes

arccos is a multivalued function: for each x there are infinitely many numbers z such that cos(z) = x. The convention is to return the angle z whose real part lies in [0, pi].

For real-valued input data types, arccos always returns real output. For each value that cannot be expressed as a real number or infinity, it yields nan and sets the invalid floating point error flag.

For complex-valued input, arccos is a complex analytic function that has branch cuts [-inf, -1] and [1, inf] and is continuous from above on the former and from below on the latter.

The inverse cos is also known as acos or cos^-1.

References

M. Abramowitz and I.A. Stegun, “Handbook of Mathematical Functions”, 10th printing, 1964, pp. 79. http://www.math.sfu.ca/~cbm/aands/

Examples

We expect the arccos of 1 to be 0, and of -1 to be pi:

>>> np.arccos([1, -1])  
array([ 0.        ,  3.14159265])

Plot arccos:

>>> import matplotlib.pyplot as plt  
>>> x = np.linspace(-1, 1, num=100)  
>>> plt.plot(x, np.arccos(x))  
>>> plt.axis('tight')  
>>> plt.show()  
dask.array.arccosh(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])

This docstring was copied from numpy.arccosh.

Some inconsistencies with the Dask version may exist.

Inverse hyperbolic cosine, element-wise.

Parameters
xarray_like

Input array.

outndarray, None, or tuple of ndarray and None, optional

A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.

wherearray_like, optional

This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default out=None, locations within it where the condition is False will remain uninitialized.

**kwargs

For other keyword-only arguments, see the ufunc docs.

Returns
arccoshndarray

Array of the same shape as x. This is a scalar if x is a scalar.

See also

cosh, arcsinh, sinh, arctanh, tanh

Notes

arccosh is a multivalued function: for each x there are infinitely many numbers z such that cosh(z) = x. The convention is to return the z whose imaginary part lies in [-pi, pi] and the real part in [0, inf].

For real-valued input data types, arccosh always returns real output. For each value that cannot be expressed as a real number or infinity, it yields nan and sets the invalid floating point error flag.

For complex-valued input, arccosh is a complex analytical function that has a branch cut [-inf, 1] and is continuous from above on it.

References

1

M. Abramowitz and I.A. Stegun, “Handbook of Mathematical Functions”, 10th printing, 1964, pp. 86. http://www.math.sfu.ca/~cbm/aands/

2

Wikipedia, “Inverse hyperbolic function”, https://en.wikipedia.org/wiki/Arccosh

Examples

>>> np.arccosh([np.e, 10.0])  
array([ 1.65745445,  2.99322285])
>>> np.arccosh(1)  
0.0
dask.array.arcsin(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])

This docstring was copied from numpy.arcsin.

Some inconsistencies with the Dask version may exist.

Inverse sine, element-wise.

Parameters
xarray_like

y-coordinate on the unit circle.

outndarray, None, or tuple of ndarray and None, optional

A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.

wherearray_like, optional

This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default out=None, locations within it where the condition is False will remain uninitialized.

**kwargs

For other keyword-only arguments, see the ufunc docs.

Returns
anglendarray

The inverse sine of each element in x, in radians and in the closed interval [-pi/2, pi/2]. This is a scalar if x is a scalar.

See also

sin, cos, arccos, tan, arctan, arctan2, emath.arcsin

Notes

arcsin is a multivalued function: for each x there are infinitely many numbers z such that \(sin(z) = x\). The convention is to return the angle z whose real part lies in [-pi/2, pi/2].

For real-valued input data types, arcsin always returns real output. For each value that cannot be expressed as a real number or infinity, it yields nan and sets the invalid floating point error flag.

For complex-valued input, arcsin is a complex analytic function that has, by convention, the branch cuts [-inf, -1] and [1, inf] and is continuous from above on the former and from below on the latter.

The inverse sine is also known as asin or sin^{-1}.

References

Abramowitz, M. and Stegun, I. A., Handbook of Mathematical Functions, 10th printing, New York: Dover, 1964, pp. 79ff. http://www.math.sfu.ca/~cbm/aands/

Examples

>>> np.arcsin(1)     # pi/2  
1.5707963267948966
>>> np.arcsin(-1)    # -pi/2  
-1.5707963267948966
>>> np.arcsin(0)  
0.0
dask.array.arcsinh(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])

This docstring was copied from numpy.arcsinh.

Some inconsistencies with the Dask version may exist.

Inverse hyperbolic sine element-wise.

Parameters
xarray_like

Input array.

outndarray, None, or tuple of ndarray and None, optional

A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.

wherearray_like, optional

This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default out=None, locations within it where the condition is False will remain uninitialized.

**kwargs

For other keyword-only arguments, see the ufunc docs.

Returns
outndarray or scalar

Array of the same shape as x. This is a scalar if x is a scalar.

Notes

arcsinh is a multivalued function: for each x there are infinitely many numbers z such that sinh(z) = x. The convention is to return the z whose imaginary part lies in [-pi/2, pi/2].

For real-valued input data types, arcsinh always returns real output. For each value that cannot be expressed as a real number or infinity, it returns nan and sets the invalid floating point error flag.

For complex-valued input, arccos is a complex analytical function that has branch cuts [1j, infj] and [-1j, -infj] and is continuous from the right on the former and from the left on the latter.

The inverse hyperbolic sine is also known as asinh or sinh^-1.

References

1

M. Abramowitz and I.A. Stegun, “Handbook of Mathematical Functions”, 10th printing, 1964, pp. 86. http://www.math.sfu.ca/~cbm/aands/

2

Wikipedia, “Inverse hyperbolic function”, https://en.wikipedia.org/wiki/Arcsinh

Examples

>>> np.arcsinh(np.array([np.e, 10.0]))  
array([ 1.72538256,  2.99822295])
dask.array.arctan(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])

This docstring was copied from numpy.arctan.

Some inconsistencies with the Dask version may exist.

Trigonometric inverse tangent, element-wise.

The inverse of tan, so that if y = tan(x) then x = arctan(y).

Parameters
xarray_like
outndarray, None, or tuple of ndarray and None, optional

A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.

wherearray_like, optional

This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default out=None, locations within it where the condition is False will remain uninitialized.

**kwargs

For other keyword-only arguments, see the ufunc docs.

Returns
outndarray or scalar

Out has the same shape as x. Its real part is in [-pi/2, pi/2] (arctan(+/-inf) returns +/-pi/2). This is a scalar if x is a scalar.

See also

arctan2

The “four quadrant” arctan of the angle formed by (x, y) and the positive x-axis.

angle

Argument of complex values.

Notes

arctan is a multi-valued function: for each x there are infinitely many numbers z such that tan(z) = x. The convention is to return the angle z whose real part lies in [-pi/2, pi/2].

For real-valued input data types, arctan always returns real output. For each value that cannot be expressed as a real number or infinity, it yields nan and sets the invalid floating point error flag.

For complex-valued input, arctan is a complex analytic function that has [1j, infj] and [-1j, -infj] as branch cuts, and is continuous from the left on the former and from the right on the latter.

The inverse tangent is also known as atan or tan^{-1}.

References

Abramowitz, M. and Stegun, I. A., Handbook of Mathematical Functions, 10th printing, New York: Dover, 1964, pp. 79. http://www.math.sfu.ca/~cbm/aands/

Examples

We expect the arctan of 0 to be 0, and of 1 to be pi/4:

>>> np.arctan([0, 1])  
array([ 0.        ,  0.78539816])
>>> np.pi/4  
0.78539816339744828

Plot arctan:

>>> import matplotlib.pyplot as plt  
>>> x = np.linspace(-10, 10)  
>>> plt.plot(x, np.arctan(x))  
>>> plt.axis('tight')  
>>> plt.show()  
dask.array.arctan2(x1, x2, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])

This docstring was copied from numpy.arctan2.

Some inconsistencies with the Dask version may exist.

Element-wise arc tangent of x1/x2 choosing the quadrant correctly.

The quadrant (i.e., branch) is chosen so that arctan2(x1, x2) is the signed angle in radians between the ray ending at the origin and passing through the point (1,0), and the ray ending at the origin and passing through the point (x2, x1). (Note the role reversal: the “y-coordinate” is the first function parameter, the “x-coordinate” is the second.) By IEEE convention, this function is defined for x2 = +/-0 and for either or both of x1 and x2 = +/-inf (see Notes for specific values).

This function is not defined for complex-valued arguments; for the so-called argument of complex values, use angle.

Parameters
x1array_like, real-valued

y-coordinates.

x2array_like, real-valued

x-coordinates. If x1.shape != x2.shape, they must be broadcastable to a common shape (which becomes the shape of the output).

outndarray, None, or tuple of ndarray and None, optional

A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.

wherearray_like, optional

This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default out=None, locations within it where the condition is False will remain uninitialized.

**kwargs

For other keyword-only arguments, see the ufunc docs.

Returns
anglendarray

Array of angles in radians, in the range [-pi, pi]. This is a scalar if both x1 and x2 are scalars.

See also

arctan, tan, angle

Notes

arctan2 is identical to the atan2 function of the underlying C library. The following special values are defined in the C standard: [1]

x1

x2

arctan2(x1,x2)

+/- 0

+0

+/- 0

+/- 0

-0

+/- pi

> 0

+/-inf

+0 / +pi

< 0

+/-inf

-0 / -pi

+/-inf

+inf

+/- (pi/4)

+/-inf

-inf

+/- (3*pi/4)

Note that +0 and -0 are distinct floating point numbers, as are +inf and -inf.

References

1

ISO/IEC standard 9899:1999, “Programming language C.”

Examples

Consider four points in different quadrants:

>>> x = np.array([-1, +1, +1, -1])  
>>> y = np.array([-1, -1, +1, +1])  
>>> np.arctan2(y, x) * 180 / np.pi  
array([-135.,  -45.,   45.,  135.])

Note the order of the parameters. arctan2 is defined also when x2 = 0 and at several other special points, obtaining values in the range [-pi, pi]:

>>> np.arctan2([1., -1.], [0., 0.])  
array([ 1.57079633, -1.57079633])
>>> np.arctan2([0., 0., np.inf], [+0., -0., np.inf])  
array([ 0.        ,  3.14159265,  0.78539816])
dask.array.arctanh(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])

This docstring was copied from numpy.arctanh.

Some inconsistencies with the Dask version may exist.

Inverse hyperbolic tangent element-wise.

Parameters
xarray_like

Input array.

outndarray, None, or tuple of ndarray and None, optional

A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.

wherearray_like, optional

This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default out=None, locations within it where the condition is False will remain uninitialized.

**kwargs

For other keyword-only arguments, see the ufunc docs.

Returns
outndarray or scalar

Array of the same shape as x. This is a scalar if x is a scalar.

See also

emath.arctanh

Notes

arctanh is a multivalued function: for each x there are infinitely many numbers z such that tanh(z) = x. The convention is to return the z whose imaginary part lies in [-pi/2, pi/2].

For real-valued input data types, arctanh always returns real output. For each value that cannot be expressed as a real number or infinity, it yields nan and sets the invalid floating point error flag.

For complex-valued input, arctanh is a complex analytical function that has branch cuts [-1, -inf] and [1, inf] and is continuous from above on the former and from below on the latter.

The inverse hyperbolic tangent is also known as atanh or tanh^-1.

References

1

M. Abramowitz and I.A. Stegun, “Handbook of Mathematical Functions”, 10th printing, 1964, pp. 86. http://www.math.sfu.ca/~cbm/aands/

2

Wikipedia, “Inverse hyperbolic function”, https://en.wikipedia.org/wiki/Arctanh

Examples

>>> np.arctanh([0, -0.5])  
array([ 0.        , -0.54930614])
dask.array.argmax(x, axis=None, split_every=None, out=None)

Return the maximum of an array or maximum along an axis.

This docstring was copied from numpy.amax.

Some inconsistencies with the Dask version may exist.

Parameters
aarray_like (Not supported in Dask)

Input data.

axisNone or int or tuple of ints, optional

Axis or axes along which to operate. By default, flattened input is used.

New in version 1.7.0.

If this is a tuple of ints, the maximum is selected over multiple axes, instead of a single axis or all the axes as before.

outndarray, optional

Alternative output array in which to place the result. Must be of the same shape and buffer length as the expected output. See ufuncs-output-type for more details.

keepdimsbool, optional (Not supported in Dask)

If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the input array.

If the default value is passed, then keepdims will not be passed through to the amax method of sub-classes of ndarray, however any non-default value will be. If the sub-class’ method does not implement keepdims any exceptions will be raised.

initialscalar, optional (Not supported in Dask)

The minimum value of an output element. Must be present to allow computation on empty slice. See ~numpy.ufunc.reduce for details.

New in version 1.15.0.

wherearray_like of bool, optional (Not supported in Dask)

Elements to compare for the maximum. See ~numpy.ufunc.reduce for details.

New in version 1.17.0.

Returns
amaxndarray or scalar

Maximum of a. If axis is None, the result is a scalar value. If axis is given, the result is an array of dimension a.ndim - 1.

See also

amin

The minimum value of an array along a given axis, propagating any NaNs.

nanmax

The maximum value of an array along a given axis, ignoring any NaNs.

maximum

Element-wise maximum of two arrays, propagating any NaNs.

fmax

Element-wise maximum of two arrays, ignoring any NaNs.

argmax

Return the indices of the maximum values.

nanmin, minimum, fmin

Notes

NaN values are propagated, that is if at least one item is NaN, the corresponding max value will be NaN as well. To ignore NaN values (MATLAB behavior), please use nanmax.

Don’t use amax for element-wise comparison of 2 arrays; when a.shape[0] is 2, maximum(a[0], a[1]) is faster than amax(a, axis=0).

Examples

>>> a = np.arange(4).reshape((2,2))  
>>> a  
array([[0, 1],
       [2, 3]])
>>> np.amax(a)           # Maximum of the flattened array  
3
>>> np.amax(a, axis=0)   # Maxima along the first axis  
array([2, 3])
>>> np.amax(a, axis=1)   # Maxima along the second axis  
array([1, 3])
>>> np.amax(a, where=[False, True], initial=-1, axis=0)  
array([-1,  3])
>>> b = np.arange(5, dtype=float)  
>>> b[2] = np.NaN  
>>> np.amax(b)  
nan
>>> np.amax(b, where=~np.isnan(b), initial=-1)  
4.0
>>> np.nanmax(b)  
4.0

You can use an initial value to compute the maximum of an empty slice, or to initialize it to a different value:

>>> np.max([[-50], [10]], axis=-1, initial=0)  
array([ 0, 10])

Notice that the initial value is used as one of the elements for which the maximum is determined, unlike for the default argument Python’s max function, which is only used for empty iterables.

>>> np.max([5], initial=6)  
6
>>> max([5], default=6)  
5
dask.array.argmin(x, axis=None, split_every=None, out=None)

Return the minimum of an array or minimum along an axis.

This docstring was copied from numpy.amin.

Some inconsistencies with the Dask version may exist.

Parameters
aarray_like (Not supported in Dask)

Input data.

axisNone or int or tuple of ints, optional

Axis or axes along which to operate. By default, flattened input is used.

New in version 1.7.0.

If this is a tuple of ints, the minimum is selected over multiple axes, instead of a single axis or all the axes as before.

outndarray, optional

Alternative output array in which to place the result. Must be of the same shape and buffer length as the expected output. See ufuncs-output-type for more details.

keepdimsbool, optional (Not supported in Dask)

If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the input array.

If the default value is passed, then keepdims will not be passed through to the amin method of sub-classes of ndarray, however any non-default value will be. If the sub-class’ method does not implement keepdims any exceptions will be raised.

initialscalar, optional (Not supported in Dask)

The maximum value of an output element. Must be present to allow computation on empty slice. See ~numpy.ufunc.reduce for details.

New in version 1.15.0.

wherearray_like of bool, optional (Not supported in Dask)

Elements to compare for the minimum. See ~numpy.ufunc.reduce for details.

New in version 1.17.0.

Returns
aminndarray or scalar

Minimum of a. If axis is None, the result is a scalar value. If axis is given, the result is an array of dimension a.ndim - 1.

See also

amax

The maximum value of an array along a given axis, propagating any NaNs.

nanmin

The minimum value of an array along a given axis, ignoring any NaNs.

minimum

Element-wise minimum of two arrays, propagating any NaNs.

fmin

Element-wise minimum of two arrays, ignoring any NaNs.

argmin

Return the indices of the minimum values.

nanmax, maximum, fmax

Notes

NaN values are propagated, that is if at least one item is NaN, the corresponding min value will be NaN as well. To ignore NaN values (MATLAB behavior), please use nanmin.

Don’t use amin for element-wise comparison of 2 arrays; when a.shape[0] is 2, minimum(a[0], a[1]) is faster than amin(a, axis=0).

Examples

>>> a = np.arange(4).reshape((2,2))  
>>> a  
array([[0, 1],
       [2, 3]])
>>> np.amin(a)           # Minimum of the flattened array  
0
>>> np.amin(a, axis=0)   # Minima along the first axis  
array([0, 1])
>>> np.amin(a, axis=1)   # Minima along the second axis  
array([0, 2])
>>> np.amin(a, where=[False, True], initial=10, axis=0)  
array([10,  1])
>>> b = np.arange(5, dtype=float)  
>>> b[2] = np.NaN  
>>> np.amin(b)  
nan
>>> np.amin(b, where=~np.isnan(b), initial=10)  
0.0
>>> np.nanmin(b)  
0.0
>>> np.min([[-50], [10]], axis=-1, initial=0)  
array([-50,   0])

Notice that the initial value is used as one of the elements for which the minimum is determined, unlike for the default argument Python’s max function, which is only used for empty iterables.

Notice that this isn’t the same as Python’s default argument.

>>> np.min([6], initial=5)  
5
>>> min([6], default=5)  
6
dask.array.argtopk(a, k, axis=-1, split_every=None)

Extract the indices of the k largest elements from a on the given axis, and return them sorted from largest to smallest. If k is negative, extract the indices of the -k smallest elements instead, and return them sorted from smallest to largest.

This performs best when k is much smaller than the chunk size. All results will be returned in a single chunk along the given axis.

Parameters
x: Array

Data being sorted

k: int
axis: int, optional
split_every: int >=2, optional

See topk(). The performance considerations for topk also apply here.

Returns
Selection of np.intp indices of x with size abs(k) along the given axis.

Examples

>>> import dask.array as da
>>> x = np.array([5, 1, 3, 6])
>>> d = da.from_array(x, chunks=2)
>>> d.argtopk(2).compute()
array([3, 0])
>>> d.argtopk(-2).compute()
array([1, 2])
dask.array.argwhere(a)

Find the indices of array elements that are non-zero, grouped by element.

This docstring was copied from numpy.argwhere.

Some inconsistencies with the Dask version may exist.

Parameters
aarray_like

Input data.

Returns
index_array(N, a.ndim) ndarray

Indices of elements that are non-zero. Indices are grouped by element. This array will have shape (N, a.ndim) where N is the number of non-zero items.

See also

where, nonzero

Notes

np.argwhere(a) is almost the same as np.transpose(np.nonzero(a)), but produces a result of the correct shape for a 0D array.

The output of argwhere is not suitable for indexing arrays. For this purpose use nonzero(a) instead.

Examples

>>> x = np.arange(6).reshape(2,3)  
>>> x  
array([[0, 1, 2],
       [3, 4, 5]])
>>> np.argwhere(x>1)  
array([[0, 2],
       [1, 0],
       [1, 1],
       [1, 2]])
dask.array.around(x, decimals=0)

Evenly round to the given number of decimals.

This docstring was copied from numpy.around.

Some inconsistencies with the Dask version may exist.

Parameters
aarray_like (Not supported in Dask)

Input data.

decimalsint, optional

Number of decimal places to round to (default: 0). If decimals is negative, it specifies the number of positions to the left of the decimal point.

outndarray, optional (Not supported in Dask)

Alternative output array in which to place the result. It must have the same shape as the expected output, but the type of the output values will be cast if necessary. See ufuncs-output-type for more details.

Returns
rounded_arrayndarray

An array of the same type as a, containing the rounded values. Unless out was specified, a new array is created. A reference to the result is returned.

The real and imaginary parts of complex numbers are rounded separately. The result of rounding a float is a float.

See also

ndarray.round

equivalent method

ceil, fix, floor, rint, trunc

Notes

For values exactly halfway between rounded decimal values, NumPy rounds to the nearest even value. Thus 1.5 and 2.5 round to 2.0, -0.5 and 0.5 round to 0.0, etc.

np.around uses a fast but sometimes inexact algorithm to round floating-point datatypes. For positive decimals it is equivalent to np.true_divide(np.rint(a * 10**decimals), 10**decimals), which has error due to the inexact representation of decimal fractions in the IEEE floating point standard [1] and errors introduced when scaling by powers of ten. For instance, note the extra “1” in the following:

>>> np.round(56294995342131.5, 3)  
56294995342131.51

If your goal is to print such values with a fixed number of decimals, it is preferable to use numpy’s float printing routines to limit the number of printed decimals:

>>> np.format_float_positional(56294995342131.5, precision=3)  
'56294995342131.5'

The float printing routines use an accurate but much more computationally demanding algorithm to compute the number of digits after the decimal point.

Alternatively, Python’s builtin round function uses a more accurate but slower algorithm for 64-bit floating point values:

>>> round(56294995342131.5, 3)  
56294995342131.5
>>> np.round(16.055, 2), round(16.055, 2)  # equals 16.0549999999999997  
(16.06, 16.05)

References

1

“Lecture Notes on the Status of IEEE 754”, William Kahan, https://people.eecs.berkeley.edu/~wkahan/ieee754status/IEEE754.PDF

2

“How Futile are Mindless Assessments of Roundoff in Floating-Point Computation?”, William Kahan, https://people.eecs.berkeley.edu/~wkahan/Mindless.pdf

Examples

>>> np.around([0.37, 1.64])  
array([0.,  2.])
>>> np.around([0.37, 1.64], decimals=1)  
array([0.4,  1.6])
>>> np.around([.5, 1.5, 2.5, 3.5, 4.5]) # rounds to nearest even value  
array([0.,  2.,  2.,  4.,  4.])
>>> np.around([1,2,3,11], decimals=1) # ndarray of ints is returned  
array([ 1,  2,  3, 11])
>>> np.around([1,2,3,11], decimals=-1)  
array([ 0,  0,  0, 10])
dask.array.array(object, dtype=None, copy=True, order='K', subok=False, ndmin=0)

This docstring was copied from numpy.array.

Some inconsistencies with the Dask version may exist.

Create an array.

Parameters
objectarray_like

An array, any object exposing the array interface, an object whose __array__ method returns an array, or any (nested) sequence.

dtypedata-type, optional

The desired data-type for the array. If not given, then the type will be determined as the minimum type required to hold the objects in the sequence.

copybool, optional

If true (default), then the object is copied. Otherwise, a copy will only be made if __array__ returns a copy, if obj is a nested sequence, or if a copy is needed to satisfy any of the other requirements (dtype, order, etc.).

order{‘K’, ‘A’, ‘C’, ‘F’}, optional

Specify the memory layout of the array. If object is not an array, the newly created array will be in C order (row major) unless ‘F’ is specified, in which case it will be in Fortran order (column major). If object is an array the following holds.

order

no copy

copy=True

‘K’

unchanged

F & C order preserved, otherwise most similar order

‘A’

unchanged

F order if input is F and not C, otherwise C order

‘C’

C order

C order

‘F’

F order

F order

When copy=False and a copy is made for other reasons, the result is the same as if copy=True, with some exceptions for A, see the Notes section. The default order is ‘K’.

subokbool, optional

If True, then sub-classes will be passed-through, otherwise the returned array will be forced to be a base-class array (default).

ndminint, optional

Specifies the minimum number of dimensions that the resulting array should have. Ones will be pre-pended to the shape as needed to meet this requirement.

Returns
outndarray

An array object satisfying the specified requirements.

See also

empty_like

Return an empty array with shape and type of input.

ones_like

Return an array of ones with shape and type of input.

zeros_like

Return an array of zeros with shape and type of input.

full_like

Return a new array with shape of input filled with value.

empty

Return a new uninitialized array.

ones

Return a new array setting values to one.

zeros

Return a new array setting values to zero.

full

Return a new array of given shape filled with value.

Notes

When order is ‘A’ and object is an array in neither ‘C’ nor ‘F’ order, and a copy is forced by a change in dtype, then the order of the result is not necessarily ‘C’ as expected. This is likely a bug.

Examples

>>> np.array([1, 2, 3])  
array([1, 2, 3])

Upcasting:

>>> np.array([1, 2, 3.0])  
array([ 1.,  2.,  3.])

More than one dimension:

>>> np.array([[1, 2], [3, 4]])  
array([[1, 2],
       [3, 4]])

Minimum dimensions 2:

>>> np.array([1, 2, 3], ndmin=2)  
array([[1, 2, 3]])

Type provided:

>>> np.array([1, 2, 3], dtype=complex)  
array([ 1.+0.j,  2.+0.j,  3.+0.j])

Data-type consisting of more than one element:

>>> x = np.array([(1,2),(3,4)],dtype=[('a','<i4'),('b','<i4')])  
>>> x['a']  
array([1, 3])

Creating an array from sub-classes:

>>> np.array(np.mat('1 2; 3 4'))  
array([[1, 2],
       [3, 4]])
>>> np.array(np.mat('1 2; 3 4'), subok=True)  
matrix([[1, 2],
        [3, 4]])
dask.array.asanyarray(a)

Convert the input to a dask array.

Subclasses of np.ndarray will be passed through as chunks unchanged.

Parameters
aarray-like

Input data, in any form that can be converted to a dask array.

Returns
outdask array

Dask array interpretation of a.

Examples

>>> import dask.array as da
>>> import numpy as np
>>> x = np.arange(3)
>>> da.asanyarray(x)
dask.array<array, shape=(3,), dtype=int64, chunksize=(3,), chunktype=numpy.ndarray>
>>> y = [[1, 2, 3], [4, 5, 6]]
>>> da.asanyarray(y)
dask.array<array, shape=(2, 3), dtype=int64, chunksize=(2, 3), chunktype=numpy.ndarray>
dask.array.asarray(a, **kwargs)

Convert the input to a dask array.

Parameters
aarray-like

Input data, in any form that can be converted to a dask array.

Returns
outdask array

Dask array interpretation of a.

Examples

>>> import dask.array as da
>>> import numpy as np
>>> x = np.arange(3)
>>> da.asarray(x)
dask.array<array, shape=(3,), dtype=int64, chunksize=(3,), chunktype=numpy.ndarray>
>>> y = [[1, 2, 3], [4, 5, 6]]
>>> da.asarray(y)
dask.array<array, shape=(2, 3), dtype=int64, chunksize=(2, 3), chunktype=numpy.ndarray>
dask.array.atleast_1d(*arys)

Convert inputs to arrays with at least one dimension.

This docstring was copied from numpy.atleast_1d.

Some inconsistencies with the Dask version may exist.

Scalar inputs are converted to 1-dimensional arrays, whilst higher-dimensional inputs are preserved.

Parameters
arys1, arys2, …array_like

One or more input arrays.

Returns
retndarray

An array, or list of arrays, each with a.ndim >= 1. Copies are made only if necessary.

Examples

>>> np.atleast_1d(1.0)  
array([1.])
>>> x = np.arange(9.0).reshape(3,3)  
>>> np.atleast_1d(x)  
array([[0., 1., 2.],
       [3., 4., 5.],
       [6., 7., 8.]])
>>> np.atleast_1d(x) is x  
True
>>> np.atleast_1d(1, [3, 4])  
[array([1]), array([3, 4])]
dask.array.atleast_2d(*arys)

View inputs as arrays with at least two dimensions.

This docstring was copied from numpy.atleast_2d.

Some inconsistencies with the Dask version may exist.

Parameters
arys1, arys2, …array_like

One or more array-like sequences. Non-array inputs are converted to arrays. Arrays that already have two or more dimensions are preserved.

Returns
res, res2, …ndarray

An array, or list of arrays, each with a.ndim >= 2. Copies are avoided where possible, and views with two or more dimensions are returned.

Examples

>>> np.atleast_2d(3.0)  
array([[3.]])
>>> x = np.arange(3.0)  
>>> np.atleast_2d(x)  
array([[0., 1., 2.]])
>>> np.atleast_2d(x).base is x  
True
>>> np.atleast_2d(1, [1, 2], [[1, 2]])  
[array([[1]]), array([[1, 2]]), array([[1, 2]])]
dask.array.atleast_3d(*arys)

View inputs as arrays with at least three dimensions.

This docstring was copied from numpy.atleast_3d.

Some inconsistencies with the Dask version may exist.

Parameters
arys1, arys2, …array_like

One or more array-like sequences. Non-array inputs are converted to arrays. Arrays that already have three or more dimensions are preserved.

Returns
res1, res2, …ndarray

An array, or list of arrays, each with a.ndim >= 3. Copies are avoided where possible, and views with three or more dimensions are returned. For example, a 1-D array of shape (N,) becomes a view of shape (1, N, 1), and a 2-D array of shape (M, N) becomes a view of shape (M, N, 1).

Examples

>>> np.atleast_3d(3.0)  
array([[[3.]]])
>>> x = np.arange(3.0)  
>>> np.atleast_3d(x).shape  
(1, 3, 1)
>>> x = np.arange(12.0).reshape(4,3)  
>>> np.atleast_3d(x).shape  
(4, 3, 1)
>>> np.atleast_3d(x).base is x.base  # x is a reshape, so not base itself  
True
>>> for arr in np.atleast_3d([1, 2], [[1, 2]], [[[1, 2]]]):  
...     print(arr, arr.shape) 
...
[[[1]
  [2]]] (1, 2, 1)
[[[1]
  [2]]] (1, 2, 1)
[[[1 2]]] (1, 1, 2)
dask.array.average(a, axis=None, weights=None, returned=False)

Compute the weighted average along the specified axis.

This docstring was copied from numpy.average.

Some inconsistencies with the Dask version may exist.

Parameters
aarray_like

Array containing data to be averaged. If a is not an array, a conversion is attempted.

axisNone or int or tuple of ints, optional

Axis or axes along which to average a. The default, axis=None, will average over all of the elements of the input array. If axis is negative it counts from the last to the first axis.

New in version 1.7.0.

If axis is a tuple of ints, averaging is performed on all of the axes specified in the tuple instead of a single axis or all the axes as before.

weightsarray_like, optional

An array of weights associated with the values in a. Each value in a contributes to the average according to its associated weight. The weights array can either be 1-D (in which case its length must be the size of a along the given axis) or of the same shape as a. If weights=None, then all data in a are assumed to have a weight equal to one. The 1-D calculation is:

avg = sum(a * weights) / sum(weights)

The only constraint on weights is that sum(weights) must not be 0.

returnedbool, optional

Default is False. If True, the tuple (average, sum_of_weights) is returned, otherwise only the average is returned. If weights=None, sum_of_weights is equivalent to the number of elements over which the average is taken.

Returns
retval, [sum_of_weights]array_type or double

Return the average along the specified axis. When returned is True, return a tuple with the average as the first element and the sum of the weights as the second element. sum_of_weights is of the same type as retval. The result dtype follows a genereal pattern. If weights is None, the result dtype will be that of a , or float64 if a is integral. Otherwise, if weights is not None and a is non- integral, the result type will be the type of lowest precision capable of representing values of both a and weights. If a happens to be integral, the previous rules still applies but the result dtype will at least be float64.

Raises
ZeroDivisionError

When all weights along axis are zero. See numpy.ma.average for a version robust to this type of error.

TypeError

When the length of 1D weights is not the same as the shape of a along axis.

See also

mean
ma.average

average for masked arrays – useful if your data contains “missing” values

numpy.result_type

Returns the type that results from applying the numpy type promotion rules to the arguments.

Examples

>>> data = np.arange(1, 5)  
>>> data  
array([1, 2, 3, 4])
>>> np.average(data)  
2.5
>>> np.average(np.arange(1, 11), weights=np.arange(10, 0, -1))  
4.0
>>> data = np.arange(6).reshape((3,2))  
>>> data  
array([[0, 1],
       [2, 3],
       [4, 5]])
>>> np.average(data, axis=1, weights=[1./4, 3./4])  
array([0.75, 2.75, 4.75])
>>> np.average(data, weights=[1./4, 3./4])  
Traceback (most recent call last):
    ...
TypeError: Axis must be specified when shapes of a and weights differ.
>>> a = np.ones(5, dtype=np.float128)  
>>> w = np.ones(5, dtype=np.complex64)  
>>> avg = np.average(a, weights=w)  
>>> print(avg.dtype)  
complex256
dask.array.bincount(x, weights=None, minlength=0)

This docstring was copied from numpy.bincount.

Some inconsistencies with the Dask version may exist.

Count number of occurrences of each value in array of non-negative ints.

The number of bins (of size 1) is one larger than the largest value in x. If minlength is specified, there will be at least this number of bins in the output array (though it will be longer if necessary, depending on the contents of x). Each bin gives the number of occurrences of its index value in x. If weights is specified the input array is weighted by it, i.e. if a value n is found at position i, out[n] += weight[i] instead of out[n] += 1.

Parameters
xarray_like, 1 dimension, nonnegative ints

Input array.

weightsarray_like, optional

Weights, array of the same shape as x.

minlengthint, optional

A minimum number of bins for the output array.

New in version 1.6.0.

Returns
outndarray of ints

The result of binning the input array. The length of out is equal to np.amax(x)+1.

Raises
ValueError

If the input is not 1-dimensional, or contains elements with negative values, or if minlength is negative.

TypeError

If the type of the input is float or complex.

Examples

>>> np.bincount(np.arange(5))  
array([1, 1, 1, 1, 1])
>>> np.bincount(np.array([0, 1, 1, 3, 2, 1, 7]))  
array([1, 3, 1, 1, 0, 0, 0, 1])
>>> x = np.array([0, 1, 1, 3, 2, 1, 7, 23])  
>>> np.bincount(x).size == np.amax(x)+1  
True

The input array needs to be of integer dtype, otherwise a TypeError is raised:

>>> np.bincount(np.arange(5, dtype=float))  
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: array cannot be safely cast to required type

A possible use of bincount is to perform sums over variable-size chunks of an array, using the weights keyword.

>>> w = np.array([0.3, 0.5, 0.2, 0.7, 1., -0.6]) # weights  
>>> x = np.array([0, 1, 1, 2, 2, 2])  
>>> np.bincount(x,  weights=w)  
array([ 0.3,  0.7,  1.1])
dask.array.bitwise_and(x1, x2, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])

This docstring was copied from numpy.bitwise_and.

Some inconsistencies with the Dask version may exist.

Compute the bit-wise AND of two arrays element-wise.

Computes the bit-wise AND of the underlying binary representation of the integers in the input arrays. This ufunc implements the C/Python operator &.

Parameters
x1, x2array_like

Only integer and boolean types are handled. If x1.shape != x2.shape, they must be broadcastable to a common shape (which becomes the shape of the output).

outndarray, None, or tuple of ndarray and None, optional

A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.

wherearray_like, optional

This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default out=None, locations within it where the condition is False will remain uninitialized.

**kwargs

For other keyword-only arguments, see the ufunc docs.

Returns
outndarray or scalar

Result. This is a scalar if both x1 and x2 are scalars.

See also

logical_and
bitwise_or
bitwise_xor
binary_repr

Return the binary representation of the input number as a string.

Examples

The number 13 is represented by 00001101. Likewise, 17 is represented by 00010001. The bit-wise AND of 13 and 17 is therefore 000000001, or 1:

>>> np.bitwise_and(13, 17)  
1
>>> np.bitwise_and(14, 13)  
12
>>> np.binary_repr(12)  
'1100'
>>> np.bitwise_and([14,3], 13)  
array([12,  1])
>>> np.bitwise_and([11,7], [4,25])  
array([0, 1])
>>> np.bitwise_and(np.array([2,5,255]), np.array([3,14,16]))  
array([ 2,  4, 16])
>>> np.bitwise_and([True, True], [False, True])  
array([False,  True])
dask.array.bitwise_not(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])

This docstring was copied from numpy.invert.

Some inconsistencies with the Dask version may exist.

Compute bit-wise inversion, or bit-wise NOT, element-wise.

Computes the bit-wise NOT of the underlying binary representation of the integers in the input arrays. This ufunc implements the C/Python operator ~.

For signed integer inputs, the two’s complement is returned. In a two’s-complement system negative numbers are represented by the two’s complement of the absolute value. This is the most common method of representing signed integers on computers [1]. A N-bit two’s-complement system can represent every integer in the range \(-2^{N-1}\) to \(+2^{N-1}-1\).

Parameters
xarray_like

Only integer and boolean types are handled.

outndarray, None, or tuple of ndarray and None, optional

A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.

wherearray_like, optional

This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default out=None, locations within it where the condition is False will remain uninitialized.

**kwargs

For other keyword-only arguments, see the ufunc docs.

Returns
outndarray or scalar

Result. This is a scalar if x is a scalar.

See also

bitwise_and, bitwise_or, bitwise_xor
logical_not
binary_repr

Return the binary representation of the input number as a string.

Notes

bitwise_not is an alias for invert:

>>> np.bitwise_not is np.invert  
True

References

1

Wikipedia, “Two’s complement”, https://en.wikipedia.org/wiki/Two’s_complement

Examples

We’ve seen that 13 is represented by 00001101. The invert or bit-wise NOT of 13 is then:

>>> x = np.invert(np.array(13, dtype=np.uint8))  
>>> x  
242
>>> np.binary_repr(x, width=8)  
'11110010'

The result depends on the bit-width:

>>> x = np.invert(np.array(13, dtype=np.uint16))  
>>> x  
65522
>>> np.binary_repr(x, width=16)  
'1111111111110010'

When using signed integer types the result is the two’s complement of the result for the unsigned type:

>>> np.invert(np.array([13], dtype=np.int8))  
array([-14], dtype=int8)
>>> np.binary_repr(-14, width=8)  
'11110010'

Booleans are accepted as well:

>>> np.invert(np.array([True, False]))  
array([False,  True])
dask.array.bitwise_or(x1, x2, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])

This docstring was copied from numpy.bitwise_or.

Some inconsistencies with the Dask version may exist.

Compute the bit-wise OR of two arrays element-wise.

Computes the bit-wise OR of the underlying binary representation of the integers in the input arrays. This ufunc implements the C/Python operator |.

Parameters
x1, x2array_like

Only integer and boolean types are handled. If x1.shape != x2.shape, they must be broadcastable to a common shape (which becomes the shape of the output).

outndarray, None, or tuple of ndarray and None, optional

A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.

wherearray_like, optional

This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default out=None, locations within it where the condition is False will remain uninitialized.

**kwargs

For other keyword-only arguments, see the ufunc docs.

Returns
outndarray or scalar

Result. This is a scalar if both x1 and x2 are scalars.

See also

logical_or
bitwise_and
bitwise_xor
binary_repr

Return the binary representation of the input number as a string.

Examples

The number 13 has the binaray representation 00001101. Likewise, 16 is represented by 00010000. The bit-wise OR of 13 and 16 is then 000111011, or 29:

>>> np.bitwise_or(13, 16)  
29
>>> np.binary_repr(29)  
'11101'
>>> np.bitwise_or(32, 2)  
34
>>> np.bitwise_or([33, 4], 1)  
array([33,  5])
>>> np.bitwise_or([33, 4], [1, 2])  
array([33,  6])
>>> np.bitwise_or(np.array([2, 5, 255]), np.array([4, 4, 4]))  
array([  6,   5, 255])
>>> np.array([2, 5, 255]) | np.array([4, 4, 4])  
array([  6,   5, 255])
>>> np.bitwise_or(np.array([2, 5, 255, 2147483647], dtype=np.int32),  
...               np.array([4, 4, 4, 2147483647], dtype=np.int32))
array([         6,          5,        255, 2147483647])
>>> np.bitwise_or([True, True], [False, True])  
array([ True,  True])
dask.array.bitwise_xor(x1, x2, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])

This docstring was copied from numpy.bitwise_xor.

Some inconsistencies with the Dask version may exist.

Compute the bit-wise XOR of two arrays element-wise.

Computes the bit-wise XOR of the underlying binary representation of the integers in the input arrays. This ufunc implements the C/Python operator ^.

Parameters
x1, x2array_like

Only integer and boolean types are handled. If x1.shape != x2.shape, they must be broadcastable to a common shape (which becomes the shape of the output).

outndarray, None, or tuple of ndarray and None, optional

A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.

wherearray_like, optional

This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default out=None, locations within it where the condition is False will remain uninitialized.

**kwargs

For other keyword-only arguments, see the ufunc docs.

Returns
outndarray or scalar

Result. This is a scalar if both x1 and x2 are scalars.

See also

logical_xor
bitwise_and
bitwise_or
binary_repr

Return the binary representation of the input number as a string.

Examples

The number 13 is represented by 00001101. Likewise, 17 is represented by 00010001. The bit-wise XOR of 13 and 17 is therefore 00011100, or 28:

>>> np.bitwise_xor(13, 17)  
28
>>> np.binary_repr(28)  
'11100'
>>> np.bitwise_xor(31, 5)  
26
>>> np.bitwise_xor([31,3], 5)  
array([26,  6])
>>> np.bitwise_xor([31,3], [5,6])  
array([26,  5])
>>> np.bitwise_xor([True, True], [False, True])  
array([ True, False])
dask.array.block(arrays, allow_unknown_chunksizes=False)

Assemble an nd-array from nested lists of blocks.

Blocks in the innermost lists are concatenated along the last dimension (-1), then these are concatenated along the second-last dimension (-2), and so on until the outermost list is reached

Blocks can be of any dimension, but will not be broadcasted using the normal rules. Instead, leading axes of size 1 are inserted, to make block.ndim the same for all blocks. This is primarily useful for working with scalars, and means that code like block([v, 1]) is valid, where v.ndim == 1.

When the nested list is two levels deep, this allows block matrices to be constructed from their components.

Parameters
arraysnested list of array_like or scalars (but not tuples)

If passed a single ndarray or scalar (a nested list of depth 0), this is returned unmodified (and not copied).

Elements shapes must match along the appropriate axes (without broadcasting), but leading 1s will be prepended to the shape as necessary to make the dimensions match.

allow_unknown_chunksizes: bool

Allow unknown chunksizes, such as come from converting from dask dataframes. Dask.array is unable to verify that chunks line up. If data comes from differently aligned sources then this can cause unexpected results.

Returns
block_arrayndarray

The array assembled from the given blocks.

The dimensionality of the output is equal to the greatest of: * the dimensionality of all the inputs * the depth to which the input list is nested

Raises
ValueError
  • If list depths are mismatched - for instance, [[a, b], c] is illegal, and should be spelt [[a, b], [c]]

  • If lists are empty - for instance, [[a, b], []]

See also

concatenate

Join a sequence of arrays together.

stack

Stack arrays in sequence along a new dimension.

hstack

Stack arrays in sequence horizontally (column wise).

vstack

Stack arrays in sequence vertically (row wise).

dstack

Stack arrays in sequence depth wise (along third dimension).

vsplit

Split array into a list of multiple sub-arrays vertically.

Notes

When called with only scalars, block is equivalent to an ndarray call. So block([[1, 2], [3, 4]]) is equivalent to array([[1, 2], [3, 4]]).

This function does not enforce that the blocks lie on a fixed grid. block([[a, b], [c, d]]) is not restricted to arrays of the form:

AAAbb
AAAbb
cccDD

But is also allowed to produce, for some a, b, c, d:

AAAbb
AAAbb
cDDDD

Since concatenation happens along the last axis first, block is _not_ capable of producing the following directly:

AAAbb
cccbb
cccDD

Matlab’s “square bracket stacking”, [A, B, ...; p, q, ...], is equivalent to block([[A, B, ...], [p, q, ...]]).

dask.array.blockwise(func, out_ind, *args, name=None, token=None, dtype=None, adjust_chunks=None, new_axes=None, align_arrays=True, concatenate=None, meta=None, **kwargs)

Tensor operation: Generalized inner and outer products

A broad class of blocked algorithms and patterns can be specified with a concise multi-index notation. The blockwise function applies an in-memory function across multiple blocks of multiple inputs in a variety of ways. Many dask.array operations are special cases of blockwise including elementwise, broadcasting, reductions, tensordot, and transpose.

Parameters
funccallable

Function to apply to individual tuples of blocks

out_inditerable

Block pattern of the output, something like ‘ijk’ or (1, 2, 3)

*argssequence of Array, index pairs

Sequence like (x, ‘ij’, y, ‘jk’, z, ‘i’)

**kwargsdict

Extra keyword arguments to pass to function

dtypenp.dtype

Datatype of resulting array.

concatenatebool, keyword only

If true concatenate arrays along dummy indices, else provide lists

adjust_chunksdict

Dictionary mapping index to function to be applied to chunk sizes

new_axesdict, keyword only

New indexes and their dimension lengths

Examples

2D embarrassingly parallel operation from two arrays, x, and y.

>>> z = blockwise(operator.add, 'ij', x, 'ij', y, 'ij', dtype='f8')  # z = x + y  

Outer product multiplying x by y, two 1-d vectors

>>> z = blockwise(operator.mul, 'ij', x, 'i', y, 'j', dtype='f8')  

z = x.T

>>> z = blockwise(np.transpose, 'ji', x, 'ij', dtype=x.dtype)  

The transpose case above is illustrative because it does same transposition both on each in-memory block by calling np.transpose and on the order of the blocks themselves, by switching the order of the index ij -> ji.

We can compose these same patterns with more variables and more complex in-memory functions

z = X + Y.T

>>> z = blockwise(lambda x, y: x + y.T, 'ij', x, 'ij', y, 'ji', dtype='f8')  

Any index, like i missing from the output index is interpreted as a contraction (note that this differs from Einstein convention; repeated indices do not imply contraction.) In the case of a contraction the passed function should expect an iterable of blocks on any array that holds that index. To receive arrays concatenated along contracted dimensions instead pass concatenate=True.

Inner product multiplying x by y, two 1-d vectors

>>> def sequence_dot(x_blocks, y_blocks):
...     result = 0
...     for x, y in zip(x_blocks, y_blocks):
...         result += x.dot(y)
...     return result
>>> z = blockwise(sequence_dot, '', x, 'i', y, 'i', dtype='f8')  

Add new single-chunk dimensions with the new_axes= keyword, including the length of the new dimension. New dimensions will always be in a single chunk.

>>> def f(x):
...     return x[:, None] * np.ones((1, 5))
>>> z = blockwise(f, 'az', x, 'a', new_axes={'z': 5}, dtype=x.dtype)  

New dimensions can also be multi-chunk by specifying a tuple of chunk sizes. This has limited utility as is (because the chunks are all the same), but the resulting graph can be modified to achieve more useful results (see da.map_blocks).

>>> z = blockwise(f, 'az', x, 'a', new_axes={'z': (5, 5)}, dtype=x.dtype)  

If the applied function changes the size of each chunk you can specify this with a adjust_chunks={...} dictionary holding a function for each index that modifies the dimension size in that index.

>>> def double(x):
...     return np.concatenate([x, x])
>>> y = blockwise(double, 'ij', x, 'ij',
...               adjust_chunks={'i': lambda n: 2 * n}, dtype=x.dtype)  

Include literals by indexing with None

>>> y = blockwise(add, 'ij', x, 'ij', 1234, None, dtype=x.dtype)  
dask.array.broadcast_arrays(*args, **kwargs)

Broadcast any number of arrays against each other.

This docstring was copied from numpy.broadcast_arrays.

Some inconsistencies with the Dask version may exist.

Parameters
`*args`array_likes

The arrays to broadcast.

subokbool, optional

If True, then sub-classes will be passed-through, otherwise the returned arrays will be forced to be a base-class array (default).

Returns
broadcastedlist of arrays

These arrays are views on the original arrays. They are typically not contiguous. Furthermore, more than one element of a broadcasted array may refer to a single memory location. If you need to write to the arrays, make copies first. While you can set the writable flag True, writing to a single output value may end up changing more than one location in the output array.

Deprecated since version 1.17: The output is currently marked so that if written to, a deprecation warning will be emitted. A future version will set the writable flag False so writing to it will raise an error.

Examples

>>> x = np.array([[1,2,3]])  
>>> y = np.array([[4],[5]])  
>>> np.broadcast_arrays(x, y)  
[array([[1, 2, 3],
       [1, 2, 3]]), array([[4, 4, 4],
       [5, 5, 5]])]

Here is a useful idiom for getting contiguous copies instead of non-contiguous views.

>>> [np.array(a) for a in np.broadcast_arrays(x, y)]  
[array([[1, 2, 3],
       [1, 2, 3]]), array([[4, 4, 4],
       [5, 5, 5]])]
dask.array.broadcast_to(x, shape, chunks=None)

Broadcast an array to a new shape.

Parameters
xarray_like

The array to broadcast.

shapetuple

The shape of the desired array.

chunkstuple, optional

If provided, then the result will use these chunks instead of the same chunks as the source array. Setting chunks explicitly as part of broadcast_to is more efficient than rechunking afterwards. Chunks are only allowed to differ from the original shape along dimensions that are new on the result or have size 1 the input array.

Returns
broadcastdask array
dask.array.coarsen(reduction, x, axes, trim_excess=False, **kwargs)

Coarsen array by applying reduction to fixed size neighborhoods

Parameters
reduction: function

Function like np.sum, np.mean, etc…

x: np.ndarray

Array to be coarsened

axes: dict

Mapping of axis to coarsening factor

Examples

>>> x = np.array([1, 2, 3, 4, 5, 6])
>>> coarsen(np.sum, x, {0: 2})
array([ 3,  7, 11])
>>> coarsen(np.max, x, {0: 3})
array([3, 6])

Provide dictionary of scale per dimension

>>> x = np.arange(24).reshape((4, 6))
>>> x
array([[ 0,  1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10, 11],
       [12, 13, 14, 15, 16, 17],
       [18, 19, 20, 21, 22, 23]])
>>> coarsen(np.min, x, {0: 2, 1: 3})
array([[ 0,  3],
       [12, 15]])

You must avoid excess elements explicitly

>>> x = np.array([1, 2, 3, 4, 5, 6, 7, 8])
>>> coarsen(np.min, x, {0: 3}, trim_excess=True)
array([1, 4])
dask.array.ceil(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])

This docstring was copied from numpy.ceil.

Some inconsistencies with the Dask version may exist.

Return the ceiling of the input, element-wise.

The ceil of the scalar x is the smallest integer i, such that i >= x. It is often denoted as \(\lceil x \rceil\).

Parameters
xarray_like

Input data.

outndarray, None, or tuple of ndarray and None, optional

A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.

wherearray_like, optional

This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default out=None, locations within it where the condition is False will remain uninitialized.

**kwargs

For other keyword-only arguments, see the ufunc docs.

Returns
yndarray or scalar

The ceiling of each element in x, with float dtype. This is a scalar if x is a scalar.

See also

floor, trunc, rint

Examples

>>> a = np.array([-1.7, -1.5, -0.2, 0.2, 1.5, 1.7, 2.0])  
>>> np.ceil(a)  
array([-1., -1., -0.,  1.,  2.,  2.,  2.])
dask.array.choose(a, choices)

Construct an array from an index array and a set of arrays to choose from.

This docstring was copied from numpy.choose.

Some inconsistencies with the Dask version may exist.

First of all, if confused or uncertain, definitely look at the Examples - in its full generality, this function is less simple than it might seem from the following code description (below ndi = numpy.lib.index_tricks):

np.choose(a,c) == np.array([c[a[I]][I] for I in ndi.ndindex(a.shape)]).

But this omits some subtleties. Here is a fully general summary:

Given an “index” array (a) of integers and a sequence of n arrays (choices), a and each choice array are first broadcast, as necessary, to arrays of a common shape; calling these Ba and Bchoices[i], i = 0,…,n-1 we have that, necessarily, Ba.shape == Bchoices[i].shape for each i. Then, a new array with shape Ba.shape is created as follows:

  • if mode=raise (the default), then, first of all, each element of a (and thus Ba) must be in the range [0, n-1]; now, suppose that i (in that range) is the value at the (j0, j1, …, jm) position in Ba - then the value at the same position in the new array is the value in Bchoices[i] at that same position;

  • if mode=wrap, values in a (and thus Ba) may be any (signed) integer; modular arithmetic is used to map integers outside the range [0, n-1] back into that range; and then the new array is constructed as above;

  • if mode=clip, values in a (and thus Ba) may be any (signed) integer; negative integers are mapped to 0; values greater than n-1 are mapped to n-1; and then the new array is constructed as above.

Parameters
aint array

This array must contain integers in [0, n-1], where n is the number of choices, unless mode=wrap or mode=clip, in which cases any integers are permissible.

choicessequence of arrays

Choice arrays. a and all of the choices must be broadcastable to the same shape. If choices is itself an array (not recommended), then its outermost dimension (i.e., the one corresponding to choices.shape[0]) is taken as defining the “sequence”.

outarray, optional (Not supported in Dask)

If provided, the result will be inserted into this array. It should be of the appropriate shape and dtype. Note that out is always buffered if mode=’raise’; use other modes for better performance.

mode{‘raise’ (default), ‘wrap’, ‘clip’}, optional (Not supported in Dask)

Specifies how indices outside [0, n-1] will be treated:

  • ‘raise’ : an exception is raised

  • ‘wrap’ : value becomes value mod n

  • ‘clip’ : values < 0 are mapped to 0, values > n-1 are mapped to n-1

Returns
merged_arrayarray

The merged result.

Raises
ValueError: shape mismatch

If a and each choice array are not all broadcastable to the same shape.

See also

ndarray.choose

equivalent method

numpy.take_along_axis

Preferable if choices is an array

Notes

To reduce the chance of misinterpretation, even though the following “abuse” is nominally supported, choices should neither be, nor be thought of as, a single array, i.e., the outermost sequence-like container should be either a list or a tuple.

Examples

>>> choices = [[0, 1, 2, 3], [10, 11, 12, 13],  
...   [20, 21, 22, 23], [30, 31, 32, 33]]
>>> np.choose([2, 3, 1, 0], choices  
... # the first element of the result will be the first element of the
... # third (2+1) "array" in choices, namely, 20; the second element
... # will be the second element of the fourth (3+1) choice array, i.e.,
... # 31, etc.
... )
array([20, 31, 12,  3])
>>> np.choose([2, 4, 1, 0], choices, mode='clip') # 4 goes to 3 (4-1)  
array([20, 31, 12,  3])
>>> # because there are 4 choice arrays
>>> np.choose([2, 4, 1, 0], choices, mode='wrap') # 4 goes to (4 mod 4)  
array([20,  1, 12,  3])
>>> # i.e., 0

A couple examples illustrating how choose broadcasts:

>>> a = [[1, 0, 1], [0, 1, 0], [1, 0, 1]]  
>>> choices = [-10, 10]  
>>> np.choose(a, choices)  
array([[ 10, -10,  10],
       [-10,  10, -10],
       [ 10, -10,  10]])
>>> # With thanks to Anne Archibald
>>> a = np.array([0, 1]).reshape((2,1,1))  
>>> c1 = np.array([1, 2, 3]).reshape((1,3,1))  
>>> c2 = np.array([-1, -2, -3, -4, -5]).reshape((1,1,5))  
>>> np.choose(a, (c1, c2)) # result is 2x3x5, res[0,:,:]=c1, res[1,:,:]=c2  
array([[[ 1,  1,  1,  1,  1],
        [ 2,  2,  2,  2,  2],
        [ 3,  3,  3,  3,  3]],
       [[-1, -2, -3, -4, -5],
        [-1, -2, -3, -4, -5],
        [-1, -2, -3, -4, -5]]])
dask.array.clip(*args, **kwargs)

Clip (limit) the values in an array.

This docstring was copied from numpy.clip.

Some inconsistencies with the Dask version may exist.

Given an interval, values outside the interval are clipped to the interval edges. For example, if an interval of [0, 1] is specified, values smaller than 0 become 0, and values larger than 1 become 1.

Equivalent to but faster than np.maximum(a_min, np.minimum(a, a_max)). No check is performed to ensure a_min < a_max.

Parameters
aarray_like (Not supported in Dask)

Array containing elements to clip.

a_minscalar or array_like or None (Not supported in Dask)

Minimum value. If None, clipping is not performed on lower interval edge. Not more than one of a_min and a_max may be None.

a_maxscalar or array_like or None (Not supported in Dask)

Maximum value. If None, clipping is not performed on upper interval edge. Not more than one of a_min and a_max may be None. If a_min or a_max are array_like, then the three arrays will be broadcasted to match their shapes.

outndarray, optional (Not supported in Dask)

The results will be placed in this array. It may be the input array for in-place clipping. out must be of the right shape to hold the output. Its type is preserved.

**kwargs

For other keyword-only arguments, see the ufunc docs.

New in version 1.17.0.

Returns
clipped_arrayndarray

An array with the elements of a, but where values < a_min are replaced with a_min, and those > a_max with a_max.

See also

ufuncs-output-type

Examples

>>> a = np.arange(10)  
>>> np.clip(a, 1, 8)  
array([1, 1, 2, 3, 4, 5, 6, 7, 8, 8])
>>> a  
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
>>> np.clip(a, 3, 6, out=a)  
array([3, 3, 3, 3, 4, 5, 6, 6, 6, 6])
>>> a = np.arange(10)  
>>> a  
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
>>> np.clip(a, [3, 4, 1, 1, 1, 4, 4, 4, 4, 4], 8)  
array([3, 4, 2, 3, 4, 5, 6, 7, 8, 8])
dask.array.compress(condition, a, axis=None)

Return selected slices of an array along given axis.

This docstring was copied from numpy.compress.

Some inconsistencies with the Dask version may exist.

When working along a given axis, a slice along that axis is returned in output for each index where condition evaluates to True. When working on a 1-D array, compress is equivalent to extract.

Parameters
condition1-D array of bools

Array that selects which entries to return. If len(condition) is less than the size of a along the given axis, then output is truncated to the length of the condition array.

aarray_like

Array from which to extract a part.

axisint, optional

Axis along which to take slices. If None (default), work on the flattened array.

outndarray, optional (Not supported in Dask)

Output array. Its type is preserved and it must be of the right shape to hold the output.

Returns
compressed_arrayndarray

A copy of a without the slices along axis for which condition is false.

See also

take, choose, diag, diagonal, select
ndarray.compress

Equivalent method in ndarray

np.extract

Equivalent method when working on 1-D arrays

ufuncs-output-type

Examples

>>> a = np.array([[1, 2], [3, 4], [5, 6]])  
>>> a  
array([[1, 2],
       [3, 4],
       [5, 6]])
>>> np.compress([0, 1], a, axis=0)  
array([[3, 4]])
>>> np.compress([False, True, True], a, axis=0)  
array([[3, 4],
       [5, 6]])
>>> np.compress([False, True], a, axis=1)  
array([[2],
       [4],
       [6]])

Working on the flattened array does not return slices along an axis but selects elements.

>>> np.compress([False, True], a)  
array([2])
dask.array.concatenate(seq, axis=0, allow_unknown_chunksizes=False)

Concatenate arrays along an existing axis

Given a sequence of dask Arrays form a new dask Array by stacking them along an existing dimension (axis=0 by default)

Parameters
seq: list of dask.arrays
axis: int

Dimension along which to align all of the arrays

allow_unknown_chunksizes: bool

Allow unknown chunksizes, such as come from converting from dask dataframes. Dask.array is unable to verify that chunks line up. If data comes from differently aligned sources then this can cause unexpected results.

See also

stack

Examples

Create slices

>>> import dask.array as da
>>> import numpy as np
>>> data = [from_array(np.ones((4, 4)), chunks=(2, 2))
...          for i in range(3)]
>>> x = da.concatenate(data, axis=0)
>>> x.shape
(12, 4)
>>> da.concatenate(data, axis=1).shape
(4, 12)

Result is a new dask Array

dask.array.conj(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])

This docstring was copied from numpy.conjugate.

Some inconsistencies with the Dask version may exist.

Return the complex conjugate, element-wise.

The complex conjugate of a complex number is obtained by changing the sign of its imaginary part.

Parameters
xarray_like

Input value.

outndarray, None, or tuple of ndarray and None, optional

A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.

wherearray_like, optional

This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default out=None, locations within it where the condition is False will remain uninitialized.

**kwargs

For other keyword-only arguments, see the ufunc docs.

Returns
yndarray

The complex conjugate of x, with same dtype as y. This is a scalar if x is a scalar.

Notes

conj is an alias for conjugate:

>>> np.conj is np.conjugate  
True

Examples

>>> np.conjugate(1+2j)  
(1-2j)
>>> x = np.eye(2) + 1j * np.eye(2)  
>>> np.conjugate(x)  
array([[ 1.-1.j,  0.-0.j],
       [ 0.-0.j,  1.-1.j]])
dask.array.copysign(x1, x2, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])

This docstring was copied from numpy.copysign.

Some inconsistencies with the Dask version may exist.

Change the sign of x1 to that of x2, element-wise.

If x2 is a scalar, its sign will be copied to all elements of x1.

Parameters
x1array_like

Values to change the sign of.

x2array_like

The sign of x2 is copied to x1. If x1.shape != x2.shape, they must be broadcastable to a common shape (which becomes the shape of the output).

outndarray, None, or tuple of ndarray and None, optional

A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.

wherearray_like, optional

This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default out=None, locations within it where the condition is False will remain uninitialized.

**kwargs

For other keyword-only arguments, see the ufunc docs.

Returns
outndarray or scalar

The values of x1 with the sign of x2. This is a scalar if both x1 and x2 are scalars.

Examples

>>> np.copysign(1.3, -1)  
-1.3
>>> 1/np.copysign(0, 1)  
inf
>>> 1/np.copysign(0, -1)  
-inf
>>> np.copysign([-1, 0, 1], -1.1)  
array([-1., -0., -1.])
>>> np.copysign([-1, 0, 1], np.arange(3)-1)  
array([-1.,  0.,  1.])
dask.array.corrcoef(x, y=None, rowvar=1)

Return Pearson product-moment correlation coefficients.

This docstring was copied from numpy.corrcoef.

Some inconsistencies with the Dask version may exist.

Please refer to the documentation for cov for more detail. The relationship between the correlation coefficient matrix, R, and the covariance matrix, C, is

\[R_{ij} = \frac{ C_{ij} } { \sqrt{ C_{ii} * C_{jj} } }\]

The values of R are between -1 and 1, inclusive.

Parameters
xarray_like

A 1-D or 2-D array containing multiple variables and observations. Each row of x represents a variable, and each column a single observation of all those variables. Also see rowvar below.

yarray_like, optional

An additional set of variables and observations. y has the same shape as x.

rowvarbool, optional

If rowvar is True (default), then each row represents a variable, with observations in the columns. Otherwise, the relationship is transposed: each column represents a variable, while the rows contain observations.

bias_NoValue, optional (Not supported in Dask)

Has no effect, do not use.

Deprecated since version 1.10.0.

ddof_NoValue, optional (Not supported in Dask)

Has no effect, do not use.

Deprecated since version 1.10.0.

Returns
Rndarray

The correlation coefficient matrix of the variables.

See also

cov

Covariance matrix

Notes

Due to floating point rounding the resulting array may not be Hermitian, the diagonal elements may not be 1, and the elements may not satisfy the inequality abs(a) <= 1. The real and imaginary parts are clipped to the interval [-1, 1] in an attempt to improve on that situation but is not much help in the complex case.

This function accepts but discards arguments bias and ddof. This is for backwards compatibility with previous versions of this function. These arguments had no effect on the return values of the function and can be safely ignored in this and previous versions of numpy.

dask.array.cos(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])

This docstring was copied from numpy.cos.

Some inconsistencies with the Dask version may exist.

Cosine element-wise.

Parameters
xarray_like

Input array in radians.

outndarray, None, or tuple of ndarray and None, optional

A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.

wherearray_like, optional

This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default out=None, locations within it where the condition is False will remain uninitialized.

**kwargs

For other keyword-only arguments, see the ufunc docs.

Returns
yndarray

The corresponding cosine values. This is a scalar if x is a scalar.

Notes

If out is provided, the function writes the result into it, and returns a reference to out. (See Examples)

References

M. Abramowitz and I. A. Stegun, Handbook of Mathematical Functions. New York, NY: Dover, 1972.

Examples

>>> np.cos(np.array([0, np.pi/2, np.pi]))  
array([  1.00000000e+00,   6.12303177e-17,  -1.00000000e+00])
>>>
>>> # Example of providing the optional output parameter
>>> out1 = np.array([0], dtype='d')  
>>> out2 = np.cos([0.1], out1)  
>>> out2 is out1  
True
>>>
>>> # Example of ValueError due to provision of shape mis-matched `out`
>>> np.cos(np.zeros((3,3)),np.zeros((2,2)))  
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: operands could not be broadcast together with shapes (3,3) (2,2)
dask.array.cosh(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])

This docstring was copied from numpy.cosh.

Some inconsistencies with the Dask version may exist.

Hyperbolic cosine, element-wise.

Equivalent to 1/2 * (np.exp(x) + np.exp(-x)) and np.cos(1j*x).

Parameters
xarray_like

Input array.

outndarray, None, or tuple of ndarray and None, optional

A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.

wherearray_like, optional

This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default out=None, locations within it where the condition is False will remain uninitialized.

**kwargs

For other keyword-only arguments, see the ufunc docs.

Returns
outndarray or scalar

Output array of same shape as x. This is a scalar if x is a scalar.

Examples

>>> np.cosh(0)  
1.0

The hyperbolic cosine describes the shape of a hanging cable:

>>> import matplotlib.pyplot as plt  
>>> x = np.linspace(-4, 4, 1000)  
>>> plt.plot(x, np.cosh(x))  
>>> plt.show()  
dask.array.count_nonzero(a, axis=None)

Counts the number of non-zero values in the array a.

This docstring was copied from numpy.count_nonzero.

Some inconsistencies with the Dask version may exist.

The word “non-zero” is in reference to the Python 2.x built-in method __nonzero__() (renamed __bool__() in Python 3.x) of Python objects that tests an object’s “truthfulness”. For example, any number is considered truthful if it is nonzero, whereas any string is considered truthful if it is not the empty string. Thus, this function (recursively) counts how many elements in a (and in sub-arrays thereof) have their __nonzero__() or __bool__() method evaluated to True.

Parameters
aarray_like

The array for which to count non-zeros.

axisint or tuple, optional

Axis or tuple of axes along which to count non-zeros. Default is None, meaning that non-zeros will be counted along a flattened version of a.

New in version 1.12.0.

Returns
countint or array of int

Number of non-zero values in the array along a given axis. Otherwise, the total number of non-zero values in the array is returned.

See also

nonzero

Return the coordinates of all the non-zero values.

Examples

>>> np.count_nonzero(np.eye(4))  
4
>>> np.count_nonzero([[0,1,7,0,0],[3,0,0,2,19]])  
5
>>> np.count_nonzero([[0,1,7,0,0],[3,0,0,2,19]], axis=0)  
array([1, 1, 1, 1, 1])
>>> np.count_nonzero([[0,1,7,0,0],[3,0,0,2,19]], axis=1)  
array([2, 3])
dask.array.cov(m, y=None, rowvar=1, bias=0, ddof=None)

Estimate a covariance matrix, given data and weights.

This docstring was copied from numpy.cov.

Some inconsistencies with the Dask version may exist.

Covariance indicates the level to which two variables vary together. If we examine N-dimensional samples, \(X = [x_1, x_2, ... x_N]^T\), then the covariance matrix element \(C_{ij}\) is the covariance of \(x_i\) and \(x_j\). The element \(C_{ii}\) is the variance of \(x_i\).

See the notes for an outline of the algorithm.

Parameters
marray_like

A 1-D or 2-D array containing multiple variables and observations. Each row of m represents a variable, and each column a single observation of all those variables. Also see rowvar below.

yarray_like, optional

An additional set of variables and observations. y has the same form as that of m.

rowvarbool, optional

If rowvar is True (default), then each row represents a variable, with observations in the columns. Otherwise, the relationship is transposed: each column represents a variable, while the rows contain observations.

biasbool, optional

Default normalization (False) is by (N - 1), where N is the number of observations given (unbiased estimate). If bias is True, then normalization is by N. These values can be overridden by using the keyword ddof in numpy versions >= 1.5.

ddofint, optional

If not None the default value implied by bias is overridden. Note that ddof=1 will return the unbiased estimate, even if both fweights and aweights are specified, and ddof=0 will return the simple average. See the notes for the details. The default value is None.

New in version 1.5.

fweightsarray_like, int, optional (Not supported in Dask)

1-D array of integer frequency weights; the number of times each observation vector should be repeated.

New in version 1.10.

aweightsarray_like, optional (Not supported in Dask)

1-D array of observation vector weights. These relative weights are typically large for observations considered “important” and smaller for observations considered less “important”. If ddof=0 the array of weights can be used to assign probabilities to observation vectors.

New in version 1.10.

Returns
outndarray

The covariance matrix of the variables.

See also

corrcoef

Normalized covariance matrix

Notes

Assume that the observations are in the columns of the observation array m and let f = fweights and a = aweights for brevity. The steps to compute the weighted covariance are as follows:

>>> m = np.arange(10, dtype=np.float64)  
>>> f = np.arange(10) * 2  
>>> a = np.arange(10) ** 2.  
>>> ddof = 1  
>>> w = f * a  
>>> v1 = np.sum(w)  
>>> v2 = np.sum(w * a)  
>>> m -= np.sum(m * w, axis=None, keepdims=True) / v1  
>>> cov = np.dot(m * w, m.T) * v1 / (v1**2 - ddof * v2)  

Note that when a == 1, the normalization factor v1 / (v1**2 - ddof * v2) goes over to 1 / (np.sum(f) - ddof) as it should.

Examples

Consider two variables, \(x_0\) and \(x_1\), which correlate perfectly, but in opposite directions:

>>> x = np.array([[0, 2], [1, 1], [2, 0]]).T  
>>> x  
array([[0, 1, 2],
       [2, 1, 0]])

Note how \(x_0\) increases while \(x_1\) decreases. The covariance matrix shows this clearly:

>>> np.cov(x)  
array([[ 1., -1.],
       [-1.,  1.]])

Note that element \(C_{0,1}\), which shows the correlation between \(x_0\) and \(x_1\), is negative.

Further, note how x and y are combined:

>>> x = [-2.1, -1,  4.3]  
>>> y = [3,  1.1,  0.12]  
>>> X = np.stack((x, y), axis=0)  
>>> np.cov(X)  
array([[11.71      , -4.286     ], # may vary
       [-4.286     ,  2.144133]])
>>> np.cov(x, y)  
array([[11.71      , -4.286     ], # may vary
       [-4.286     ,  2.144133]])
>>> np.cov(x)  
array(11.71)
dask.array.cumprod(x, axis=None, dtype=None, out=None)

Return the cumulative product of elements along a given axis.

This docstring was copied from numpy.cumprod.

Some inconsistencies with the Dask version may exist.

Parameters
aarray_like (Not supported in Dask)

Input array.

axisint, optional

Axis along which the cumulative product is computed. By default the input is flattened.

dtypedtype, optional

Type of the returned array, as well as of the accumulator in which the elements are multiplied. If dtype is not specified, it defaults to the dtype of a, unless a has an integer dtype with a precision less than that of the default platform integer. In that case, the default platform integer is used instead.

outndarray, optional

Alternative output array in which to place the result. It must have the same shape and buffer length as the expected output but the type of the resulting values will be cast if necessary.

Returns
cumprodndarray

A new array holding the result is returned unless out is specified, in which case a reference to out is returned.

See also

ufuncs-output-type

Notes

Arithmetic is modular when using integer types, and no error is raised on overflow.

Examples

>>> a = np.array([1,2,3])  
>>> np.cumprod(a) # intermediate results 1, 1*2  
...               # total product 1*2*3 = 6
array([1, 2, 6])
>>> a = np.array([[1, 2, 3], [4, 5, 6]])  
>>> np.cumprod(a, dtype=float) # specify type of output  
array([   1.,    2.,    6.,   24.,  120.,  720.])

The cumulative product for each column (i.e., over the rows) of a:

>>> np.cumprod(a, axis=0)  
array([[ 1,  2,  3],
       [ 4, 10, 18]])

The cumulative product for each row (i.e. over the columns) of a:

>>> np.cumprod(a,axis=1)  
array([[  1,   2,   6],
       [  4,  20, 120]])
dask.array.cumsum(x, axis=None, dtype=None, out=None)

Return the cumulative sum of the elements along a given axis.

This docstring was copied from numpy.cumsum.

Some inconsistencies with the Dask version may exist.

Parameters
aarray_like (Not supported in Dask)

Input array.

axisint, optional

Axis along which the cumulative sum is computed. The default (None) is to compute the cumsum over the flattened array.

dtypedtype, optional

Type of the returned array and of the accumulator in which the elements are summed. If dtype is not specified, it defaults to the dtype of a, unless a has an integer dtype with a precision less than that of the default platform integer. In that case, the default platform integer is used.

outndarray, optional

Alternative output array in which to place the result. It must have the same shape and buffer length as the expected output but the type will be cast if necessary. See ufuncs-output-type for more details.

Returns
cumsum_along_axisndarray.

A new array holding the result is returned unless out is specified, in which case a reference to out is returned. The result has the same size as a, and the same shape as a if axis is not None or a is a 1-d array.

See also

sum

Sum array elements.

trapz

Integration of array values using the composite trapezoidal rule.

diff

Calculate the n-th discrete difference along given axis.

Notes

Arithmetic is modular when using integer types, and no error is raised on overflow.

Examples

>>> a = np.array([[1,2,3], [4,5,6]])  
>>> a  
array([[1, 2, 3],
       [4, 5, 6]])
>>> np.cumsum(a)  
array([ 1,  3,  6, 10, 15, 21])
>>> np.cumsum(a, dtype=float)     # specifies type of output value(s)  
array([  1.,   3.,   6.,  10.,  15.,  21.])
>>> np.cumsum(a,axis=0)      # sum over rows for each of the 3 columns  
array([[1, 2, 3],
       [5, 7, 9]])
>>> np.cumsum(a,axis=1)      # sum over columns for each of the 2 rows  
array([[ 1,  3,  6],
       [ 4,  9, 15]])
dask.array.deg2rad(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])

This docstring was copied from numpy.deg2rad.

Some inconsistencies with the Dask version may exist.

Convert angles from degrees to radians.

Parameters
xarray_like

Angles in degrees.

outndarray, None, or tuple of ndarray and None, optional

A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.

wherearray_like, optional

This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default out=None, locations within it where the condition is False will remain uninitialized.

**kwargs

For other keyword-only arguments, see the ufunc docs.

Returns
yndarray

The corresponding angle in radians. This is a scalar if x is a scalar.

See also

rad2deg

Convert angles from radians to degrees.

unwrap

Remove large jumps in angle by wrapping.

Notes

New in version 1.3.0.

deg2rad(x) is x * pi / 180.

Examples

>>> np.deg2rad(180)  
3.1415926535897931
dask.array.degrees(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])

This docstring was copied from numpy.degrees.

Some inconsistencies with the Dask version may exist.

Convert angles from radians to degrees.

Parameters
xarray_like

Input array in radians.

outndarray, None, or tuple of ndarray and None, optional

A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.

wherearray_like, optional

This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default out=None, locations within it where the condition is False will remain uninitialized.

**kwargs

For other keyword-only arguments, see the ufunc docs.

Returns
yndarray of floats

The corresponding degree values; if out was supplied this is a reference to it. This is a scalar if x is a scalar.

See also

rad2deg

equivalent function

Examples

Convert a radian array to degrees

>>> rad = np.arange(12.)*np.pi/6  
>>> np.degrees(rad)  
array([   0.,   30.,   60.,   90.,  120.,  150.,  180.,  210.,  240.,
        270.,  300.,  330.])
>>> out = np.zeros((rad.shape))  
>>> r = np.degrees(rad, out)  
>>> np.all(r == out)  
True
dask.array.diag(v)

Extract a diagonal or construct a diagonal array.

This docstring was copied from numpy.diag.

Some inconsistencies with the Dask version may exist.

See the more detailed documentation for numpy.diagonal if you use this function to extract a diagonal and wish to write to the resulting array; whether it returns a copy or a view depends on what version of numpy you are using.

Parameters
varray_like

If v is a 2-D array, return a copy of its k-th diagonal. If v is a 1-D array, return a 2-D array with v on the k-th diagonal.

kint, optional (Not supported in Dask)

Diagonal in question. The default is 0. Use k>0 for diagonals above the main diagonal, and k<0 for diagonals below the main diagonal.

Returns
outndarray

The extracted diagonal or constructed diagonal array.

See also

diagonal

Return specified diagonals.

diagflat

Create a 2-D array with the flattened input as a diagonal.

trace

Sum along diagonals.

triu

Upper triangle of an array.

tril

Lower triangle of an array.

Examples

>>> x = np.arange(9).reshape((3,3))  
>>> x  
array([[0, 1, 2],
       [3, 4, 5],
       [6, 7, 8]])
>>> np.diag(x)  
array([0, 4, 8])
>>> np.diag(x, k=1)  
array([1, 5])
>>> np.diag(x, k=-1)  
array([3, 7])
>>> np.diag(np.diag(x))  
array([[0, 0, 0],
       [0, 4, 0],
       [0, 0, 8]])
dask.array.diagonal(a, offset=0, axis1=0, axis2=1)

Return specified diagonals.

This docstring was copied from numpy.diagonal.

Some inconsistencies with the Dask version may exist.

If a is 2-D, returns the diagonal of a with the given offset, i.e., the collection of elements of the form a[i, i+offset]. If a has more than two dimensions, then the axes specified by axis1 and axis2 are used to determine the 2-D sub-array whose diagonal is returned. The shape of the resulting array can be determined by removing axis1 and axis2 and appending an index to the right equal to the size of the resulting diagonals.

In versions of NumPy prior to 1.7, this function always returned a new, independent array containing a copy of the values in the diagonal.

In NumPy 1.7 and 1.8, it continues to return a copy of the diagonal, but depending on this fact is deprecated. Writing to the resulting array continues to work as it used to, but a FutureWarning is issued.

Starting in NumPy 1.9 it returns a read-only view on the original array. Attempting to write to the resulting array will produce an error.

In some future release, it will return a read/write view and writing to the returned array will alter your original array. The returned array will have the same type as the input array.

If you don’t write to the array returned by this function, then you can just ignore all of the above.

If you depend on the current behavior, then we suggest copying the returned array explicitly, i.e., use np.diagonal(a).copy() instead of just np.diagonal(a). This will work with both past and future versions of NumPy.

Parameters
aarray_like

Array from which the diagonals are taken.

offsetint, optional

Offset of the diagonal from the main diagonal. Can be positive or negative. Defaults to main diagonal (0).

axis1int, optional

Axis to be used as the first axis of the 2-D sub-arrays from which the diagonals should be taken. Defaults to first axis (0).

axis2int, optional

Axis to be used as the second axis of the 2-D sub-arrays from which the diagonals should be taken. Defaults to second axis (1).

Returns
array_of_diagonalsndarray

If a is 2-D, then a 1-D array containing the diagonal and of the same type as a is returned unless a is a matrix, in which case a 1-D array rather than a (2-D) matrix is returned in order to maintain backward compatibility.

If a.ndim > 2, then the dimensions specified by axis1 and axis2 are removed, and a new axis inserted at the end corresponding to the diagonal.

Raises
ValueError

If the dimension of a is less than 2.

See also

diag

MATLAB work-a-like for 1-D and 2-D arrays.

diagflat

Create diagonal arrays.

trace

Sum along diagonals.

Examples

>>> a = np.arange(4).reshape(2,2)  
>>> a  
array([[0, 1],
       [2, 3]])
>>> a.diagonal()  
array([0, 3])
>>> a.diagonal(1)  
array([1])

A 3-D example:

>>> a = np.arange(8).reshape(2,2,2); a  
array([[[0, 1],
        [2, 3]],
       [[4, 5],
        [6, 7]]])
>>> a.diagonal(0,  # Main diagonals of two arrays created by skipping  
...            0,  # across the outer(left)-most axis last and
...            1)  # the "middle" (row) axis first.
array([[0, 6],
       [1, 7]])

The sub-arrays whose main diagonals we just obtained; note that each corresponds to fixing the right-most (column) axis, and that the diagonals are “packed” in rows.

>>> a[:,:,0]  # main diagonal is [0 6]  
array([[0, 2],
       [4, 6]])
>>> a[:,:,1]  # main diagonal is [1 7]  
array([[1, 3],
       [5, 7]])

The anti-diagonal can be obtained by reversing the order of elements using either numpy.flipud or numpy.fliplr.

>>> a = np.arange(9).reshape(3, 3)  
>>> a  
array([[0, 1, 2],
       [3, 4, 5],
       [6, 7, 8]])
>>> np.fliplr(a).diagonal()  # Horizontal flip  
array([2, 4, 6])
>>> np.flipud(a).diagonal()  # Vertical flip  
array([6, 4, 2])

Note that the order in which the diagonal is retrieved varies depending on the flip function.

dask.array.diff(a, n=1, axis=-1)

Calculate the n-th discrete difference along the given axis.

This docstring was copied from numpy.diff.

Some inconsistencies with the Dask version may exist.

The first difference is given by out[i] = a[i+1] - a[i] along the given axis, higher differences are calculated by using diff recursively.

Parameters
aarray_like

Input array

nint, optional

The number of times values are differenced. If zero, the input is returned as-is.

axisint, optional

The axis along which the difference is taken, default is the last axis.

prepend, appendarray_like, optional

Values to prepend or append to a along axis prior to performing the difference. Scalar values are expanded to arrays with length 1 in the direction of axis and the shape of the input array in along all other axes. Otherwise the dimension and shape must match a except along axis.

New in version 1.16.0.

Returns
diffndarray

The n-th differences. The shape of the output is the same as a except along axis where the dimension is smaller by n. The type of the output is the same as the type of the difference between any two elements of a. This is the same as the type of a in most cases. A notable exception is datetime64, which results in a timedelta64 output array.

See also

gradient, ediff1d, cumsum

Notes

Type is preserved for boolean arrays, so the result will contain False when consecutive elements are the same and True when they differ.

For unsigned integer arrays, the results will also be unsigned. This should not be surprising, as the result is consistent with calculating the difference directly:

>>> u8_arr = np.array([1, 0], dtype=np.uint8)  
>>> np.diff(u8_arr)  
array([255], dtype=uint8)
>>> u8_arr[1,...] - u8_arr[0,...]  
255

If this is not desirable, then the array should be cast to a larger integer type first:

>>> i16_arr = u8_arr.astype(np.int16)  
>>> np.diff(i16_arr)  
array([-1], dtype=int16)

Examples

>>> x = np.array([1, 2, 4, 7, 0])  
>>> np.diff(x)  
array([ 1,  2,  3, -7])
>>> np.diff(x, n=2)  
array([  1,   1, -10])
>>> x = np.array([[1, 3, 6, 10], [0, 5, 6, 8]])  
>>> np.diff(x)  
array([[2, 3, 4],
       [5, 1, 2]])
>>> np.diff(x, axis=0)  
array([[-1,  2,  0, -2]])
>>> x = np.arange('1066-10-13', '1066-10-16', dtype=np.datetime64)  
>>> np.diff(x)  
array([1, 1], dtype='timedelta64[D]')
dask.array.digitize(a, bins, right=False)

Return the indices of the bins to which each value in input array belongs.

This docstring was copied from numpy.digitize.

Some inconsistencies with the Dask version may exist.

right

order of bins

returned index i satisfies

False

increasing

bins[i-1] <= x < bins[i]

True

increasing

bins[i-1] < x <= bins[i]

False

decreasing

bins[i-1] > x >= bins[i]

True

decreasing

bins[i-1] >= x > bins[i]

If values in x are beyond the bounds of bins, 0 or len(bins) is returned as appropriate.

Parameters
xarray_like (Not supported in Dask)

Input array to be binned. Prior to NumPy 1.10.0, this array had to be 1-dimensional, but can now have any shape.

binsarray_like

Array of bins. It has to be 1-dimensional and monotonic.

rightbool, optional

Indicating whether the intervals include the right or the left bin edge. Default behavior is (right==False) indicating that the interval does not include the right edge. The left bin end is open in this case, i.e., bins[i-1] <= x < bins[i] is the default behavior for monotonically increasing bins.

Returns
indicesndarray of ints

Output array of indices, of same shape as x.

Raises
ValueError

If bins is not monotonic.

TypeError

If the type of the input is complex.

See also

bincount, histogram, unique, searchsorted

Notes

If values in x are such that they fall outside the bin range, attempting to index bins with the indices that digitize returns will result in an IndexError.

New in version 1.10.0.

np.digitize is implemented in terms of np.searchsorted. This means that a binary search is used to bin the values, which scales much better for larger number of bins than the previous linear search. It also removes the requirement for the input array to be 1-dimensional.

For monotonically _increasing_ bins, the following are equivalent:

np.digitize(x, bins, right=True)
np.searchsorted(bins, x, side='left')

Note that as the order of the arguments are reversed, the side must be too. The searchsorted call is marginally faster, as it does not do any monotonicity checks. Perhaps more importantly, it supports all dtypes.

Examples

>>> x = np.array([0.2, 6.4, 3.0, 1.6])  
>>> bins = np.array([0.0, 1.0, 2.5, 4.0, 10.0])  
>>> inds = np.digitize(x, bins)  
>>> inds  
array([1, 4, 3, 2])
>>> for n in range(x.size):  
...   print(bins[inds[n]-1], "<=", x[n], "<", bins[inds[n]])
...
0.0 <= 0.2 < 1.0
4.0 <= 6.4 < 10.0
2.5 <= 3.0 < 4.0
1.0 <= 1.6 < 2.5
>>> x = np.array([1.2, 10.0, 12.4, 15.5, 20.])  
>>> bins = np.array([0, 5, 10, 15, 20])  
>>> np.digitize(x,bins,right=True)  
array([1, 2, 3, 4, 4])
>>> np.digitize(x,bins,right=False)  
array([1, 3, 3, 4, 5])
dask.array.dot(a, b, out=None)

This docstring was copied from numpy.dot.

Some inconsistencies with the Dask version may exist.

Dot product of two arrays. Specifically,

  • If both a and b are 1-D arrays, it is inner product of vectors (without complex conjugation).

  • If both a and b are 2-D arrays, it is matrix multiplication, but using matmul() or a @ b is preferred.

  • If either a or b is 0-D (scalar), it is equivalent to multiply() and using numpy.multiply(a, b) or a * b is preferred.

  • If a is an N-D array and b is a 1-D array, it is a sum product over the last axis of a and b.

  • If a is an N-D array and b is an M-D array (where M>=2), it is a sum product over the last axis of a and the second-to-last axis of b:

    dot(a, b)[i,j,k,m] = sum(a[i,j,:] * b[k,:,m])
    
Parameters
aarray_like

First argument.

barray_like

Second argument.

outndarray, optional

Output argument. This must have the exact kind that would be returned if it was not used. In particular, it must have the right type, must be C-contiguous, and its dtype must be the dtype that would be returned for dot(a,b). This is a performance feature. Therefore, if these conditions are not met, an exception is raised, instead of attempting to be flexible.

Returns
outputndarray

Returns the dot product of a and b. If a and b are both scalars or both 1-D arrays then a scalar is returned; otherwise an array is returned. If out is given, then it is returned.

Raises
ValueError

If the last dimension of a is not the same size as the second-to-last dimension of b.

See also

vdot

Complex-conjugating dot product.

tensordot

Sum products over arbitrary axes.

einsum

Einstein summation convention.

matmul

‘@’ operator as method with out parameter.

Examples

>>> np.dot(3, 4)  
12

Neither argument is complex-conjugated:

>>> np.dot([2j, 3j], [2j, 3j])  
(-13+0j)

For 2-D arrays it is the matrix product:

>>> a = [[1, 0], [0, 1]]  
>>> b = [[4, 1], [2, 2]]  
>>> np.dot(a, b)  
array([[4, 1],
       [2, 2]])
>>> a = np.arange(3*4*5*6).reshape((3,4,5,6))  
>>> b = np.arange(3*4*5*6)[::-1].reshape((5,4,6,3))  
>>> np.dot(a, b)[2,3,2,1,2,2]  
499128
>>> sum(a[2,3,2,:] * b[1,2,:,2])  
499128
dask.array.dstack(tup, allow_unknown_chunksizes=False)

Stack arrays in sequence depth wise (along third axis).

This docstring was copied from numpy.dstack.

Some inconsistencies with the Dask version may exist.

This is equivalent to concatenation along the third axis after 2-D arrays of shape (M,N) have been reshaped to (M,N,1) and 1-D arrays of shape (N,) have been reshaped to (1,N,1). Rebuilds arrays divided by dsplit.

This function makes most sense for arrays with up to 3 dimensions. For instance, for pixel-data with a height (first axis), width (second axis), and r/g/b channels (third axis). The functions concatenate, stack and block provide more general stacking and concatenation operations.

Parameters
tupsequence of arrays

The arrays must have the same shape along all but the third axis. 1-D or 2-D arrays must have the same shape.

Returns
stackedndarray

The array formed by stacking the given arrays, will be at least 3-D.

See also

stack

Join a sequence of arrays along a new axis.

vstack

Stack along first axis.

hstack

Stack along second axis.

concatenate

Join a sequence of arrays along an existing axis.

dsplit

Split array along third axis.

Examples

>>> a = np.array((1,2,3))  
>>> b = np.array((2,3,4))  
>>> np.dstack((a,b))  
array([[[1, 2],
        [2, 3],
        [3, 4]]])
>>> a = np.array([[1],[2],[3]])  
>>> b = np.array([[2],[3],[4]])  
>>> np.dstack((a,b))  
array([[[1, 2]],
       [[2, 3]],
       [[3, 4]]])
dask.array.ediff1d(ary, to_end=None, to_begin=None)

The differences between consecutive elements of an array.

This docstring was copied from numpy.ediff1d.

Some inconsistencies with the Dask version may exist.

Parameters
aryarray_like

If necessary, will be flattened before the differences are taken.

to_endarray_like, optional

Number(s) to append at the end of the returned differences.

to_beginarray_like, optional

Number(s) to prepend at the beginning of the returned differences.

Returns
ediff1dndarray

The differences. Loosely, this is ary.flat[1:] - ary.flat[:-1].

See also

diff, gradient

Notes

When applied to masked arrays, this function drops the mask information if the to_begin and/or to_end parameters are used.

Examples

>>> x = np.array([1, 2, 4, 7, 0])  
>>> np.ediff1d(x)  
array([ 1,  2,  3, -7])
>>> np.ediff1d(x, to_begin=-99, to_end=np.array([88, 99]))  
array([-99,   1,   2, ...,  -7,  88,  99])

The returned array is always 1D.

>>> y = [[1, 2, 4], [1, 6, 24]]  
>>> np.ediff1d(y)  
array([ 1,  2, -3,  5, 18])
dask.array.empty(*args, **kwargs)

Blocked variant of empty

Follows the signature of empty exactly except that it also features optional keyword arguments chunks: int, tuple, or dict and name: str.

Original signature follows below. empty(shape, dtype=float, order=’C’)

Return a new array of given shape and type, without initializing entries.

Parameters
shapeint or tuple of int

Shape of the empty array, e.g., (2, 3) or 2.

dtypedata-type, optional

Desired output data-type for the array, e.g, numpy.int8. Default is numpy.float64.

order{‘C’, ‘F’}, optional, default: ‘C’

Whether to store multi-dimensional data in row-major (C-style) or column-major (Fortran-style) order in memory.

Returns
outndarray

Array of uninitialized (arbitrary) data of the given shape, dtype, and order. Object arrays will be initialized to None.

See also

empty_like

Return an empty array with shape and type of input.

ones

Return a new array setting values to one.

zeros

Return a new array setting values to zero.

full

Return a new array of given shape filled with value.

Notes

empty, unlike zeros, does not set the array values to zero, and may therefore be marginally faster. On the other hand, it requires the user to manually set all the values in the array, and should be used with caution.

Examples

>>> np.empty([2, 2])
array([[ -9.74499359e+001,   6.69583040e-309],
       [  2.13182611e-314,   3.06959433e-309]])         #uninitialized
>>> np.empty([2, 2], dtype=int)
array([[-1073741821, -1067949133],
       [  496041986,    19249760]])                     #uninitialized
dask.array.empty_like(a, dtype=None, order='C', chunks=None, name=None, shape=None)

Return a new array with the same shape and type as a given array.

Parameters
aarray_like

The shape and data-type of a define these same attributes of the returned array.

dtypedata-type, optional

Overrides the data type of the result.

order{‘C’, ‘F’}, optional

Whether to store multidimensional data in C- or Fortran-contiguous (row- or column-wise) order in memory.

chunkssequence of ints

The number of samples on each block. Note that the last block will have fewer samples if len(array) % chunks != 0.

namestr, optional

An optional keyname for the array. Defaults to hashing the input keyword arguments.

shapeint or sequence of ints, optional.

Overrides the shape of the result.

Returns
outndarray

Array of uninitialized (arbitrary) data with the same shape and type as a.

See also

ones_like

Return an array of ones with shape and type of input.

zeros_like

Return an array of zeros with shape and type of input.

empty

Return a new uninitialized array.

ones

Return a new array setting values to one.

zeros

Return a new array setting values to zero.

Notes

This function does not initialize the returned array; to do that use zeros_like or ones_like instead. It may be marginally faster than the functions that do set the array values.

dask.array.einsum(subscripts, *operands, out=None, dtype=None, order='K', casting='safe', optimize=False)

This docstring was copied from numpy.einsum.

Some inconsistencies with the Dask version may exist.

Evaluates the Einstein summation convention on the operands.

Using the Einstein summation convention, many common multi-dimensional, linear algebraic array operations can be represented in a simple fashion. In implicit mode einsum computes these values.

In explicit mode, einsum provides further flexibility to compute other array operations that might not be considered classical Einstein summation operations, by disabling, or forcing summation over specified subscript labels.

See the notes and examples for clarification.

Parameters
subscriptsstr

Specifies the subscripts for summation as comma separated list of subscript labels. An implicit (classical Einstein summation) calculation is performed unless the explicit indicator ‘->’ is included as well as subscript labels of the precise output form.

operandslist of array_like

These are the arrays for the operation.

outndarray, optional

If provided, the calculation is done into this array.

dtype{data-type, None}, optional

If provided, forces the calculation to use the data type specified. Note that you may have to also give a more liberal casting parameter to allow the conversions. Default is None.

order{‘C’, ‘F’, ‘A’, ‘K’}, optional

Controls the memory layout of the output. ‘C’ means it should be C contiguous. ‘F’ means it should be Fortran contiguous, ‘A’ means it should be ‘F’ if the inputs are all ‘F’, ‘C’ otherwise. ‘K’ means it should be as close to the layout as the inputs as is possible, including arbitrarily permuted axes. Default is ‘K’.

casting{‘no’, ‘equiv’, ‘safe’, ‘same_kind’, ‘unsafe’}, optional

Controls what kind of data casting may occur. Setting this to ‘unsafe’ is not recommended, as it can adversely affect accumulations.

  • ‘no’ means the data types should not be cast at all.

  • ‘equiv’ means only byte-order changes are allowed.

  • ‘safe’ means only casts which can preserve values are allowed.

  • ‘same_kind’ means only safe casts or casts within a kind, like float64 to float32, are allowed.

  • ‘unsafe’ means any data conversions may be done.

Default is ‘safe’.

optimize{False, True, ‘greedy’, ‘optimal’}, optional

Controls if intermediate optimization should occur. No optimization will occur if False and True will default to the ‘greedy’ algorithm. Also accepts an explicit contraction list from the np.einsum_path function. See np.einsum_path for more details. Defaults to False.

Returns
outputndarray

The calculation based on the Einstein summation convention.

See also

einsum_path, dot, inner, outer, tensordot, linalg.multi_dot

Notes

New in version 1.6.0.

The Einstein summation convention can be used to compute many multi-dimensional, linear algebraic array operations. einsum provides a succinct way of representing these.

A non-exhaustive list of these operations, which can be computed by einsum, is shown below along with examples:

The subscripts string is a comma-separated list of subscript labels, where each label refers to a dimension of the corresponding operand. Whenever a label is repeated it is summed, so np.einsum('i,i', a, b) is equivalent to np.inner(a,b). If a label appears only once, it is not summed, so np.einsum('i', a) produces a view of a with no changes. A further example np.einsum('ij,jk', a, b) describes traditional matrix multiplication and is equivalent to np.matmul(a,b). Repeated subscript labels in one operand take the diagonal. For example, np.einsum('ii', a) is equivalent to np.trace(a).

In implicit mode, the chosen subscripts are important since the axes of the output are reordered alphabetically. This means that np.einsum('ij', a) doesn’t affect a 2D array, while np.einsum('ji', a) takes its transpose. Additionally, np.einsum('ij,jk', a, b) returns a matrix multiplication, while, np.einsum('ij,jh', a, b) returns the transpose of the multiplication since subscript ‘h’ precedes subscript ‘i’.

In explicit mode the output can be directly controlled by specifying output subscript labels. This requires the identifier ‘->’ as well as the list of output subscript labels. This feature increases the flexibility of the function since summing can be disabled or forced when required. The call np.einsum('i->', a) is like np.sum(a, axis=-1), and np.einsum('ii->i', a) is like np.diag(a). The difference is that einsum does not allow broadcasting by default. Additionally np.einsum('ij,jh->ih', a, b) directly specifies the order of the output subscript labels and therefore returns matrix multiplication, unlike the example above in implicit mode.

To enable and control broadcasting, use an ellipsis. Default NumPy-style broadcasting is done by adding an ellipsis to the left of each term, like np.einsum('...ii->...i', a). To take the trace along the first and last axes, you can do np.einsum('i...i', a), or to do a matrix-matrix product with the left-most indices instead of rightmost, one can do np.einsum('ij...,jk...->ik...', a, b).

When there is only one operand, no axes are summed, and no output parameter is provided, a view into the operand is returned instead of a new array. Thus, taking the diagonal as np.einsum('ii->i', a) produces a view (changed in version 1.10.0).

einsum also provides an alternative way to provide the subscripts and operands as einsum(op0, sublist0, op1, sublist1, ..., [sublistout]). If the output shape is not provided in this format einsum will be calculated in implicit mode, otherwise it will be performed explicitly. The examples below have corresponding einsum calls with the two parameter methods.

New in version 1.10.0.

Views returned from einsum are now writeable whenever the input array is writeable. For example, np.einsum('ijk...->kji...', a) will now have the same effect as np.swapaxes(a, 0, 2) and np.einsum('ii->i', a) will return a writeable view of the diagonal of a 2D array.

New in version 1.12.0.

Added the optimize argument which will optimize the contraction order of an einsum expression. For a contraction with three or more operands this can greatly increase the computational efficiency at the cost of a larger memory footprint during computation.

Typically a ‘greedy’ algorithm is applied which empirical tests have shown returns the optimal path in the majority of cases. In some cases ‘optimal’ will return the superlative path through a more expensive, exhaustive search. For iterative calculations it may be advisable to calculate the optimal path once and reuse that path by supplying it as an argument. An example is given below.

See numpy.einsum_path() for more details.

Examples

>>> a = np.arange(25).reshape(5,5)  
>>> b = np.arange(5)  
>>> c = np.arange(6).reshape(2,3)  

Trace of a matrix:

>>> np.einsum('ii', a)  
60
>>> np.einsum(a, [0,0])  
60
>>> np.trace(a)  
60

Extract the diagonal (requires explicit form):

>>> np.einsum('ii->i', a)  
array([ 0,  6, 12, 18, 24])
>>> np.einsum(a, [0,0], [0])  
array([ 0,  6, 12, 18, 24])
>>> np.diag(a)  
array([ 0,  6, 12, 18, 24])

Sum over an axis (requires explicit form):

>>> np.einsum('ij->i', a)  
array([ 10,  35,  60,  85, 110])
>>> np.einsum(a, [0,1], [0])  
array([ 10,  35,  60,  85, 110])
>>> np.sum(a, axis=1)  
array([ 10,  35,  60,  85, 110])

For higher dimensional arrays summing a single axis can be done with ellipsis:

>>> np.einsum('...j->...', a)  
array([ 10,  35,  60,  85, 110])
>>> np.einsum(a, [Ellipsis,1], [Ellipsis])  
array([ 10,  35,  60,  85, 110])

Compute a matrix transpose, or reorder any number of axes:

>>> np.einsum('ji', c)  
array([[0, 3],
       [1, 4],
       [2, 5]])
>>> np.einsum('ij->ji', c)  
array([[0, 3],
       [1, 4],
       [2, 5]])
>>> np.einsum(c, [1,0])  
array([[0, 3],
       [1, 4],
       [2, 5]])
>>> np.transpose(c)  
array([[0, 3],
       [1, 4],
       [2, 5]])

Vector inner products:

>>> np.einsum('i,i', b, b)  
30
>>> np.einsum(b, [0], b, [0])  
30
>>> np.inner(b,b)  
30

Matrix vector multiplication:

>>> np.einsum('ij,j', a, b)  
array([ 30,  80, 130, 180, 230])
>>> np.einsum(a, [0,1], b, [1])  
array([ 30,  80, 130, 180, 230])
>>> np.dot(a, b)  
array([ 30,  80, 130, 180, 230])
>>> np.einsum('...j,j', a, b)  
array([ 30,  80, 130, 180, 230])

Broadcasting and scalar multiplication:

>>> np.einsum('..., ...', 3, c)  
array([[ 0,  3,  6],
       [ 9, 12, 15]])
>>> np.einsum(',ij', 3, c)  
array([[ 0,  3,  6],
       [ 9, 12, 15]])
>>> np.einsum(3, [Ellipsis], c, [Ellipsis])  
array([[ 0,  3,  6],
       [ 9, 12, 15]])
>>> np.multiply(3, c)  
array([[ 0,  3,  6],
       [ 9, 12, 15]])

Vector outer product:

>>> np.einsum('i,j', np.arange(2)+1, b)  
array([[0, 1, 2, 3, 4],
       [0, 2, 4, 6, 8]])
>>> np.einsum(np.arange(2)+1, [0], b, [1])  
array([[0, 1, 2, 3, 4],
       [0, 2, 4, 6, 8]])
>>> np.outer(np.arange(2)+1, b)  
array([[0, 1, 2, 3, 4],
       [0, 2, 4, 6, 8]])

Tensor contraction:

>>> a = np.arange(60.).reshape(3,4,5)  
>>> b = np.arange(24.).reshape(4,3,2)  
>>> np.einsum('ijk,jil->kl', a, b)  
array([[4400., 4730.],
       [4532., 4874.],
       [4664., 5018.],
       [4796., 5162.],
       [4928., 5306.]])
>>> np.einsum(a, [0,1,2], b, [1,0,3], [2,3])  
array([[4400., 4730.],
       [4532., 4874.],
       [4664., 5018.],
       [4796., 5162.],
       [4928., 5306.]])
>>> np.tensordot(a,b, axes=([1,0],[0,1]))  
array([[4400., 4730.],
       [4532., 4874.],
       [4664., 5018.],
       [4796., 5162.],
       [4928., 5306.]])

Writeable returned arrays (since version 1.10.0):

>>> a = np.zeros((3, 3))  
>>> np.einsum('ii->i', a)[:] = 1  
>>> a  
array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])

Example of ellipsis use:

>>> a = np.arange(6).reshape((3,2))  
>>> b = np.arange(12).reshape((4,3))  
>>> np.einsum('ki,jk->ij', a, b)  
array([[10, 28, 46, 64],
       [13, 40, 67, 94]])
>>> np.einsum('ki,...k->i...', a, b)  
array([[10, 28, 46, 64],
       [13, 40, 67, 94]])
>>> np.einsum('k...,jk', a, b)  
array([[10, 28, 46, 64],
       [13, 40, 67, 94]])

Chained array operations. For more complicated contractions, speed ups might be achieved by repeatedly computing a ‘greedy’ path or pre-computing the ‘optimal’ path and repeatedly applying it, using an einsum_path insertion (since version 1.12.0). Performance improvements can be particularly significant with larger arrays:

>>> a = np.ones(64).reshape(2,4,8)  

Basic einsum: ~1520ms (benchmarked on 3.1GHz Intel i5.)

>>> for iteration in range(500):  
...     _ = np.einsum('ijk,ilm,njm,nlk,abc->',a,a,a,a,a)

Sub-optimal einsum (due to repeated path calculation time): ~330ms

>>> for iteration in range(500):  
...     _ = np.einsum('ijk,ilm,njm,nlk,abc->',a,a,a,a,a, optimize='optimal')

Greedy einsum (faster optimal path approximation): ~160ms

>>> for iteration in range(500):  
...     _ = np.einsum('ijk,ilm,njm,nlk,abc->',a,a,a,a,a, optimize='greedy')

Optimal einsum (best usage pattern in some use cases): ~110ms

>>> path = np.einsum_path('ijk,ilm,njm,nlk,abc->',a,a,a,a,a, optimize='optimal')[0]  
>>> for iteration in range(500):  
...     _ = np.einsum('ijk,ilm,njm,nlk,abc->',a,a,a,a,a, optimize=path)
dask.array.exp(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])

This docstring was copied from numpy.exp.

Some inconsistencies with the Dask version may exist.

Calculate the exponential of all elements in the input array.

Parameters
xarray_like

Input values.

outndarray, None, or tuple of ndarray and None, optional

A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.

wherearray_like, optional

This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default out=None, locations within it where the condition is False will remain uninitialized.

**kwargs

For other keyword-only arguments, see the ufunc docs.

Returns
outndarray or scalar

Output array, element-wise exponential of x. This is a scalar if x is a scalar.

See also

expm1

Calculate exp(x) - 1 for all elements in the array.

exp2

Calculate 2**x for all elements in the array.

Notes

The irrational number e is also known as Euler’s number. It is approximately 2.718281, and is the base of the natural logarithm, ln (this means that, if \(x = \ln y = \log_e y\), then \(e^x = y\). For real input, exp(x) is always positive.

For complex arguments, x = a + ib, we can write \(e^x = e^a e^{ib}\). The first term, \(e^a\), is already known (it is the real argument, described above). The second term, \(e^{ib}\), is \(\cos b + i \sin b\), a function with magnitude 1 and a periodic phase.

References

1

Wikipedia, “Exponential function”, https://en.wikipedia.org/wiki/Exponential_function

2

M. Abramovitz and I. A. Stegun, “Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables,” Dover, 1964, p. 69, http://www.math.sfu.ca/~cbm/aands/page_69.htm

Examples

Plot the magnitude and phase of exp(x) in the complex plane:

>>> import matplotlib.pyplot as plt  
>>> x = np.linspace(-2*np.pi, 2*np.pi, 100)  
>>> xx = x + 1j * x[:, np.newaxis] # a + ib over complex plane  
>>> out = np.exp(xx)  
>>> plt.subplot(121)  
>>> plt.imshow(np.abs(out),  
...            extent=[-2*np.pi, 2*np.pi, -2*np.pi, 2*np.pi], cmap='gray')
>>> plt.title('Magnitude of exp(x)')  
>>> plt.subplot(122)  
>>> plt.imshow(np.angle(out),  
...            extent=[-2*np.pi, 2*np.pi, -2*np.pi, 2*np.pi], cmap='hsv')
>>> plt.title('Phase (angle) of exp(x)')  
>>> plt.show()  
dask.array.expm1(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])

This docstring was copied from numpy.expm1.

Some inconsistencies with the Dask version may exist.

Calculate exp(x) - 1 for all elements in the array.

Parameters
xarray_like

Input values.

outndarray, None, or tuple of ndarray and None, optional

A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.

wherearray_like, optional

This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default out=None, locations within it where the condition is False will remain uninitialized.

**kwargs

For other keyword-only arguments, see the ufunc docs.

Returns
outndarray or scalar

Element-wise exponential minus one: out = exp(x) - 1. This is a scalar if x is a scalar.

See also

log1p

log(1 + x), the inverse of expm1.

Notes

This function provides greater precision than exp(x) - 1 for small values of x.

Examples

The true value of exp(1e-10) - 1 is 1.00000000005e-10 to about 32 significant digits. This example shows the superiority of expm1 in this case.

>>> np.expm1(1e-10)  
1.00000000005e-10
>>> np.exp(1e-10) - 1  
1.000000082740371e-10
dask.array.eye(N, chunks='auto', M=None, k=0, dtype=<class 'float'>)

Return a 2-D Array with ones on the diagonal and zeros elsewhere.

Parameters
Nint

Number of rows in the output.

chunksint, str

How to chunk the array. Must be one of the following forms:

  • A blocksize like 1000.

  • A size in bytes, like “100 MiB” which will choose a uniform block-like shape

  • The word “auto” which acts like the above, but uses a configuration value array.chunk-size for the chunk size

Mint, optional

Number of columns in the output. If None, defaults to N.

kint, optional

Index of the diagonal: 0 (the default) refers to the main diagonal, a positive value refers to an upper diagonal, and a negative value to a lower diagonal.

dtypedata-type, optional

Data-type of the returned array.

Returns
IArray of shape (N,M)

An array where all elements are equal to zero, except for the k-th diagonal, whose values are equal to one.

dask.array.fabs(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])

This docstring was copied from numpy.fabs.

Some inconsistencies with the Dask version may exist.

Compute the absolute values element-wise.

This function returns the absolute values (positive magnitude) of the data in x. Complex values are not handled, use absolute to find the absolute values of complex data.

Parameters
xarray_like

The array of numbers for which the absolute values are required. If x is a scalar, the result y will also be a scalar.

outndarray, None, or tuple of ndarray and None, optional

A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.

wherearray_like, optional

This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default out=None, locations within it where the condition is False will remain uninitialized.

**kwargs

For other keyword-only arguments, see the ufunc docs.

Returns
yndarray or scalar

The absolute values of x, the returned values are always floats. This is a scalar if x is a scalar.

See also

absolute

Absolute values including complex types.

Examples

>>> np.fabs(-1)  
1.0
>>> np.fabs([-1.2, 1.2])  
array([ 1.2,  1.2])
dask.array.fix(*args, **kwargs)

Round to nearest integer towards zero.

This docstring was copied from numpy.fix.

Some inconsistencies with the Dask version may exist.

Round an array of floats element-wise to nearest integer towards zero. The rounded values are returned as floats.

Parameters
xarray_like (Not supported in Dask)

An array of floats to be rounded

yndarray, optional

Output array

Returns
outndarray of floats (Not supported in Dask)

The array of rounded numbers

See also

trunc, floor, ceil
around

Round to given number of decimals

Examples

>>> np.fix(3.14)  
3.0
>>> np.fix(3)  
3.0
>>> np.fix([2.1, 2.9, -2.1, -2.9])  
array([ 2.,  2., -2., -2.])
dask.array.flatnonzero(a)

Return indices that are non-zero in the flattened version of a.

This docstring was copied from numpy.flatnonzero.

Some inconsistencies with the Dask version may exist.

This is equivalent to np.nonzero(np.ravel(a))[0].

Parameters
aarray_like

Input data.

Returns
resndarray

Output array, containing the indices of the elements of a.ravel() that are non-zero.

See also

nonzero

Return the indices of the non-zero elements of the input array.

ravel

Return a 1-D array containing the elements of the input array.

Examples

>>> x = np.arange(-2, 3)  
>>> x  
array([-2, -1,  0,  1,  2])
>>> np.flatnonzero(x)  
array([0, 1, 3, 4])

Use the indices of the non-zero elements as an index array to extract these elements:

>>> x.ravel()[np.flatnonzero(x)]  
array([-2, -1,  1,  2])
dask.array.flip(m, axis)

Reverse element order along axis.

Parameters
axisint

Axis to reverse element order of.

Returns
reversed arrayndarray
dask.array.flipud(m)

Flip array in the up/down direction.

This docstring was copied from numpy.flipud.

Some inconsistencies with the Dask version may exist.

Flip the entries in each column in the up/down direction. Rows are preserved, but appear in a different order than before.

Parameters
marray_like

Input array.

Returns
outarray_like

A view of m with the rows reversed. Since a view is returned, this operation is \(\mathcal O(1)\).

See also

fliplr

Flip array in the left/right direction.

rot90

Rotate array counterclockwise.

Notes

Equivalent to m[::-1,...]. Does not require the array to be two-dimensional.

Examples

>>> A = np.diag([1.0, 2, 3])  
>>> A  
array([[1.,  0.,  0.],
       [0.,  2.,  0.],
       [0.,  0.,  3.]])
>>> np.flipud(A)  
array([[0.,  0.,  3.],
       [0.,  2.,  0.],
       [1.,  0.,  0.]])
>>> A = np.random.randn(2,3,5)  
>>> np.all(np.flipud(A) == A[::-1,...])  
True
>>> np.flipud([1,2])  
array([2, 1])
dask.array.fliplr(m)

Flip array in the left/right direction.

This docstring was copied from numpy.fliplr.

Some inconsistencies with the Dask version may exist.

Flip the entries in each row in the left/right direction. Columns are preserved, but appear in a different order than before.

Parameters
marray_like

Input array, must be at least 2-D.

Returns
fndarray

A view of m with the columns reversed. Since a view is returned, this operation is \(\mathcal O(1)\).

See also

flipud

Flip array in the up/down direction.

rot90

Rotate array counterclockwise.

Notes

Equivalent to m[:,::-1]. Requires the array to be at least 2-D.

Examples

>>> A = np.diag([1.,2.,3.])  
>>> A  
array([[1.,  0.,  0.],
       [0.,  2.,  0.],
       [0.,  0.,  3.]])
>>> np.fliplr(A)  
array([[0.,  0.,  1.],
       [0.,  2.,  0.],
       [3.,  0.,  0.]])
>>> A = np.random.randn(2,3,5)  
>>> np.all(np.fliplr(A) == A[:,::-1,...])  
True
dask.array.floor(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])

This docstring was copied from numpy.floor.

Some inconsistencies with the Dask version may exist.

Return the floor of the input, element-wise.

The floor of the scalar x is the largest integer i, such that i <= x. It is often denoted as \(\lfloor x \rfloor\).

Parameters
xarray_like

Input data.

outndarray, None, or tuple of ndarray and None, optional

A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.

wherearray_like, optional

This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default out=None, locations within it where the condition is False will remain uninitialized.

**kwargs

For other keyword-only arguments, see the ufunc docs.

Returns
yndarray or scalar

The floor of each element in x. This is a scalar if x is a scalar.

See also

ceil, trunc, rint

Notes

Some spreadsheet programs calculate the “floor-towards-zero”, in other words floor(-2.5) == -2. NumPy instead uses the definition of floor where floor(-2.5) == -3.

Examples

>>> a = np.array([-1.7, -1.5, -0.2, 0.2, 1.5, 1.7, 2.0])  
>>> np.floor(a)  
array([-2., -2., -1.,  0.,  1.,  1.,  2.])
dask.array.fmax(x1, x2, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])

This docstring was copied from numpy.fmax.

Some inconsistencies with the Dask version may exist.

Element-wise maximum of array elements.

Compare two arrays and returns a new array containing the element-wise maxima. If one of the elements being compared is a NaN, then the non-nan element is returned. If both elements are NaNs then the first is returned. The latter distinction is important for complex NaNs, which are defined as at least one of the real or imaginary parts being a NaN. The net effect is that NaNs are ignored when possible.

Parameters
x1, x2array_like

The arrays holding the elements to be compared. If x1.shape != x2.shape, they must be broadcastable to a common shape (which becomes the shape of the output).

outndarray, None, or tuple of ndarray and None, optional

A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.

wherearray_like, optional

This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default out=None, locations within it where the condition is False will remain uninitialized.

**kwargs

For other keyword-only arguments, see the ufunc docs.

Returns
yndarray or scalar

The maximum of x1 and x2, element-wise. This is a scalar if both x1 and x2 are scalars.

See also

fmin

Element-wise minimum of two arrays, ignores NaNs.

maximum

Element-wise maximum of two arrays, propagates NaNs.

amax

The maximum value of an array along a given axis, propagates NaNs.

nanmax

The maximum value of an array along a given axis, ignores NaNs.

minimum, amin, nanmin

Notes

New in version 1.3.0.

The fmax is equivalent to np.where(x1 >= x2, x1, x2) when neither x1 nor x2 are NaNs, but it is faster and does proper broadcasting.

Examples

>>> np.fmax([2, 3, 4], [1, 5, 2])  
array([ 2.,  5.,  4.])
>>> np.fmax(np.eye(2), [0.5, 2])  
array([[ 1. ,  2. ],
       [ 0.5,  2. ]])
>>> np.fmax([np.nan, 0, np.nan],[0, np.nan, np.nan])  
array([ 0.,  0., nan])
dask.array.fmin(x1, x2, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])

This docstring was copied from numpy.fmin.

Some inconsistencies with the Dask version may exist.

Element-wise minimum of array elements.

Compare two arrays and returns a new array containing the element-wise minima. If one of the elements being compared is a NaN, then the non-nan element is returned. If both elements are NaNs then the first is returned. The latter distinction is important for complex NaNs, which are defined as at least one of the real or imaginary parts being a NaN. The net effect is that NaNs are ignored when possible.

Parameters
x1, x2array_like

The arrays holding the elements to be compared. If x1.shape != x2.shape, they must be broadcastable to a common shape (which becomes the shape of the output).

outndarray, None, or tuple of ndarray and None, optional

A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.

wherearray_like, optional

This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default out=None, locations within it where the condition is False will remain uninitialized.

**kwargs

For other keyword-only arguments, see the ufunc docs.

Returns
yndarray or scalar

The minimum of x1 and x2, element-wise. This is a scalar if both x1 and x2 are scalars.

See also

fmax

Element-wise maximum of two arrays, ignores NaNs.

minimum

Element-wise minimum of two arrays, propagates NaNs.

amin

The minimum value of an array along a given axis, propagates NaNs.

nanmin

The minimum value of an array along a given axis, ignores NaNs.

maximum, amax, nanmax

Notes

New in version 1.3.0.

The fmin is equivalent to np.where(x1 <= x2, x1, x2) when neither x1 nor x2 are NaNs, but it is faster and does proper broadcasting.

Examples

>>> np.fmin([2, 3, 4], [1, 5, 2])  
array([1, 3, 2])
>>> np.fmin(np.eye(2), [0.5, 2])  
array([[ 0.5,  0. ],
       [ 0. ,  1. ]])
>>> np.fmin([np.nan, 0, np.nan],[0, np.nan, np.nan])  
array([ 0.,  0., nan])
dask.array.fmod(x1, x2, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])

This docstring was copied from numpy.fmod.

Some inconsistencies with the Dask version may exist.

Return the element-wise remainder of division.

This is the NumPy implementation of the C library function fmod, the remainder has the same sign as the dividend x1. It is equivalent to the Matlab(TM) rem function and should not be confused with the Python modulus operator x1 % x2.

Parameters
x1array_like

Dividend.

x2array_like

Divisor. If x1.shape != x2.shape, they must be broadcastable to a common shape (which becomes the shape of the output).

outndarray, None, or tuple of ndarray and None, optional

A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.

wherearray_like, optional

This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default out=None, locations within it where the condition is False will remain uninitialized.

**kwargs

For other keyword-only arguments, see the ufunc docs.

Returns
yarray_like

The remainder of the division of x1 by x2. This is a scalar if both x1 and x2 are scalars.

See also

remainder

Equivalent to the Python % operator.

divide

Notes

The result of the modulo operation for negative dividend and divisors is bound by conventions. For fmod, the sign of result is the sign of the dividend, while for remainder the sign of the result is the sign of the divisor. The fmod function is equivalent to the Matlab(TM) rem function.

Examples

>>> np.fmod([-3, -2, -1, 1, 2, 3], 2)  
array([-1,  0, -1,  1,  0,  1])
>>> np.remainder([-3, -2, -1, 1, 2, 3], 2)  
array([1, 0, 1, 1, 0, 1])
>>> np.fmod([5, 3], [2, 2.])  
array([ 1.,  1.])
>>> a = np.arange(-3, 3).reshape(3, 2)  
>>> a  
array([[-3, -2],
       [-1,  0],
       [ 1,  2]])
>>> np.fmod(a, [2,2])  
array([[-1,  0],
       [-1,  0],
       [ 1,  0]])
dask.array.frexp(x, [out1, out2, ]/, [out=(None, None), ]*, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])

This docstring was copied from numpy.frexp.

Some inconsistencies with the Dask version may exist.

Decompose the elements of x into mantissa and twos exponent.

Returns (mantissa, exponent), where x = mantissa * 2**exponent`. The mantissa is lies in the open interval(-1, 1), while the twos exponent is a signed integer.

Parameters
xarray_like

Array of numbers to be decomposed.

out1ndarray, optional

Output array for the mantissa. Must have the same shape as x.

out2ndarray, optional

Output array for the exponent. Must have the same shape as x.

outndarray, None, or tuple of ndarray and None, optional

A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.

wherearray_like, optional

This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default out=None, locations within it where the condition is False will remain uninitialized.

**kwargs

For other keyword-only arguments, see the ufunc docs.

Returns
mantissandarray

Floating values between -1 and 1. This is a scalar if x is a scalar.

exponentndarray

Integer exponents of 2. This is a scalar if x is a scalar.

See also

ldexp

Compute y = x1 * 2**x2, the inverse of frexp.

Notes

Complex dtypes are not supported, they will raise a TypeError.

Examples

>>> x = np.arange(9)  
>>> y1, y2 = np.frexp(x)  
>>> y1  
array([ 0.   ,  0.5  ,  0.5  ,  0.75 ,  0.5  ,  0.625,  0.75 ,  0.875,
        0.5  ])
>>> y2  
array([0, 1, 2, 2, 3, 3, 3, 3, 4])
>>> y1 * 2**y2  
array([ 0.,  1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.])
dask.array.fromfunction(func, chunks='auto', shape=None, dtype=None, **kwargs)

Construct an array by executing a function over each coordinate.

This docstring was copied from numpy.fromfunction.

Some inconsistencies with the Dask version may exist.

The resulting array therefore has a value fn(x, y, z) at coordinate (x, y, z).

Parameters
functioncallable (Not supported in Dask)

The function is called with N parameters, where N is the rank of shape. Each parameter represents the coordinates of the array varying along a specific axis. For example, if shape were (2, 2), then the parameters would be array([[0, 0], [1, 1]]) and array([[0, 1], [0, 1]])

shape(N,) tuple of ints

Shape of the output array, which also determines the shape of the coordinate arrays passed to function.

dtypedata-type, optional

Data-type of the coordinate arrays passed to function. By default, dtype is float.

Returns
fromfunctionany

The result of the call to function is passed back directly. Therefore the shape of fromfunction is completely determined by function. If function returns a scalar value, the shape of fromfunction would not match the shape parameter.

See also

indices, meshgrid

Notes

Keywords other than dtype are passed to function.

Examples

>>> np.fromfunction(lambda i, j: i == j, (3, 3), dtype=int)  
array([[ True, False, False],
       [False,  True, False],
       [False, False,  True]])
>>> np.fromfunction(lambda i, j: i + j, (3, 3), dtype=int)  
array([[0, 1, 2],
       [1, 2, 3],
       [2, 3, 4]])
dask.array.frompyfunc(func, nin, nout)

This docstring was copied from numpy.frompyfunc.

Some inconsistencies with the Dask version may exist.

Takes an arbitrary Python function and returns a NumPy ufunc.

Can be used, for example, to add broadcasting to a built-in Python function (see Examples section).

Parameters
funcPython function object

An arbitrary Python function.

ninint

The number of input arguments.

noutint

The number of objects returned by func.

Returns
outufunc

Returns a NumPy universal function (ufunc) object.

See also

vectorize

Evaluates pyfunc over input arrays using broadcasting rules of numpy.

Notes

The returned ufunc always returns PyObject arrays.

Examples

Use frompyfunc to add broadcasting to the Python function oct:

>>> oct_array = np.frompyfunc(oct, 1, 1)  
>>> oct_array(np.array((10, 30, 100)))  
array(['0o12', '0o36', '0o144'], dtype=object)
>>> np.array((oct(10), oct(30), oct(100))) # for comparison  
array(['0o12', '0o36', '0o144'], dtype='<U5')
dask.array.full(shape, fill_value, *args, **kwargs)

Blocked variant of full

Follows the signature of full exactly except that it also features optional keyword arguments chunks: int, tuple, or dict and name: str.

Original signature follows below.

Return a new array of given shape and type, filled with fill_value.

Parameters
shapeint or sequence of ints

Shape of the new array, e.g., (2, 3) or 2.

fill_valuescalar

Fill value.

dtypedata-type, optional
The desired data-type for the array The default, None, means

np.array(fill_value).dtype.

order{‘C’, ‘F’}, optional

Whether to store multidimensional data in C- or Fortran-contiguous (row- or column-wise) order in memory.

Returns
outndarray

Array of fill_value with the given shape, dtype, and order.

See also

full_like

Return a new array with shape of input filled with value.

empty

Return a new uninitialized array.

ones

Return a new array setting values to one.

zeros

Return a new array setting values to zero.

Examples

>>> np.full((2, 2), np.inf)
array([[inf, inf],
       [inf, inf]])
>>> np.full((2, 2), 10)
array([[10, 10],
       [10, 10]])
dask.array.full_like(a, fill_value, order='C', dtype=None, chunks=None, name=None, shape=None)

Return a full array with the same shape and type as a given array.

Parameters
aarray_like

The shape and data-type of a define these same attributes of the returned array.

fill_valuescalar

Fill value.

dtypedata-type, optional

Overrides the data type of the result.

order{‘C’, ‘F’}, optional

Whether to store multidimensional data in C- or Fortran-contiguous (row- or column-wise) order in memory.

chunkssequence of ints

The number of samples on each block. Note that the last block will have fewer samples if len(array) % chunks != 0.

namestr, optional

An optional keyname for the array. Defaults to hashing the input keyword arguments.

shapeint or sequence of ints, optional.

Overrides the shape of the result.

Returns
outndarray

Array of fill_value with the same shape and type as a.

See also

zeros_like

Return an array of zeros with shape and type of input.

ones_like

Return an array of ones with shape and type of input.

empty_like

Return an empty array with shape and type of input.

zeros

Return a new array setting values to zero.

ones

Return a new array setting values to one.

empty

Return a new uninitialized array.

full

Fill a new array.

dask.array.gradient(f, *varargs, **kwargs)

Return the gradient of an N-dimensional array.

This docstring was copied from numpy.gradient.

Some inconsistencies with the Dask version may exist.

The gradient is computed using second order accurate central differences in the interior points and either first or second order accurate one-sides (forward or backwards) differences at the boundaries. The returned gradient hence has the same shape as the input array.

Parameters
farray_like

An N-dimensional array containing samples of a scalar function.

varargslist of scalar or array, optional

Spacing between f values. Default unitary spacing for all dimensions. Spacing can be specified using:

  1. single scalar to specify a sample distance for all dimensions.

  2. N scalars to specify a constant sample distance for each dimension. i.e. dx, dy, dz, …

  3. N arrays to specify the coordinates of the values along each dimension of F. The length of the array must match the size of the corresponding dimension

  4. Any combination of N scalars/arrays with the meaning of 2. and 3.

If axis is given, the number of varargs must equal the number of axes. Default: 1.

edge_order{1, 2}, optional

Gradient is calculated using N-th order accurate differences at the boundaries. Default: 1.

New in version 1.9.1.

axisNone or int or tuple of ints, optional

Gradient is calculated only along the given axis or axes The default (axis = None) is to calculate the gradient for all the axes of the input array. axis may be negative, in which case it counts from the last to the first axis.

New in version 1.11.0.

Returns
gradientndarray or list of ndarray

A set of ndarrays (or a single ndarray if there is only one dimension) corresponding to the derivatives of f with respect to each dimension. Each derivative has the same shape as f.

Notes

Assuming that \(f\in C^{3}\) (i.e., \(f\) has at least 3 continuous derivatives) and let \(h_{*}\) be a non-homogeneous stepsize, we minimize the “consistency error” \(\eta_{i}\) between the true gradient and its estimate from a linear combination of the neighboring grid-points:

\[\eta_{i} = f_{i}^{\left(1\right)} - \left[ \alpha f\left(x_{i}\right) + \beta f\left(x_{i} + h_{d}\right) + \gamma f\left(x_{i}-h_{s}\right) \right]\]

By substituting \(f(x_{i} + h_{d})\) and \(f(x_{i} - h_{s})\) with their Taylor series expansion, this translates into solving the following the linear system:

\[\begin{split}\left\{ \begin{array}{r} \alpha+\beta+\gamma=0 \\ \beta h_{d}-\gamma h_{s}=1 \\ \beta h_{d}^{2}+\gamma h_{s}^{2}=0 \end{array} \right.\end{split}\]

The resulting approximation of \(f_{i}^{(1)}\) is the following:

\[\hat f_{i}^{(1)} = \frac{ h_{s}^{2}f\left(x_{i} + h_{d}\right) + \left(h_{d}^{2} - h_{s}^{2}\right)f\left(x_{i}\right) - h_{d}^{2}f\left(x_{i}-h_{s}\right)} { h_{s}h_{d}\left(h_{d} + h_{s}\right)} + \mathcal{O}\left(\frac{h_{d}h_{s}^{2} + h_{s}h_{d}^{2}}{h_{d} + h_{s}}\right)\]

It is worth noting that if \(h_{s}=h_{d}\) (i.e., data are evenly spaced) we find the standard second order approximation:

\[\hat f_{i}^{(1)}= \frac{f\left(x_{i+1}\right) - f\left(x_{i-1}\right)}{2h} + \mathcal{O}\left(h^{2}\right)\]

With a similar procedure the forward/backward approximations used for boundaries can be derived.

References

1

Quarteroni A., Sacco R., Saleri F. (2007) Numerical Mathematics (Texts in Applied Mathematics). New York: Springer.

2

Durran D. R. (1999) Numerical Methods for Wave Equations in Geophysical Fluid Dynamics. New York: Springer.

3

Fornberg B. (1988) Generation of Finite Difference Formulas on Arbitrarily Spaced Grids, Mathematics of Computation 51, no. 184 : 699-706. PDF.

Examples

>>> f = np.array([1, 2, 4, 7, 11, 16], dtype=float)  
>>> np.gradient(f)  
array([1. , 1.5, 2.5, 3.5, 4.5, 5. ])
>>> np.gradient(f, 2)  
array([0.5 ,  0.75,  1.25,  1.75,  2.25,  2.5 ])

Spacing can be also specified with an array that represents the coordinates of the values F along the dimensions. For instance a uniform spacing:

>>> x = np.arange(f.size)  
>>> np.gradient(f, x)  
array([1. ,  1.5,  2.5,  3.5,  4.5,  5. ])

Or a non uniform one:

>>> x = np.array([0., 1., 1.5, 3.5, 4., 6.], dtype=float)  
>>> np.gradient(f, x)  
array([1. ,  3. ,  3.5,  6.7,  6.9,  2.5])

For two dimensional arrays, the return will be two arrays ordered by axis. In this example the first array stands for the gradient in rows and the second one in columns direction:

>>> np.gradient(np.array([[1, 2, 6], [3, 4, 5]], dtype=float))  
[array([[ 2.,  2., -1.],
       [ 2.,  2., -1.]]), array([[1. , 2.5, 4. ],
       [1. , 1. , 1. ]])]

In this example the spacing is also specified: uniform for axis=0 and non uniform for axis=1

>>> dx = 2.  
>>> y = [1., 1.5, 3.5]  
>>> np.gradient(np.array([[1, 2, 6], [3, 4, 5]], dtype=float), dx, y)  
[array([[ 1. ,  1. , -0.5],
       [ 1. ,  1. , -0.5]]), array([[2. , 2. , 2. ],
       [2. , 1.7, 0.5]])]

It is possible to specify how boundaries are treated using edge_order

>>> x = np.array([0, 1, 2, 3, 4])  
>>> f = x**2  
>>> np.gradient(f, edge_order=1)  
array([1.,  2.,  4.,  6.,  7.])
>>> np.gradient(f, edge_order=2)  
array([0., 2., 4., 6., 8.])

The axis keyword can be used to specify a subset of axes of which the gradient is calculated

>>> np.gradient(np.array([[1, 2, 6], [3, 4, 5]], dtype=float), axis=0)  
array([[ 2.,  2., -1.],
       [ 2.,  2., -1.]])
dask.array.histogram(a, bins=None, range=None, normed=False, weights=None, density=None)

Blocked variant of numpy.histogram().

Parameters
aarray_like

Input data. The histogram is computed over the flattened array.

binsint or sequence of scalars, optional

Either an iterable specifying the bins or the number of bins and a range argument is required as computing min and max over blocked arrays is an expensive operation that must be performed explicitly. If bins is an int, it defines the number of equal-width bins in the given range (10, by default). If bins is a sequence, it defines a monotonically increasing array of bin edges, including the rightmost edge, allowing for non-uniform bin widths.

range(float, float), optional

The lower and upper range of the bins. If not provided, range is simply (a.min(), a.max()). Values outside the range are ignored. The first element of the range must be less than or equal to the second. range affects the automatic bin computation as well. While bin width is computed to be optimal based on the actual data within range, the bin count will fill the entire range including portions containing no data.

normedbool, optional

This is equivalent to the density argument, but produces incorrect results for unequal bin widths. It should not be used.

weightsarray_like, optional

A dask.array.Array of weights, of the same block structure as a. Each value in a only contributes its associated weight towards the bin count (instead of 1). If density is True, the weights are normalized, so that the integral of the density over the range remains 1.

densitybool, optional

If False, the result will contain the number of samples in each bin. If True, the result is the value of the probability density function at the bin, normalized such that the integral over the range is 1. Note that the sum of the histogram values will not be equal to 1 unless bins of unity width are chosen; it is not a probability mass function. Overrides the normed keyword if given. If density is True, bins cannot be a single-number delayed value. It must be a concrete number, or a (possibly-delayed) array/sequence of the bin edges.

Returns
——-
histdask Array

The values of the histogram. See density and weights for a description of the possible semantics.

bin_edgesdask Array of dtype float

Return the bin edges (length(hist)+1).

Examples

Using number of bins and range:

>>> import dask.array as da
>>> import numpy as np
>>> x = da.from_array(np.arange(10000), chunks=10)
>>> h, bins = da.histogram(x, bins=10, range=[0, 10000])
>>> bins
array([    0.,  1000.,  2000.,  3000.,  4000.,  5000.,  6000.,  7000.,
        8000.,  9000., 10000.])
>>> h.compute()
array([1000, 1000, 1000, 1000, 1000, 1000, 1000, 1000, 1000, 1000])

Explicitly specifying the bins:

>>> h, bins = da.histogram(x, bins=np.array([0, 5000, 10000]))
>>> bins
array([    0,  5000, 10000])
>>> h.compute()
array([5000, 5000])
dask.array.hstack(tup, allow_unknown_chunksizes=False)

Stack arrays in sequence horizontally (column wise).

This docstring was copied from numpy.hstack.

Some inconsistencies with the Dask version may exist.

This is equivalent to concatenation along the second axis, except for 1-D arrays where it concatenates along the first axis. Rebuilds arrays divided by hsplit.

This function makes most sense for arrays with up to 3 dimensions. For instance, for pixel-data with a height (first axis), width (second axis), and r/g/b channels (third axis). The functions concatenate, stack and block provide more general stacking and concatenation operations.

Parameters
tupsequence of ndarrays

The arrays must have the same shape along all but the second axis, except 1-D arrays which can be any length.

Returns
stackedndarray

The array formed by stacking the given arrays.

See also

stack

Join a sequence of arrays along a new axis.

vstack

Stack arrays in sequence vertically (row wise).

dstack

Stack arrays in sequence depth wise (along third axis).

concatenate

Join a sequence of arrays along an existing axis.

hsplit

Split array along second axis.

block

Assemble arrays from blocks.

Examples

>>> a = np.array((1,2,3))  
>>> b = np.array((2,3,4))  
>>> np.hstack((a,b))  
array([1, 2, 3, 2, 3, 4])
>>> a = np.array([[1],[2],[3]])  
>>> b = np.array([[2],[3],[4]])  
>>> np.hstack((a,b))  
array([[1, 2],
       [2, 3],
       [3, 4]])
dask.array.hypot(x1, x2, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])

This docstring was copied from numpy.hypot.

Some inconsistencies with the Dask version may exist.

Given the “legs” of a right triangle, return its hypotenuse.

Equivalent to sqrt(x1**2 + x2**2), element-wise. If x1 or x2 is scalar_like (i.e., unambiguously cast-able to a scalar type), it is broadcast for use with each element of the other argument. (See Examples)

Parameters
x1, x2array_like

Leg of the triangle(s). If x1.shape != x2.shape, they must be broadcastable to a common shape (which becomes the shape of the output).

outndarray, None, or tuple of ndarray and None, optional

A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.

wherearray_like, optional

This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default out=None, locations within it where the condition is False will remain uninitialized.

**kwargs

For other keyword-only arguments, see the ufunc docs.

Returns
zndarray

The hypotenuse of the triangle(s). This is a scalar if both x1 and x2 are scalars.

Examples

>>> np.hypot(3*np.ones((3, 3)), 4*np.ones((3, 3)))  
array([[ 5.,  5.,  5.],
       [ 5.,  5.,  5.],
       [ 5.,  5.,  5.]])

Example showing broadcast of scalar_like argument:

>>> np.hypot(3*np.ones((3, 3)), [4])  
array([[ 5.,  5.,  5.],
       [ 5.,  5.,  5.],
       [ 5.,  5.,  5.]])
dask.array.imag(*args, **kwargs)

Return the imaginary part of the complex argument.

This docstring was copied from numpy.imag.

Some inconsistencies with the Dask version may exist.

Parameters
valarray_like (Not supported in Dask)

Input array.

Returns
outndarray or scalar

The imaginary component of the complex argument. If val is real, the type of val is used for the output. If val has complex elements, the returned type is float.

See also

real, angle, real_if_close

Examples

>>> a = np.array([1+2j, 3+4j, 5+6j])  
>>> a.imag  
array([2.,  4.,  6.])
>>> a.imag = np.array([8, 10, 12])  
>>> a  
array([1. +8.j,  3.+10.j,  5.+12.j])
>>> np.imag(1 + 1j)  
1.0
dask.array.indices(dimensions, dtype=<class 'int'>, chunks='auto')

Implements NumPy’s indices for Dask Arrays.

Generates a grid of indices covering the dimensions provided.

The final array has the shape (len(dimensions), *dimensions). The chunks are used to specify the chunking for axis 1 up to len(dimensions). The 0th axis always has chunks of length 1.

Parameters
dimensionssequence of ints

The shape of the index grid.

dtypedtype, optional

Type to use for the array. Default is int.

chunkssequence of ints, str

The size of each block. Must be one of the following forms:

  • A blocksize like (500, 1000)

  • A size in bytes, like “100 MiB” which will choose a uniform block-like shape

  • The word “auto” which acts like the above, but uses a configuration value array.chunk-size for the chunk size

Note that the last block will have fewer samples if len(array) % chunks != 0.

Returns
griddask array
dask.array.insert(arr, obj, values, axis)

Insert values along the given axis before the given indices.

This docstring was copied from numpy.insert.

Some inconsistencies with the Dask version may exist.

Parameters
arrarray_like

Input array.

objint, slice or sequence of ints

Object that defines the index or indices before which values is inserted.

New in version 1.8.0.

Support for multiple insertions when obj is a single scalar or a sequence with one element (similar to calling insert multiple times).

valuesarray_like

Values to insert into arr. If the type of values is different from that of arr, values is converted to the type of arr. values should be shaped so that arr[...,obj,...] = values is legal.

axisint, optional

Axis along which to insert values. If axis is None then arr is flattened first.

Returns
outndarray

A copy of arr with values inserted. Note that insert does not occur in-place: a new array is returned. If axis is None, out is a flattened array.

See also

append

Append elements at the end of an array.

concatenate

Join a sequence of arrays along an existing axis.

delete

Delete elements from an array.

Notes

Note that for higher dimensional inserts obj=0 behaves very different from obj=[0] just like arr[:,0,:] = values is different from arr[:,[0],:] = values.

Examples

>>> a = np.array([[1, 1], [2, 2], [3, 3]])  
>>> a  
array([[1, 1],
       [2, 2],
       [3, 3]])
>>> np.insert(a, 1, 5)  
array([1, 5, 1, ..., 2, 3, 3])
>>> np.insert(a, 1, 5, axis=1)  
array([[1, 5, 1],
       [2, 5, 2],
       [3, 5, 3]])

Difference between sequence and scalars:

>>> np.insert(a, [1], [[1],[2],[3]], axis=1)  
array([[1, 1, 1],
       [2, 2, 2],
       [3, 3, 3]])
>>> np.array_equal(np.insert(a, 1, [1, 2, 3], axis=1),  
...                np.insert(a, [1], [[1],[2],[3]], axis=1))
True
>>> b = a.flatten()  
>>> b  
array([1, 1, 2, 2, 3, 3])
>>> np.insert(b, [2, 2], [5, 6])  
array([1, 1, 5, ..., 2, 3, 3])
>>> np.insert(b, slice(2, 4), [5, 6])  
array([1, 1, 5, ..., 2, 3, 3])
>>> np.insert(b, [2, 2], [7.13, False]) # type casting  
array([1, 1, 7, ..., 2, 3, 3])
>>> x = np.arange(8).reshape(2, 4)  
>>> idx = (1, 3)  
>>> np.insert(x, idx, 999, axis=1)  
array([[  0, 999,   1,   2, 999,   3],
       [  4, 999,   5,   6, 999,   7]])
dask.array.invert(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])

This docstring was copied from numpy.invert.

Some inconsistencies with the Dask version may exist.

Compute bit-wise inversion, or bit-wise NOT, element-wise.

Computes the bit-wise NOT of the underlying binary representation of the integers in the input arrays. This ufunc implements the C/Python operator ~.

For signed integer inputs, the two’s complement is returned. In a two’s-complement system negative numbers are represented by the two’s complement of the absolute value. This is the most common method of representing signed integers on computers [1]. A N-bit two’s-complement system can represent every integer in the range \(-2^{N-1}\) to \(+2^{N-1}-1\).

Parameters
xarray_like

Only integer and boolean types are handled.

outndarray, None, or tuple of ndarray and None, optional

A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.

wherearray_like, optional

This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default out=None, locations within it where the condition is False will remain uninitialized.

**kwargs

For other keyword-only arguments, see the ufunc docs.

Returns
outndarray or scalar

Result. This is a scalar if x is a scalar.

See also

bitwise_and, bitwise_or, bitwise_xor
logical_not
binary_repr

Return the binary representation of the input number as a string.

Notes

bitwise_not is an alias for invert:

>>> np.bitwise_not is np.invert  
True

References

1

Wikipedia, “Two’s complement”, https://en.wikipedia.org/wiki/Two’s_complement

Examples

We’ve seen that 13 is represented by 00001101. The invert or bit-wise NOT of 13 is then:

>>> x = np.invert(np.array(13, dtype=np.uint8))  
>>> x  
242
>>> np.binary_repr(x, width=8)  
'11110010'

The result depends on the bit-width:

>>> x = np.invert(np.array(13, dtype=np.uint16))  
>>> x  
65522
>>> np.binary_repr(x, width=16)  
'1111111111110010'

When using signed integer types the result is the two’s complement of the result for the unsigned type:

>>> np.invert(np.array([13], dtype=np.int8))  
array([-14], dtype=int8)
>>> np.binary_repr(-14, width=8)  
'11110010'

Booleans are accepted as well:

>>> np.invert(np.array([True, False]))  
array([False,  True])
dask.array.isclose(arr1, arr2, rtol=1e-05, atol=1e-08, equal_nan=False)

Returns a boolean array where two arrays are element-wise equal within a tolerance.

This docstring was copied from numpy.isclose.

Some inconsistencies with the Dask version may exist.

The tolerance values are positive, typically very small numbers. The relative difference (rtol * abs(b)) and the absolute difference atol are added together to compare against the absolute difference between a and b.

Warning

The default atol is not appropriate for comparing numbers that are much smaller than one (see Notes).

Parameters
a, barray_like

Input arrays to compare.

rtolfloat

The relative tolerance parameter (see Notes).

atolfloat

The absolute tolerance parameter (see Notes).

equal_nanbool

Whether to compare NaN’s as equal. If True, NaN’s in a will be considered equal to NaN’s in b in the output array.

Returns
yarray_like

Returns a boolean array of where a and b are equal within the given tolerance. If both a and b are scalars, returns a single boolean value.

See also

allclose

Notes

New in version 1.7.0.

For finite values, isclose uses the following equation to test whether two floating point values are equivalent.

absolute(a - b) <= (atol + rtol * absolute(b))

Unlike the built-in math.isclose, the above equation is not symmetric in a and b – it assumes b is the reference value – so that isclose(a, b) might be different from isclose(b, a). Furthermore, the default value of atol is not zero, and is used to determine what small values should be considered close to zero. The default value is appropriate for expected values of order unity: if the expected values are significantly smaller than one, it can result in false positives. atol should be carefully selected for the use case at hand. A zero value for atol will result in False if either a or b is zero.

Examples

>>> np.isclose([1e10,1e-7], [1.00001e10,1e-8])  
array([ True, False])
>>> np.isclose([1e10,1e-8], [1.00001e10,1e-9])  
array([ True, True])
>>> np.isclose([1e10,1e-8], [1.0001e10,1e-9])  
array([False,  True])
>>> np.isclose([1.0, np.nan], [1.0, np.nan])  
array([ True, False])
>>> np.isclose([1.0, np.nan], [1.0, np.nan], equal_nan=True)  
array([ True, True])
>>> np.isclose([1e-8, 1e-7], [0.0, 0.0])  
array([ True, False])
>>> np.isclose([1e-100, 1e-7], [0.0, 0.0], atol=0.0)  
array([False, False])
>>> np.isclose([1e-10, 1e-10], [1e-20, 0.0])  
array([ True,  True])
>>> np.isclose([1e-10, 1e-10], [1e-20, 0.999999e-10], atol=0.0)  
array([False,  True])
dask.array.iscomplex(*args, **kwargs)

Returns a bool array, where True if input element is complex.

This docstring was copied from numpy.iscomplex.

Some inconsistencies with the Dask version may exist.

What is tested is whether the input has a non-zero imaginary part, not if the input type is complex.

Parameters
xarray_like (Not supported in Dask)

Input array.

Returns
outndarray of bools

Output array.

See also

isreal
iscomplexobj

Return True if x is a complex type or an array of complex numbers.

Examples

>>> np.iscomplex([1+1j, 1+0j, 4.5, 3, 2, 2j])  
array([ True, False, False, False, False,  True])
dask.array.isfinite(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])

This docstring was copied from numpy.isfinite.

Some inconsistencies with the Dask version may exist.

Test element-wise for finiteness (not infinity or not Not a Number).

The result is returned as a boolean array.

Parameters
xarray_like

Input values.

outndarray, None, or tuple of ndarray and None, optional

A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.

wherearray_like, optional

This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default out=None, locations within it where the condition is False will remain uninitialized.

**kwargs

For other keyword-only arguments, see the ufunc docs.

Returns
yndarray, bool

True where x is not positive infinity, negative infinity, or NaN; false otherwise. This is a scalar if x is a scalar.

Notes

Not a Number, positive infinity and negative infinity are considered to be non-finite.

NumPy uses the IEEE Standard for Binary Floating-Point for Arithmetic (IEEE 754). This means that Not a Number is not equivalent to infinity. Also that positive infinity is not equivalent to negative infinity. But infinity is equivalent to positive infinity. Errors result if the second argument is also supplied when x is a scalar input, or if first and second arguments have different shapes.

Examples

>>> np.isfinite(1)  
True
>>> np.isfinite(0)  
True
>>> np.isfinite(np.nan)  
False
>>> np.isfinite(np.inf)  
False
>>> np.isfinite(np.NINF)  
False
>>> np.isfinite([np.log(-1.),1.,np.log(0)])  
array([False,  True, False])
>>> x = np.array([-np.inf, 0., np.inf])  
>>> y = np.array([2, 2, 2])  
>>> np.isfinite(x, y)  
array([0, 1, 0])
>>> y  
array([0, 1, 0])
dask.array.isin(element, test_elements, assume_unique=False, invert=False)

Calculates element in test_elements, broadcasting over element only. Returns a boolean array of the same shape as element that is True where an element of element is in test_elements and False otherwise.

Parameters
elementarray_like

Input array.

test_elementsarray_like

The values against which to test each value of element. This argument is flattened if it is an array or array_like. See notes for behavior with non-array-like parameters.

assume_uniquebool, optional

If True, the input arrays are both assumed to be unique, which can speed up the calculation. Default is False.

invertbool, optional

If True, the values in the returned array are inverted, as if calculating element not in test_elements. Default is False. np.isin(a, b, invert=True) is equivalent to (but faster than) np.invert(np.isin(a, b)).

Returns
isinndarray, bool

Has the same shape as element. The values element[isin] are in test_elements.

See also

in1d

Flattened version of this function.

numpy.lib.arraysetops

Module with a number of other functions for performing set operations on arrays.

Notes

isin is an element-wise function version of the python keyword in. isin(a, b) is roughly equivalent to np.array([item in b for item in a]) if a and b are 1-D sequences.

element and test_elements are converted to arrays if they are not already. If test_elements is a set (or other non-sequence collection) it will be converted to an object array with one element, rather than an array of the values contained in test_elements. This is a consequence of the array constructor’s way of handling non-sequence collections. Converting the set to a list usually gives the desired behavior.

New in version 1.13.0.

Examples

>>> element = 2*np.arange(4).reshape((2, 2))
>>> element
array([[0, 2],
       [4, 6]])
>>> test_elements = [1, 2, 4, 8]
>>> mask = np.isin(element, test_elements)
>>> mask
array([[False,  True],
       [ True, False]])
>>> element[mask]
array([2, 4])

The indices of the matched values can be obtained with nonzero:

>>> np.nonzero(mask)
(array([0, 1]), array([1, 0]))

The test can also be inverted:

>>> mask = np.isin(element, test_elements, invert=True)
>>> mask
array([[ True, False],
       [False,  True]])
>>> element[mask]
array([0, 6])

Because of how array handles sets, the following does not work as expected:

>>> test_set = {1, 2, 4, 8}
>>> np.isin(element, test_set)
array([[False, False],
       [False, False]])

Casting the set to a list gives the expected result:

>>> np.isin(element, list(test_set))
array([[False,  True],
       [ True, False]])
dask.array.isinf(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])

This docstring was copied from numpy.isinf.

Some inconsistencies with the Dask version may exist.

Test element-wise for positive or negative infinity.

Returns a boolean array of the same shape as x, True where x == +/-inf, otherwise False.

Parameters
xarray_like

Input values

outndarray, None, or tuple of ndarray and None, optional

A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.

wherearray_like, optional

This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default out=None, locations within it where the condition is False will remain uninitialized.

**kwargs

For other keyword-only arguments, see the ufunc docs.

Returns
ybool (scalar) or boolean ndarray

True where x is positive or negative infinity, false otherwise. This is a scalar if x is a scalar.

Notes

NumPy uses the IEEE Standard for Binary Floating-Point for Arithmetic (IEEE 754).

Errors result if the second argument is supplied when the first argument is a scalar, or if the first and second arguments have different shapes.

Examples

>>> np.isinf(np.inf)  
True
>>> np.isinf(np.nan)  
False
>>> np.isinf(np.NINF)  
True
>>> np.isinf([np.inf, -np.inf, 1.0, np.nan])  
array([ True,  True, False, False])
>>> x = np.array([-np.inf, 0., np.inf])  
>>> y = np.array([2, 2, 2])  
>>> np.isinf(x, y)  
array([1, 0, 1])
>>> y  
array([1, 0, 1])
dask.array.isneginf(*args, **kwargs)

This docstring was copied from numpy.equal.

Some inconsistencies with the Dask version may exist.

Return (x1 == x2) element-wise.

Parameters
x1, x2array_like

Input arrays. If x1.shape != x2.shape, they must be broadcastable to a common shape (which becomes the shape of the output).

outndarray, None, or tuple of ndarray and None, optional

A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.

wherearray_like, optional

This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default out=None, locations within it where the condition is False will remain uninitialized.

**kwargs

For other keyword-only arguments, see the ufunc docs.

Returns
outndarray or scalar

Output array, element-wise comparison of x1 and x2. Typically of type bool, unless dtype=object is passed. This is a scalar if both x1 and x2 are scalars.

See also

not_equal, greater_equal, less_equal, greater, less

Examples

>>> np.equal([0, 1, 3], np.arange(3))  
array([ True,  True, False])

What is compared are values, not types. So an int (1) and an array of length one can evaluate as True:

>>> np.equal(1, np.ones(1))  
array([ True])
dask.array.isnan(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])

This docstring was copied from numpy.isnan.

Some inconsistencies with the Dask version may exist.

Test element-wise for NaN and return result as a boolean array.

Parameters
xarray_like

Input array.

outndarray, None, or tuple of ndarray and None, optional

A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.

wherearray_like, optional

This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default out=None, locations within it where the condition is False will remain uninitialized.

**kwargs

For other keyword-only arguments, see the ufunc docs.

Returns
yndarray or bool

True where x is NaN, false otherwise. This is a scalar if x is a scalar.

See also

isinf, isneginf, isposinf, isfinite, isnat

Notes

NumPy uses the IEEE Standard for Binary Floating-Point for Arithmetic (IEEE 754). This means that Not a Number is not equivalent to infinity.

Examples

>>> np.isnan(np.nan)  
True
>>> np.isnan(np.inf)  
False
>>> np.isnan([np.log(-1.),1.,np.log(0)])  
array([ True, False, False])
dask.array.isnull(values)

pandas.isnull for dask arrays

dask.array.isposinf(*args, **kwargs)

This docstring was copied from numpy.equal.

Some inconsistencies with the Dask version may exist.

Return (x1 == x2) element-wise.

Parameters
x1, x2array_like

Input arrays. If x1.shape != x2.shape, they must be broadcastable to a common shape (which becomes the shape of the output).

outndarray, None, or tuple of ndarray and None, optional

A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.

wherearray_like, optional

This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default out=None, locations within it where the condition is False will remain uninitialized.

**kwargs

For other keyword-only arguments, see the ufunc docs.

Returns
outndarray or scalar

Output array, element-wise comparison of x1 and x2. Typically of type bool, unless dtype=object is passed. This is a scalar if both x1 and x2 are scalars.

See also

not_equal, greater_equal, less_equal, greater, less

Examples

>>> np.equal([0, 1, 3], np.arange(3))  
array([ True,  True, False])

What is compared are values, not types. So an int (1) and an array of length one can evaluate as True:

>>> np.equal(1, np.ones(1))  
array([ True])
dask.array.isreal(*args, **kwargs)

Returns a bool array, where True if input element is real.

This docstring was copied from numpy.isreal.

Some inconsistencies with the Dask version may exist.

If element has complex type with zero complex part, the return value for that element is True.

Parameters
xarray_like (Not supported in Dask)

Input array.

Returns
outndarray, bool

Boolean array of same shape as x.

See also

iscomplex
isrealobj

Return True if x is not a complex type.

Examples

>>> np.isreal([1+1j, 1+0j, 4.5, 3, 2, 2j])  
array([False,  True,  True,  True,  True, False])
dask.array.ldexp(x1, x2, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])

This docstring was copied from numpy.ldexp.

Some inconsistencies with the Dask version may exist.

Returns x1 * 2**x2, element-wise.

The mantissas x1 and twos exponents x2 are used to construct floating point numbers x1 * 2**x2.

Parameters
x1array_like

Array of multipliers.

x2array_like, int

Array of twos exponents. If x1.shape != x2.shape, they must be broadcastable to a common shape (which becomes the shape of the output).

outndarray, None, or tuple of ndarray and None, optional

A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.

wherearray_like, optional

This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default out=None, locations within it where the condition is False will remain uninitialized.

**kwargs

For other keyword-only arguments, see the ufunc docs.

Returns
yndarray or scalar

The result of x1 * 2**x2. This is a scalar if both x1 and x2 are scalars.

See also

frexp

Return (y1, y2) from x = y1 * 2**y2, inverse to ldexp.

Notes

Complex dtypes are not supported, they will raise a TypeError.

ldexp is useful as the inverse of frexp, if used by itself it is more clear to simply use the expression x1 * 2**x2.

Examples

>>> np.ldexp(5, np.arange(4))  
array([ 5., 10., 20., 40.], dtype=float16)
>>> x = np.arange(6)  
>>> np.ldexp(*np.frexp(x))  
array([ 0.,  1.,  2.,  3.,  4.,  5.])
dask.array.linspace(start, stop, num=50, endpoint=True, retstep=False, chunks='auto', dtype=None)

Return num evenly spaced values over the closed interval [start, stop].

Parameters
startscalar

The starting value of the sequence.

stopscalar

The last value of the sequence.

numint, optional

Number of samples to include in the returned dask array, including the endpoints. Default is 50.

endpointbool, optional

If True, stop is the last sample. Otherwise, it is not included. Default is True.

retstepbool, optional

If True, return (samples, step), where step is the spacing between samples. Default is False.

chunksint

The number of samples on each block. Note that the last block will have fewer samples if num % blocksize != 0

dtypedtype, optional

The type of the output array.

Returns
samplesdask array
stepfloat, optional

Only returned if retstep is True. Size of spacing between samples.

dask.array.log(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])

This docstring was copied from numpy.log.

Some inconsistencies with the Dask version may exist.

Natural logarithm, element-wise.

The natural logarithm log is the inverse of the exponential function, so that log(exp(x)) = x. The natural logarithm is logarithm in base e.

Parameters
xarray_like

Input value.

outndarray, None, or tuple of ndarray and None, optional

A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.

wherearray_like, optional

This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default out=None, locations within it where the condition is False will remain uninitialized.

**kwargs

For other keyword-only arguments, see the ufunc docs.

Returns
yndarray

The natural logarithm of x, element-wise. This is a scalar if x is a scalar.

See also

log10, log2, log1p, emath.log

Notes

Logarithm is a multivalued function: for each x there is an infinite number of z such that exp(z) = x. The convention is to return the z whose imaginary part lies in [-pi, pi].

For real-valued input data types, log always returns real output. For each value that cannot be expressed as a real number or infinity, it yields nan and sets the invalid floating point error flag.

For complex-valued input, log is a complex analytical function that has a branch cut [-inf, 0] and is continuous from above on it. log handles the floating-point negative zero as an infinitesimal negative number, conforming to the C99 standard.

References

1

M. Abramowitz and I.A. Stegun, “Handbook of Mathematical Functions”, 10th printing, 1964, pp. 67. http://www.math.sfu.ca/~cbm/aands/

2

Wikipedia, “Logarithm”. https://en.wikipedia.org/wiki/Logarithm

Examples

>>> np.log([1, np.e, np.e**2, 0])  
array([  0.,   1.,   2., -Inf])
dask.array.log10(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])

This docstring was copied from numpy.log10.

Some inconsistencies with the Dask version may exist.

Return the base 10 logarithm of the input array, element-wise.

Parameters
xarray_like

Input values.

outndarray, None, or tuple of ndarray and None, optional

A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.

wherearray_like, optional

This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default out=None, locations within it where the condition is False will remain uninitialized.

**kwargs

For other keyword-only arguments, see the ufunc docs.

Returns
yndarray

The logarithm to the base 10 of x, element-wise. NaNs are returned where x is negative. This is a scalar if x is a scalar.

See also

emath.log10

Notes

Logarithm is a multivalued function: for each x there is an infinite number of z such that 10**z = x. The convention is to return the z whose imaginary part lies in [-pi, pi].

For real-valued input data types, log10 always returns real output. For each value that cannot be expressed as a real number or infinity, it yields nan and sets the invalid floating point error flag.

For complex-valued input, log10 is a complex analytical function that has a branch cut [-inf, 0] and is continuous from above on it. log10 handles the floating-point negative zero as an infinitesimal negative number, conforming to the C99 standard.

References

1

M. Abramowitz and I.A. Stegun, “Handbook of Mathematical Functions”, 10th printing, 1964, pp. 67. http://www.math.sfu.ca/~cbm/aands/

2

Wikipedia, “Logarithm”. https://en.wikipedia.org/wiki/Logarithm

Examples

>>> np.log10([1e-15, -3.])  
array([-15.,  nan])
dask.array.log1p(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])

This docstring was copied from numpy.log1p.

Some inconsistencies with the Dask version may exist.

Return the natural logarithm of one plus the input array, element-wise.

Calculates log(1 + x).

Parameters
xarray_like

Input values.

outndarray, None, or tuple of ndarray and None, optional

A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.

wherearray_like, optional

This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default out=None, locations within it where the condition is False will remain uninitialized.

**kwargs

For other keyword-only arguments, see the ufunc docs.

Returns
yndarray

Natural logarithm of 1 + x, element-wise. This is a scalar if x is a scalar.

See also

expm1

exp(x) - 1, the inverse of log1p.

Notes

For real-valued input, log1p is accurate also for x so small that 1 + x == 1 in floating-point accuracy.

Logarithm is a multivalued function: for each x there is an infinite number of z such that exp(z) = 1 + x. The convention is to return the z whose imaginary part lies in [-pi, pi].

For real-valued input data types, log1p always returns real output. For each value that cannot be expressed as a real number or infinity, it yields nan and sets the invalid floating point error flag.

For complex-valued input, log1p is a complex analytical function that has a branch cut [-inf, -1] and is continuous from above on it. log1p handles the floating-point negative zero as an infinitesimal negative number, conforming to the C99 standard.

References

1

M. Abramowitz and I.A. Stegun, “Handbook of Mathematical Functions”, 10th printing, 1964, pp. 67. http://www.math.sfu.ca/~cbm/aands/

2

Wikipedia, “Logarithm”. https://en.wikipedia.org/wiki/Logarithm

Examples

>>> np.log1p(1e-99)  
1e-99
>>> np.log(1 + 1e-99)  
0.0
dask.array.log2(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])

This docstring was copied from numpy.log2.

Some inconsistencies with the Dask version may exist.

Base-2 logarithm of x.

Parameters
xarray_like

Input values.

outndarray, None, or tuple of ndarray and None, optional

A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.

wherearray_like, optional

This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default out=None, locations within it where the condition is False will remain uninitialized.

**kwargs

For other keyword-only arguments, see the ufunc docs.

Returns
yndarray

Base-2 logarithm of x. This is a scalar if x is a scalar.

See also

log, log10, log1p, emath.log2

Notes

New in version 1.3.0.

Logarithm is a multivalued function: for each x there is an infinite number of z such that 2**z = x. The convention is to return the z whose imaginary part lies in [-pi, pi].

For real-valued input data types, log2 always returns real output. For each value that cannot be expressed as a real number or infinity, it yields nan and sets the invalid floating point error flag.

For complex-valued input, log2 is a complex analytical function that has a branch cut [-inf, 0] and is continuous from above on it. log2 handles the floating-point negative zero as an infinitesimal negative number, conforming to the C99 standard.

Examples

>>> x = np.array([0, 1, 2, 2**4])  
>>> np.log2(x)  
array([-Inf,   0.,   1.,   4.])
>>> xi = np.array([0+1.j, 1, 2+0.j, 4.j])  
>>> np.log2(xi)  
array([ 0.+2.26618007j,  0.+0.j        ,  1.+0.j        ,  2.+2.26618007j])
dask.array.logaddexp(x1, x2, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])

This docstring was copied from numpy.logaddexp.

Some inconsistencies with the Dask version may exist.

Logarithm of the sum of exponentiations of the inputs.

Calculates log(exp(x1) + exp(x2)). This function is useful in statistics where the calculated probabilities of events may be so small as to exceed the range of normal floating point numbers. In such cases the logarithm of the calculated probability is stored. This function allows adding probabilities stored in such a fashion.

Parameters
x1, x2array_like

Input values. If x1.shape != x2.shape, they must be broadcastable to a common shape (which becomes the shape of the output).

outndarray, None, or tuple of ndarray and None, optional

A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.

wherearray_like, optional

This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default out=None, locations within it where the condition is False will remain uninitialized.

**kwargs

For other keyword-only arguments, see the ufunc docs.

Returns
resultndarray

Logarithm of exp(x1) + exp(x2). This is a scalar if both x1 and x2 are scalars.

See also

logaddexp2

Logarithm of the sum of exponentiations of inputs in base 2.

Notes

New in version 1.3.0.

Examples

>>> prob1 = np.log(1e-50)  
>>> prob2 = np.log(2.5e-50)  
>>> prob12 = np.logaddexp(prob1, prob2)  
>>> prob12  
-113.87649168120691
>>> np.exp(prob12)  
3.5000000000000057e-50
dask.array.logaddexp2(x1, x2, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])

This docstring was copied from numpy.logaddexp2.

Some inconsistencies with the Dask version may exist.

Logarithm of the sum of exponentiations of the inputs in base-2.

Calculates log2(2**x1 + 2**x2). This function is useful in machine learning when the calculated probabilities of events may be so small as to exceed the range of normal floating point numbers. In such cases the base-2 logarithm of the calculated probability can be used instead. This function allows adding probabilities stored in such a fashion.

Parameters
x1, x2array_like

Input values. If x1.shape != x2.shape, they must be broadcastable to a common shape (which becomes the shape of the output).

outndarray, None, or tuple of ndarray and None, optional

A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.

wherearray_like, optional

This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default out=None, locations within it where the condition is False will remain uninitialized.

**kwargs

For other keyword-only arguments, see the ufunc docs.

Returns
resultndarray

Base-2 logarithm of 2**x1 + 2**x2. This is a scalar if both x1 and x2 are scalars.

See also

logaddexp

Logarithm of the sum of exponentiations of the inputs.

Notes

New in version 1.3.0.

Examples

>>> prob1 = np.log2(1e-50)  
>>> prob2 = np.log2(2.5e-50)  
>>> prob12 = np.logaddexp2(prob1, prob2)  
>>> prob1, prob2, prob12  
(-166.09640474436813, -164.77447664948076, -164.28904982231052)
>>> 2**prob12  
3.4999999999999914e-50
dask.array.logical_and(x1, x2, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])

This docstring was copied from numpy.logical_and.

Some inconsistencies with the Dask version may exist.

Compute the truth value of x1 AND x2 element-wise.

Parameters
x1, x2array_like

Input arrays. If x1.shape != x2.shape, they must be broadcastable to a common shape (which becomes the shape of the output).

outndarray, None, or tuple of ndarray and None, optional

A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.

wherearray_like, optional

This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default out=None, locations within it where the condition is False will remain uninitialized.

**kwargs

For other keyword-only arguments, see the ufunc docs.

Returns
yndarray or bool

Boolean result of the logical AND operation applied to the elements of x1 and x2; the shape is determined by broadcasting. This is a scalar if both x1 and x2 are scalars.

Examples

>>> np.logical_and(True, False)  
False
>>> np.logical_and([True, False], [False, False])  
array([False, False])
>>> x = np.arange(5)  
>>> np.logical_and(x>1, x<4)  
array([False, False,  True,  True, False])
dask.array.logical_not(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])

This docstring was copied from numpy.logical_not.

Some inconsistencies with the Dask version may exist.

Compute the truth value of NOT x element-wise.

Parameters
xarray_like

Logical NOT is applied to the elements of x.

outndarray, None, or tuple of ndarray and None, optional

A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.

wherearray_like, optional

This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default out=None, locations within it where the condition is False will remain uninitialized.

**kwargs

For other keyword-only arguments, see the ufunc docs.

Returns
ybool or ndarray of bool

Boolean result with the same shape as x of the NOT operation on elements of x. This is a scalar if x is a scalar.

Examples

>>> np.logical_not(3)  
False
>>> np.logical_not([True, False, 0, 1])  
array([False,  True,  True, False])
>>> x = np.arange(5)  
>>> np.logical_not(x<3)  
array([False, False, False,  True,  True])
dask.array.logical_or(x1, x2, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])

This docstring was copied from numpy.logical_or.

Some inconsistencies with the Dask version may exist.

Compute the truth value of x1 OR x2 element-wise.

Parameters
x1, x2array_like

Logical OR is applied to the elements of x1 and x2. If x1.shape != x2.shape, they must be broadcastable to a common shape (which becomes the shape of the output).

outndarray, None, or tuple of ndarray and None, optional

A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.

wherearray_like, optional

This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default out=None, locations within it where the condition is False will remain uninitialized.

**kwargs

For other keyword-only arguments, see the ufunc docs.

Returns
yndarray or bool

Boolean result of the logical OR operation applied to the elements of x1 and x2; the shape is determined by broadcasting. This is a scalar if both x1 and x2 are scalars.

Examples

>>> np.logical_or(True, False)  
True
>>> np.logical_or([True, False], [False, False])  
array([ True, False])
>>> x = np.arange(5)  
>>> np.logical_or(x < 1, x > 3)  
array([ True, False, False, False,  True])
dask.array.logical_xor(x1, x2, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])

This docstring was copied from numpy.logical_xor.

Some inconsistencies with the Dask version may exist.

Compute the truth value of x1 XOR x2, element-wise.

Parameters
x1, x2array_like

Logical XOR is applied to the elements of x1 and x2. If x1.shape != x2.shape, they must be broadcastable to a common shape (which becomes the shape of the output).

outndarray, None, or tuple of ndarray and None, optional

A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.

wherearray_like, optional

This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default out=None, locations within it where the condition is False will remain uninitialized.

**kwargs

For other keyword-only arguments, see the ufunc docs.

Returns
ybool or ndarray of bool

Boolean result of the logical XOR operation applied to the elements of x1 and x2; the shape is determined by broadcasting. This is a scalar if both x1 and x2 are scalars.

Examples

>>> np.logical_xor(True, False)  
True
>>> np.logical_xor([True, True, False, False], [True, False, True, False])  
array([False,  True,  True, False])
>>> x = np.arange(5)  
>>> np.logical_xor(x < 1, x > 3)  
array([ True, False, False, False,  True])

Simple example showing support of broadcasting

>>> np.logical_xor(0, np.eye(2))  
array([[ True, False],
       [False,  True]])
dask.array.map_blocks(func, *args, name=None, token=None, dtype=None, chunks=None, drop_axis=[], new_axis=None, meta=None, **kwargs)

Map a function across all blocks of a dask array.

Parameters
funccallable

Function to apply to every block in the array.

argsdask arrays or other objects
dtypenp.dtype, optional

The dtype of the output array. It is recommended to provide this. If not provided, will be inferred by applying the function to a small set of fake data.

chunkstuple, optional

Chunk shape of resulting blocks if the function does not preserve shape. If not provided, the resulting array is assumed to have the same block structure as the first input array.

drop_axisnumber or iterable, optional

Dimensions lost by the function.

new_axisnumber or iterable, optional

New dimensions created by the function. Note that these are applied after drop_axis (if present).

tokenstring, optional

The key prefix to use for the output array. If not provided, will be determined from the function name.

namestring, optional

The key name to use for the output array. Note that this fully specifies the output key name, and must be unique. If not provided, will be determined by a hash of the arguments.

**kwargs :

Other keyword arguments to pass to function. Values must be constants (not dask.arrays)

See also

dask.array.blockwise

Generalized operation with control over block alignment.

Examples

>>> import dask.array as da
>>> x = da.arange(6, chunks=3)
>>> x.map_blocks(lambda x: x * 2).compute()
array([ 0,  2,  4,  6,  8, 10])

The da.map_blocks function can also accept multiple arrays.

>>> d = da.arange(5, chunks=2)
>>> e = da.arange(5, chunks=2)
>>> f = map_blocks(lambda a, b: a + b**2, d, e)
>>> f.compute()
array([ 0,  2,  6, 12, 20])

If the function changes shape of the blocks then you must provide chunks explicitly.

>>> y = x.map_blocks(lambda x: x[::2], chunks=((2, 2),))

You have a bit of freedom in specifying chunks. If all of the output chunk sizes are the same, you can provide just that chunk size as a single tuple.

>>> a = da.arange(18, chunks=(6,))
>>> b = a.map_blocks(lambda x: x[:3], chunks=(3,))

If the function changes the dimension of the blocks you must specify the created or destroyed dimensions.

>>> b = a.map_blocks(lambda x: x[None, :, None], chunks=(1, 6, 1),
...                  new_axis=[0, 2])

If chunks is specified but new_axis is not, then it is inferred to add the necessary number of axes on the left.

Map_blocks aligns blocks by block positions without regard to shape. In the following example we have two arrays with the same number of blocks but with different shape and chunk sizes.

>>> x = da.arange(1000, chunks=(100,))
>>> y = da.arange(100, chunks=(10,))

The relevant attribute to match is numblocks.

>>> x.numblocks
(10,)
>>> y.numblocks
(10,)

If these match (up to broadcasting rules) then we can map arbitrary functions across blocks

>>> def func(a, b):
...     return np.array([a.max(), b.max()])
>>> da.map_blocks(func, x, y, chunks=(2,), dtype='i8')
dask.array<func, shape=(20,), dtype=int64, chunksize=(2,), chunktype=numpy.ndarray>
>>> _.compute()
array([ 99,   9, 199,  19, 299,  29, 399,  39, 499,  49, 599,  59, 699,
        69, 799,  79, 899,  89, 999,  99])

Your block function get information about where it is in the array by accepting a special block_info keyword argument.

>>> def func(block, block_info=None):
...     pass

This will receive the following information:

>>> block_info  
{0: {'shape': (1000,),
     'num-chunks': (10,),
     'chunk-location': (4,),
     'array-location': [(400, 500)]},
 None: {'shape': (1000,),
        'num-chunks': (10,),
        'chunk-location': (4,),
        'array-location': [(400, 500)],
        'chunk-shape': (100,),
        'dtype': dtype('float64')}}

For each argument and keyword arguments that are dask arrays (the positions of which are the first index), you will receive the shape of the full array, the number of chunks of the full array in each dimension, the chunk location (for example the fourth chunk over in the first dimension), and the array location (for example the slice corresponding to 40:50). The same information is provided for the output, with the key None, plus the shape and dtype that should be returned.

These features can be combined to synthesize an array from scratch, for example:

>>> def func(block_info=None):
...     loc = block_info[None]['array-location'][0]
...     return np.arange(loc[0], loc[1])
>>> da.map_blocks(func, chunks=((4, 4),), dtype=np.float_)
dask.array<func, shape=(8,), dtype=float64, chunksize=(4,), chunktype=numpy.ndarray>
>>> _.compute()
array([0, 1, 2, 3, 4, 5, 6, 7])

You may specify the key name prefix of the resulting task in the graph with the optional token keyword argument.

>>> x.map_blocks(lambda x: x + 1, name='increment')  
dask.array<increment, shape=(100,), dtype=int64, chunksize=(10,), chunktype=numpy.ndarray>
dask.array.matmul(x1, x2, /, out=None, *, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])

This docstring was copied from numpy.matmul.

Some inconsistencies with the Dask version may exist.

Matrix product of two arrays.

Parameters
x1, x2array_like

Input arrays, scalars not allowed.

outndarray, optional

A location into which the result is stored. If provided, it must have a shape that matches the signature (n,k),(k,m)->(n,m). If not provided or None, a freshly-allocated array is returned.

**kwargs

For other keyword-only arguments, see the ufunc docs.

New in version 1.16: Now handles ufunc kwargs

Returns
yndarray

The matrix product of the inputs. This is a scalar only when both x1, x2 are 1-d vectors.

Raises
ValueError

If the last dimension of a is not the same size as the second-to-last dimension of b.

If a scalar value is passed in.

See also

vdot

Complex-conjugating dot product.

tensordot

Sum products over arbitrary axes.

einsum

Einstein summation convention.

dot

alternative matrix product with different broadcasting rules.

Notes

The behavior depends on the arguments in the following way.

  • If both arguments are 2-D they are multiplied like conventional matrices.

  • If either argument is N-D, N > 2, it is treated as a stack of matrices residing in the last two indexes and broadcast accordingly.

  • If the first argument is 1-D, it is promoted to a matrix by prepending a 1 to its dimensions. After matrix multiplication the prepended 1 is removed.

  • If the second argument is 1-D, it is promoted to a matrix by appending a 1 to its dimensions. After matrix multiplication the appended 1 is removed.

matmul differs from dot in two important ways:

  • Multiplication by scalars is not allowed, use * instead.

  • Stacks of matrices are broadcast together as if the matrices were elements, respecting the signature (n,k),(k,m)->(n,m):

    >>> a = np.ones([9, 5, 7, 4])  
    >>> c = np.ones([9, 5, 4, 3])  
    >>> np.dot(a, c).shape  
    (9, 5, 7, 9, 5, 3)
    >>> np.matmul(a, c).shape  
    (9, 5, 7, 3)
    >>> # n is 7, k is 4, m is 3
    

The matmul function implements the semantics of the @ operator introduced in Python 3.5 following PEP465.

Examples

For 2-D arrays it is the matrix product:

>>> a = np.array([[1, 0],  
...               [0, 1]])
>>> b = np.array([[4, 1],  
...               [2, 2]])
>>> np.matmul(a, b)  
array([[4, 1],
       [2, 2]])

For 2-D mixed with 1-D, the result is the usual.

>>> a = np.array([[1, 0],  
...               [0, 1]])
>>> b = np.array([1, 2])  
>>> np.matmul(a, b)  
array([1, 2])
>>> np.matmul(b, a)  
array([1, 2])

Broadcasting is conventional for stacks of arrays

>>> a = np.arange(2 * 2 * 4).reshape((2, 2, 4))  
>>> b = np.arange(2 * 2 * 4).reshape((2, 4, 2))  
>>> np.matmul(a,b).shape  
(2, 2, 2)
>>> np.matmul(a, b)[0, 1, 1]  
98
>>> sum(a[0, 1, :] * b[0 , :, 1])  
98

Vector, vector returns the scalar inner product, but neither argument is complex-conjugated:

>>> np.matmul([2j, 3j], [2j, 3j])  
(-13+0j)

Scalar multiplication raises an error.

>>> np.matmul([1,2], 3)  
Traceback (most recent call last):
...
ValueError: matmul: Input operand 1 does not have enough dimensions ...

New in version 1.10.0.

dask.array.max(a, axis=None, keepdims=False, split_every=None, out=None)

Return the maximum of an array or maximum along an axis.

This docstring was copied from numpy.max.

Some inconsistencies with the Dask version may exist.

Parameters
aarray_like

Input data.

axisNone or int or tuple of ints, optional

Axis or axes along which to operate. By default, flattened input is used.

New in version 1.7.0.

If this is a tuple of ints, the maximum is selected over multiple axes, instead of a single axis or all the axes as before.

outndarray, optional

Alternative output array in which to place the result. Must be of the same shape and buffer length as the expected output. See ufuncs-output-type for more details.

keepdimsbool, optional

If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the input array.

If the default value is passed, then keepdims will not be passed through to the amax method of sub-classes of ndarray, however any non-default value will be. If the sub-class’ method does not implement keepdims any exceptions will be raised.

initialscalar, optional (Not supported in Dask)

The minimum value of an output element. Must be present to allow computation on empty slice. See ~numpy.ufunc.reduce for details.

New in version 1.15.0.

wherearray_like of bool, optional (Not supported in Dask)

Elements to compare for the maximum. See ~numpy.ufunc.reduce for details.

New in version 1.17.0.

Returns
amaxndarray or scalar

Maximum of a. If axis is None, the result is a scalar value. If axis is given, the result is an array of dimension a.ndim - 1.

See also

amin

The minimum value of an array along a given axis, propagating any NaNs.

nanmax

The maximum value of an array along a given axis, ignoring any NaNs.

maximum

Element-wise maximum of two arrays, propagating any NaNs.

fmax

Element-wise maximum of two arrays, ignoring any NaNs.

argmax

Return the indices of the maximum values.

nanmin, minimum, fmin

Notes

NaN values are propagated, that is if at least one item is NaN, the corresponding max value will be NaN as well. To ignore NaN values (MATLAB behavior), please use nanmax.

Don’t use amax for element-wise comparison of 2 arrays; when a.shape[0] is 2, maximum(a[0], a[1]) is faster than amax(a, axis=0).

Examples

>>> a = np.arange(4).reshape((2,2))  
>>> a  
array([[0, 1],
       [2, 3]])
>>> np.amax(a)           # Maximum of the flattened array  
3
>>> np.amax(a, axis=0)   # Maxima along the first axis  
array([2, 3])
>>> np.amax(a, axis=1)   # Maxima along the second axis  
array([1, 3])
>>> np.amax(a, where=[False, True], initial=-1, axis=0)  
array([-1,  3])
>>> b = np.arange(5, dtype=float)  
>>> b[2] = np.NaN  
>>> np.amax(b)  
nan
>>> np.amax(b, where=~np.isnan(b), initial=-1)  
4.0
>>> np.nanmax(b)  
4.0

You can use an initial value to compute the maximum of an empty slice, or to initialize it to a different value:

>>> np.max([[-50], [10]], axis=-1, initial=0)  
array([ 0, 10])

Notice that the initial value is used as one of the elements for which the maximum is determined, unlike for the default argument Python’s max function, which is only used for empty iterables.

>>> np.max([5], initial=6)  
6
>>> max([5], default=6)  
5
dask.array.maximum(x1, x2, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])

This docstring was copied from numpy.maximum.

Some inconsistencies with the Dask version may exist.

Element-wise maximum of array elements.

Compare two arrays and returns a new array containing the element-wise maxima. If one of the elements being compared is a NaN, then that element is returned. If both elements are NaNs then the first is returned. The latter distinction is important for complex NaNs, which are defined as at least one of the real or imaginary parts being a NaN. The net effect is that NaNs are propagated.

Parameters
x1, x2array_like

The arrays holding the elements to be compared. If x1.shape != x2.shape, they must be broadcastable to a common shape (which becomes the shape of the output).

outndarray, None, or tuple of ndarray and None, optional

A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.

wherearray_like, optional

This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default out=None, locations within it where the condition is False will remain uninitialized.

**kwargs

For other keyword-only arguments, see the ufunc docs.

Returns
yndarray or scalar

The maximum of x1 and x2, element-wise. This is a scalar if both x1 and x2 are scalars.

See also

minimum

Element-wise minimum of two arrays, propagates NaNs.

fmax

Element-wise maximum of two arrays, ignores NaNs.

amax

The maximum value of an array along a given axis, propagates NaNs.

nanmax

The maximum value of an array along a given axis, ignores NaNs.

fmin, amin, nanmin

Notes

The maximum is equivalent to np.where(x1 >= x2, x1, x2) when neither x1 nor x2 are nans, but it is faster and does proper broadcasting.

Examples

>>> np.maximum([2, 3, 4], [1, 5, 2])  
array([2, 5, 4])
>>> np.maximum(np.eye(2), [0.5, 2]) # broadcasting  
array([[ 1. ,  2. ],
       [ 0.5,  2. ]])
>>> np.maximum([np.nan, 0, np.nan], [0, np.nan, np.nan])  
array([nan, nan, nan])
>>> np.maximum(np.Inf, 1)  
inf
dask.array.mean(a, axis=None, dtype=None, keepdims=False, split_every=None, out=None)

Compute the arithmetic mean along the specified axis.

This docstring was copied from numpy.mean.

Some inconsistencies with the Dask version may exist.

Returns the average of the array elements. The average is taken over the flattened array by default, otherwise over the specified axis. float64 intermediate and return values are used for integer inputs.

Parameters
aarray_like

Array containing numbers whose mean is desired. If a is not an array, a conversion is attempted.

axisNone or int or tuple of ints, optional

Axis or axes along which the means are computed. The default is to compute the mean of the flattened array.

New in version 1.7.0.

If this is a tuple of ints, a mean is performed over multiple axes, instead of a single axis or all the axes as before.

dtypedata-type, optional

Type to use in computing the mean. For integer inputs, the default is float64; for floating point inputs, it is the same as the input dtype.

outndarray, optional

Alternate output array in which to place the result. The default is None; if provided, it must have the same shape as the expected output, but the type will be cast if necessary. See ufuncs-output-type for more details.

keepdimsbool, optional

If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the input array.

If the default value is passed, then keepdims will not be passed through to the mean method of sub-classes of ndarray, however any non-default value will be. If the sub-class’ method does not implement keepdims any exceptions will be raised.

Returns
mndarray, see dtype parameter above

If out=None, returns a new array containing the mean values, otherwise a reference to the output array is returned.

See also

average

Weighted average

std, var, nanmean, nanstd, nanvar

Notes

The arithmetic mean is the sum of the elements along the axis divided by the number of elements.

Note that for floating-point input, the mean is computed using the same precision the input has. Depending on the input data, this can cause the results to be inaccurate, especially for float32 (see example below). Specifying a higher-precision accumulator using the dtype keyword can alleviate this issue.

By default, float16 results are computed using float32 intermediates for extra precision.

Examples

>>> a = np.array([[1, 2], [3, 4]])  
>>> np.mean(a)  
2.5
>>> np.mean(a, axis=0)  
array([2., 3.])
>>> np.mean(a, axis=1)  
array([1.5, 3.5])

In single precision, mean can be inaccurate:

>>> a = np.zeros((2, 512*512), dtype=np.float32)  
>>> a[0, :] = 1.0  
>>> a[1, :] = 0.1  
>>> np.mean(a)  
0.54999924

Computing the mean in float64 is more accurate:

>>> np.mean(a, dtype=np.float64)  
0.55000000074505806 # may vary
dask.array.median(a, axis=None, keepdims=False, out=None)

Compute the median along the specified axis.

This docstring was copied from numpy.median.

Some inconsistencies with the Dask version may exist.

This works by automatically chunking the reduced axes to a single chunk and then calling numpy.median function across the remaining dimensions

Returns the median of the array elements.

Parameters
aarray_like

Input array or object that can be converted to an array.

axis{int, sequence of int, None}, optional

Axis or axes along which the medians are computed. The default is to compute the median along a flattened version of the array. A sequence of axes is supported since version 1.9.0.

outndarray, optional

Alternative output array in which to place the result. It must have the same shape and buffer length as the expected output, but the type (of the output) will be cast if necessary.

overwrite_inputbool, optional (Not supported in Dask)

If True, then allow use of memory of input array a for calculations. The input array will be modified by the call to median. This will save memory when you do not need to preserve the contents of the input array. Treat the input as undefined, but it will probably be fully or partially sorted. Default is False. If overwrite_input is True and a is not already an ndarray, an error will be raised.

keepdimsbool, optional

If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the original arr.

New in version 1.9.0.

Returns
medianndarray

A new array holding the result. If the input contains integers or floats smaller than float64, then the output data-type is np.float64. Otherwise, the data-type of the output is the same as that of the input. If out is specified, that array is returned instead.

See also

mean, percentile

Notes

Given a vector V of length N, the median of V is the middle value of a sorted copy of V, V_sorted - i e., V_sorted[(N-1)/2], when N is odd, and the average of the two middle values of V_sorted when N is even.

Examples

>>> a = np.array([[10, 7, 4], [3, 2, 1]])  
>>> a  
array([[10,  7,  4],
       [ 3,  2,  1]])
>>> np.median(a)  
3.5
>>> np.median(a, axis=0)  
array([6.5, 4.5, 2.5])
>>> np.median(a, axis=1)  
array([7.,  2.])
>>> m = np.median(a, axis=0)  
>>> out = np.zeros_like(m)  
>>> np.median(a, axis=0, out=m)  
array([6.5,  4.5,  2.5])
>>> m  
array([6.5,  4.5,  2.5])
>>> b = a.copy()  
>>> np.median(b, axis=1, overwrite_input=True)  
array([7.,  2.])
>>> assert not np.all(a==b)  
>>> b = a.copy()  
>>> np.median(b, axis=None, overwrite_input=True)  
3.5
>>> assert not np.all(a==b)  
dask.array.meshgrid(*xi, **kwargs)

Return coordinate matrices from coordinate vectors.

This docstring was copied from numpy.meshgrid.

Some inconsistencies with the Dask version may exist.

Make N-D coordinate arrays for vectorized evaluations of N-D scalar/vector fields over N-D grids, given one-dimensional coordinate arrays x1, x2,…, xn.

Changed in version 1.9: 1-D and 0-D cases are allowed.

Parameters
x1, x2,…, xnarray_like

1-D arrays representing the coordinates of a grid.

indexing{‘xy’, ‘ij’}, optional

Cartesian (‘xy’, default) or matrix (‘ij’) indexing of output. See Notes for more details.

New in version 1.7.0.

sparsebool, optional

If True a sparse grid is returned in order to conserve memory. Default is False.

New in version 1.7.0.

copybool, optional

If False, a view into the original arrays are returned in order to conserve memory. Default is True. Please note that sparse=False, copy=False will likely return non-contiguous arrays. Furthermore, more than one element of a broadcast array may refer to a single memory location. If you need to write to the arrays, make copies first.

New in version 1.7.0.

Returns
X1, X2,…, XNndarray

For vectors x1, x2,…, ‘xn’ with lengths Ni=len(xi) , return (N1, N2, N3,...Nn) shaped arrays if indexing=’ij’ or (N2, N1, N3,...Nn) shaped arrays if indexing=’xy’ with the elements of xi repeated to fill the matrix along the first dimension for x1, the second for x2 and so on.

See also

index_tricks.mgrid

Construct a multi-dimensional “meshgrid” using indexing notation.

index_tricks.ogrid

Construct an open multi-dimensional “meshgrid” using indexing notation.

Notes

This function supports both indexing conventions through the indexing keyword argument. Giving the string ‘ij’ returns a meshgrid with matrix indexing, while ‘xy’ returns a meshgrid with Cartesian indexing. In the 2-D case with inputs of length M and N, the outputs are of shape (N, M) for ‘xy’ indexing and (M, N) for ‘ij’ indexing. In the 3-D case with inputs of length M, N and P, outputs are of shape (N, M, P) for ‘xy’ indexing and (M, N, P) for ‘ij’ indexing. The difference is illustrated by the following code snippet:

xv, yv = np.meshgrid(x, y, sparse=False, indexing='ij')
for i in range(nx):
    for j in range(ny):
        # treat xv[i,j], yv[i,j]

xv, yv = np.meshgrid(x, y, sparse=False, indexing='xy')
for i in range(nx):
    for j in range(ny):
        # treat xv[j,i], yv[j,i]

In the 1-D and 0-D case, the indexing and sparse keywords have no effect.

Examples

>>> nx, ny = (3, 2)  
>>> x = np.linspace(0, 1, nx)  
>>> y = np.linspace(0, 1, ny)  
>>> xv, yv = np.meshgrid(x, y)  
>>> xv  
array([[0. , 0.5, 1. ],
       [0. , 0.5, 1. ]])
>>> yv  
array([[0.,  0.,  0.],
       [1.,  1.,  1.]])
>>> xv, yv = np.meshgrid(x, y, sparse=True)  # make sparse output arrays  
>>> xv  
array([[0. ,  0.5,  1. ]])
>>> yv  
array([[0.],
       [1.]])

meshgrid is very useful to evaluate functions on a grid.

>>> import matplotlib.pyplot as plt  
>>> x = np.arange(-5, 5, 0.1)  
>>> y = np.arange(-5, 5, 0.1)  
>>> xx, yy = np.meshgrid(x, y, sparse=True)  
>>> z = np.sin(xx**2 + yy**2) / (xx**2 + yy**2)  
>>> h = plt.contourf(x,y,z)  
>>> plt.show()  
dask.array.min(a, axis=None, keepdims=False, split_every=None, out=None)

Return the minimum of an array or minimum along an axis.

This docstring was copied from numpy.min.

Some inconsistencies with the Dask version may exist.

Parameters
aarray_like

Input data.

axisNone or int or tuple of ints, optional

Axis or axes along which to operate. By default, flattened input is used.

New in version 1.7.0.

If this is a tuple of ints, the minimum is selected over multiple axes, instead of a single axis or all the axes as before.

outndarray, optional

Alternative output array in which to place the result. Must be of the same shape and buffer length as the expected output. See ufuncs-output-type for more details.

keepdimsbool, optional

If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the input array.

If the default value is passed, then keepdims will not be passed through to the amin method of sub-classes of ndarray, however any non-default value will be. If the sub-class’ method does not implement keepdims any exceptions will be raised.

initialscalar, optional (Not supported in Dask)

The maximum value of an output element. Must be present to allow computation on empty slice. See ~numpy.ufunc.reduce for details.

New in version 1.15.0.

wherearray_like of bool, optional (Not supported in Dask)

Elements to compare for the minimum. See ~numpy.ufunc.reduce for details.

New in version 1.17.0.

Returns
aminndarray or scalar

Minimum of a. If axis is None, the result is a scalar value. If axis is given, the result is an array of dimension a.ndim - 1.

See also

amax

The maximum value of an array along a given axis, propagating any NaNs.

nanmin

The minimum value of an array along a given axis, ignoring any NaNs.

minimum

Element-wise minimum of two arrays, propagating any NaNs.

fmin

Element-wise minimum of two arrays, ignoring any NaNs.

argmin

Return the indices of the minimum values.

nanmax, maximum, fmax

Notes

NaN values are propagated, that is if at least one item is NaN, the corresponding min value will be NaN as well. To ignore NaN values (MATLAB behavior), please use nanmin.

Don’t use amin for element-wise comparison of 2 arrays; when a.shape[0] is 2, minimum(a[0], a[1]) is faster than amin(a, axis=0).

Examples

>>> a = np.arange(4).reshape((2,2))  
>>> a  
array([[0, 1],
       [2, 3]])
>>> np.amin(a)           # Minimum of the flattened array  
0
>>> np.amin(a, axis=0)   # Minima along the first axis  
array([0, 1])
>>> np.amin(a, axis=1)   # Minima along the second axis  
array([0, 2])
>>> np.amin(a, where=[False, True], initial=10, axis=0)  
array([10,  1])
>>> b = np.arange(5, dtype=float)  
>>> b[2] = np.NaN  
>>> np.amin(b)  
nan
>>> np.amin(b, where=~np.isnan(b), initial=10)  
0.0
>>> np.nanmin(b)  
0.0
>>> np.min([[-50], [10]], axis=-1, initial=0)  
array([-50,   0])

Notice that the initial value is used as one of the elements for which the minimum is determined, unlike for the default argument Python’s max function, which is only used for empty iterables.

Notice that this isn’t the same as Python’s default argument.

>>> np.min([6], initial=5)  
5
>>> min([6], default=5)  
6
dask.array.minimum(x1, x2, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])

This docstring was copied from numpy.minimum.

Some inconsistencies with the Dask version may exist.

Element-wise minimum of array elements.

Compare two arrays and returns a new array containing the element-wise minima. If one of the elements being compared is a NaN, then that element is returned. If both elements are NaNs then the first is returned. The latter distinction is important for complex NaNs, which are defined as at least one of the real or imaginary parts being a NaN. The net effect is that NaNs are propagated.

Parameters
x1, x2array_like

The arrays holding the elements to be compared. If x1.shape != x2.shape, they must be broadcastable to a common shape (which becomes the shape of the output).

outndarray, None, or tuple of ndarray and None, optional

A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.

wherearray_like, optional

This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default out=None, locations within it where the condition is False will remain uninitialized.

**kwargs

For other keyword-only arguments, see the ufunc docs.

Returns
yndarray or scalar

The minimum of x1 and x2, element-wise. This is a scalar if both x1 and x2 are scalars.

See also

maximum

Element-wise maximum of two arrays, propagates NaNs.

fmin

Element-wise minimum of two arrays, ignores NaNs.

amin

The minimum value of an array along a given axis, propagates NaNs.

nanmin

The minimum value of an array along a given axis, ignores NaNs.

fmax, amax, nanmax

Notes

The minimum is equivalent to np.where(x1 <= x2, x1, x2) when neither x1 nor x2 are NaNs, but it is faster and does proper broadcasting.

Examples

>>> np.minimum([2, 3, 4], [1, 5, 2])  
array([1, 3, 2])
>>> np.minimum(np.eye(2), [0.5, 2]) # broadcasting  
array([[ 0.5,  0. ],
       [ 0. ,  1. ]])
>>> np.minimum([np.nan, 0, np.nan],[0, np.nan, np.nan])  
array([nan, nan, nan])
>>> np.minimum(-np.Inf, 1)  
-inf
dask.array.modf(x, [out1, out2, ]/, [out=(None, None), ]*, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])

This docstring was copied from numpy.modf.

Some inconsistencies with the Dask version may exist.

Return the fractional and integral parts of an array, element-wise.

The fractional and integral parts are negative if the given number is negative.

Parameters
xarray_like

Input array.

outndarray, None, or tuple of ndarray and None, optional

A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.

wherearray_like, optional

This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default out=None, locations within it where the condition is False will remain uninitialized.

**kwargs

For other keyword-only arguments, see the ufunc docs.

Returns
y1ndarray

Fractional part of x. This is a scalar if x is a scalar.

y2ndarray

Integral part of x. This is a scalar if x is a scalar.

See also

divmod

divmod(x, 1) is equivalent to modf with the return values switched, except it always has a positive remainder.

Notes

For integer input the return values are floats.

Examples

>>> np.modf([0, 3.5])  
(array([ 0. ,  0.5]), array([ 0.,  3.]))
>>> np.modf(-0.5)  
(-0.5, -0)
dask.array.moment(a, order, axis=None, dtype=None, keepdims=False, ddof=0, split_every=None, out=None)
dask.array.moveaxis(a, source, destination)

Move axes of an array to new positions.

This docstring was copied from numpy.moveaxis.

Some inconsistencies with the Dask version may exist.

Other axes remain in their original order.

New in version 1.11.0.

Parameters
anp.ndarray

The array whose axes should be reordered.

sourceint or sequence of int

Original positions of the axes to move. These must be unique.

destinationint or sequence of int

Destination positions for each of the original axes. These must also be unique.

Returns
resultnp.ndarray

Array with moved axes. This array is a view of the input array.

See also

transpose

Permute the dimensions of an array.

swapaxes

Interchange two axes of an array.

Examples

>>> x = np.zeros((3, 4, 5))  
>>> np.moveaxis(x, 0, -1).shape  
(4, 5, 3)
>>> np.moveaxis(x, -1, 0).shape  
(5, 3, 4)

These all achieve the same result:

>>> np.transpose(x).shape  
(5, 4, 3)
>>> np.swapaxes(x, 0, -1).shape  
(5, 4, 3)
>>> np.moveaxis(x, [0, 1], [-1, -2]).shape  
(5, 4, 3)
>>> np.moveaxis(x, [0, 1, 2], [-1, -2, -3]).shape  
(5, 4, 3)
dask.array.nanargmax(x, axis=None, split_every=None, out=None)

Return the maximum of an array or maximum along an axis, ignoring any NaNs. When all-NaN slices are encountered a RuntimeWarning is raised and NaN is returned for that slice.

This docstring was copied from numpy.nanmax.

Some inconsistencies with the Dask version may exist.

Parameters
aarray_like (Not supported in Dask)

Array containing numbers whose maximum is desired. If a is not an array, a conversion is attempted.

axis{int, tuple of int, None}, optional

Axis or axes along which the maximum is computed. The default is to compute the maximum of the flattened array.

outndarray, optional

Alternate output array in which to place the result. The default is None; if provided, it must have the same shape as the expected output, but the type will be cast if necessary. See ufuncs-output-type for more details.

New in version 1.8.0.

keepdimsbool, optional (Not supported in Dask)

If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the original a.

If the value is anything but the default, then keepdims will be passed through to the max method of sub-classes of ndarray. If the sub-classes methods does not implement keepdims any exceptions will be raised.

New in version 1.8.0.

Returns
nanmaxndarray

An array with the same shape as a, with the specified axis removed. If a is a 0-d array, or if axis is None, an ndarray scalar is returned. The same dtype as a is returned.

See also

nanmin

The minimum value of an array along a given axis, ignoring any NaNs.

amax

The maximum value of an array along a given axis, propagating any NaNs.

fmax

Element-wise maximum of two arrays, ignoring any NaNs.

maximum

Element-wise maximum of two arrays, propagating any NaNs.

isnan

Shows which elements are Not a Number (NaN).

isfinite

Shows which elements are neither NaN nor infinity.

amin, fmin, minimum

Notes

NumPy uses the IEEE Standard for Binary Floating-Point for Arithmetic (IEEE 754). This means that Not a Number is not equivalent to infinity. Positive infinity is treated as a very large number and negative infinity is treated as a very small (i.e. negative) number.

If the input has a integer type the function is equivalent to np.max.

Examples

>>> a = np.array([[1, 2], [3, np.nan]])  
>>> np.nanmax(a)  
3.0
>>> np.nanmax(a, axis=0)  
array([3.,  2.])
>>> np.nanmax(a, axis=1)  
array([2.,  3.])

When positive infinity and negative infinity are present:

>>> np.nanmax([1, 2, np.nan, np.NINF])  
2.0
>>> np.nanmax([1, 2, np.nan, np.inf])  
inf
dask.array.nanargmin(x, axis=None, split_every=None, out=None)

Return minimum of an array or minimum along an axis, ignoring any NaNs. When all-NaN slices are encountered a RuntimeWarning is raised and Nan is returned for that slice.

This docstring was copied from numpy.nanmin.

Some inconsistencies with the Dask version may exist.

Parameters
aarray_like (Not supported in Dask)

Array containing numbers whose minimum is desired. If a is not an array, a conversion is attempted.

axis{int, tuple of int, None}, optional

Axis or axes along which the minimum is computed. The default is to compute the minimum of the flattened array.

outndarray, optional

Alternate output array in which to place the result. The default is None; if provided, it must have the same shape as the expected output, but the type will be cast if necessary. See ufuncs-output-type for more details.

New in version 1.8.0.

keepdimsbool, optional (Not supported in Dask)

If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the original a.

If the value is anything but the default, then keepdims will be passed through to the min method of sub-classes of ndarray. If the sub-classes methods does not implement keepdims any exceptions will be raised.

New in version 1.8.0.

Returns
nanminndarray

An array with the same shape as a, with the specified axis removed. If a is a 0-d array, or if axis is None, an ndarray scalar is returned. The same dtype as a is returned.

See also

nanmax

The maximum value of an array along a given axis, ignoring any NaNs.

amin

The minimum value of an array along a given axis, propagating any NaNs.

fmin

Element-wise minimum of two arrays, ignoring any NaNs.

minimum

Element-wise minimum of two arrays, propagating any NaNs.

isnan

Shows which elements are Not a Number (NaN).

isfinite

Shows which elements are neither NaN nor infinity.

amax, fmax, maximum

Notes

NumPy uses the IEEE Standard for Binary Floating-Point for Arithmetic (IEEE 754). This means that Not a Number is not equivalent to infinity. Positive infinity is treated as a very large number and negative infinity is treated as a very small (i.e. negative) number.

If the input has a integer type the function is equivalent to np.min.

Examples

>>> a = np.array([[1, 2], [3, np.nan]])  
>>> np.nanmin(a)  
1.0
>>> np.nanmin(a, axis=0)  
array([1.,  2.])
>>> np.nanmin(a, axis=1)  
array([1.,  3.])

When positive infinity and negative infinity are present:

>>> np.nanmin([1, 2, np.nan, np.inf])  
1.0
>>> np.nanmin([1, 2, np.nan, np.NINF])  
-inf
dask.array.nancumprod(x, axis, dtype=None, out=None)

Return the cumulative product of array elements over a given axis treating Not a Numbers (NaNs) as one. The cumulative product does not change when NaNs are encountered and leading NaNs are replaced by ones.

This docstring was copied from numpy.nancumprod.

Some inconsistencies with the Dask version may exist.

Ones are returned for slices that are all-NaN or empty.

New in version 1.12.0.

Parameters
aarray_like (Not supported in Dask)

Input array.

axisint, optional

Axis along which the cumulative product is computed. By default the input is flattened.

dtypedtype, optional

Type of the returned array, as well as of the accumulator in which the elements are multiplied. If dtype is not specified, it defaults to the dtype of a, unless a has an integer dtype with a precision less than that of the default platform integer. In that case, the default platform integer is used instead.

outndarray, optional

Alternative output array in which to place the result. It must have the same shape and buffer length as the expected output but the type of the resulting values will be cast if necessary.

Returns
nancumprodndarray

A new array holding the result is returned unless out is specified, in which case it is returned.

See also

numpy.cumprod

Cumulative product across array propagating NaNs.

isnan

Show which elements are NaN.

Examples

>>> np.nancumprod(1)  
array([1])
>>> np.nancumprod([1])  
array([1])
>>> np.nancumprod([1, np.nan])  
array([1.,  1.])
>>> a = np.array([[1, 2], [3, np.nan]])  
>>> np.nancumprod(a)  
array([1.,  2.,  6.,  6.])
>>> np.nancumprod(a, axis=0)  
array([[1.,  2.],
       [3.,  2.]])
>>> np.nancumprod(a, axis=1)  
array([[1.,  2.],
       [3.,  3.]])
dask.array.nancumsum(x, axis, dtype=None, out=None)

Return the cumulative sum of array elements over a given axis treating Not a Numbers (NaNs) as zero. The cumulative sum does not change when NaNs are encountered and leading NaNs are replaced by zeros.

This docstring was copied from numpy.nancumsum.

Some inconsistencies with the Dask version may exist.

Zeros are returned for slices that are all-NaN or empty.

New in version 1.12.0.

Parameters
aarray_like (Not supported in Dask)

Input array.

axisint, optional

Axis along which the cumulative sum is computed. The default (None) is to compute the cumsum over the flattened array.

dtypedtype, optional

Type of the returned array and of the accumulator in which the elements are summed. If dtype is not specified, it defaults to the dtype of a, unless a has an integer dtype with a precision less than that of the default platform integer. In that case, the default platform integer is used.

outndarray, optional

Alternative output array in which to place the result. It must have the same shape and buffer length as the expected output but the type will be cast if necessary. See ufuncs-output-type for more details.

Returns
nancumsumndarray.

A new array holding the result is returned unless out is specified, in which it is returned. The result has the same size as a, and the same shape as a if axis is not None or a is a 1-d array.

See also

numpy.cumsum

Cumulative sum across array propagating NaNs.

isnan

Show which elements are NaN.

Examples

>>> np.nancumsum(1)  
array([1])
>>> np.nancumsum([1])  
array([1])
>>> np.nancumsum([1, np.nan])  
array([1.,  1.])
>>> a = np.array([[1, 2], [3, np.nan]])  
>>> np.nancumsum(a)  
array([1.,  3.,  6.,  6.])
>>> np.nancumsum(a, axis=0)  
array([[1.,  2.],
       [4.,  2.]])
>>> np.nancumsum(a, axis=1)  
array([[1.,  3.],
       [3.,  3.]])
dask.array.nanmax(a, axis=None, keepdims=False, split_every=None, out=None)

Return the maximum of an array or maximum along an axis, ignoring any NaNs. When all-NaN slices are encountered a RuntimeWarning is raised and NaN is returned for that slice.

This docstring was copied from numpy.nanmax.

Some inconsistencies with the Dask version may exist.

Parameters
aarray_like

Array containing numbers whose maximum is desired. If a is not an array, a conversion is attempted.

axis{int, tuple of int, None}, optional

Axis or axes along which the maximum is computed. The default is to compute the maximum of the flattened array.

outndarray, optional

Alternate output array in which to place the result. The default is None; if provided, it must have the same shape as the expected output, but the type will be cast if necessary. See ufuncs-output-type for more details.

New in version 1.8.0.

keepdimsbool, optional

If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the original a.

If the value is anything but the default, then keepdims will be passed through to the max method of sub-classes of ndarray. If the sub-classes methods does not implement keepdims any exceptions will be raised.

New in version 1.8.0.

Returns
nanmaxndarray

An array with the same shape as a, with the specified axis removed. If a is a 0-d array, or if axis is None, an ndarray scalar is returned. The same dtype as a is returned.

See also

nanmin

The minimum value of an array along a given axis, ignoring any NaNs.

amax

The maximum value of an array along a given axis, propagating any NaNs.

fmax

Element-wise maximum of two arrays, ignoring any NaNs.

maximum

Element-wise maximum of two arrays, propagating any NaNs.

isnan

Shows which elements are Not a Number (NaN).

isfinite

Shows which elements are neither NaN nor infinity.

amin, fmin, minimum

Notes

NumPy uses the IEEE Standard for Binary Floating-Point for Arithmetic (IEEE 754). This means that Not a Number is not equivalent to infinity. Positive infinity is treated as a very large number and negative infinity is treated as a very small (i.e. negative) number.

If the input has a integer type the function is equivalent to np.max.

Examples

>>> a = np.array([[1, 2], [3, np.nan]])  
>>> np.nanmax(a)  
3.0
>>> np.nanmax(a, axis=0)  
array([3.,  2.])
>>> np.nanmax(a, axis=1)  
array([2.,  3.])

When positive infinity and negative infinity are present:

>>> np.nanmax([1, 2, np.nan, np.NINF])  
2.0
>>> np.nanmax([1, 2, np.nan, np.inf])  
inf
dask.array.nanmean(a, axis=None, dtype=None, keepdims=False, split_every=None, out=None)

Compute the arithmetic mean along the specified axis, ignoring NaNs.

This docstring was copied from numpy.nanmean.

Some inconsistencies with the Dask version may exist.

Compute the arithmetic mean along the specified axis, ignoring NaNs.

This docstring was copied from numpy.nanmean.

Some inconsistencies with the Dask version may exist.

Returns the average of the array elements. The average is taken over the flattened array by default, otherwise over the specified axis. float64 intermediate and return values are used for integer inputs.

For all-NaN slices, NaN is returned and a RuntimeWarning is raised.

New in version 1.8.0.

Parameters
aarray_like

Array containing numbers whose mean is desired. If a is not an array, a conversion is attempted.

axis{int, tuple of int, None}, optional

Axis or axes along which the means are computed. The default is to compute the mean of the flattened array.

dtypedata-type, optional

Type to use in computing the mean. For integer inputs, the default is float64; for inexact inputs, it is the same as the input dtype.

outndarray, optional

Alternate output array in which to place the result. The default is None; if provided, it must have the same shape as the expected output, but the type will be cast if necessary. See ufuncs-output-type for more details.

keepdimsbool, optional

If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the original a.

If the value is anything but the default, then keepdims will be passed through to the mean or sum methods of sub-classes of ndarray. If the sub-classes methods does not implement keepdims any exceptions will be raised.

Returns
mndarray, see dtype parameter above

If out=None, returns a new array containing the mean values, otherwise a reference to the output array is returned. Nan is returned for slices that contain only NaNs.

See also

average

Weighted average

mean

Arithmetic mean taken while not ignoring NaNs

var, nanvar

Notes

The arithmetic mean is the sum of the non-NaN elements along the axis divided by the number of non-NaN elements.

Note that for floating-point input, the mean is computed using the same precision the input has. Depending on the input data, this can cause the results to be inaccurate, especially for float32. Specifying a higher-precision accumulator using the dtype keyword can alleviate this issue.

Examples

>>> a = np.array([[1, np.nan], [3, 4]])  
>>> np.nanmean(a)  
2.6666666666666665
>>> np.nanmean(a, axis=0)  
array([2.,  4.])
>>> np.nanmean(a, axis=1)  
array([1.,  3.5]) # may vary

Returns the average of the array elements. The average is taken over the flattened array by default, otherwise over the specified axis. float64 intermediate and return values are used for integer inputs.

For all-NaN slices, NaN is returned and a RuntimeWarning is raised.

New in version 1.8.0.

dask.array.nanmedian(a, axis=None, keepdims=False, out=None)

Compute the median along the specified axis, while ignoring NaNs.

This docstring was copied from numpy.nanmedian.

Some inconsistencies with the Dask version may exist.

This works by automatically chunking the reduced axes to a single chunk and then calling numpy.nanmedian function across the remaining dimensions

Returns the median of the array elements.

New in version 1.9.0.

Parameters
aarray_like

Input array or object that can be converted to an array.

axis{int, sequence of int, None}, optional

Axis or axes along which the medians are computed. The default is to compute the median along a flattened version of the array. A sequence of axes is supported since version 1.9.0.

outndarray, optional

Alternative output array in which to place the result. It must have the same shape and buffer length as the expected output, but the type (of the output) will be cast if necessary.

overwrite_inputbool, optional (Not supported in Dask)

If True, then allow use of memory of input array a for calculations. The input array will be modified by the call to median. This will save memory when you do not need to preserve the contents of the input array. Treat the input as undefined, but it will probably be fully or partially sorted. Default is False. If overwrite_input is True and a is not already an ndarray, an error will be raised.

keepdimsbool, optional

If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the original a.

If this is anything but the default value it will be passed through (in the special case of an empty array) to the mean function of the underlying array. If the array is a sub-class and mean does not have the kwarg keepdims this will raise a RuntimeError.

Returns
medianndarray

A new array holding the result. If the input contains integers or floats smaller than float64, then the output data-type is np.float64. Otherwise, the data-type of the output is the same as that of the input. If out is specified, that array is returned instead.

See also

mean, median, percentile

Notes

Given a vector V of length N, the median of V is the middle value of a sorted copy of V, V_sorted - i.e., V_sorted[(N-1)/2], when N is odd and the average of the two middle values of V_sorted when N is even.

Examples

>>> a = np.array([[10.0, 7, 4], [3, 2, 1]])  
>>> a[0, 1] = np.nan  
>>> a  
array([[10., nan,  4.],
       [ 3.,  2.,  1.]])
>>> np.median(a)  
nan
>>> np.nanmedian(a)  
3.0
>>> np.nanmedian(a, axis=0)  
array([6.5, 2. , 2.5])
>>> np.median(a, axis=1)  
array([nan,  2.])
>>> b = a.copy()  
>>> np.nanmedian(b, axis=1, overwrite_input=True)  
array([7.,  2.])
>>> assert not np.all(a==b)  
>>> b = a.copy()  
>>> np.nanmedian(b, axis=None, overwrite_input=True)  
3.0
>>> assert not np.all(a==b)  
dask.array.nanmin(a, axis=None, keepdims=False, split_every=None, out=None)

Return minimum of an array or minimum along an axis, ignoring any NaNs. When all-NaN slices are encountered a RuntimeWarning is raised and Nan is returned for that slice.

This docstring was copied from numpy.nanmin.

Some inconsistencies with the Dask version may exist.

Parameters
aarray_like

Array containing numbers whose minimum is desired. If a is not an array, a conversion is attempted.

axis{int, tuple of int, None}, optional

Axis or axes along which the minimum is computed. The default is to compute the minimum of the flattened array.

outndarray, optional

Alternate output array in which to place the result. The default is None; if provided, it must have the same shape as the expected output, but the type will be cast if necessary. See ufuncs-output-type for more details.

New in version 1.8.0.

keepdimsbool, optional

If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the original a.

If the value is anything but the default, then keepdims will be passed through to the min method of sub-classes of ndarray. If the sub-classes methods does not implement keepdims any exceptions will be raised.

New in version 1.8.0.

Returns
nanminndarray

An array with the same shape as a, with the specified axis removed. If a is a 0-d array, or if axis is None, an ndarray scalar is returned. The same dtype as a is returned.

See also

nanmax

The maximum value of an array along a given axis, ignoring any NaNs.

amin

The minimum value of an array along a given axis, propagating any NaNs.

fmin

Element-wise minimum of two arrays, ignoring any NaNs.

minimum

Element-wise minimum of two arrays, propagating any NaNs.

isnan

Shows which elements are Not a Number (NaN).

isfinite

Shows which elements are neither NaN nor infinity.

amax, fmax, maximum

Notes

NumPy uses the IEEE Standard for Binary Floating-Point for Arithmetic (IEEE 754). This means that Not a Number is not equivalent to infinity. Positive infinity is treated as a very large number and negative infinity is treated as a very small (i.e. negative) number.

If the input has a integer type the function is equivalent to np.min.

Examples

>>> a = np.array([[1, 2], [3, np.nan]])  
>>> np.nanmin(a)  
1.0
>>> np.nanmin(a, axis=0)  
array([1.,  2.])
>>> np.nanmin(a, axis=1)  
array([1.,  3.])

When positive infinity and negative infinity are present:

>>> np.nanmin([1, 2, np.nan, np.inf])  
1.0
>>> np.nanmin([1, 2, np.nan, np.NINF])  
-inf
dask.array.nanprod(a, axis=None, dtype=None, keepdims=False, split_every=None, out=None)

Return the product of array elements over a given axis treating Not a Numbers (NaNs) as ones.

This docstring was copied from numpy.nanprod.

Some inconsistencies with the Dask version may exist.

One is returned for slices that are all-NaN or empty.

New in version 1.10.0.

Parameters
aarray_like

Array containing numbers whose product is desired. If a is not an array, a conversion is attempted.

axis{int, tuple of int, None}, optional

Axis or axes along which the product is computed. The default is to compute the product of the flattened array.

dtypedata-type, optional

The type of the returned array and of the accumulator in which the elements are summed. By default, the dtype of a is used. An exception is when a has an integer type with less precision than the platform (u)intp. In that case, the default will be either (u)int32 or (u)int64 depending on whether the platform is 32 or 64 bits. For inexact inputs, dtype must be inexact.

outndarray, optional

Alternate output array in which to place the result. The default is None. If provided, it must have the same shape as the expected output, but the type will be cast if necessary. See ufuncs-output-type for more details. The casting of NaN to integer can yield unexpected results.

keepdimsbool, optional

If True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the original arr.

Returns
nanprodndarray

A new array holding the result is returned unless out is specified, in which case it is returned.

See also

numpy.prod

Product across array propagating NaNs.

isnan

Show which elements are NaN.

Examples

>>> np.nanprod(1)  
1
>>> np.nanprod([1])  
1
>>> np.nanprod([1, np.nan])  
1.0
>>> a = np.array([[1, 2], [3, np.nan]])  
>>> np.nanprod(a)  
6.0
>>> np.nanprod(a, axis=0)  
array([3., 2.])
dask.array.nanstd(a, axis=None, dtype=None, keepdims=False, ddof=0, split_every=None, out=None)

Compute the standard deviation along the specified axis, while ignoring NaNs.

This docstring was copied from numpy.nanstd.

Some inconsistencies with the Dask version may exist.

Compute the standard deviation along the specified axis, while ignoring NaNs.

This docstring was copied from numpy.nanstd.

Some inconsistencies with the Dask version may exist.

Returns the standard deviation, a measure of the spread of a distribution, of the non-NaN array elements. The standard deviation is computed for the flattened array by default, otherwise over the specified axis.

For all-NaN slices or slices with zero degrees of freedom, NaN is returned and a RuntimeWarning is raised.

New in version 1.8.0.

Parameters
aarray_like

Calculate the standard deviation of the non-NaN values.

axis{int, tuple of int, None}, optional

Axis or axes along which the standard deviation is computed. The default is to compute the standard deviation of the flattened array.

dtypedtype, optional

Type to use in computing the standard deviation. For arrays of integer type the default is float64, for arrays of float types it is the same as the array type.

outndarray, optional

Alternative output array in which to place the result. It must have the same shape as the expected output but the type (of the calculated values) will be cast if necessary.

ddofint, optional

Means Delta Degrees of Freedom. The divisor used in calculations is N - ddof, where N represents the number of non-NaN elements. By default ddof is zero.

keepdimsbool, optional

If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the original a.

If this value is anything but the default it is passed through as-is to the relevant functions of the sub-classes. If these functions do not have a keepdims kwarg, a RuntimeError will be raised.

Returns
standard_deviationndarray, see dtype parameter above.

If out is None, return a new array containing the standard deviation, otherwise return a reference to the output array. If ddof is >= the number of non-NaN elements in a slice or the slice contains only NaNs, then the result for that slice is NaN.

See also

var, mean, std
nanvar, nanmean
ufuncs-output-type

Notes

The standard deviation is the square root of the average of the squared deviations from the mean: std = sqrt(mean(abs(x - x.mean())**2)).

The average squared deviation is normally calculated as x.sum() / N, where N = len(x). If, however, ddof is specified, the divisor N - ddof is used instead. In standard statistical practice, ddof=1 provides an unbiased estimator of the variance of the infinite population. ddof=0 provides a maximum likelihood estimate of the variance for normally distributed variables. The standard deviation computed in this function is the square root of the estimated variance, so even with ddof=1, it will not be an unbiased estimate of the standard deviation per se.

Note that, for complex numbers, std takes the absolute value before squaring, so that the result is always real and nonnegative.

For floating-point input, the std is computed using the same precision the input has. Depending on the input data, this can cause the results to be inaccurate, especially for float32 (see example below). Specifying a higher-accuracy accumulator using the dtype keyword can alleviate this issue.

Examples

>>> a = np.array([[1, np.nan], [3, 4]])  
>>> np.nanstd(a)  
1.247219128924647
>>> np.nanstd(a, axis=0)  
array([1., 0.])
>>> np.nanstd(a, axis=1)  
array([0.,  0.5]) # may vary

Returns the standard deviation, a measure of the spread of a distribution, of the non-NaN array elements. The standard deviation is computed for the flattened array by default, otherwise over the specified axis.

For all-NaN slices or slices with zero degrees of freedom, NaN is returned and a RuntimeWarning is raised.

New in version 1.8.0.

dask.array.nansum(a, axis=None, dtype=None, keepdims=False, split_every=None, out=None)

Return the sum of array elements over a given axis treating Not a Numbers (NaNs) as zero.

This docstring was copied from numpy.nansum.

Some inconsistencies with the Dask version may exist.

In NumPy versions <= 1.9.0 Nan is returned for slices that are all-NaN or empty. In later versions zero is returned.

Parameters
aarray_like

Array containing numbers whose sum is desired. If a is not an array, a conversion is attempted.

axis{int, tuple of int, None}, optional

Axis or axes along which the sum is computed. The default is to compute the sum of the flattened array.

dtypedata-type, optional

The type of the returned array and of the accumulator in which the elements are summed. By default, the dtype of a is used. An exception is when a has an integer type with less precision than the platform (u)intp. In that case, the default will be either (u)int32 or (u)int64 depending on whether the platform is 32 or 64 bits. For inexact inputs, dtype must be inexact.

New in version 1.8.0.

outndarray, optional

Alternate output array in which to place the result. The default is None. If provided, it must have the same shape as the expected output, but the type will be cast if necessary. See ufuncs-output-type for more details. The casting of NaN to integer can yield unexpected results.

New in version 1.8.0.

keepdimsbool, optional

If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the original a.

If the value is anything but the default, then keepdims will be passed through to the mean or sum methods of sub-classes of ndarray. If the sub-classes methods does not implement keepdims any exceptions will be raised.

New in version 1.8.0.

Returns
nansumndarray.

A new array holding the result is returned unless out is specified, in which it is returned. The result has the same size as a, and the same shape as a if axis is not None or a is a 1-d array.

See also

numpy.sum

Sum across array propagating NaNs.

isnan

Show which elements are NaN.

isfinite

Show which elements are not NaN or +/-inf.

Notes

If both positive and negative infinity are present, the sum will be Not A Number (NaN).

Examples

>>> np.nansum(1)  
1
>>> np.nansum([1])  
1
>>> np.nansum([1, np.nan])  
1.0
>>> a = np.array([[1, 1], [1, np.nan]])  
>>> np.nansum(a)  
3.0
>>> np.nansum(a, axis=0)  
array([2.,  1.])
>>> np.nansum([1, np.nan, np.inf])  
inf
>>> np.nansum([1, np.nan, np.NINF])  
-inf
>>> from numpy.testing import suppress_warnings  
>>> with suppress_warnings() as sup:  
...     sup.filter(RuntimeWarning)
...     np.nansum([1, np.nan, np.inf, -np.inf]) # both +/- infinity present
nan
dask.array.nanvar(a, axis=None, dtype=None, keepdims=False, ddof=0, split_every=None, out=None)

Compute the variance along the specified axis, while ignoring NaNs.

This docstring was copied from numpy.nanvar.

Some inconsistencies with the Dask version may exist.

Compute the variance along the specified axis, while ignoring NaNs.

This docstring was copied from numpy.nanvar.

Some inconsistencies with the Dask version may exist.

Returns the variance of the array elements, a measure of the spread of a distribution. The variance is computed for the flattened array by default, otherwise over the specified axis.

For all-NaN slices or slices with zero degrees of freedom, NaN is returned and a RuntimeWarning is raised.

New in version 1.8.0.

Parameters
aarray_like

Array containing numbers whose variance is desired. If a is not an array, a conversion is attempted.

axis{int, tuple of int, None}, optional

Axis or axes along which the variance is computed. The default is to compute the variance of the flattened array.

dtypedata-type, optional

Type to use in computing the variance. For arrays of integer type the default is float64; for arrays of float types it is the same as the array type.

outndarray, optional

Alternate output array in which to place the result. It must have the same shape as the expected output, but the type is cast if necessary.

ddofint, optional

“Delta Degrees of Freedom”: the divisor used in the calculation is N - ddof, where N represents the number of non-NaN elements. By default ddof is zero.

keepdimsbool, optional

If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the original a.

Returns
variancendarray, see dtype parameter above

If out is None, return a new array containing the variance, otherwise return a reference to the output array. If ddof is >= the number of non-NaN elements in a slice or the slice contains only NaNs, then the result for that slice is NaN.

See also

std

Standard deviation

mean

Average

var

Variance while not ignoring NaNs

nanstd, nanmean
ufuncs-output-type

Notes

The variance is the average of the squared deviations from the mean, i.e., var = mean(abs(x - x.mean())**2).

The mean is normally calculated as x.sum() / N, where N = len(x). If, however, ddof is specified, the divisor N - ddof is used instead. In standard statistical practice, ddof=1 provides an unbiased estimator of the variance of a hypothetical infinite population. ddof=0 provides a maximum likelihood estimate of the variance for normally distributed variables.

Note that for complex numbers, the absolute value is taken before squaring, so that the result is always real and nonnegative.

For floating-point input, the variance is computed using the same precision the input has. Depending on the input data, this can cause the results to be inaccurate, especially for float32 (see example below). Specifying a higher-accuracy accumulator using the dtype keyword can alleviate this issue.

For this function to work on sub-classes of ndarray, they must define sum with the kwarg keepdims

Examples

>>> a = np.array([[1, np.nan], [3, 4]])  
>>> np.nanvar(a)  
1.5555555555555554
>>> np.nanvar(a, axis=0)  
array([1.,  0.])
>>> np.nanvar(a, axis=1)  
array([0.,  0.25])  # may vary

Returns the variance of the array elements, a measure of the spread of a distribution. The variance is computed for the flattened array by default, otherwise over the specified axis.

For all-NaN slices or slices with zero degrees of freedom, NaN is returned and a RuntimeWarning is raised.

New in version 1.8.0.

dask.array.nan_to_num(*args, **kwargs)

Replace NaN with zero and infinity with large finite numbers (default behaviour) or with the numbers defined by the user using the nan, posinf and/or neginf keywords.

This docstring was copied from numpy.nan_to_num.

Some inconsistencies with the Dask version may exist.

If x is inexact, NaN is replaced by zero or by the user defined value in nan keyword, infinity is replaced by the largest finite floating point values representable by x.dtype or by the user defined value in posinf keyword and -infinity is replaced by the most negative finite floating point values representable by x.dtype or by the user defined value in neginf keyword.

For complex dtypes, the above is applied to each of the real and imaginary components of x separately.

If x is not inexact, then no replacements are made.

Parameters
xscalar or array_like (Not supported in Dask)

Input data.

copybool, optional (Not supported in Dask)

Whether to create a copy of x (True) or to replace values in-place (False). The in-place operation only occurs if casting to an array does not require a copy. Default is True.

New in version 1.13.

nanint, float, optional (Not supported in Dask)

Value to be used to fill NaN values. If no value is passed then NaN values will be replaced with 0.0.

New in version 1.17.

posinfint, float, optional (Not supported in Dask)

Value to be used to fill positive infinity values. If no value is passed then positive infinity values will be replaced with a very large number.

New in version 1.17.

neginfint, float, optional (Not supported in Dask)

Value to be used to fill negative infinity values. If no value is passed then negative infinity values will be replaced with a very small (or negative) number.

New in version 1.17.

Returns
outndarray

x, with the non-finite values replaced. If copy is False, this may be x itself.

See also

isinf

Shows which elements are positive or negative infinity.

isneginf

Shows which elements are negative infinity.

isposinf

Shows which elements are positive infinity.

isnan

Shows which elements are Not a Number (NaN).

isfinite

Shows which elements are finite (not NaN, not infinity)

Notes

NumPy uses the IEEE Standard for Binary Floating-Point for Arithmetic (IEEE 754). This means that Not a Number is not equivalent to infinity.

Examples

>>> np.nan_to_num(np.inf)  
1.7976931348623157e+308
>>> np.nan_to_num(-np.inf)  
-1.7976931348623157e+308
>>> np.nan_to_num(np.nan)  
0.0
>>> x = np.array([np.inf, -np.inf, np.nan, -128, 128])  
>>> np.nan_to_num(x)  
array([ 1.79769313e+308, -1.79769313e+308,  0.00000000e+000, # may vary
       -1.28000000e+002,  1.28000000e+002])
>>> np.nan_to_num(x, nan=-9999, posinf=33333333, neginf=33333333)  
array([ 3.3333333e+07,  3.3333333e+07, -9.9990000e+03, 
       -1.2800000e+02,  1.2800000e+02])
>>> y = np.array([complex(np.inf, np.nan), np.nan, complex(np.nan, np.inf)])  
array([  1.79769313e+308,  -1.79769313e+308,   0.00000000e+000, # may vary
     -1.28000000e+002,   1.28000000e+002])
>>> np.nan_to_num(y)  
array([  1.79769313e+308 +0.00000000e+000j, # may vary
         0.00000000e+000 +0.00000000e+000j,
         0.00000000e+000 +1.79769313e+308j])
>>> np.nan_to_num(y, nan=111111, posinf=222222)  
array([222222.+111111.j, 111111.     +0.j, 111111.+222222.j])
dask.array.nextafter(x1, x2, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])

This docstring was copied from numpy.nextafter.

Some inconsistencies with the Dask version may exist.

Return the next floating-point value after x1 towards x2, element-wise.

Parameters
x1array_like

Values to find the next representable value of.

x2array_like

The direction where to look for the next representable value of x1. If x1.shape != x2.shape, they must be broadcastable to a common shape (which becomes the shape of the output).

outndarray, None, or tuple of ndarray and None, optional

A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.

wherearray_like, optional

This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default out=None, locations within it where the condition is False will remain uninitialized.

**kwargs

For other keyword-only arguments, see the ufunc docs.

Returns
outndarray or scalar

The next representable values of x1 in the direction of x2. This is a scalar if both x1 and x2 are scalars.

Examples

>>> eps = np.finfo(np.float64).eps  
>>> np.nextafter(1, 2) == eps + 1  
True
>>> np.nextafter([1, 2], [2, 1]) == [eps + 1, 2 - eps]  
array([ True,  True])
dask.array.nonzero(a)

Return the indices of the elements that are non-zero.

This docstring was copied from numpy.nonzero.

Some inconsistencies with the Dask version may exist.

Returns a tuple of arrays, one for each dimension of a, containing the indices of the non-zero elements in that dimension. The values in a are always tested and returned in row-major, C-style order.

To group the indices by element, rather than dimension, use argwhere, which returns a row for each non-zero element.

Note

When called on a zero-d array or scalar, nonzero(a) is treated as nonzero(atleast1d(a)).

Deprecated since version 1.17.0: Use atleast1d explicitly if this behavior is deliberate.

Parameters
aarray_like

Input array.

Returns
tuple_of_arraystuple

Indices of elements that are non-zero.

See also

flatnonzero

Return indices that are non-zero in the flattened version of the input array.

ndarray.nonzero

Equivalent ndarray method.

count_nonzero

Counts the number of non-zero elements in the input array.

Notes

While the nonzero values can be obtained with a[nonzero(a)], it is recommended to use x[x.astype(bool)] or x[x != 0] instead, which will correctly handle 0-d arrays.

Examples

>>> x = np.array([[3, 0, 0], [0, 4, 0], [5, 6, 0]])  
>>> x  
array([[3, 0, 0],
       [0, 4, 0],
       [5, 6, 0]])
>>> np.nonzero(x)  
(array([0, 1, 2, 2]), array([0, 1, 0, 1]))
>>> x[np.nonzero(x)]  
array([3, 4, 5, 6])
>>> np.transpose(np.nonzero(x))  
array([[0, 0],
       [1, 1],
       [2, 0],
       [2, 1]])

A common use for nonzero is to find the indices of an array, where a condition is True. Given an array a, the condition a > 3 is a boolean array and since False is interpreted as 0, np.nonzero(a > 3) yields the indices of the a where the condition is true.

>>> a = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])  
>>> a > 3  
array([[False, False, False],
       [ True,  True,  True],
       [ True,  True,  True]])
>>> np.nonzero(a > 3)  
(array([1, 1, 1, 2, 2, 2]), array([0, 1, 2, 0, 1, 2]))

Using this result to index a is equivalent to using the mask directly:

>>> a[np.nonzero(a > 3)]  
array([4, 5, 6, 7, 8, 9])
>>> a[a > 3]  # prefer this spelling  
array([4, 5, 6, 7, 8, 9])

nonzero can also be called as a method of the array.

>>> (a > 3).nonzero()  
(array([1, 1, 1, 2, 2, 2]), array([0, 1, 2, 0, 1, 2]))
dask.array.notnull(values)

pandas.notnull for dask arrays

dask.array.ones(*args, **kwargs)

Blocked variant of ones

Follows the signature of ones exactly except that it also features optional keyword arguments chunks: int, tuple, or dict and name: str.

Original signature follows below.

Return a new array of given shape and type, filled with ones.

Parameters
shapeint or sequence of ints

Shape of the new array, e.g., (2, 3) or 2.

dtypedata-type, optional

The desired data-type for the array, e.g., numpy.int8. Default is numpy.float64.

order{‘C’, ‘F’}, optional, default: C

Whether to store multi-dimensional data in row-major (C-style) or column-major (Fortran-style) order in memory.

Returns
outndarray

Array of ones with the given shape, dtype, and order.

See also

ones_like

Return an array of ones with shape and type of input.

empty

Return a new uninitialized array.

zeros

Return a new array setting values to zero.

full

Return a new array of given shape filled with value.

Examples

>>> np.ones(5)
array([1., 1., 1., 1., 1.])
>>> np.ones((5,), dtype=int)
array([1, 1, 1, 1, 1])
>>> np.ones((2, 1))
array([[1.],
       [1.]])
>>> s = (2,2)
>>> np.ones(s)
array([[1.,  1.],
       [1.,  1.]])
dask.array.ones_like(a, dtype=None, order='C', chunks=None, name=None, shape=None)

Return an array of ones with the same shape and type as a given array.

Parameters
aarray_like

The shape and data-type of a define these same attributes of the returned array.

dtypedata-type, optional

Overrides the data type of the result.

order{‘C’, ‘F’}, optional

Whether to store multidimensional data in C- or Fortran-contiguous (row- or column-wise) order in memory.

chunkssequence of ints

The number of samples on each block. Note that the last block will have fewer samples if len(array) % chunks != 0.

namestr, optional

An optional keyname for the array. Defaults to hashing the input keyword arguments.

shapeint or sequence of ints, optional.

Overrides the shape of the result.

Returns
outndarray

Array of ones with the same shape and type as a.

See also

zeros_like

Return an array of zeros with shape and type of input.

empty_like

Return an empty array with shape and type of input.

zeros

Return a new array setting values to zero.

ones

Return a new array setting values to one.

empty

Return a new uninitialized array.

dask.array.outer(a, b)

Compute the outer product of two vectors.

This docstring was copied from numpy.outer.

Some inconsistencies with the Dask version may exist.

Given two vectors, a = [a0, a1, ..., aM] and b = [b0, b1, ..., bN], the outer product [1] is:

[[a0*b0  a0*b1 ... a0*bN ]
 [a1*b0    .
 [ ...          .
 [aM*b0            aM*bN ]]
Parameters
a(M,) array_like

First input vector. Input is flattened if not already 1-dimensional.

b(N,) array_like

Second input vector. Input is flattened if not already 1-dimensional.

out(M, N) ndarray, optional

A location where the result is stored

New in version 1.9.0.

Returns
out(M, N) ndarray

out[i, j] = a[i] * b[j]

See also

inner
einsum

einsum('i,j->ij', a.ravel(), b.ravel()) is the equivalent.

ufunc.outer

A generalization to N dimensions and other operations. np.multiply.outer(a.ravel(), b.ravel()) is the equivalent.

References

1

: G. H. Golub and C. F. Van Loan, Matrix Computations, 3rd ed., Baltimore, MD, Johns Hopkins University Press, 1996, pg. 8.

Examples

Make a (very coarse) grid for computing a Mandelbrot set:

>>> rl = np.outer(np.ones((5,)), np.linspace(-2, 2, 5))  
>>> rl  
array([[-2., -1.,  0.,  1.,  2.],
       [-2., -1.,  0.,  1.,  2.],
       [-2., -1.,  0.,  1.,  2.],
       [-2., -1.,  0.,  1.,  2.],
       [-2., -1.,  0.,  1.,  2.]])
>>> im = np.outer(1j*np.linspace(2, -2, 5), np.ones((5,)))  
>>> im  
array([[0.+2.j, 0.+2.j, 0.+2.j, 0.+2.j, 0.+2.j],
       [0.+1.j, 0.+1.j, 0.+1.j, 0.+1.j, 0.+1.j],
       [0.+0.j, 0.+0.j, 0.+0.j, 0.+0.j, 0.+0.j],
       [0.-1.j, 0.-1.j, 0.-1.j, 0.-1.j, 0.-1.j],
       [0.-2.j, 0.-2.j, 0.-2.j, 0.-2.j, 0.-2.j]])
>>> grid = rl + im  
>>> grid  
array([[-2.+2.j, -1.+2.j,  0.+2.j,  1.+2.j,  2.+2.j],
       [-2.+1.j, -1.+1.j,  0.+1.j,  1.+1.j,  2.+1.j],
       [-2.+0.j, -1.+0.j,  0.+0.j,  1.+0.j,  2.+0.j],
       [-2.-1.j, -1.-1.j,  0.-1.j,  1.-1.j,  2.-1.j],
       [-2.-2.j, -1.-2.j,  0.-2.j,  1.-2.j,  2.-2.j]])

An example using a “vector” of letters:

>>> x = np.array(['a', 'b', 'c'], dtype=object)  
>>> np.outer(x, [1, 2, 3])  
array([['a', 'aa', 'aaa'],
       ['b', 'bb', 'bbb'],
       ['c', 'cc', 'ccc']], dtype=object)
dask.array.pad(array, pad_width, mode='constant', **kwargs)

Pad an array.

This docstring was copied from numpy.pad.

Some inconsistencies with the Dask version may exist.

Parameters
arrayarray_like of rank N

The array to pad.

pad_width{sequence, array_like, int}

Number of values padded to the edges of each axis. ((before_1, after_1), … (before_N, after_N)) unique pad widths for each axis. ((before, after),) yields same before and after pad for each axis. (pad,) or int is a shortcut for before = after = pad width for all axes.

modestr or function, optional

One of the following string values or a user supplied function.

‘constant’ (default)

Pads with a constant value.

‘edge’

Pads with the edge values of array.

‘linear_ramp’

Pads with the linear ramp between end_value and the array edge value.

‘maximum’

Pads with the maximum value of all or part of the vector along each axis.

‘mean’

Pads with the mean value of all or part of the vector along each axis.

‘median’

Pads with the median value of all or part of the vector along each axis.

‘minimum’

Pads with the minimum value of all or part of the vector along each axis.

‘reflect’

Pads with the reflection of the vector mirrored on the first and last values of the vector along each axis.

‘symmetric’

Pads with the reflection of the vector mirrored along the edge of the array.

‘wrap’

Pads with the wrap of the vector along the axis. The first values are used to pad the end and the end values are used to pad the beginning.

‘empty’

Pads with undefined values.

New in version 1.17.

<function>

Padding function, see Notes.

stat_lengthsequence or int, optional

Used in ‘maximum’, ‘mean’, ‘median’, and ‘minimum’. Number of values at edge of each axis used to calculate the statistic value.

((before_1, after_1), … (before_N, after_N)) unique statistic lengths for each axis.

((before, after),) yields same before and after statistic lengths for each axis.

(stat_length,) or int is a shortcut for before = after = statistic length for all axes.

Default is None, to use the entire axis.

constant_valuessequence or scalar, optional

Used in ‘constant’. The values to set the padded values for each axis.

((before_1, after_1), ... (before_N, after_N)) unique pad constants for each axis.

((before, after),) yields same before and after constants for each axis.

(constant,) or constant is a shortcut for before = after = constant for all axes.

Default is 0.

end_valuessequence or scalar, optional

Used in ‘linear_ramp’. The values used for the ending value of the linear_ramp and that will form the edge of the padded array.

((before_1, after_1), ... (before_N, after_N)) unique end values for each axis.

((before, after),) yields same before and after end values for each axis.

(constant,) or constant is a shortcut for before = after = constant for all axes.

Default is 0.

reflect_type{‘even’, ‘odd’}, optional

Used in ‘reflect’, and ‘symmetric’. The ‘even’ style is the default with an unaltered reflection around the edge value. For the ‘odd’ style, the extended part of the array is created by subtracting the reflected values from two times the edge value.

Returns
padndarray

Padded array of rank equal to array with shape increased according to pad_width.

Notes

New in version 1.7.0.

For an array with rank greater than 1, some of the padding of later axes is calculated from padding of previous axes. This is easiest to think about with a rank 2 array where the corners of the padded array are calculated by using padded values from the first axis.

The padding function, if used, should modify a rank 1 array in-place. It has the following signature:

padding_func(vector, iaxis_pad_width, iaxis, kwargs)

where

vectorndarray

A rank 1 array already padded with zeros. Padded values are vector[:iaxis_pad_width[0]] and vector[-iaxis_pad_width[1]:].

iaxis_pad_widthtuple

A 2-tuple of ints, iaxis_pad_width[0] represents the number of values padded at the beginning of vector where iaxis_pad_width[1] represents the number of values padded at the end of vector.

iaxisint

The axis currently being calculated.

kwargsdict

Any keyword arguments the function requires.

Examples

>>> a = [1, 2, 3, 4, 5]  
>>> np.pad(a, (2, 3), 'constant', constant_values=(4, 6))  
array([4, 4, 1, ..., 6, 6, 6])
>>> np.pad(a, (2, 3), 'edge')  
array([1, 1, 1, ..., 5, 5, 5])
>>> np.pad(a, (2, 3), 'linear_ramp', end_values=(5, -4))  
array([ 5,  3,  1,  2,  3,  4,  5,  2, -1, -4])
>>> np.pad(a, (2,), 'maximum')  
array([5, 5, 1, 2, 3, 4, 5, 5, 5])
>>> np.pad(a, (2,), 'mean')  
array([3, 3, 1, 2, 3, 4, 5, 3, 3])
>>> np.pad(a, (2,), 'median')  
array([3, 3, 1, 2, 3, 4, 5, 3, 3])
>>> a = [[1, 2], [3, 4]]  
>>> np.pad(a, ((3, 2), (2, 3)), 'minimum')  
array([[1, 1, 1, 2, 1, 1, 1],
       [1, 1, 1, 2, 1, 1, 1],
       [1, 1, 1, 2, 1, 1, 1],
       [1, 1, 1, 2, 1, 1, 1],
       [3, 3, 3, 4, 3, 3, 3],
       [1, 1, 1, 2, 1, 1, 1],
       [1, 1, 1, 2, 1, 1, 1]])
>>> a = [1, 2, 3, 4, 5]  
>>> np.pad(a, (2, 3), 'reflect')  
array([3, 2, 1, 2, 3, 4, 5, 4, 3, 2])
>>> np.pad(a, (2, 3), 'reflect', reflect_type='odd')  
array([-1,  0,  1,  2,  3,  4,  5,  6,  7,  8])
>>> np.pad(a, (2, 3), 'symmetric')  
array([2, 1, 1, 2, 3, 4, 5, 5, 4, 3])
>>> np.pad(a, (2, 3), 'symmetric', reflect_type='odd')  
array([0, 1, 1, 2, 3, 4, 5, 5, 6, 7])
>>> np.pad(a, (2, 3), 'wrap')  
array([4, 5, 1, 2, 3, 4, 5, 1, 2, 3])
>>> def pad_with(vector, pad_width, iaxis, kwargs):  
...     pad_value = kwargs.get('padder', 10)
...     vector[:pad_width[0]] = pad_value
...     vector[-pad_width[1]:] = pad_value
>>> a = np.arange(6)  
>>> a = a.reshape((2, 3))  
>>> np.pad(a, 2, pad_with)  
array([[10, 10, 10, 10, 10, 10, 10],
       [10, 10, 10, 10, 10, 10, 10],
       [10, 10,  0,  1,  2, 10, 10],
       [10, 10,  3,  4,  5, 10, 10],
       [10, 10, 10, 10, 10, 10, 10],
       [10, 10, 10, 10, 10, 10, 10]])
>>> np.pad(a, 2, pad_with, padder=100)  
array([[100, 100, 100, 100, 100, 100, 100],
       [100, 100, 100, 100, 100, 100, 100],
       [100, 100,   0,   1,   2, 100, 100],
       [100, 100,   3,   4,   5, 100, 100],
       [100, 100, 100, 100, 100, 100, 100],
       [100, 100, 100, 100, 100, 100, 100]])
dask.array.percentile(a, q, interpolation='linear', method='default')

Approximate percentile of 1-D array

Parameters
aArray
qarray_like of float

Percentile or sequence of percentiles to compute, which must be between 0 and 100 inclusive.

interpolation{‘linear’, ‘lower’, ‘higher’, ‘midpoint’, ‘nearest’}, optional

The interpolation method to use when the desired percentile lies between two data points i < j. Only valid for method='dask'.

  • ‘linear’: i + (j - i) * fraction, where fraction is the fractional part of the index surrounded by i and j.

  • ‘lower’: i.

  • ‘higher’: j.

  • ‘nearest’: i or j, whichever is nearest.

  • ‘midpoint’: (i + j) / 2.

method{‘default’, ‘dask’, ‘tdigest’}, optional

What method to use. By default will use dask’s internal custom algorithm ('dask'). If set to 'tdigest' will use tdigest for floats and ints and fallback to the 'dask' otherwise.

See also

numpy.percentile

Numpy’s equivalent Percentile function

dask.array.piecewise(x, condlist, funclist, *args, **kw)

Evaluate a piecewise-defined function.

This docstring was copied from numpy.piecewise.

Some inconsistencies with the Dask version may exist.

Given a set of conditions and corresponding functions, evaluate each function on the input data wherever its condition is true.

Parameters
xndarray or scalar

The input domain.

condlistlist of bool arrays or bool scalars

Each boolean array corresponds to a function in funclist. Wherever condlist[i] is True, funclist[i](x) is used as the output value.

Each boolean array in condlist selects a piece of x, and should therefore be of the same shape as x.

The length of condlist must correspond to that of funclist. If one extra function is given, i.e. if len(funclist) == len(condlist) + 1, then that extra function is the default value, used wherever all conditions are false.

funclistlist of callables, f(x,*args,**kw), or scalars

Each function is evaluated over x wherever its corresponding condition is True. It should take a 1d array as input and give an 1d array or a scalar value as output. If, instead of a callable, a scalar is provided then a constant function (lambda x: scalar) is assumed.

argstuple, optional

Any further arguments given to piecewise are passed to the functions upon execution, i.e., if called piecewise(..., ..., 1, 'a'), then each function is called as f(x, 1, 'a').

kwdict, optional

Keyword arguments used in calling piecewise are passed to the functions upon execution, i.e., if called piecewise(..., ..., alpha=1), then each function is called as f(x, alpha=1).

Returns
outndarray

The output is the same shape and type as x and is found by calling the functions in funclist on the appropriate portions of x, as defined by the boolean arrays in condlist. Portions not covered by any condition have a default value of 0.

See also

choose, select, where

Notes

This is similar to choose or select, except that functions are evaluated on elements of x that satisfy the corresponding condition from condlist.

The result is:

      |--
      |funclist[0](x[condlist[0]])
out = |funclist[1](x[condlist[1]])
      |...
      |funclist[n2](x[condlist[n2]])
      |--

Examples

Define the sigma function, which is -1 for x < 0 and +1 for x >= 0.

>>> x = np.linspace(-2.5, 2.5, 6)  
>>> np.piecewise(x, [x < 0, x >= 0], [-1, 1])  
array([-1., -1., -1.,  1.,  1.,  1.])

Define the absolute value, which is -x for x <0 and x for x >= 0.

>>> np.piecewise(x, [x < 0, x >= 0], [lambda x: -x, lambda x: x])  
array([2.5,  1.5,  0.5,  0.5,  1.5,  2.5])

Apply the same function to a scalar value.

>>> y = -2  
>>> np.piecewise(y, [y < 0, y >= 0], [lambda x: -x, lambda x: x])  
array(2)
dask.array.prod(a, axis=None, dtype=None, keepdims=False, split_every=None, out=None)

Return the product of array elements over a given axis.

This docstring was copied from numpy.prod.

Some inconsistencies with the Dask version may exist.

Parameters
aarray_like

Input data.

axisNone or int or tuple of ints, optional

Axis or axes along which a product is performed. The default, axis=None, will calculate the product of all the elements in the input array. If axis is negative it counts from the last to the first axis.

New in version 1.7.0.

If axis is a tuple of ints, a product is performed on all of the axes specified in the tuple instead of a single axis or all the axes as before.

dtypedtype, optional

The type of the returned array, as well as of the accumulator in which the elements are multiplied. The dtype of a is used by default unless a has an integer dtype of less precision than the default platform integer. In that case, if a is signed then the platform integer is used while if a is unsigned then an unsigned integer of the same precision as the platform integer is used.

outndarray, optional

Alternative output array in which to place the result. It must have the same shape as the expected output, but the type of the output values will be cast if necessary.

keepdimsbool, optional

If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the input array.

If the default value is passed, then keepdims will not be passed through to the prod method of sub-classes of ndarray, however any non-default value will be. If the sub-class’ method does not implement keepdims any exceptions will be raised.

initialscalar, optional (Not supported in Dask)

The starting value for this product. See ~numpy.ufunc.reduce for details.

New in version 1.15.0.

wherearray_like of bool, optional (Not supported in Dask)

Elements to include in the product. See ~numpy.ufunc.reduce for details.

New in version 1.17.0.

Returns
product_along_axisndarray, see dtype parameter above.

An array shaped as a but with the specified axis removed. Returns a reference to out if specified.

See also

ndarray.prod

equivalent method

ufuncs-output-type

Notes

Arithmetic is modular when using integer types, and no error is raised on overflow. That means that, on a 32-bit platform:

>>> x = np.array([536870910, 536870910, 536870910, 536870910])  
>>> np.prod(x)  
16 # may vary

The product of an empty array is the neutral element 1:

>>> np.prod([])  
1.0

Examples

By default, calculate the product of all elements:

>>> np.prod([1.,2.])  
2.0

Even when the input array is two-dimensional:

>>> np.prod([[1.,2.],[3.,4.]])  
24.0

But we can also specify the axis over which to multiply:

>>> np.prod([[1.,2.],[3.,4.]], axis=1)  
array([  2.,  12.])

Or select specific elements to include:

>>> np.prod([1., np.nan, 3.], where=[True, False, True])  
3.0

If the type of x is unsigned, then the output type is the unsigned platform integer:

>>> x = np.array([1, 2, 3], dtype=np.uint8)  
>>> np.prod(x).dtype == np.uint  
True

If x is of a signed integer type, then the output type is the default platform integer:

>>> x = np.array([1, 2, 3], dtype=np.int8)  
>>> np.prod(x).dtype == int  
True

You can also start the product with a value other than one:

>>> np.prod([1, 2], initial=5)  
10
dask.array.ptp(a, axis=None)

Range of values (maximum - minimum) along an axis.

This docstring was copied from numpy.ptp.

Some inconsistencies with the Dask version may exist.

The name of the function comes from the acronym for ‘peak to peak’.

Parameters
aarray_like

Input values.

axisNone or int or tuple of ints, optional

Axis along which to find the peaks. By default, flatten the array. axis may be negative, in which case it counts from the last to the first axis.

New in version 1.15.0.

If this is a tuple of ints, a reduction is performed on multiple axes, instead of a single axis or all the axes as before.

outarray_like (Not supported in Dask)

Alternative output array in which to place the result. It must have the same shape and buffer length as the expected output, but the type of the output values will be cast if necessary.

keepdimsbool, optional (Not supported in Dask)

If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the input array.

If the default value is passed, then keepdims will not be passed through to the ptp method of sub-classes of ndarray, however any non-default value will be. If the sub-class’ method does not implement keepdims any exceptions will be raised.

Returns
ptpndarray

A new array holding the result, unless out was specified, in which case a reference to out is returned.

Examples

>>> x = np.arange(4).reshape((2,2))  
>>> x  
array([[0, 1],
       [2, 3]])
>>> np.ptp(x, axis=0)  
array([2, 2])
>>> np.ptp(x, axis=1)  
array([1, 1])
dask.array.rad2deg(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])

This docstring was copied from numpy.rad2deg.

Some inconsistencies with the Dask version may exist.

Convert angles from radians to degrees.

Parameters
xarray_like

Angle in radians.

outndarray, None, or tuple of ndarray and None, optional

A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.

wherearray_like, optional

This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default out=None, locations within it where the condition is False will remain uninitialized.

**kwargs

For other keyword-only arguments, see the ufunc docs.

Returns
yndarray

The corresponding angle in degrees. This is a scalar if x is a scalar.

See also

deg2rad

Convert angles from degrees to radians.

unwrap

Remove large jumps in angle by wrapping.

Notes

New in version 1.3.0.

rad2deg(x) is 180 * x / pi.

Examples

>>> np.rad2deg(np.pi/2)  
90.0
dask.array.radians(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])

This docstring was copied from numpy.radians.

Some inconsistencies with the Dask version may exist.

Convert angles from degrees to radians.

Parameters
xarray_like

Input array in degrees.

outndarray, None, or tuple of ndarray and None, optional

A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.

wherearray_like, optional

This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default out=None, locations within it where the condition is False will remain uninitialized.

**kwargs

For other keyword-only arguments, see the ufunc docs.

Returns
yndarray

The corresponding radian values. This is a scalar if x is a scalar.

See also

deg2rad

equivalent function

Examples

Convert a degree array to radians

>>> deg = np.arange(12.) * 30.  
>>> np.radians(deg)  
array([ 0.        ,  0.52359878,  1.04719755,  1.57079633,  2.0943951 ,
        2.61799388,  3.14159265,  3.66519143,  4.1887902 ,  4.71238898,
        5.23598776,  5.75958653])
>>> out = np.zeros((deg.shape))  
>>> ret = np.radians(deg, out)  
>>> ret is out  
True
dask.array.ravel(array)

Return a contiguous flattened array.

This docstring was copied from numpy.ravel.

Some inconsistencies with the Dask version may exist.

A 1-D array, containing the elements of the input, is returned. A copy is made only if needed.

As of NumPy 1.10, the returned array will have the same type as the input array. (for example, a masked array will be returned for a masked array input)

Parameters
aarray_like (Not supported in Dask)

Input array. The elements in a are read in the order specified by order, and packed as a 1-D array.

order{‘C’,’F’, ‘A’, ‘K’}, optional (Not supported in Dask)

The elements of a are read using this index order. ‘C’ means to index the elements in row-major, C-style order, with the last axis index changing fastest, back to the first axis index changing slowest. ‘F’ means to index the elements in column-major, Fortran-style order, with the first index changing fastest, and the last index changing slowest. Note that the ‘C’ and ‘F’ options take no account of the memory layout of the underlying array, and only refer to the order of axis indexing. ‘A’ means to read the elements in Fortran-like index order if a is Fortran contiguous in memory, C-like order otherwise. ‘K’ means to read the elements in the order they occur in memory, except for reversing the data when strides are negative. By default, ‘C’ index order is used.

Returns
yarray_like

y is an array of the same subtype as a, with shape (a.size,). Note that matrices are special cased for backward compatibility, if a is a matrix, then y is a 1-D ndarray.

See also

ndarray.flat

1-D iterator over an array.

ndarray.flatten

1-D array copy of the elements of an array in row-major order.

ndarray.reshape

Change the shape of an array without changing its data.

Notes

In row-major, C-style order, in two dimensions, the row index varies the slowest, and the column index the quickest. This can be generalized to multiple dimensions, where row-major order implies that the index along the first axis varies slowest, and the index along the last quickest. The opposite holds for column-major, Fortran-style index ordering.

When a view is desired in as many cases as possible, arr.reshape(-1) may be preferable.

Examples

It is equivalent to reshape(-1, order=order).

>>> x = np.array([[1, 2, 3], [4, 5, 6]])  
>>> np.ravel(x)  
array([1, 2, 3, 4, 5, 6])
>>> x.reshape(-1)  
array([1, 2, 3, 4, 5, 6])
>>> np.ravel(x, order='F')  
array([1, 4, 2, 5, 3, 6])

When order is ‘A’, it will preserve the array’s ‘C’ or ‘F’ ordering:

>>> np.ravel(x.T)  
array([1, 4, 2, 5, 3, 6])
>>> np.ravel(x.T, order='A')  
array([1, 2, 3, 4, 5, 6])

When order is ‘K’, it will preserve orderings that are neither ‘C’ nor ‘F’, but won’t reverse axes:

>>> a = np.arange(3)[::-1]; a  
array([2, 1, 0])
>>> a.ravel(order='C')  
array([2, 1, 0])
>>> a.ravel(order='K')  
array([2, 1, 0])
>>> a = np.arange(12).reshape(2,3,2).swapaxes(1,2); a  
array([[[ 0,  2,  4],
        [ 1,  3,  5]],
       [[ 6,  8, 10],
        [ 7,  9, 11]]])
>>> a.ravel(order='C')  
array([ 0,  2,  4,  1,  3,  5,  6,  8, 10,  7,  9, 11])
>>> a.ravel(order='K')  
array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11])
dask.array.real(*args, **kwargs)

Return the real part of the complex argument.

This docstring was copied from numpy.real.

Some inconsistencies with the Dask version may exist.

Parameters
valarray_like (Not supported in Dask)

Input array.

Returns
outndarray or scalar

The real component of the complex argument. If val is real, the type of val is used for the output. If val has complex elements, the returned type is float.

See also

real_if_close, imag, angle

Examples

>>> a = np.array([1+2j, 3+4j, 5+6j])  
>>> a.real  
array([1.,  3.,  5.])
>>> a.real = 9  
>>> a  
array([9.+2.j,  9.+4.j,  9.+6.j])
>>> a.real = np.array([9, 8, 7])  
>>> a  
array([9.+2.j,  8.+4.j,  7.+6.j])
>>> np.real(1 + 1j)  
1.0
dask.array.rechunk(x, chunks='auto', threshold=None, block_size_limit=None)

Convert blocks in dask array x for new chunks.

Parameters
x: dask array

Array to be rechunked.

chunks: int, tuple, dict or str, optional

The new block dimensions to create. -1 indicates the full size of the corresponding dimension. Default is “auto” which automatically determines chunk sizes.

threshold: int, optional

The graph growth factor under which we don’t bother introducing an intermediate step.

block_size_limit: int, optional

The maximum block size (in bytes) we want to produce Defaults to the configuration value array.chunk-size

Examples

>>> import dask.array as da
>>> x = da.ones((1000, 1000), chunks=(100, 100))

Specify uniform chunk sizes with a tuple

>>> y = x.rechunk((1000, 10))

Or chunk only specific dimensions with a dictionary

>>> y = x.rechunk({0: 1000})

Use the value -1 to specify that you want a single chunk along a dimension or the value "auto" to specify that dask can freely rechunk a dimension to attain blocks of a uniform block size

>>> y = x.rechunk({0: -1, 1: 'auto'}, block_size_limit=1e8)
dask.array.reduction(x, chunk, aggregate, axis=None, keepdims=False, dtype=None, split_every=None, combine=None, name=None, out=None, concatenate=True, output_size=1, meta=None)

General version of reductions

Parameters
x: Array

Data being reduced along one or more axes

chunk: callable(x_chunk, axis, keepdims)

First function to be executed when resolving the dask graph. This function is applied in parallel to all original chunks of x. See below for function parameters.

combine: callable(x_chunk, axis, keepdims), optional

Function used for intermediate recursive aggregation (see split_every below). If omitted, it defaults to aggregate. If the reduction can be performed in less than 3 steps, it will not be invoked at all.

aggregate: callable(x_chunk, axis, keepdims)

Last function to be executed when resolving the dask graph, producing the final output. It is always invoked, even when the reduced Array counts a single chunk along the reduced axes.

axis: int or sequence of ints, optional

Axis or axes to aggregate upon. If omitted, aggregate along all axes.

keepdims: boolean, optional

Whether the reduction function should preserve the reduced axes, leaving them at size output_size, or remove them.

dtype: np.dtype, optional

Force output dtype. Defaults to x.dtype if omitted.

split_every: int >= 2 or dict(axis: int), optional

Determines the depth of the recursive aggregation. If set to or more than the number of input chunks, the aggregation will be performed in two steps, one chunk function per input chunk and a single aggregate function at the end. If set to less than that, an intermediate combine function will be used, so that any one combine or aggregate function has no more than split_every inputs. The depth of the aggregation graph will be \(log_{split_every}(input chunks along reduced axes)\). Setting to a low value can reduce cache size and network transfers, at the cost of more CPU and a larger dask graph.

Omit to let dask heuristically decide a good default. A default can also be set globally with the split_every key in dask.config.

name: str, optional

Prefix of the keys of the intermediate and output nodes. If omitted it defaults to the function names.

out: Array, optional

Another dask array whose contents will be replaced. Omit to create a new one. Note that, unlike in numpy, this setting gives no performance benefits whatsoever, but can still be useful if one needs to preserve the references to a previously existing Array.

concatenate: bool, optional

If True (the default), the outputs of the chunk/combine functions are concatenated into a single np.array before being passed to the combine/aggregate functions. If False, the input of combine and aggregate will be either a list of the raw outputs of the previous step or a single output, and the function will have to concatenate it itself. It can be useful to set this to False if the chunk and/or combine steps do not produce np.arrays.

output_size: int >= 1, optional

Size of the output of the aggregate function along the reduced axes. Ignored if keepdims is False.

Returns
dask array
Function Parameters
x_chunk: numpy.ndarray

Individual input chunk. For chunk functions, it is one of the original chunks of x. For combine and aggregate functions, it’s the concatenation of the outputs produced by the previous chunk or combine functions. If concatenate=False, it’s a list of the raw outputs from the previous functions.

axis: tuple

Normalized list of axes to reduce upon, e.g. (0, ) Scalar, negative, and None axes have been normalized away. Note that some numpy reduction functions cannot reduce along multiple axes at once and strictly require an int in input. Such functions have to be wrapped to cope.

keepdims: bool

Whether the reduction function should preserve the reduced axes or remove them.

dask.array.repeat(a, repeats, axis=None)

Repeat elements of an array.

This docstring was copied from numpy.repeat.

Some inconsistencies with the Dask version may exist.

Parameters
aarray_like

Input array.

repeatsint or array of ints

The number of repetitions for each element. repeats is broadcasted to fit the shape of the given axis.

axisint, optional

The axis along which to repeat values. By default, use the flattened input array, and return a flat output array.

Returns
repeated_arrayndarray

Output array which has the same shape as a, except along the given axis.

See also

tile

Tile an array.

Examples

>>> np.repeat(3, 4)  
array([3, 3, 3, 3])
>>> x = np.array([[1,2],[3,4]])  
>>> np.repeat(x, 2)  
array([1, 1, 2, 2, 3, 3, 4, 4])
>>> np.repeat(x, 3, axis=1)  
array([[1, 1, 1, 2, 2, 2],
       [3, 3, 3, 4, 4, 4]])
>>> np.repeat(x, [1, 2], axis=0)  
array([[1, 2],
       [3, 4],
       [3, 4]])
dask.array.reshape(x, shape)

Reshape array to new shape

This is a parallelized version of the np.reshape function with the following limitations:

  1. It assumes that the array is stored in row-major order

  2. It only allows for reshapings that collapse or merge dimensions like (1, 2, 3, 4) -> (1, 6, 4) or (64,) -> (4, 4, 4)

When communication is necessary this algorithm depends on the logic within rechunk. It endeavors to keep chunk sizes roughly the same when possible.

dask.array.result_type(*arrays_and_dtypes)

This docstring was copied from numpy.result_type.

Some inconsistencies with the Dask version may exist.

Returns the type that results from applying the NumPy type promotion rules to the arguments.

Type promotion in NumPy works similarly to the rules in languages like C++, with some slight differences. When both scalars and arrays are used, the array’s type takes precedence and the actual value of the scalar is taken into account.

For example, calculating 3*a, where a is an array of 32-bit floats, intuitively should result in a 32-bit float output. If the 3 is a 32-bit integer, the NumPy rules indicate it can’t convert losslessly into a 32-bit float, so a 64-bit float should be the result type. By examining the value of the constant, ‘3’, we see that it fits in an 8-bit integer, which can be cast losslessly into the 32-bit float.

Parameters
arrays_and_dtypeslist of arrays and dtypes

The operands of some operation whose result type is needed.

Returns
outdtype

The result type.

See also

dtype, promote_types, min_scalar_type, can_cast

Notes

New in version 1.6.0.

The specific algorithm used is as follows.

Categories are determined by first checking which of boolean, integer (int/uint), or floating point (float/complex) the maximum kind of all the arrays and the scalars are.

If there are only scalars or the maximum category of the scalars is higher than the maximum category of the arrays, the data types are combined with promote_types() to produce the return value.

Otherwise, min_scalar_type is called on each array, and the resulting data types are all combined with promote_types() to produce the return value.

The set of int values is not a subset of the uint values for types with the same number of bits, something not reflected in min_scalar_type(), but handled as a special case in result_type.

Examples

>>> np.result_type(3, np.arange(7, dtype='i1'))  
dtype('int8')
>>> np.result_type('i4', 'c8')  
dtype('complex128')
>>> np.result_type(3.0, -2)  
dtype('float64')
dask.array.rint(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])

This docstring was copied from numpy.rint.

Some inconsistencies with the Dask version may exist.

Round elements of the array to the nearest integer.

Parameters
xarray_like

Input array.

outndarray, None, or tuple of ndarray and None, optional

A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.

wherearray_like, optional

This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default out=None, locations within it where the condition is False will remain uninitialized.

**kwargs

For other keyword-only arguments, see the ufunc docs.

Returns
outndarray or scalar

Output array is same shape and type as x. This is a scalar if x is a scalar.

See also

ceil, floor, trunc

Examples

>>> a = np.array([-1.7, -1.5, -0.2, 0.2, 1.5, 1.7, 2.0])  
>>> np.rint(a)  
array([-2., -2., -0.,  0.,  2.,  2.,  2.])
dask.array.roll(array, shift, axis=None)

Roll array elements along a given axis.

This docstring was copied from numpy.roll.

Some inconsistencies with the Dask version may exist.

Elements that roll beyond the last position are re-introduced at the first.

Parameters
aarray_like (Not supported in Dask)

Input array.

shiftint or tuple of ints

The number of places by which elements are shifted. If a tuple, then axis must be a tuple of the same size, and each of the given axes is shifted by the corresponding number. If an int while axis is a tuple of ints, then the same value is used for all given axes.

axisint or tuple of ints, optional

Axis or axes along which elements are shifted. By default, the array is flattened before shifting, after which the original shape is restored.

Returns
resndarray

Output array, with the same shape as a.

See also

rollaxis

Roll the specified axis backwards, until it lies in a given position.

Notes

New in version 1.12.0.

Supports rolling over multiple dimensions simultaneously.

Examples

>>> x = np.arange(10)  
>>> np.roll(x, 2)  
array([8, 9, 0, 1, 2, 3, 4, 5, 6, 7])
>>> np.roll(x, -2)  
array([2, 3, 4, 5, 6, 7, 8, 9, 0, 1])
>>> x2 = np.reshape(x, (2,5))  
>>> x2  
array([[0, 1, 2, 3, 4],
       [5, 6, 7, 8, 9]])
>>> np.roll(x2, 1)  
array([[9, 0, 1, 2, 3],
       [4, 5, 6, 7, 8]])
>>> np.roll(x2, -1)  
array([[1, 2, 3, 4, 5],
       [6, 7, 8, 9, 0]])
>>> np.roll(x2, 1, axis=0)  
array([[5, 6, 7, 8, 9],
       [0, 1, 2, 3, 4]])
>>> np.roll(x2, -1, axis=0)  
array([[5, 6, 7, 8, 9],
       [0, 1, 2, 3, 4]])
>>> np.roll(x2, 1, axis=1)  
array([[4, 0, 1, 2, 3],
       [9, 5, 6, 7, 8]])
>>> np.roll(x2, -1, axis=1)  
array([[1, 2, 3, 4, 0],
       [6, 7, 8, 9, 5]])
dask.array.rollaxis(a, axis, start=0)
dask.array.round(a, decimals=0)

Round an array to the given number of decimals.

This docstring was copied from numpy.round.

Some inconsistencies with the Dask version may exist.

See also

around

equivalent function; see for details.

dask.array.sign(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])

This docstring was copied from numpy.sign.

Some inconsistencies with the Dask version may exist.

Returns an element-wise indication of the sign of a number.

The sign function returns -1 if x < 0, 0 if x==0, 1 if x > 0. nan is returned for nan inputs.

For complex inputs, the sign function returns sign(x.real) + 0j if x.real != 0 else sign(x.imag) + 0j.

complex(nan, 0) is returned for complex nan inputs.

Parameters
xarray_like

Input values.

outndarray, None, or tuple of ndarray and None, optional

A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.

wherearray_like, optional

This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default out=None, locations within it where the condition is False will remain uninitialized.

**kwargs

For other keyword-only arguments, see the ufunc docs.

Returns
yndarray

The sign of x. This is a scalar if x is a scalar.

Notes

There is more than one definition of sign in common use for complex numbers. The definition used here is equivalent to \(x/\sqrt{x*x}\) which is different from a common alternative, \(x/|x|\).

Examples

>>> np.sign([-5., 4.5])  
array([-1.,  1.])
>>> np.sign(0)  
0
>>> np.sign(5-2j)  
(1+0j)
dask.array.signbit(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])

This docstring was copied from numpy.signbit.

Some inconsistencies with the Dask version may exist.

Returns element-wise True where signbit is set (less than zero).

Parameters
xarray_like

The input value(s).

outndarray, None, or tuple of ndarray and None, optional

A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.

wherearray_like, optional

This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default out=None, locations within it where the condition is False will remain uninitialized.

**kwargs

For other keyword-only arguments, see the ufunc docs.

Returns
resultndarray of bool

Output array, or reference to out if that was supplied. This is a scalar if x is a scalar.

Examples

>>> np.signbit(-1.2)  
True
>>> np.signbit(np.array([1, -2.3, 2.1]))  
array([False,  True, False])
dask.array.sin(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])

This docstring was copied from numpy.sin.

Some inconsistencies with the Dask version may exist.

Trigonometric sine, element-wise.

Parameters
xarray_like

Angle, in radians (\(2 \pi\) rad equals 360 degrees).

outndarray, None, or tuple of ndarray and None, optional

A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.

wherearray_like, optional

This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default out=None, locations within it where the condition is False will remain uninitialized.

**kwargs

For other keyword-only arguments, see the ufunc docs.

Returns
yarray_like

The sine of each element of x. This is a scalar if x is a scalar.

See also

arcsin, sinh, cos

Notes

The sine is one of the fundamental functions of trigonometry (the mathematical study of triangles). Consider a circle of radius 1 centered on the origin. A ray comes in from the \(+x\) axis, makes an angle at the origin (measured counter-clockwise from that axis), and departs from the origin. The \(y\) coordinate of the outgoing ray’s intersection with the unit circle is the sine of that angle. It ranges from -1 for \(x=3\pi / 2\) to +1 for \(\pi / 2.\) The function has zeroes where the angle is a multiple of \(\pi\). Sines of angles between \(\pi\) and \(2\pi\) are negative. The numerous properties of the sine and related functions are included in any standard trigonometry text.

Examples

Print sine of one angle:

>>> np.sin(np.pi/2.)  
1.0

Print sines of an array of angles given in degrees:

>>> np.sin(np.array((0., 30., 45., 60., 90.)) * np.pi / 180. )  
array([ 0.        ,  0.5       ,  0.70710678,  0.8660254 ,  1.        ])

Plot the sine function:

>>> import matplotlib.pylab as plt  
>>> x = np.linspace(-np.pi, np.pi, 201)  
>>> plt.plot(x, np.sin(x))  
>>> plt.xlabel('Angle [rad]')  
>>> plt.ylabel('sin(x)')  
>>> plt.axis('tight')  
>>> plt.show()  
dask.array.sinh(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])

This docstring was copied from numpy.sinh.

Some inconsistencies with the Dask version may exist.

Hyperbolic sine, element-wise.

Equivalent to 1/2 * (np.exp(x) - np.exp(-x)) or -1j * np.sin(1j*x).

Parameters
xarray_like

Input array.

outndarray, None, or tuple of ndarray and None, optional

A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.

wherearray_like, optional

This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default out=None, locations within it where the condition is False will remain uninitialized.

**kwargs

For other keyword-only arguments, see the ufunc docs.

Returns
yndarray

The corresponding hyperbolic sine values. This is a scalar if x is a scalar.

Notes

If out is provided, the function writes the result into it, and returns a reference to out. (See Examples)

References

M. Abramowitz and I. A. Stegun, Handbook of Mathematical Functions. New York, NY: Dover, 1972, pg. 83.

Examples

>>> np.sinh(0)  
0.0
>>> np.sinh(np.pi*1j/2)  
1j
>>> np.sinh(np.pi*1j) # (exact value is 0)  
1.2246063538223773e-016j
>>> # Discrepancy due to vagaries of floating point arithmetic.
>>> # Example of providing the optional output parameter
>>> out1 = np.array([0], dtype='d')  
>>> out2 = np.sinh([0.1], out1)  
>>> out2 is out1  
True
>>> # Example of ValueError due to provision of shape mis-matched `out`
>>> np.sinh(np.zeros((3,3)),np.zeros((2,2)))  
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: operands could not be broadcast together with shapes (3,3) (2,2)
dask.array.sqrt(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])

This docstring was copied from numpy.sqrt.

Some inconsistencies with the Dask version may exist.

Return the non-negative square-root of an array, element-wise.

Parameters
xarray_like

The values whose square-roots are required.

outndarray, None, or tuple of ndarray and None, optional

A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.

wherearray_like, optional

This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default out=None, locations within it where the condition is False will remain uninitialized.

**kwargs

For other keyword-only arguments, see the ufunc docs.

Returns
yndarray

An array of the same shape as x, containing the positive square-root of each element in x. If any element in x is complex, a complex array is returned (and the square-roots of negative reals are calculated). If all of the elements in x are real, so is y, with negative elements returning nan. If out was provided, y is a reference to it. This is a scalar if x is a scalar.

See also

lib.scimath.sqrt

A version which returns complex numbers when given negative reals.

Notes

sqrt has–consistent with common convention–as its branch cut the real “interval” [-inf, 0), and is continuous from above on it. A branch cut is a curve in the complex plane across which a given complex function fails to be continuous.

Examples

>>> np.sqrt([1,4,9])  
array([ 1.,  2.,  3.])
>>> np.sqrt([4, -1, -3+4J])  
array([ 2.+0.j,  0.+1.j,  1.+2.j])
>>> np.sqrt([4, -1, np.inf])  
array([ 2., nan, inf])
dask.array.square(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])

This docstring was copied from numpy.square.

Some inconsistencies with the Dask version may exist.

Return the element-wise square of the input.

Parameters
xarray_like

Input data.

outndarray, None, or tuple of ndarray and None, optional

A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.

wherearray_like, optional

This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default out=None, locations within it where the condition is False will remain uninitialized.

**kwargs

For other keyword-only arguments, see the ufunc docs.

Returns
outndarray or scalar

Element-wise x*x, of the same shape and dtype as x. This is a scalar if x is a scalar.

Examples

>>> np.square([-1j, 1])  
array([-1.-0.j,  1.+0.j])
dask.array.squeeze(a, axis=None)

Remove single-dimensional entries from the shape of an array.

This docstring was copied from numpy.squeeze.

Some inconsistencies with the Dask version may exist.

Parameters
aarray_like

Input data.

axisNone or int or tuple of ints, optional

New in version 1.7.0.

Selects a subset of the single-dimensional entries in the shape. If an axis is selected with shape entry greater than one, an error is raised.

Returns
squeezedndarray

The input array, but with all or a subset of the dimensions of length 1 removed. This is always a itself or a view into a.

Raises
ValueError

If axis is not None, and an axis being squeezed is not of length 1

See also

expand_dims

The inverse operation, adding singleton dimensions

reshape

Insert, remove, and combine dimensions, and resize existing ones

Examples

>>> x = np.array([[[0], [1], [2]]])  
>>> x.shape  
(1, 3, 1)
>>> np.squeeze(x).shape  
(3,)
>>> np.squeeze(x, axis=0).shape  
(3, 1)
>>> np.squeeze(x, axis=1).shape  
Traceback (most recent call last):
...
ValueError: cannot select an axis to squeeze out which has size not equal to one
>>> np.squeeze(x, axis=2).shape  
(1, 3)
dask.array.stack(seq, axis=0, allow_unknown_chunksizes=False)

Stack arrays along a new axis

Given a sequence of dask arrays, form a new dask array by stacking them along a new dimension (axis=0 by default)

Parameters
seq: list of dask.arrays
axis: int

Dimension along which to align all of the arrays

allow_unknown_chunksizes: bool

Allow unknown chunksizes, such as come from converting from dask dataframes. Dask.array is unable to verify that chunks line up. If data comes from differently aligned sources then this can cause unexpected results.

See also

concatenate

Examples

Create slices

>>> import dask.array as da
>>> import numpy as np
>>> data = [from_array(np.ones((4, 4)), chunks=(2, 2))
...          for i in range(3)]
>>> x = da.stack(data, axis=0)
>>> x.shape
(3, 4, 4)
>>> da.stack(data, axis=1).shape
(4, 3, 4)
>>> da.stack(data, axis=-1).shape
(4, 4, 3)

Result is a new dask Array

dask.array.std(a, axis=None, dtype=None, keepdims=False, ddof=0, split_every=None, out=None)

Compute the standard deviation along the specified axis.

This docstring was copied from numpy.std.

Some inconsistencies with the Dask version may exist.

Returns the standard deviation, a measure of the spread of a distribution, of the array elements. The standard deviation is computed for the flattened array by default, otherwise over the specified axis.

Parameters
aarray_like

Calculate the standard deviation of these values.

axisNone or int or tuple of ints, optional

Axis or axes along which the standard deviation is computed. The default is to compute the standard deviation of the flattened array.

New in version 1.7.0.

If this is a tuple of ints, a standard deviation is performed over multiple axes, instead of a single axis or all the axes as before.

dtypedtype, optional

Type to use in computing the standard deviation. For arrays of integer type the default is float64, for arrays of float types it is the same as the array type.

outndarray, optional

Alternative output array in which to place the result. It must have the same shape as the expected output but the type (of the calculated values) will be cast if necessary.

ddofint, optional

Means Delta Degrees of Freedom. The divisor used in calculations is N - ddof, where N represents the number of elements. By default ddof is zero.

keepdimsbool, optional

If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the input array.

If the default value is passed, then keepdims will not be passed through to the std method of sub-classes of ndarray, however any non-default value will be. If the sub-class’ method does not implement keepdims any exceptions will be raised.

Returns
standard_deviationndarray, see dtype parameter above.

If out is None, return a new array containing the standard deviation, otherwise return a reference to the output array.

See also

var, mean, nanmean, nanstd, nanvar
ufuncs-output-type

Notes

The standard deviation is the square root of the average of the squared deviations from the mean, i.e., std = sqrt(mean(abs(x - x.mean())**2)).

The average squared deviation is normally calculated as x.sum() / N, where N = len(x). If, however, ddof is specified, the divisor N - ddof is used instead. In standard statistical practice, ddof=1 provides an unbiased estimator of the variance of the infinite population. ddof=0 provides a maximum likelihood estimate of the variance for normally distributed variables. The standard deviation computed in this function is the square root of the estimated variance, so even with ddof=1, it will not be an unbiased estimate of the standard deviation per se.

Note that, for complex numbers, std takes the absolute value before squaring, so that the result is always real and nonnegative.

For floating-point input, the std is computed using the same precision the input has. Depending on the input data, this can cause the results to be inaccurate, especially for float32 (see example below). Specifying a higher-accuracy accumulator using the dtype keyword can alleviate this issue.

Examples

>>> a = np.array([[1, 2], [3, 4]])  
>>> np.std(a)  
1.1180339887498949 # may vary
>>> np.std(a, axis=0)  
array([1.,  1.])
>>> np.std(a, axis=1)  
array([0.5,  0.5])

In single precision, std() can be inaccurate:

>>> a = np.zeros((2, 512*512), dtype=np.float32)  
>>> a[0, :] = 1.0  
>>> a[1, :] = 0.1  
>>> np.std(a)  
0.45000005

Computing the standard deviation in float64 is more accurate:

>>> np.std(a, dtype=np.float64)  
0.44999999925494177 # may vary
dask.array.sum(a, axis=None, dtype=None, keepdims=False, split_every=None, out=None)

Sum of array elements over a given axis.

This docstring was copied from numpy.sum.

Some inconsistencies with the Dask version may exist.

Parameters
aarray_like

Elements to sum.

axisNone or int or tuple of ints, optional

Axis or axes along which a sum is performed. The default, axis=None, will sum all of the elements of the input array. If axis is negative it counts from the last to the first axis.

New in version 1.7.0.

If axis is a tuple of ints, a sum is performed on all of the axes specified in the tuple instead of a single axis or all the axes as before.

dtypedtype, optional

The type of the returned array and of the accumulator in which the elements are summed. The dtype of a is used by default unless a has an integer dtype of less precision than the default platform integer. In that case, if a is signed then the platform integer is used while if a is unsigned then an unsigned integer of the same precision as the platform integer is used.

outndarray, optional

Alternative output array in which to place the result. It must have the same shape as the expected output, but the type of the output values will be cast if necessary.

keepdimsbool, optional

If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the input array.

If the default value is passed, then keepdims will not be passed through to the sum method of sub-classes of ndarray, however any non-default value will be. If the sub-class’ method does not implement keepdims any exceptions will be raised.

initialscalar, optional (Not supported in Dask)

Starting value for the sum. See ~numpy.ufunc.reduce for details.

New in version 1.15.0.

wherearray_like of bool, optional (Not supported in Dask)

Elements to include in the sum. See ~numpy.ufunc.reduce for details.

New in version 1.17.0.

Returns
sum_along_axisndarray

An array with the same shape as a, with the specified axis removed. If a is a 0-d array, or if axis is None, a scalar is returned. If an output array is specified, a reference to out is returned.

See also

ndarray.sum

Equivalent method.

add.reduce

Equivalent functionality of add.

cumsum

Cumulative sum of array elements.

trapz

Integration of array values using the composite trapezoidal rule.

mean, average

Notes

Arithmetic is modular when using integer types, and no error is raised on overflow.

The sum of an empty array is the neutral element 0:

>>> np.sum([])  
0.0

For floating point numbers the numerical precision of sum (and np.add.reduce) is in general limited by directly adding each number individually to the result causing rounding errors in every step. However, often numpy will use a numerically better approach (partial pairwise summation) leading to improved precision in many use-cases. This improved precision is always provided when no axis is given. When axis is given, it will depend on which axis is summed. Technically, to provide the best speed possible, the improved precision is only used when the summation is along the fast axis in memory. Note that the exact precision may vary depending on other parameters. In contrast to NumPy, Python’s math.fsum function uses a slower but more precise approach to summation. Especially when summing a large number of lower precision floating point numbers, such as float32, numerical errors can become significant. In such cases it can be advisable to use dtype=”float64” to use a higher precision for the output.

Examples

>>> np.sum([0.5, 1.5])  
2.0
>>> np.sum([0.5, 0.7, 0.2, 1.5], dtype=np.int32)  
1
>>> np.sum([[0, 1], [0, 5]])  
6
>>> np.sum([[0, 1], [0, 5]], axis=0)  
array([0, 6])
>>> np.sum([[0, 1], [0, 5]], axis=1)  
array([1, 5])
>>> np.sum([[0, 1], [np.nan, 5]], where=[False, True], axis=1)  
array([1., 5.])

If the accumulator is too small, overflow occurs:

>>> np.ones(128, dtype=np.int8).sum(dtype=np.int8)  
-128

You can also start the sum with a value other than zero:

>>> np.sum([10], initial=5)  
15
dask.array.take(a, indices, axis=0)

Take elements from an array along an axis.

This docstring was copied from numpy.take.

Some inconsistencies with the Dask version may exist.

When axis is not None, this function does the same thing as “fancy” indexing (indexing arrays using arrays); however, it can be easier to use if you need elements along a given axis. A call such as np.take(arr, indices, axis=3) is equivalent to arr[:,:,:,indices,...].

Explained without fancy indexing, this is equivalent to the following use of ndindex, which sets each of ii, jj, and kk to a tuple of indices:

Ni, Nk = a.shape[:axis], a.shape[axis+1:]
Nj = indices.shape
for ii in ndindex(Ni):
    for jj in ndindex(Nj):
        for kk in ndindex(Nk):
            out[ii + jj + kk] = a[ii + (indices[jj],) + kk]
Parameters
aarray_like (Ni…, M, Nk…)

The source array.

indicesarray_like (Nj…)

The indices of the values to extract.

New in version 1.8.0.

Also allow scalars for indices.

axisint, optional

The axis over which to select values. By default, the flattened input array is used.

outndarray, optional (Ni…, Nj…, Nk…)

If provided, the result will be placed in this array. It should be of the appropriate shape and dtype. Note that out is always buffered if mode=’raise’; use other modes for better performance.

mode{‘raise’, ‘wrap’, ‘clip’}, optional (Not supported in Dask)

Specifies how out-of-bounds indices will behave.

  • ‘raise’ – raise an error (default)

  • ‘wrap’ – wrap around

  • ‘clip’ – clip to the range

‘clip’ mode means that all indices that are too large are replaced by the index that addresses the last element along that axis. Note that this disables indexing with negative numbers.

Returns
outndarray (Ni…, Nj…, Nk…)

The returned array has the same type as a.

See also

compress

Take elements using a boolean mask

ndarray.take

equivalent method

take_along_axis

Take elements by matching the array and the index arrays

Notes

By eliminating the inner loop in the description above, and using s_ to build simple slice objects, take can be expressed in terms of applying fancy indexing to each 1-d slice:

Ni, Nk = a.shape[:axis], a.shape[axis+1:]
for ii in ndindex(Ni):
    for kk in ndindex(Nj):
        out[ii + s_[...,] + kk] = a[ii + s_[:,] + kk][indices]

For this reason, it is equivalent to (but faster than) the following use of apply_along_axis:

out = np.apply_along_axis(lambda a_1d: a_1d[indices], axis, a)

Examples

>>> a = [4, 3, 5, 7, 6, 8]  
>>> indices = [0, 1, 4]  
>>> np.take(a, indices)  
array([4, 3, 6])

In this example if a is an ndarray, “fancy” indexing can be used.

>>> a = np.array(a)  
>>> a[indices]  
array([4, 3, 6])

If indices is not one dimensional, the output also has these dimensions.

>>> np.take(a, [[0, 1], [2, 3]])  
array([[4, 3],
       [5, 7]])
dask.array.tan(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])

This docstring was copied from numpy.tan.

Some inconsistencies with the Dask version may exist.

Compute tangent element-wise.

Equivalent to np.sin(x)/np.cos(x) element-wise.

Parameters
xarray_like

Input array.

outndarray, None, or tuple of ndarray and None, optional

A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.

wherearray_like, optional

This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default out=None, locations within it where the condition is False will remain uninitialized.

**kwargs

For other keyword-only arguments, see the ufunc docs.

Returns
yndarray

The corresponding tangent values. This is a scalar if x is a scalar.

Notes

If out is provided, the function writes the result into it, and returns a reference to out. (See Examples)

References

M. Abramowitz and I. A. Stegun, Handbook of Mathematical Functions. New York, NY: Dover, 1972.

Examples

>>> from math import pi  
>>> np.tan(np.array([-pi,pi/2,pi]))  
array([  1.22460635e-16,   1.63317787e+16,  -1.22460635e-16])
>>>
>>> # Example of providing the optional output parameter illustrating
>>> # that what is returned is a reference to said parameter
>>> out1 = np.array([0], dtype='d')  
>>> out2 = np.cos([0.1], out1)  
>>> out2 is out1  
True
>>>
>>> # Example of ValueError due to provision of shape mis-matched `out`
>>> np.cos(np.zeros((3,3)),np.zeros((2,2)))  
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: operands could not be broadcast together with shapes (3,3) (2,2)
dask.array.tanh(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])

This docstring was copied from numpy.tanh.

Some inconsistencies with the Dask version may exist.

Compute hyperbolic tangent element-wise.

Equivalent to np.sinh(x)/np.cosh(x) or -1j * np.tan(1j*x).

Parameters
xarray_like

Input array.

outndarray, None, or tuple of ndarray and None, optional

A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.

wherearray_like, optional

This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default out=None, locations within it where the condition is False will remain uninitialized.

**kwargs

For other keyword-only arguments, see the ufunc docs.

Returns
yndarray

The corresponding hyperbolic tangent values. This is a scalar if x is a scalar.

Notes

If out is provided, the function writes the result into it, and returns a reference to out. (See Examples)

References

1

M. Abramowitz and I. A. Stegun, Handbook of Mathematical Functions. New York, NY: Dover, 1972, pg. 83. http://www.math.sfu.ca/~cbm/aands/

2

Wikipedia, “Hyperbolic function”, https://en.wikipedia.org/wiki/Hyperbolic_function

Examples

>>> np.tanh((0, np.pi*1j, np.pi*1j/2))  
array([ 0. +0.00000000e+00j,  0. -1.22460635e-16j,  0. +1.63317787e+16j])
>>> # Example of providing the optional output parameter illustrating
>>> # that what is returned is a reference to said parameter
>>> out1 = np.array([0], dtype='d')  
>>> out2 = np.tanh([0.1], out1)  
>>> out2 is out1  
True
>>> # Example of ValueError due to provision of shape mis-matched `out`
>>> np.tanh(np.zeros((3,3)),np.zeros((2,2)))  
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: operands could not be broadcast together with shapes (3,3) (2,2)
dask.array.tensordot(lhs, rhs, axes=2)

Compute tensor dot product along specified axes.

This docstring was copied from numpy.tensordot.

Some inconsistencies with the Dask version may exist.

Given two tensors, a and b, and an array_like object containing two array_like objects, (a_axes, b_axes), sum the products of a’s and b’s elements (components) over the axes specified by a_axes and b_axes. The third argument can be a single non-negative integer_like scalar, N; if it is such, then the last N dimensions of a and the first N dimensions of b are summed over.

Parameters
a, barray_like

Tensors to “dot”.

axesint or (2,) array_like
  • integer_like If an int N, sum over the last N axes of a and the first N axes of b in order. The sizes of the corresponding axes must match.

  • (2,) array_like Or, a list of axes to be summed over, first sequence applying to a, second to b. Both elements array_like must be of the same length.

Returns
outputndarray

The tensor dot product of the input.

See also

dot, einsum

Notes

Three common use cases are:
  • axes = 0 : tensor product \(a\otimes b\)

  • axes = 1 : tensor dot product \(a\cdot b\)

  • axes = 2 : (default) tensor double contraction \(a:b\)

When axes is integer_like, the sequence for evaluation will be: first the -Nth axis in a and 0th axis in b, and the -1th axis in a and Nth axis in b last.

When there is more than one axis to sum over - and they are not the last (first) axes of a (b) - the argument axes should consist of two sequences of the same length, with the first axis to sum over given first in both sequences, the second axis second, and so forth.

The shape of the result consists of the non-contracted axes of the first tensor, followed by the non-contracted axes of the second.

Examples

A “traditional” example:

>>> a = np.arange(60.).reshape(3,4,5)  
>>> b = np.arange(24.).reshape(4,3,2)  
>>> c = np.tensordot(a,b, axes=([1,0],[0,1]))  
>>> c.shape  
(5, 2)
>>> c  
array([[4400., 4730.],
       [4532., 4874.],
       [4664., 5018.],
       [4796., 5162.],
       [4928., 5306.]])
>>> # A slower but equivalent way of computing the same...
>>> d = np.zeros((5,2))  
>>> for i in range(5):  
...   for j in range(2):
...     for k in range(3):
...       for n in range(4):
...         d[i,j] += a[k,n,i] * b[n,k,j]
>>> c == d  
array([[ True,  True],
       [ True,  True],
       [ True,  True],
       [ True,  True],
       [ True,  True]])

An extended example taking advantage of the overloading of + and *:

>>> a = np.array(range(1, 9))  
>>> a.shape = (2, 2, 2)  
>>> A = np.array(('a', 'b', 'c', 'd'), dtype=object)  
>>> A.shape = (2, 2)  
>>> a; A  
array([[[1, 2],
        [3, 4]],
       [[5, 6],
        [7, 8]]])
array([['a', 'b'],
       ['c', 'd']], dtype=object)
>>> np.tensordot(a, A) # third argument default is 2 for double-contraction  
array(['abbcccdddd', 'aaaaabbbbbbcccccccdddddddd'], dtype=object)
>>> np.tensordot(a, A, 1)  
array([[['acc', 'bdd'],
        ['aaacccc', 'bbbdddd']],
       [['aaaaacccccc', 'bbbbbdddddd'],
        ['aaaaaaacccccccc', 'bbbbbbbdddddddd']]], dtype=object)
>>> np.tensordot(a, A, 0) # tensor product (result too long to incl.)  
array([[[[['a', 'b'],
          ['c', 'd']],
          ...
>>> np.tensordot(a, A, (0, 1))  
array([[['abbbbb', 'cddddd'],
        ['aabbbbbb', 'ccdddddd']],
       [['aaabbbbbbb', 'cccddddddd'],
        ['aaaabbbbbbbb', 'ccccdddddddd']]], dtype=object)
>>> np.tensordot(a, A, (2, 1))  
array([[['abb', 'cdd'],
        ['aaabbbb', 'cccdddd']],
       [['aaaaabbbbbb', 'cccccdddddd'],
        ['aaaaaaabbbbbbbb', 'cccccccdddddddd']]], dtype=object)
>>> np.tensordot(a, A, ((0, 1), (0, 1)))  
array(['abbbcccccddddddd', 'aabbbbccccccdddddddd'], dtype=object)
>>> np.tensordot(a, A, ((2, 1), (1, 0)))  
array(['acccbbdddd', 'aaaaacccccccbbbbbbdddddddd'], dtype=object)
dask.array.tile(A, reps)

Construct an array by repeating A the number of times given by reps.

This docstring was copied from numpy.tile.

Some inconsistencies with the Dask version may exist.

If reps has length d, the result will have dimension of max(d, A.ndim).

If A.ndim < d, A is promoted to be d-dimensional by prepending new axes. So a shape (3,) array is promoted to (1, 3) for 2-D replication, or shape (1, 1, 3) for 3-D replication. If this is not the desired behavior, promote A to d-dimensions manually before calling this function.

If A.ndim > d, reps is promoted to A.ndim by pre-pending 1’s to it. Thus for an A of shape (2, 3, 4, 5), a reps of (2, 2) is treated as (1, 1, 2, 2).

Note : Although tile may be used for broadcasting, it is strongly recommended to use numpy’s broadcasting operations and functions.

Parameters
Aarray_like

The input array.

repsarray_like

The number of repetitions of A along each axis.

Returns
cndarray

The tiled output array.

See also

repeat

Repeat elements of an array.

broadcast_to

Broadcast an array to a new shape

Examples

>>> a = np.array([0, 1, 2])  
>>> np.tile(a, 2)  
array([0, 1, 2, 0, 1, 2])
>>> np.tile(a, (2, 2))  
array([[0, 1, 2, 0, 1, 2],
       [0, 1, 2, 0, 1, 2]])
>>> np.tile(a, (2, 1, 2))  
array([[[0, 1, 2, 0, 1, 2]],
       [[0, 1, 2, 0, 1, 2]]])
>>> b = np.array([[1, 2], [3, 4]])  
>>> np.tile(b, 2)  
array([[1, 2, 1, 2],
       [3, 4, 3, 4]])
>>> np.tile(b, (2, 1))  
array([[1, 2],
       [3, 4],
       [1, 2],
       [3, 4]])
>>> c = np.array([1,2,3,4])  
>>> np.tile(c,(4,1))  
array([[1, 2, 3, 4],
       [1, 2, 3, 4],
       [1, 2, 3, 4],
       [1, 2, 3, 4]])
dask.array.topk(a, k, axis=-1, split_every=None)

Extract the k largest elements from a on the given axis, and return them sorted from largest to smallest. If k is negative, extract the -k smallest elements instead, and return them sorted from smallest to largest.

This performs best when k is much smaller than the chunk size. All results will be returned in a single chunk along the given axis.

Parameters
x: Array

Data being sorted

k: int
axis: int, optional
split_every: int >=2, optional

See reduce(). This parameter becomes very important when k is on the same order of magnitude of the chunk size or more, as it prevents getting the whole or a significant portion of the input array in memory all at once, with a negative impact on network transfer too when running on distributed.

Returns
Selection of x with size abs(k) along the given axis.

Examples

>>> import dask.array as da
>>> x = np.array([5, 1, 3, 6])
>>> d = da.from_array(x, chunks=2)
>>> d.topk(2).compute()
array([6, 5])
>>> d.topk(-2).compute()
array([1, 3])
dask.array.transpose(a, axes=None)

Permute the dimensions of an array.

This docstring was copied from numpy.transpose.

Some inconsistencies with the Dask version may exist.

Parameters
aarray_like

Input array.

axeslist of ints, optional

By default, reverse the dimensions, otherwise permute the axes according to the values given.

Returns
pndarray

a with its axes permuted. A view is returned whenever possible.

See also

moveaxis
argsort

Notes

Use transpose(a, argsort(axes)) to invert the transposition of tensors when using the axes keyword argument.

Transposing a 1-D array returns an unchanged view of the original array.

Examples

>>> x = np.arange(4).reshape((2,2))  
>>> x  
array([[0, 1],
       [2, 3]])
>>> np.transpose(x)  
array([[0, 2],
       [1, 3]])
>>> x = np.ones((1, 2, 3))  
>>> np.transpose(x, (1, 0, 2)).shape  
(2, 1, 3)
dask.array.tril(m, k=0)

Lower triangle of an array with elements above the k-th diagonal zeroed.

Parameters
marray_like, shape (M, M)

Input array.

kint, optional

Diagonal above which to zero elements. k = 0 (the default) is the main diagonal, k < 0 is below it and k > 0 is above.

Returns
trilndarray, shape (M, M)

Lower triangle of m, of same shape and data-type as m.

See also

triu

upper triangle of an array

dask.array.triu(m, k=0)

Upper triangle of an array with elements below the k-th diagonal zeroed.

Parameters
marray_like, shape (M, N)

Input array.

kint, optional

Diagonal below which to zero elements. k = 0 (the default) is the main diagonal, k < 0 is below it and k > 0 is above.

Returns
triundarray, shape (M, N)

Upper triangle of m, of same shape and data-type as m.

See also

tril

lower triangle of an array

dask.array.trunc(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])

This docstring was copied from numpy.trunc.

Some inconsistencies with the Dask version may exist.

Return the truncated value of the input, element-wise.

The truncated value of the scalar x is the nearest integer i which is closer to zero than x is. In short, the fractional part of the signed number x is discarded.

Parameters
xarray_like

Input data.

outndarray, None, or tuple of ndarray and None, optional

A location into which the result is stored. If provided, it must have a shape that the inputs broadcast to. If not provided or None, a freshly-allocated array is returned. A tuple (possible only as a keyword argument) must have length equal to the number of outputs.

wherearray_like, optional

This condition is broadcast over the input. At locations where the condition is True, the out array will be set to the ufunc result. Elsewhere, the out array will retain its original value. Note that if an uninitialized out array is created via the default out=None, locations within it where the condition is False will remain uninitialized.

**kwargs

For other keyword-only arguments, see the ufunc docs.

Returns
yndarray or scalar

The truncated value of each element in x. This is a scalar if x is a scalar.

See also

ceil, floor, rint

Notes

New in version 1.3.0.

Examples

>>> a = np.array([-1.7, -1.5, -0.2, 0.2, 1.5, 1.7, 2.0])  
>>> np.trunc(a)  
array([-1., -1., -0.,  0.,  1.,  1.,  2.])
dask.array.unique(ar, return_index=False, return_inverse=False, return_counts=False)

Find the unique elements of an array.

This docstring was copied from numpy.unique.

Some inconsistencies with the Dask version may exist.

Returns the sorted unique elements of an array. There are three optional outputs in addition to the unique elements:

  • the indices of the input array that give the unique values

  • the indices of the unique array that reconstruct the input array

  • the number of times each unique value comes up in the input array

Parameters
ararray_like

Input array. Unless axis is specified, this will be flattened if it is not already 1-D.

return_indexbool, optional

If True, also return the indices of ar (along the specified axis, if provided, or in the flattened array) that result in the unique array.

return_inversebool, optional

If True, also return the indices of the unique array (for the specified axis, if provided) that can be used to reconstruct ar.

return_countsbool, optional

If True, also return the number of times each unique item appears in ar.

New in version 1.9.0.

axisint or None, optional (Not supported in Dask)

The axis to operate on. If None, ar will be flattened. If an integer, the subarrays indexed by the given axis will be flattened and treated as the elements of a 1-D array with the dimension of the given axis, see the notes for more details. Object arrays or structured arrays that contain objects are not supported if the axis kwarg is used. The default is None.

New in version 1.13.0.

Returns
uniquendarray

The sorted unique values.

unique_indicesndarray, optional

The indices of the first occurrences of the unique values in the original array. Only provided if return_index is True.

unique_inversendarray, optional

The indices to reconstruct the original array from the unique array. Only provided if return_inverse is True.

unique_countsndarray, optional

The number of times each of the unique values comes up in the original array. Only provided if return_counts is True.

New in version 1.9.0.

See also

numpy.lib.arraysetops

Module with a number of other functions for performing set operations on arrays.

Notes

When an axis is specified the subarrays indexed by the axis are sorted. This is done by making the specified axis the first dimension of the array (move the axis to the first dimension to keep the order of the other axes) and then flattening the subarrays in C order. The flattened subarrays are then viewed as a structured type with each element given a label, with the effect that we end up with a 1-D array of structured types that can be treated in the same way as any other 1-D array. The result is that the flattened subarrays are sorted in lexicographic order starting with the first element.

Examples

>>> np.unique([1, 1, 2, 2, 3, 3])  
array([1, 2, 3])
>>> a = np.array([[1, 1], [2, 3]])  
>>> np.unique(a)  
array([1, 2, 3])

Return the unique rows of a 2D array

>>> a = np.array([[1, 0, 0], [1, 0, 0], [2, 3, 4]])  
>>> np.unique(a, axis=0)  
array([[1, 0, 0], [2, 3, 4]])

Return the indices of the original array that give the unique values:

>>> a = np.array(['a', 'b', 'b', 'c', 'a'])  
>>> u, indices = np.unique(a, return_index=True)  
>>> u  
array(['a', 'b', 'c'], dtype='<U1')
>>> indices  
array([0, 1, 3])
>>> a[indices]  
array(['a', 'b', 'c'], dtype='<U1')

Reconstruct the input array from the unique values:

>>> a = np.array([1, 2, 6, 4, 2, 3, 2])  
>>> u, indices = np.unique(a, return_inverse=True)  
>>> u  
array([1, 2, 3, 4, 6])
>>> indices  
array([0, 1, 4, ..., 1, 2, 1])
>>> u[indices]  
array([1, 2, 6, ..., 2, 3, 2])
dask.array.unravel_index(indices, shape, order='C')

This docstring was copied from numpy.unravel_index.

Some inconsistencies with the Dask version may exist.

Converts a flat index or array of flat indices into a tuple of coordinate arrays.

Parameters
indicesarray_like

An integer array whose elements are indices into the flattened version of an array of dimensions shape. Before version 1.6.0, this function accepted just one index value.

shapetuple of ints

The shape of the array to use for unraveling indices.

Changed in version 1.16.0: Renamed from dims to shape.

order{‘C’, ‘F’}, optional

Determines whether the indices should be viewed as indexing in row-major (C-style) or column-major (Fortran-style) order.

New in version 1.6.0.

Returns
unraveled_coordstuple of ndarray

Each array in the tuple has the same shape as the indices array.

See also

ravel_multi_index

Examples

>>> np.unravel_index([22, 41, 37], (7,6))  
(array([3, 6, 6]), array([4, 5, 1]))
>>> np.unravel_index([31, 41, 13], (7,6), order='F')  
(array([3, 6, 6]), array([4, 5, 1]))
>>> np.unravel_index(1621, (6,7,8,9))  
(3, 1, 4, 1)
dask.array.var(a, axis=None, dtype=None, keepdims=False, ddof=0, split_every=None, out=None)

Compute the variance along the specified axis.

This docstring was copied from numpy.var.

Some inconsistencies with the Dask version may exist.

Returns the variance of the array elements, a measure of the spread of a distribution. The variance is computed for the flattened array by default, otherwise over the specified axis.

Parameters
aarray_like

Array containing numbers whose variance is desired. If a is not an array, a conversion is attempted.

axisNone or int or tuple of ints, optional

Axis or axes along which the variance is computed. The default is to compute the variance of the flattened array.

New in version 1.7.0.

If this is a tuple of ints, a variance is performed over multiple axes, instead of a single axis or all the axes as before.

dtypedata-type, optional

Type to use in computing the variance. For arrays of integer type the default is float64; for arrays of float types it is the same as the array type.

outndarray, optional

Alternate output array in which to place the result. It must have the same shape as the expected output, but the type is cast if necessary.

ddofint, optional

“Delta Degrees of Freedom”: the divisor used in the calculation is N - ddof, where N represents the number of elements. By default ddof is zero.

keepdimsbool, optional

If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the input array.

If the default value is passed, then keepdims will not be passed through to the var method of sub-classes of ndarray, however any non-default value will be. If the sub-class’ method does not implement keepdims any exceptions will be raised.

Returns
variancendarray, see dtype parameter above

If out=None, returns a new array containing the variance; otherwise, a reference to the output array is returned.

See also

std, mean, nanmean, nanstd, nanvar
ufuncs-output-type

Notes

The variance is the average of the squared deviations from the mean, i.e., var = mean(abs(x - x.mean())**2).

The mean is normally calculated as x.sum() / N, where N = len(x). If, however, ddof is specified, the divisor N - ddof is used instead. In standard statistical practice, ddof=1 provides an unbiased estimator of the variance of a hypothetical infinite population. ddof=0 provides a maximum likelihood estimate of the variance for normally distributed variables.

Note that for complex numbers, the absolute value is taken before squaring, so that the result is always real and nonnegative.

For floating-point input, the variance is computed using the same precision the input has. Depending on the input data, this can cause the results to be inaccurate, especially for float32 (see example below). Specifying a higher-accuracy accumulator using the dtype keyword can alleviate this issue.

Examples

>>> a = np.array([[1, 2], [3, 4]])  
>>> np.var(a)  
1.25
>>> np.var(a, axis=0)  
array([1.,  1.])
>>> np.var(a, axis=1)  
array([0.25,  0.25])

In single precision, var() can be inaccurate:

>>> a = np.zeros((2, 512*512), dtype=np.float32)  
>>> a[0, :] = 1.0  
>>> a[1, :] = 0.1  
>>> np.var(a)  
0.20250003

Computing the variance in float64 is more accurate:

>>> np.var(a, dtype=np.float64)  
0.20249999932944759 # may vary
>>> ((1-0.55)**2 + (0.1-0.55)**2)/2  
0.2025
dask.array.vdot(a, b)

This docstring was copied from numpy.vdot.

Some inconsistencies with the Dask version may exist.

Return the dot product of two vectors.

The vdot(a, b) function handles complex numbers differently than dot(a, b). If the first argument is complex the complex conjugate of the first argument is used for the calculation of the dot product.

Note that vdot handles multidimensional arrays differently than dot: it does not perform a matrix product, but flattens input arguments to 1-D vectors first. Consequently, it should only be used for vectors.

Parameters
aarray_like

If a is complex the complex conjugate is taken before calculation of the dot product.

barray_like

Second argument to the dot product.

Returns
outputndarray

Dot product of a and b. Can be an int, float, or complex depending on the types of a and b.

See also

dot

Return the dot product without using the complex conjugate of the first argument.

Examples

>>> a = np.array([1+2j,3+4j])  
>>> b = np.array([5+6j,7+8j])  
>>> np.vdot(a, b)  
(70-8j)
>>> np.vdot(b, a)  
(70+8j)

Note that higher-dimensional arrays are flattened!

>>> a = np.array([[1, 4], [5, 6]])  
>>> b = np.array([[4, 1], [2, 2]])  
>>> np.vdot(a, b)  
30
>>> np.vdot(b, a)  
30
>>> 1*4 + 4*1 + 5*2 + 6*2  
30
dask.array.vstack(tup, allow_unknown_chunksizes=False)

Stack arrays in sequence vertically (row wise).

This docstring was copied from numpy.vstack.

Some inconsistencies with the Dask version may exist.

This is equivalent to concatenation along the first axis after 1-D arrays of shape (N,) have been reshaped to (1,N). Rebuilds arrays divided by vsplit.

This function makes most sense for arrays with up to 3 dimensions. For instance, for pixel-data with a height (first axis), width (second axis), and r/g/b channels (third axis). The functions concatenate, stack and block provide more general stacking and concatenation operations.

Parameters
tupsequence of ndarrays

The arrays must have the same shape along all but the first axis. 1-D arrays must have the same length.

Returns
stackedndarray

The array formed by stacking the given arrays, will be at least 2-D.

See also

stack

Join a sequence of arrays along a new axis.

hstack

Stack arrays in sequence horizontally (column wise).

dstack

Stack arrays in sequence depth wise (along third dimension).

concatenate

Join a sequence of arrays along an existing axis.

vsplit

Split array into a list of multiple sub-arrays vertically.

block

Assemble arrays from blocks.

Examples

>>> a = np.array([1, 2, 3])  
>>> b = np.array([2, 3, 4])  
>>> np.vstack((a,b))  
array([[1, 2, 3],
       [2, 3, 4]])
>>> a = np.array([[1], [2], [3]])  
>>> b = np.array([[2], [3], [4]])  
>>> np.vstack((a,b))  
array([[1],
       [2],
       [3],
       [2],
       [3],
       [4]])
dask.array.where(condition[, x, y])

This docstring was copied from numpy.where.

Some inconsistencies with the Dask version may exist.

Return elements chosen from x or y depending on condition.

Note

When only condition is provided, this function is a shorthand for np.asarray(condition).nonzero(). Using nonzero directly should be preferred, as it behaves correctly for subclasses. The rest of this documentation covers only the case where all three arguments are provided.

Parameters
conditionarray_like, bool

Where True, yield x, otherwise yield y.

x, yarray_like

Values from which to choose. x, y and condition need to be broadcastable to some shape.

Returns
outndarray

An array with elements from x where condition is True, and elements from y elsewhere.

See also

choose
nonzero

The function that is called when x and y are omitted

Notes

If all the arrays are 1-D, where is equivalent to:

[xv if c else yv
 for c, xv, yv in zip(condition, x, y)]

Examples

>>> a = np.arange(10)  
>>> a  
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
>>> np.where(a < 5, a, 10*a)  
array([ 0,  1,  2,  3,  4, 50, 60, 70, 80, 90])

This can be used on multidimensional arrays too:

>>> np.where([[True, False], [True, True]],  
...          [[1, 2], [3, 4]],
...          [[9, 8], [7, 6]])
array([[1, 8],
       [3, 4]])

The shapes of x, y, and the condition are broadcast together:

>>> x, y = np.ogrid[:3, :4]  
>>> np.where(x < y, x, 10 + y)  # both x and 10+y are broadcast  
array([[10,  0,  0,  0],
       [10, 11,  1,  1],
       [10, 11, 12,  2]])
>>> a = np.array([[0, 1, 2],  
...               [0, 2, 4],
...               [0, 3, 6]])
>>> np.where(a < 4, a, -1)  # -1 is broadcast  
array([[ 0,  1,  2],
       [ 0,  2, -1],
       [ 0,  3, -1]])
dask.array.zeros(*args, **kwargs)

Blocked variant of zeros

Follows the signature of zeros exactly except that it also features optional keyword arguments chunks: int, tuple, or dict and name: str.

Original signature follows below. zeros(shape, dtype=float, order=’C’)

Return a new array of given shape and type, filled with zeros.

Parameters
shapeint or tuple of ints

Shape of the new array, e.g., (2, 3) or 2.

dtypedata-type, optional

The desired data-type for the array, e.g., numpy.int8. Default is numpy.float64.

order{‘C’, ‘F’}, optional, default: ‘C’

Whether to store multi-dimensional data in row-major (C-style) or column-major (Fortran-style) order in memory.

Returns
outndarray

Array of zeros with the given shape, dtype, and order.

See also

zeros_like

Return an array of zeros with shape and type of input.

empty

Return a new uninitialized array.

ones

Return a new array setting values to one.

full

Return a new array of given shape filled with value.

Examples

>>> np.zeros(5)
array([ 0.,  0.,  0.,  0.,  0.])
>>> np.zeros((5,), dtype=int)
array([0, 0, 0, 0, 0])
>>> np.zeros((2, 1))
array([[ 0.],
       [ 0.]])
>>> s = (2,2)
>>> np.zeros(s)
array([[ 0.,  0.],
       [ 0.,  0.]])
>>> np.zeros((2,), dtype=[('x', 'i4'), ('y', 'i4')]) # custom dtype
array([(0, 0), (0, 0)],
      dtype=[('x', '<i4'), ('y', '<i4')])
dask.array.zeros_like(a, dtype=None, order='C', chunks=None, name=None, shape=None)

Return an array of zeros with the same shape and type as a given array.

Parameters
aarray_like

The shape and data-type of a define these same attributes of the returned array.

dtypedata-type, optional

Overrides the data type of the result.

order{‘C’, ‘F’}, optional

Whether to store multidimensional data in C- or Fortran-contiguous (row- or column-wise) order in memory.

chunkssequence of ints

The number of samples on each block. Note that the last block will have fewer samples if len(array) % chunks != 0.

namestr, optional

An optional keyname for the array. Defaults to hashing the input keyword arguments.

shapeint or sequence of ints, optional.

Overrides the shape of the result.

Returns
outndarray

Array of zeros with the same shape and type as a.

See also

ones_like

Return an array of ones with shape and type of input.

empty_like

Return an empty array with shape and type of input.

zeros

Return a new array setting values to zero.

ones

Return a new array setting values to one.

empty

Return a new uninitialized array.

dask.array.linalg.cholesky(a, lower=False)

Returns the Cholesky decomposition, \(A = L L^*\) or \(A = U^* U\) of a Hermitian positive-definite matrix A.

Parameters
a(M, M) array_like

Matrix to be decomposed

lowerbool, optional

Whether to compute the upper or lower triangular Cholesky factorization. Default is upper-triangular.

Returns
c(M, M) Array

Upper- or lower-triangular Cholesky factor of a.

dask.array.linalg.inv(a)

Compute the inverse of a matrix with LU decomposition and forward / backward substitutions.

Parameters
aarray_like

Square matrix to be inverted.

Returns
ainvArray

Inverse of the matrix a.

dask.array.linalg.lstsq(a, b)

Return the least-squares solution to a linear matrix equation using QR decomposition.

Solves the equation a x = b by computing a vector x that minimizes the Euclidean 2-norm || b - a x ||^2. The equation may be under-, well-, or over- determined (i.e., the number of linearly independent rows of a can be less than, equal to, or greater than its number of linearly independent columns). If a is square and of full rank, then x (but for round-off error) is the “exact” solution of the equation.

Parameters
a(M, N) array_like

“Coefficient” matrix.

b(M,) array_like

Ordinate or “dependent variable” values.

Returns
x(N,) Array

Least-squares solution. If b is two-dimensional, the solutions are in the K columns of x.

residuals(1,) Array

Sums of residuals; squared Euclidean 2-norm for each column in b - a*x.

rankArray

Rank of matrix a.

s(min(M, N),) Array

Singular values of a.

dask.array.linalg.lu(a)

Compute the lu decomposition of a matrix.

Returns
p: Array, permutation matrix
l: Array, lower triangular matrix with unit diagonal.
u: Array, upper triangular matrix

Examples

>>> p, l, u = da.linalg.lu(x)  
dask.array.linalg.norm(x, ord=None, axis=None, keepdims=False)

Matrix or vector norm.

This docstring was copied from numpy.linalg.norm.

Some inconsistencies with the Dask version may exist.

This function is able to return one of eight different matrix norms, or one of an infinite number of vector norms (described below), depending on the value of the ord parameter.

Parameters
xarray_like

Input array. If axis is None, x must be 1-D or 2-D, unless ord is None. If both axis and ord are None, the 2-norm of x.ravel will be returned.

ord{non-zero int, inf, -inf, ‘fro’, ‘nuc’}, optional

Order of the norm (see table under Notes). inf means numpy’s inf object. The default is None.

axis{None, int, 2-tuple of ints}, optional.

If axis is an integer, it specifies the axis of x along which to compute the vector norms. If axis is a 2-tuple, it specifies the axes that hold 2-D matrices, and the matrix norms of these matrices are computed. If axis is None then either a vector norm (when x is 1-D) or a matrix norm (when x is 2-D) is returned. The default is None.

New in version 1.8.0.

keepdimsbool, optional

If this is set to True, the axes which are normed over are left in the result as dimensions with size one. With this option the result will broadcast correctly against the original x.

New in version 1.10.0.

Returns
nfloat or ndarray

Norm of the matrix or vector(s).

Notes

For values of ord <= 0, the result is, strictly speaking, not a mathematical ‘norm’, but it may still be useful for various numerical purposes.

The following norms can be calculated:

ord

norm for matrices

norm for vectors

None

Frobenius norm

2-norm

‘fro’

Frobenius norm

‘nuc’

nuclear norm

inf

max(sum(abs(x), axis=1))

max(abs(x))

-inf

min(sum(abs(x), axis=1))

min(abs(x))

0

sum(x != 0)

1

max(sum(abs(x), axis=0))

as below

-1

min(sum(abs(x), axis=0))

as below

2

2-norm (largest sing. value)

as below

-2

smallest singular value

as below

other

sum(abs(x)**ord)**(1./ord)

The Frobenius norm is given by [1]:

\(||A||_F = [\sum_{i,j} abs(a_{i,j})^2]^{1/2}\)

The nuclear norm is the sum of the singular values.

References

1

G. H. Golub and C. F. Van Loan, Matrix Computations, Baltimore, MD, Johns Hopkins University Press, 1985, pg. 15

Examples

>>> from numpy import linalg as LA  
>>> a = np.arange(9) - 4  
>>> a  
array([-4, -3, -2, ...,  2,  3,  4])
>>> b = a.reshape((3, 3))  
>>> b  
array([[-4, -3, -2],
       [-1,  0,  1],
       [ 2,  3,  4]])
>>> LA.norm(a)  
7.745966692414834
>>> LA.norm(b)  
7.745966692414834
>>> LA.norm(b, 'fro')  
7.745966692414834
>>> LA.norm(a, np.inf)  
4.0
>>> LA.norm(b, np.inf)  
9.0
>>> LA.norm(a, -np.inf)  
0.0
>>> LA.norm(b, -np.inf)  
2.0
>>> LA.norm(a, 1)  
20.0
>>> LA.norm(b, 1)  
7.0
>>> LA.norm(a, -1)  
-4.6566128774142013e-010
>>> LA.norm(b, -1)  
6.0
>>> LA.norm(a, 2)  
7.745966692414834
>>> LA.norm(b, 2)  
7.3484692283495345
>>> LA.norm(a, -2)  
0.0
>>> LA.norm(b, -2)  
1.8570331885190563e-016 # may vary
>>> LA.norm(a, 3)  
5.8480354764257312 # may vary
>>> LA.norm(a, -3)  
0.0

Using the axis argument to compute vector norms:

>>> c = np.array([[ 1, 2, 3],  
...               [-1, 1, 4]])
>>> LA.norm(c, axis=0)  
array([ 1.41421356,  2.23606798,  5.        ])
>>> LA.norm(c, axis=1)  
array([ 3.74165739,  4.24264069])
>>> LA.norm(c, ord=1, axis=1)  
array([ 6.,  6.])

Using the axis argument to compute matrix norms:

>>> m = np.arange(8).reshape(2,2,2)  
>>> LA.norm(m, axis=(1,2))  
array([  3.74165739,  11.22497216])
>>> LA.norm(m[0, :, :]), LA.norm(m[1, :, :])  
(3.7416573867739413, 11.224972160321824)
dask.array.linalg.qr(a)

Compute the qr factorization of a matrix.

Parameters
aArray
Returns
q: Array, orthonormal
r: Array, upper-triangular

See also

numpy.linalg.qr

Equivalent NumPy Operation

dask.array.linalg.tsqr

Implementation for tall-and-skinny arrays

dask.array.linalg.sfqr

Implementation for short-and-fat arrays

Examples

>>> q, r = da.linalg.qr(x)  
dask.array.linalg.solve(a, b, sym_pos=False)

Solve the equation a x = b for x. By default, use LU decomposition and forward / backward substitutions. When sym_pos is True, use Cholesky decomposition.

Parameters
a(M, M) array_like

A square matrix.

b(M,) or (M, N) array_like

Right-hand side matrix in a x = b.

sym_posbool

Assume a is symmetric and positive definite. If True, use Cholesky decomposition.

Returns
x(M,) or (M, N) Array

Solution to the system a x = b. Shape of the return matches the shape of b.

dask.array.linalg.solve_triangular(a, b, lower=False)

Solve the equation a x = b for x, assuming a is a triangular matrix.

Parameters
a(M, M) array_like

A triangular matrix

b(M,) or (M, N) array_like

Right-hand side matrix in a x = b

lowerbool, optional

Use only data contained in the lower triangle of a. Default is to use upper triangle.

Returns
x(M,) or (M, N) array

Solution to the system a x = b. Shape of return matches b.

dask.array.linalg.svd(a)

Compute the singular value decomposition of a matrix.

Returns
u: Array, unitary / orthogonal
s: Array, singular values in decreasing order (largest first)
v: Array, unitary / orthogonal

See also

np.linalg.svd

Equivalent NumPy Operation

dask.array.linalg.tsqr

Implementation for tall-and-skinny arrays

Examples

>>> u, s, v = da.linalg.svd(x)  
dask.array.linalg.svd_compressed(a, k, n_power_iter=0, seed=None, compute=False)

Randomly compressed rank-k thin Singular Value Decomposition.

This computes the approximate singular value decomposition of a large array. This algorithm is generally faster than the normal algorithm but does not provide exact results. One can balance between performance and accuracy with input parameters (see below).

Parameters
a: Array

Input array

k: int

Rank of the desired thin SVD decomposition.

n_power_iter: int

Number of power iterations, useful when the singular values decay slowly. Error decreases exponentially as n_power_iter increases. In practice, set n_power_iter <= 4.

computebool

Whether or not to compute data at each use. Recomputing the input while performing several passes reduces memory pressure, but means that we have to compute the input multiple times. This is a good choice if the data is larger than memory and cheap to recreate.

Returns
u: Array, unitary / orthogonal
s: Array, singular values in decreasing order (largest first)
v: Array, unitary / orthogonal

References

N. Halko, P. G. Martinsson, and J. A. Tropp. Finding structure with randomness: Probabilistic algorithms for constructing approximate matrix decompositions. SIAM Rev., Survey and Review section, Vol. 53, num. 2, pp. 217-288, June 2011 https://arxiv.org/abs/0909.4061

Examples

>>> u, s, vt = svd_compressed(x, 20)  
dask.array.linalg.sfqr(data, name=None)

Direct Short-and-Fat QR

Currently, this is a quick hack for non-tall-and-skinny matrices which are one chunk tall and (unless they are one chunk wide) have chunks that are wider than they are tall

Q [R_1 R_2 …] = [A_1 A_2 …]

it computes the factorization Q R_1 = A_1, then computes the other R_k’s in parallel.

Parameters
data: Array

See also

dask.array.linalg.qr

Main user API that uses this function

dask.array.linalg.tsqr

Variant for tall-and-skinny case

dask.array.linalg.tsqr(data, compute_svd=False, _max_vchunk_size=None)

Direct Tall-and-Skinny QR algorithm

As presented in:

A. Benson, D. Gleich, and J. Demmel. Direct QR factorizations for tall-and-skinny matrices in MapReduce architectures. IEEE International Conference on Big Data, 2013. https://arxiv.org/abs/1301.1071

This algorithm is used to compute both the QR decomposition and the Singular Value Decomposition. It requires that the input array have a single column of blocks, each of which fit in memory.

Parameters
data: Array
compute_svd: bool

Whether to compute the SVD rather than the QR decomposition

_max_vchunk_size: Integer

Used internally in recursion to set the maximum row dimension of chunks in subsequent recursive calls.

See also

dask.array.linalg.qr

Powered by this algorithm

dask.array.linalg.svd

Powered by this algorithm

dask.array.linalg.sfqr

Variant for short-and-fat arrays

Notes

With k blocks of size (m, n), this algorithm has memory use that scales as k * n * n.

The implementation here is the recursive variant due to the ultimate need for one “single core” QR decomposition. In the non-recursive version of the algorithm, given k blocks, after k m * n QR decompositions, there will be a “single core” QR decomposition that will have to work with a (k * n, n) matrix.

Here, recursion is applied as necessary to ensure that k * n is not larger than m (if m / n >= 2). In particular, this is done to ensure that single core computations do not have to work on blocks larger than (m, n).

Where blocks are irregular, the above logic is applied with the “height” of the “tallest” block used in place of m.

Consider use of the rechunk method to control this behavior. Taller blocks will reduce overall memory use (assuming that many of them still fit in memory at once).

dask.array.ma.average(a, axis=None, weights=None, returned=False)

Return the weighted average of array over the given axis.

This docstring was copied from numpy.ma.average.

Some inconsistencies with the Dask version may exist.

Parameters
aarray_like

Data to be averaged. Masked entries are not taken into account in the computation.

axisint, optional

Axis along which to average a. If None, averaging is done over the flattened array.

weightsarray_like, optional

The importance that each element has in the computation of the average. The weights array can either be 1-D (in which case its length must be the size of a along the given axis) or of the same shape as a. If weights=None, then all data in a are assumed to have a weight equal to one. The 1-D calculation is:

avg = sum(a * weights) / sum(weights)

The only constraint on weights is that sum(weights) must not be 0.

returnedbool, optional

Flag indicating whether a tuple (result, sum of weights) should be returned as output (True), or just the result (False). Default is False.

Returns
average, [sum_of_weights](tuple of) scalar or MaskedArray

The average along the specified axis. When returned is True, return a tuple with the average as the first element and the sum of the weights as the second element. The return type is np.float64 if a is of integer type and floats smaller than float64, or the input data-type, otherwise. If returned, sum_of_weights is always float64.

Examples

>>> a = np.ma.array([1., 2., 3., 4.], mask=[False, False, True, True])  
>>> np.ma.average(a, weights=[3, 1, 0, 0])  
1.25
>>> x = np.ma.arange(6.).reshape(3, 2)  
>>> x  
masked_array(
  data=[[0., 1.],
        [2., 3.],
        [4., 5.]],
  mask=False,
  fill_value=1e+20)
>>> avg, sumweights = np.ma.average(x, axis=0, weights=[1, 2, 3],  
...                                 returned=True)
>>> avg  
masked_array(data=[2.6666666666666665, 3.6666666666666665],
             mask=[False, False],
       fill_value=1e+20)
dask.array.ma.filled(a, fill_value=None)

Return input as an array with masked data replaced by a fill value.

This docstring was copied from numpy.ma.filled.

Some inconsistencies with the Dask version may exist.

If a is not a MaskedArray, a itself is returned. If a is a MaskedArray and fill_value is None, fill_value is set to a.fill_value.

Parameters
aMaskedArray or array_like

An input object.

fill_valuearray_like, optional.

Can be scalar or non-scalar. If non-scalar, the resulting filled array should be broadcastable over input array. Default is None.

Returns
andarray

The filled array.

See also

compressed

Examples

>>> x = np.ma.array(np.arange(9).reshape(3, 3), mask=[[1, 0, 0],  
...                                                   [1, 0, 0],
...                                                   [0, 0, 0]])
>>> x.filled()  
array([[999999,      1,      2],
       [999999,      4,      5],
       [     6,      7,      8]])
>>> x.filled(fill_value=333)  
array([[333,   1,   2],
       [333,   4,   5],
       [  6,   7,   8]])
>>> x.filled(fill_value=np.arange(3))  
array([[0, 1, 2],
       [0, 4, 5],
       [6, 7, 8]])
dask.array.ma.fix_invalid(a, fill_value=None)

Return input with invalid data masked and replaced by a fill value.

This docstring was copied from numpy.ma.fix_invalid.

Some inconsistencies with the Dask version may exist.

Invalid data means values of nan, inf, etc.

Parameters
aarray_like

Input array, a (subclass of) ndarray.

masksequence, optional (Not supported in Dask)

Mask. Must be convertible to an array of booleans with the same shape as data. True indicates a masked (i.e. invalid) data.

copybool, optional (Not supported in Dask)

Whether to use a copy of a (True) or to fix a in place (False). Default is True.

fill_valuescalar, optional

Value used for fixing invalid data. Default is None, in which case the a.fill_value is used.

Returns
bMaskedArray

The input array with invalid entries fixed.

Notes

A copy is performed by default.

Examples

>>> x = np.ma.array([1., -1, np.nan, np.inf], mask=[1] + [0]*3)  
>>> x  
masked_array(data=[--, -1.0, nan, inf],
             mask=[ True, False, False, False],
       fill_value=1e+20)
>>> np.ma.fix_invalid(x)  
masked_array(data=[--, -1.0, --, --],
             mask=[ True, False,  True,  True],
       fill_value=1e+20)
>>> fixed = np.ma.fix_invalid(x)  
>>> fixed.data  
array([ 1.e+00, -1.e+00,  1.e+20,  1.e+20])
>>> x.data  
array([ 1., -1., nan, inf])
dask.array.ma.getdata(a)

Return the data of a masked array as an ndarray.

This docstring was copied from numpy.ma.getdata.

Some inconsistencies with the Dask version may exist.

Return the data of a (if any) as an ndarray if a is a MaskedArray, else return a as a ndarray or subclass (depending on subok) if not.

Parameters
aarray_like

Input MaskedArray, alternatively a ndarray or a subclass thereof.

subokbool (Not supported in Dask)

Whether to force the output to be a pure ndarray (False) or to return a subclass of ndarray if appropriate (True, default).

See also

getmask

Return the mask of a masked array, or nomask.

getmaskarray

Return the mask of a masked array, or full array of False.

Examples

>>> import numpy.ma as ma  
>>> a = ma.masked_equal([[1,2],[3,4]], 2)  
>>> a  
masked_array(
  data=[[1, --],
        [3, 4]],
  mask=[[False,  True],
        [False, False]],
  fill_value=2)
>>> ma.getdata(a)  
array([[1, 2],
       [3, 4]])

Equivalently use the MaskedArray data attribute.

>>> a.data  
array([[1, 2],
       [3, 4]])
dask.array.ma.getmaskarray(a)

Return the mask of a masked array, or full boolean array of False.

This docstring was copied from numpy.ma.getmaskarray.

Some inconsistencies with the Dask version may exist.

Return the mask of arr as an ndarray if arr is a MaskedArray and the mask is not nomask, else return a full boolean array of False of the same shape as arr.

Parameters
arrarray_like (Not supported in Dask)

Input MaskedArray for which the mask is required.

See also

getmask

Return the mask of a masked array, or nomask.

getdata

Return the data of a masked array as an ndarray.

Examples

>>> import numpy.ma as ma  
>>> a = ma.masked_equal([[1,2],[3,4]], 2)  
>>> a  
masked_array(
  data=[[1, --],
        [3, 4]],
  mask=[[False,  True],
        [False, False]],
  fill_value=2)
>>> ma.getmaskarray(a)  
array([[False,  True],
       [False, False]])

Result when mask == nomask

>>> b = ma.masked_array([[1,2],[3,4]])  
>>> b  
masked_array(
  data=[[1, 2],
        [3, 4]],
  mask=False,
  fill_value=999999)
>>> ma.getmaskarray(b)  
array([[False, False],
       [False, False]])
dask.array.ma.masked_array(data, mask=False, fill_value=None, **kwargs)

An array class with possibly masked values.

This docstring was copied from numpy.ma.masked_array.

Some inconsistencies with the Dask version may exist.

Masked values of True exclude the corresponding element from any computation.

Construction:

x = MaskedArray(data, mask=nomask, dtype=None, copy=False, subok=True,
                ndmin=0, fill_value=None, keep_mask=True, hard_mask=None,
                shrink=True, order=None)
Parameters
dataarray_like

Input data.

masksequence, optional

Mask. Must be convertible to an array of booleans with the same shape as data. True indicates a masked (i.e. invalid) data.

dtypedtype, optional (Not supported in Dask)

Data type of the output. If dtype is None, the type of the data argument (data.dtype) is used. If dtype is not None and different from data.dtype, a copy is performed.

copybool, optional (Not supported in Dask)

Whether to copy the input data (True), or to use a reference instead. Default is False.

subokbool, optional (Not supported in Dask)

Whether to return a subclass of MaskedArray if possible (True) or a plain MaskedArray. Default is True.

ndminint, optional (Not supported in Dask)

Minimum number of dimensions. Default is 0.

fill_valuescalar, optional

Value used to fill in the masked values when necessary. If None, a default based on the data-type is used.

keep_maskbool, optional (Not supported in Dask)

Whether to combine mask with the mask of the input data, if any (True), or to use only mask for the output (False). Default is True.

hard_maskbool, optional (Not supported in Dask)

Whether to use a hard mask or not. With a hard mask, masked values cannot be unmasked. Default is False.

shrinkbool, optional (Not supported in Dask)

Whether to force compression of an empty mask. Default is True.

order{‘C’, ‘F’, ‘A’}, optional (Not supported in Dask)

Specify the order of the array. If order is ‘C’, then the array will be in C-contiguous order (last-index varies the fastest). If order is ‘F’, then the returned array will be in Fortran-contiguous order (first-index varies the fastest). If order is ‘A’ (default), then the returned array may be in any order (either C-, Fortran-contiguous, or even discontiguous), unless a copy is required, in which case it will be C-contiguous.

dask.array.ma.masked_equal(a, value)

Mask an array where equal to a given value.

This docstring was copied from numpy.ma.masked_equal.

Some inconsistencies with the Dask version may exist.

This function is a shortcut to masked_where, with condition = (x == value). For floating point arrays, consider using masked_values(x, value).

See also

masked_where

Mask where a condition is met.

masked_values

Mask using floating point equality.

Examples

>>> import numpy.ma as ma  
>>> a = np.arange(4)  
>>> a  
array([0, 1, 2, 3])
>>> ma.masked_equal(a, 2)  
masked_array(data=[0, 1, --, 3],
             mask=[False, False,  True, False],
       fill_value=2)
dask.array.ma.masked_greater(x, value, copy=True)

Mask an array where greater than a given value.

This function is a shortcut to masked_where, with condition = (x > value).

See also

masked_where

Mask where a condition is met.

Examples

>>> import numpy.ma as ma
>>> a = np.arange(4)
>>> a
array([0, 1, 2, 3])
>>> ma.masked_greater(a, 2)
masked_array(data=[0, 1, 2, --],
             mask=[False, False, False,  True],
       fill_value=999999)
dask.array.ma.masked_greater_equal(x, value, copy=True)

Mask an array where greater than or equal to a given value.

This function is a shortcut to masked_where, with condition = (x >= value).

See also

masked_where

Mask where a condition is met.

Examples

>>> import numpy.ma as ma
>>> a = np.arange(4)
>>> a
array([0, 1, 2, 3])
>>> ma.masked_greater_equal(a, 2)
masked_array(data=[0, 1, --, --],
             mask=[False, False,  True,  True],
       fill_value=999999)
dask.array.ma.masked_inside(x, v1, v2)

Mask an array inside a given interval.

This docstring was copied from numpy.ma.masked_inside.

Some inconsistencies with the Dask version may exist.

Shortcut to masked_where, where condition is True for x inside the interval [v1,v2] (v1 <= x <= v2). The boundaries v1 and v2 can be given in either order.

See also

masked_where

Mask where a condition is met.

Notes

The array x is prefilled with its filling value.

Examples

>>> import numpy.ma as ma  
>>> x = [0.31, 1.2, 0.01, 0.2, -0.4, -1.1]  
>>> ma.masked_inside(x, -0.3, 0.3)  
masked_array(data=[0.31, 1.2, --, --, -0.4, -1.1],
             mask=[False, False,  True,  True, False, False],
       fill_value=1e+20)

The order of v1 and v2 doesn’t matter.

>>> ma.masked_inside(x, 0.3, -0.3)  
masked_array(data=[0.31, 1.2, --, --, -0.4, -1.1],
             mask=[False, False,  True,  True, False, False],
       fill_value=1e+20)
dask.array.ma.masked_invalid(a)

Mask an array where invalid values occur (NaNs or infs).

This docstring was copied from numpy.ma.masked_invalid.

Some inconsistencies with the Dask version may exist.

This function is a shortcut to masked_where, with condition = ~(np.isfinite(a)). Any pre-existing mask is conserved. Only applies to arrays with a dtype where NaNs or infs make sense (i.e. floating point types), but accepts any array_like object.

See also

masked_where

Mask where a condition is met.

Examples

>>> import numpy.ma as ma  
>>> a = np.arange(5, dtype=float)  
>>> a[2] = np.NaN  
>>> a[3] = np.PINF  
>>> a  
array([ 0.,  1., nan, inf,  4.])
>>> ma.masked_invalid(a)  
masked_array(data=[0.0, 1.0, --, --, 4.0],
             mask=[False, False,  True,  True, False],
       fill_value=1e+20)
dask.array.ma.masked_less(x, value, copy=True)

Mask an array where less than a given value.

This function is a shortcut to masked_where, with condition = (x < value).

See also

masked_where

Mask where a condition is met.

Examples

>>> import numpy.ma as ma
>>> a = np.arange(4)
>>> a
array([0, 1, 2, 3])
>>> ma.masked_less(a, 2)
masked_array(data=[--, --, 2, 3],
             mask=[ True,  True, False, False],
       fill_value=999999)
dask.array.ma.masked_less_equal(x, value, copy=True)

Mask an array where less than or equal to a given value.

This function is a shortcut to masked_where, with condition = (x <= value).

See also

masked_where

Mask where a condition is met.

Examples

>>> import numpy.ma as ma
>>> a = np.arange(4)
>>> a
array([0, 1, 2, 3])
>>> ma.masked_less_equal(a, 2)
masked_array(data=[--, --, --, 3],
             mask=[ True,  True,  True, False],
       fill_value=999999)
dask.array.ma.masked_not_equal(x, value, copy=True)

Mask an array where not equal to a given value.

This function is a shortcut to masked_where, with condition = (x != value).

See also

masked_where

Mask where a condition is met.

Examples

>>> import numpy.ma as ma
>>> a = np.arange(4)
>>> a
array([0, 1, 2, 3])
>>> ma.masked_not_equal(a, 2)
masked_array(data=[--, --, 2, --],
             mask=[ True,  True, False,  True],
       fill_value=999999)
dask.array.ma.masked_outside(x, v1, v2)

Mask an array outside a given interval.

This docstring was copied from numpy.ma.masked_outside.

Some inconsistencies with the Dask version may exist.

Shortcut to masked_where, where condition is True for x outside the interval [v1,v2] (x < v1)|(x > v2). The boundaries v1 and v2 can be given in either order.

See also

masked_where

Mask where a condition is met.

Notes

The array x is prefilled with its filling value.

Examples

>>> import numpy.ma as ma  
>>> x = [0.31, 1.2, 0.01, 0.2, -0.4, -1.1]  
>>> ma.masked_outside(x, -0.3, 0.3)  
masked_array(data=[--, --, 0.01, 0.2, --, --],
             mask=[ True,  True, False, False,  True,  True],
       fill_value=1e+20)

The order of v1 and v2 doesn’t matter.

>>> ma.masked_outside(x, 0.3, -0.3)  
masked_array(data=[--, --, 0.01, 0.2, --, --],
             mask=[ True,  True, False, False,  True,  True],
       fill_value=1e+20)
dask.array.ma.masked_values(x, value, rtol=1e-05, atol=1e-08, shrink=True)

Mask using floating point equality.

This docstring was copied from numpy.ma.masked_values.

Some inconsistencies with the Dask version may exist.

Return a MaskedArray, masked where the data in array x are approximately equal to value, determined using isclose. The default tolerances for masked_values are the same as those for isclose.

For integer types, exact equality is used, in the same way as masked_equal.

The fill_value is set to value and the mask is set to nomask if possible.

Parameters
xarray_like

Array to mask.

valuefloat

Masking value.

rtol, atolfloat, optional

Tolerance parameters passed on to isclose

copybool, optional (Not supported in Dask)

Whether to return a copy of x.

shrinkbool, optional

Whether to collapse a mask full of False to nomask.

Returns
resultMaskedArray

The result of masking x where approximately equal to value.

See also

masked_where

Mask where a condition is met.

masked_equal

Mask where equal to a given value (integers).

Examples

>>> import numpy.ma as ma  
>>> x = np.array([1, 1.1, 2, 1.1, 3])  
>>> ma.masked_values(x, 1.1)  
masked_array(data=[1.0, --, 2.0, --, 3.0],
             mask=[False,  True, False,  True, False],
       fill_value=1.1)

Note that mask is set to nomask if possible.

>>> ma.masked_values(x, 1.5)  
masked_array(data=[1. , 1.1, 2. , 1.1, 3. ],
             mask=False,
       fill_value=1.5)

For integers, the fill value will be different in general to the result of masked_equal.

>>> x = np.arange(5)  
>>> x  
array([0, 1, 2, 3, 4])
>>> ma.masked_values(x, 2)  
masked_array(data=[0, 1, --, 3, 4],
             mask=[False, False,  True, False, False],
       fill_value=2)
>>> ma.masked_equal(x, 2)  
masked_array(data=[0, 1, --, 3, 4],
             mask=[False, False,  True, False, False],
       fill_value=2)
dask.array.ma.masked_where(condition, a)

Mask an array where a condition is met.

This docstring was copied from numpy.ma.masked_where.

Some inconsistencies with the Dask version may exist.

Return a as an array masked where condition is True. Any masked values of a or condition are also masked in the output.

Parameters
conditionarray_like

Masking condition. When condition tests floating point values for equality, consider using masked_values instead.

aarray_like

Array to mask.

copybool (Not supported in Dask)

If True (default) make a copy of a in the result. If False modify a in place and return a view.

Returns
resultMaskedArray

The result of masking a where condition is True.

See also

masked_values

Mask using floating point equality.

masked_equal

Mask where equal to a given value.

masked_not_equal

Mask where not equal to a given value.

masked_less_equal

Mask where less than or equal to a given value.

masked_greater_equal

Mask where greater than or equal to a given value.

masked_less

Mask where less than a given value.

masked_greater

Mask where greater than a given value.

masked_inside

Mask inside a given interval.

masked_outside

Mask outside a given interval.

masked_invalid

Mask invalid values (NaNs or infs).

Examples

>>> import numpy.ma as ma  
>>> a = np.arange(4)  
>>> a  
array([0, 1, 2, 3])
>>> ma.masked_where(a <= 2, a)  
masked_array(data=[--, --, --, 3],
             mask=[ True,  True,  True, False],
       fill_value=999999)

Mask array b conditional on a.

>>> b = ['a', 'b', 'c', 'd']  
>>> ma.masked_where(a == 2, b)  
masked_array(data=['a', 'b', --, 'd'],
             mask=[False, False,  True, False],
       fill_value='N/A',
            dtype='<U1')

Effect of the copy argument.

>>> c = ma.masked_where(a <= 2, a)  
>>> c  
masked_array(data=[--, --, --, 3],
             mask=[ True,  True,  True, False],
       fill_value=999999)
>>> c[0] = 99  
>>> c  
masked_array(data=[99, --, --, 3],
             mask=[False,  True,  True, False],
       fill_value=999999)
>>> a  
array([0, 1, 2, 3])
>>> c = ma.masked_where(a <= 2, a, copy=False)  
>>> c[0] = 99  
>>> c  
masked_array(data=[99, --, --, 3],
             mask=[False,  True,  True, False],
       fill_value=999999)
>>> a  
array([99,  1,  2,  3])

When condition or a contain masked values.

>>> a = np.arange(4)  
>>> a = ma.masked_where(a == 2, a)  
>>> a  
masked_array(data=[0, 1, --, 3],
             mask=[False, False,  True, False],
       fill_value=999999)
>>> b = np.arange(4)  
>>> b = ma.masked_where(b == 0, b)  
>>> b  
masked_array(data=[--, 1, 2, 3],
             mask=[ True, False, False, False],
       fill_value=999999)
>>> ma.masked_where(a == 3, b)  
masked_array(data=[--, 1, --, --],
             mask=[ True, False,  True,  True],
       fill_value=999999)
dask.array.ma.set_fill_value(a, fill_value)

Set the filling value of a, if a is a masked array.

This docstring was copied from numpy.ma.set_fill_value.

Some inconsistencies with the Dask version may exist.

This function changes the fill value of the masked array a in place. If a is not a masked array, the function returns silently, without doing anything.

Parameters
aarray_like

Input array.

fill_valuedtype

Filling value. A consistency test is performed to make sure the value is compatible with the dtype of a.

Returns
None

Nothing returned by this function.

See also

maximum_fill_value

Return the default fill value for a dtype.

MaskedArray.fill_value

Return current fill value.

MaskedArray.set_fill_value

Equivalent method.

Examples

>>> import numpy.ma as ma  
>>> a = np.arange(5)  
>>> a  
array([0, 1, 2, 3, 4])
>>> a = ma.masked_where(a < 3, a)  
>>> a  
masked_array(data=[--, --, --, 3, 4],
             mask=[ True,  True,  True, False, False],
       fill_value=999999)
>>> ma.set_fill_value(a, -999)  
>>> a  
masked_array(data=[--, --, --, 3, 4],
             mask=[ True,  True,  True, False, False],
       fill_value=-999)

Nothing happens if a is not a masked array.

>>> a = list(range(5))  
>>> a  
[0, 1, 2, 3, 4]
>>> ma.set_fill_value(a, 100)  
>>> a  
[0, 1, 2, 3, 4]
>>> a = np.arange(5)  
>>> a  
array([0, 1, 2, 3, 4])
>>> ma.set_fill_value(a, 100)  
>>> a  
array([0, 1, 2, 3, 4])
dask.array.overlap.overlap(x, depth, boundary)

Share boundaries between neighboring blocks

Parameters
x: da.Array

A dask array

depth: dict

The size of the shared boundary per axis

boundary: dict

The boundary condition on each axis. Options are ‘reflect’, ‘periodic’, ‘nearest’, ‘none’, or an array value. Such a value will fill the boundary with that value.

The depth input informs how many cells to overlap between neighboring
blocks ``{0: 2, 2: 5}`` means share two cells in 0 axis, 5 cells in 2 axis.
Axes missing from this input will not be overlapped.

Examples

>>> import numpy as np
>>> import dask.array as da
>>> x = np.arange(64).reshape((8, 8))
>>> d = da.from_array(x, chunks=(4, 4))
>>> d.chunks
((4, 4), (4, 4))
>>> g = da.overlap.overlap(d, depth={0: 2, 1: 1},
...                       boundary={0: 100, 1: 'reflect'})
>>> g.chunks
((8, 8), (6, 6))
>>> np.array(g)
array([[100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100],
       [100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100],
       [  0,   0,   1,   2,   3,   4,   3,   4,   5,   6,   7,   7],
       [  8,   8,   9,  10,  11,  12,  11,  12,  13,  14,  15,  15],
       [ 16,  16,  17,  18,  19,  20,  19,  20,  21,  22,  23,  23],
       [ 24,  24,  25,  26,  27,  28,  27,  28,  29,  30,  31,  31],
       [ 32,  32,  33,  34,  35,  36,  35,  36,  37,  38,  39,  39],
       [ 40,  40,  41,  42,  43,  44,  43,  44,  45,  46,  47,  47],
       [ 16,  16,  17,  18,  19,  20,  19,  20,  21,  22,  23,  23],
       [ 24,  24,  25,  26,  27,  28,  27,  28,  29,  30,  31,  31],
       [ 32,  32,  33,  34,  35,  36,  35,  36,  37,  38,  39,  39],
       [ 40,  40,  41,  42,  43,  44,  43,  44,  45,  46,  47,  47],
       [ 48,  48,  49,  50,  51,  52,  51,  52,  53,  54,  55,  55],
       [ 56,  56,  57,  58,  59,  60,  59,  60,  61,  62,  63,  63],
       [100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100],
       [100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100]])
dask.array.overlap.map_overlap(func, *args, depth=None, boundary=None, trim=True, align_arrays=True, **kwargs)

Map a function over blocks of arrays with some overlap

We share neighboring zones between blocks of the array, map a function, and then trim away the neighboring strips.

Parameters
func: function

The function to apply to each extended block. If multiple arrays are provided, then the function should expect to receive chunks of each array in the same order.

argsdask arrays
depth: int, tuple, dict or list

The number of elements that each block should share with its neighbors If a tuple or dict then this can be different per axis. If a list then each element of that list must be an int, tuple or dict defining depth for the corresponding array in args. Asymmetric depths may be specified using a dict value of (-/+) tuples. Note that asymmetric depths are currently only supported when boundary is ‘none’. The default value is 0.

boundary: str, tuple, dict or list

How to handle the boundaries. Values include ‘reflect’, ‘periodic’, ‘nearest’, ‘none’, or any constant value like 0 or np.nan. If a list then each element must be a str, tuple or dict defining the boundary for the corresponding array in args. The default value is ‘reflect’.

trim: bool

Whether or not to trim depth elements from each block after calling the map function. Set this to False if your mapping function already does this for you

align_arrays: bool

Whether or not to align chunks along equally sized dimensions when multiple arrays are provided. This allows for larger chunks in some arrays to be broken into smaller ones that match chunk sizes in other arrays such that they are compatible for block function mapping. If this is false, then an error will be thrown if arrays do not already have the same number of blocks in each dimension.

**kwargs:

Other keyword arguments valid in map_blocks

Examples

>>> import numpy as np
>>> import dask.array as da
>>> x = np.array([1, 1, 2, 3, 3, 3, 2, 1, 1])
>>> x = da.from_array(x, chunks=5)
>>> def derivative(x):
...     return x - np.roll(x, 1)
>>> y = x.map_overlap(derivative, depth=1, boundary=0)
>>> y.compute()
array([ 1,  0,  1,  1,  0,  0, -1, -1,  0])
>>> x = np.arange(16).reshape((4, 4))
>>> d = da.from_array(x, chunks=(2, 2))
>>> d.map_overlap(lambda x: x + x.size, depth=1).compute()
array([[16, 17, 18, 19],
       [20, 21, 22, 23],
       [24, 25, 26, 27],
       [28, 29, 30, 31]])
>>> func = lambda x: x + x.size
>>> depth = {0: 1, 1: 1}
>>> boundary = {0: 'reflect', 1: 'none'}
>>> d.map_overlap(func, depth, boundary).compute()  
array([[12,  13,  14,  15],
       [16,  17,  18,  19],
       [20,  21,  22,  23],
       [24,  25,  26,  27]])

The da.map_overlap function can also accept multiple arrays.

>>> func = lambda x, y: x + y
>>> x = da.arange(8).reshape(2, 4).rechunk((1, 2))
>>> y = da.arange(4).rechunk(2)
>>> da.map_overlap(func, x, y, depth=1).compute() 
array([[ 0,  2,  4,  6],
       [ 4,  6,  8,  10]])

When multiple arrays are given, they do not need to have the same number of dimensions but they must broadcast together. Arrays are aligned block by block (just as in da.map_blocks) so the blocks must have a common chunk size. This common chunking is determined automatically as long as align_arrays is True.

>>> x = da.arange(8, chunks=4)
>>> y = da.arange(8, chunks=2)
>>> r = da.map_overlap(func, x, y, depth=1, align_arrays=True)
>>> len(r.to_delayed())
4
>>> da.map_overlap(func, x, y, depth=1, align_arrays=False).compute()
Traceback (most recent call last):
    ...
ValueError: Shapes do not align {'.0': {2, 4}}

Note also that this function is equivalent to map_blocks by default. A non-zero depth must be defined for any overlap to appear in the arrays provided to func.

>>> func = lambda x: x.sum()
>>> x = da.ones(10, dtype='int')
>>> block_args = dict(chunks=(), drop_axis=0)
>>> da.map_blocks(func, x, **block_args).compute()
10
>>> da.map_overlap(func, x, **block_args).compute()
10
>>> da.map_overlap(func, x, **block_args, depth=1).compute()
12
dask.array.overlap.trim_internal(x, axes, boundary=None)

Trim sides from each block

This couples well with the overlap operation, which may leave excess data on each block

See also

dask.array.chunk.trim
dask.array.map_blocks
dask.array.overlap.trim_overlap(x, depth, boundary=None)

Trim sides from each block.

This couples well with the map_overlap operation which may leave excess data on each block.

dask.array.from_array(x, chunks='auto', name=None, lock=False, asarray=None, fancy=True, getitem=None, meta=None)

Create dask array from something that looks like an array

Input must have a .shape, .ndim, .dtype and support numpy-style slicing.

Parameters
xarray_like
chunksint, tuple

How to chunk the array. Must be one of the following forms:

  • A blocksize like 1000.

  • A blockshape like (1000, 1000).

  • Explicit sizes of all blocks along all dimensions like ((1000, 1000, 500), (400, 400)).

  • A size in bytes, like “100 MiB” which will choose a uniform block-like shape

  • The word “auto” which acts like the above, but uses a configuration value array.chunk-size for the chunk size

-1 or None as a blocksize indicate the size of the corresponding dimension.

namestr, optional

The key name to use for the array. Defaults to a hash of x. By default, hash uses python’s standard sha1. This behaviour can be changed by installing cityhash, xxhash or murmurhash. If installed, a large-factor speedup can be obtained in the tokenisation step. Use name=False to generate a random name instead of hashing (fast)

Note

Because this name is used as the key in task graphs, you should ensure that it uniquely identifies the data contained within. If you’d like to provide a descriptive name that is still unique, combine the descriptive name with dask.base.tokenize() of the array_like. See Task Graphs for more.

lockbool or Lock, optional

If x doesn’t support concurrent reads then provide a lock here, or pass in True to have dask.array create one for you.

asarraybool, optional

If True then call np.asarray on chunks to convert them to numpy arrays. If False then chunks are passed through unchanged. If None (default) then we use True if the __array_function__ method is undefined.

fancybool, optional

If x doesn’t support fancy indexing (e.g. indexing with lists or arrays) then set to False. Default is True.

metaArray-like, optional

The metadata for the resulting dask array. This is the kind of array that will result from slicing the input array. Defaults to the input array.

Examples

>>> x = h5py.File('...')['/data/path']  
>>> a = da.from_array(x, chunks=(1000, 1000))  

If your underlying datastore does not support concurrent reads then include the lock=True keyword argument or lock=mylock if you want multiple arrays to coordinate around the same lock.

>>> a = da.from_array(x, chunks=(1000, 1000), lock=True)  

If your underlying datastore has a .chunks attribute (as h5py and zarr datasets do) then a multiple of that chunk shape will be used if you do not provide a chunk shape.

>>> a = da.from_array(x, chunks='auto')  
>>> a = da.from_array(x, chunks='100 MiB')  
>>> a = da.from_array(x)  

If providing a name, ensure that it is unique

>>> import dask.base
>>> token = dask.base.tokenize(x)  
>>> a = da.from_array('myarray-' + token)  
dask.array.from_delayed(value, shape, dtype=None, meta=None, name=None)

Create a dask array from a dask delayed value

This routine is useful for constructing dask arrays in an ad-hoc fashion using dask delayed, particularly when combined with stack and concatenate.

The dask array will consist of a single chunk.

Examples

>>> import dask
>>> import dask.array as da
>>> value = dask.delayed(np.ones)(5)
>>> array = da.from_delayed(value, (5,), dtype=float)
>>> array
dask.array<from-value, shape=(5,), dtype=float64, chunksize=(5,), chunktype=numpy.ndarray>
>>> array.compute()
array([1., 1., 1., 1., 1.])
dask.array.from_npy_stack(dirname, mmap_mode='r')

Load dask array from stack of npy files

See da.to_npy_stack for docstring

Parameters
dirname: string

Directory of .npy files

mmap_mode: (None or ‘r’)

Read data in memory map mode

dask.array.from_zarr(url, component=None, storage_options=None, chunks=None, name=None, **kwargs)

Load array from the zarr storage format

See https://zarr.readthedocs.io for details about the format.

Parameters
url: Zarr Array or str or MutableMapping

Location of the data. A URL can include a protocol specifier like s3:// for remote data. Can also be any MutableMapping instance, which should be serializable if used in multiple processes.

component: str or None

If the location is a zarr group rather than an array, this is the subcomponent that should be loaded, something like 'foo/bar'.

storage_options: dict

Any additional parameters for the storage backend (ignored for local paths)

chunks: tuple of ints or tuples of ints

Passed to da.from_array, allows setting the chunks on initialisation, if the chunking scheme in the on-disc dataset is not optimal for the calculations to follow.

namestr, optional

An optional keyname for the array. Defaults to hashing the input

kwargs: passed to ``zarr.Array``.
dask.array.from_tiledb(uri, attribute=None, chunks=None, storage_options=None, **kwargs)

Load array from the TileDB storage format

See https://docs.tiledb.io for more information about TileDB.

Parameters
uri: TileDB array or str

Location to save the data

attribute: str or None

Attribute selection (single-attribute view on multi-attribute array)

Returns
A Dask Array

Examples

>>> # create a tiledb array
>>> import tiledb, numpy as np, tempfile  
>>> uri = tempfile.NamedTemporaryFile().name  
>>> tiledb.from_numpy(uri, np.arange(0,9).reshape(3,3))  
<tiledb.libtiledb.DenseArray object at 0x...>
>>> # read back the array
>>> import dask.array as da  
>>> tdb_ar = da.from_tiledb(uri)  
>>> tdb_ar.shape  
(3, 3)
>>> tdb_ar.mean().compute()  
4.0
dask.array.store(sources, targets, lock=True, regions=None, compute=True, return_stored=False, **kwargs)

Store dask arrays in array-like objects, overwrite data in target

This stores dask arrays into object that supports numpy-style setitem indexing. It stores values chunk by chunk so that it does not have to fill up memory. For best performance you can align the block size of the storage target with the block size of your array.

If your data fits in memory then you may prefer calling np.array(myarray) instead.

Parameters
sources: Array or iterable of Arrays
targets: array-like or Delayed or iterable of array-likes and/or Delayeds

These should support setitem syntax target[10:20] = ...

lock: boolean or threading.Lock, optional

Whether or not to lock the data stores while storing. Pass True (lock each file individually), False (don’t lock) or a particular threading.Lock object to be shared among all writes.

regions: tuple of slices or list of tuples of slices

Each region tuple in regions should be such that target[region].shape = source.shape for the corresponding source and target in sources and targets, respectively. If this is a tuple, the contents will be assumed to be slices, so do not provide a tuple of tuples.

compute: boolean, optional

If true compute immediately, return dask.delayed.Delayed otherwise

return_stored: boolean, optional

Optionally return the stored result (default False).

Examples

>>> x = ...  
>>> import h5py  
>>> f = h5py.File('myfile.hdf5', mode='a')  
>>> dset = f.create_dataset('/data', shape=x.shape,
...                                  chunks=x.chunks,
...                                  dtype='f8')  
>>> store(x, dset)  

Alternatively store many arrays at the same time

>>> store([x, y, z], [dset1, dset2, dset3])  
dask.array.to_hdf5(filename, *args, **kwargs)

Store arrays in HDF5 file

This saves several dask arrays into several datapaths in an HDF5 file. It creates the necessary datasets and handles clean file opening/closing.

>>> da.to_hdf5('myfile.hdf5', '/x', x)  

or

>>> da.to_hdf5('myfile.hdf5', {'/x': x, '/y': y})  

Optionally provide arguments as though to h5py.File.create_dataset

>>> da.to_hdf5('myfile.hdf5', '/x', x, compression='lzf', shuffle=True)  

This can also be used as a method on a single Array

>>> x.to_hdf5('myfile.hdf5', '/x')  

See also

da.store
h5py.File.create_dataset
dask.array.to_zarr(arr, url, component=None, storage_options=None, overwrite=False, compute=True, return_stored=False, **kwargs)

Save array to the zarr storage format

See https://zarr.readthedocs.io for details about the format.

Parameters
arr: dask.array

Data to store

url: Zarr Array or str or MutableMapping

Location of the data. A URL can include a protocol specifier like s3:// for remote data. Can also be any MutableMapping instance, which should be serializable if used in multiple processes.

component: str or None

If the location is a zarr group rather than an array, this is the subcomponent that should be created/over-written.

storage_options: dict

Any additional parameters for the storage backend (ignored for local paths)

overwrite: bool

If given array already exists, overwrite=False will cause an error, where overwrite=True will replace the existing data. Note that this check is done at computation time, not during graph creation.

compute, return_stored: see ``store()``
kwargs: passed to the ``zarr.create()`` function, e.g., compression options
Raises
ValueError

If arr has unknown chunk sizes, which is not supported by Zarr.

dask.array.to_npy_stack(dirname, x, axis=0)

Write dask array to a stack of .npy files

This partitions the dask.array along one axis and stores each block along that axis as a single .npy file in the specified directory

See also

from_npy_stack

Examples

>>> x = da.ones((5, 10, 10), chunks=(2, 4, 4))  
>>> da.to_npy_stack('data/', x, axis=0)  

The .npy files store numpy arrays for x[0:2], x[2:4], and x[4:5] respectively, as is specified by the chunk size along the zeroth axis:

$ tree data/
data/
|-- 0.npy
|-- 1.npy
|-- 2.npy
|-- info

The info file stores the dtype, chunks, and axis information of the array. You can load these stacks with the da.from_npy_stack function.

>>> y = da.from_npy_stack('data/')  
dask.array.to_tiledb(darray, uri, compute=True, return_stored=False, storage_options=None, **kwargs)

Save array to the TileDB storage format

Save ‘array’ using the TileDB storage manager, to any TileDB-supported URI, including local disk, S3, or HDFS.

See https://docs.tiledb.io for more information about TileDB.

Parameters
darray: dask.array

A dask array to write.

uri:

Any supported TileDB storage location.

storage_options: dict

Dict containing any configuration options for the TileDB backend. see https://docs.tiledb.io/en/stable/tutorials/config.html

compute, return_stored: see ``store()``
Returns
None

Unless return_stored is set to True (False by default)

Notes

TileDB only supports regularly-chunked arrays. TileDB tile extents correspond to form 2 of the dask chunk specification, and the conversion is done automatically for supported arrays.

Examples

>>> import dask.array as da, tempfile  
>>> uri = tempfile.NamedTemporaryFile().name   
>>> data = da.random.random(5,5)  
>>> da.to_tiledb(data, uri)  
>>> import tiledb  
>>> tdb_ar = tiledb.open(uri)  
>>> all(tdb_ar == data)  
True
dask.array.fft.fft_wrap(fft_func, kind=None, dtype=None)

Wrap 1D, 2D, and ND real and complex FFT functions

Takes a function that behaves like numpy.fft functions and a specified kind to match it to that are named after the functions in the numpy.fft API.

Supported kinds include:

  • fft

  • fft2

  • fftn

  • ifft

  • ifft2

  • ifftn

  • rfft

  • rfft2

  • rfftn

  • irfft

  • irfft2

  • irfftn

  • hfft

  • ihfft

Examples

>>> parallel_fft = fft_wrap(np.fft.fft)
>>> parallel_ifft = fft_wrap(np.fft.ifft)
dask.array.fft.fft(a, n=None, axis=None)

Wrapping of numpy.fft.fft

The axis along which the FFT is applied must have only one chunk. To change the array’s chunking use dask.Array.rechunk.

The numpy.fft.fft docstring follows below:

Compute the one-dimensional discrete Fourier Transform.

This function computes the one-dimensional n-point discrete Fourier Transform (DFT) with the efficient Fast Fourier Transform (FFT) algorithm [CT].

Parameters
aarray_like

Input array, can be complex.

nint, optional

Length of the transformed axis of the output. If n is smaller than the length of the input, the input is cropped. If it is larger, the input is padded with zeros. If n is not given, the length of the input along the axis specified by axis is used.

axisint, optional

Axis over which to compute the FFT. If not given, the last axis is used.

norm{None, “ortho”}, optional

New in version 1.10.0.

Normalization mode (see numpy.fft). Default is None.

Returns
outcomplex ndarray

The truncated or zero-padded input, transformed along the axis indicated by axis, or the last one if axis is not specified.

Raises
IndexError

if axes is larger than the last axis of a.

See also

numpy.fft

for definition of the DFT and conventions used.

ifft

The inverse of fft.

fft2

The two-dimensional FFT.

fftn

The n-dimensional FFT.

rfftn

The n-dimensional FFT of real input.

fftfreq

Frequency bins for given FFT parameters.

Notes

FFT (Fast Fourier Transform) refers to a way the discrete Fourier Transform (DFT) can be calculated efficiently, by using symmetries in the calculated terms. The symmetry is highest when n is a power of 2, and the transform is therefore most efficient for these sizes.

The DFT is defined, with the conventions used in this implementation, in the documentation for the numpy.fft module.

References

CT

Cooley, James W., and John W. Tukey, 1965, “An algorithm for the machine calculation of complex Fourier series,” Math. Comput. 19: 297-301.

Examples

>>> np.fft.fft(np.exp(2j * np.pi * np.arange(8) / 8))
array([-2.33486982e-16+1.14423775e-17j,  8.00000000e+00-1.25557246e-15j,
        2.33486982e-16+2.33486982e-16j,  0.00000000e+00+1.22464680e-16j,
       -1.14423775e-17+2.33486982e-16j,  0.00000000e+00+5.20784380e-16j,
        1.14423775e-17+1.14423775e-17j,  0.00000000e+00+1.22464680e-16j])

In this example, real input has an FFT which is Hermitian, i.e., symmetric in the real part and anti-symmetric in the imaginary part, as described in the numpy.fft documentation:

>>> import matplotlib.pyplot as plt
>>> t = np.arange(256)
>>> sp = np.fft.fft(np.sin(t))
>>> freq = np.fft.fftfreq(t.shape[-1])
>>> plt.plot(freq, sp.real, freq, sp.imag)
[<matplotlib.lines.Line2D object at 0x...>, <matplotlib.lines.Line2D object at 0x...>]
>>> plt.show()
dask.array.fft.fft2(a, s=None, axes=None)

Wrapping of numpy.fft.fft2

The axis along which the FFT is applied must have only one chunk. To change the array’s chunking use dask.Array.rechunk.

The numpy.fft.fft2 docstring follows below:

Compute the 2-dimensional discrete Fourier Transform

This function computes the n-dimensional discrete Fourier Transform over any axes in an M-dimensional array by means of the Fast Fourier Transform (FFT). By default, the transform is computed over the last two axes of the input array, i.e., a 2-dimensional FFT.

Parameters
aarray_like

Input array, can be complex

ssequence of ints, optional

Shape (length of each transformed axis) of the output (s[0] refers to axis 0, s[1] to axis 1, etc.). This corresponds to n for fft(x, n). Along each axis, if the given shape is smaller than that of the input, the input is cropped. If it is larger, the input is padded with zeros. if s is not given, the shape of the input along the axes specified by axes is used.

axessequence of ints, optional

Axes over which to compute the FFT. If not given, the last two axes are used. A repeated index in axes means the transform over that axis is performed multiple times. A one-element sequence means that a one-dimensional FFT is performed.

norm{None, “ortho”}, optional

New in version 1.10.0.

Normalization mode (see numpy.fft). Default is None.

Returns
outcomplex ndarray

The truncated or zero-padded input, transformed along the axes indicated by axes, or the last two axes if axes is not given.

Raises
ValueError

If s and axes have different length, or axes not given and len(s) != 2.

IndexError

If an element of axes is larger than than the number of axes of a.

See also

numpy.fft

Overall view of discrete Fourier transforms, with definitions and conventions used.

ifft2

The inverse two-dimensional FFT.

fft

The one-dimensional FFT.

fftn

The n-dimensional FFT.

fftshift

Shifts zero-frequency terms to the center of the array. For two-dimensional input, swaps first and third quadrants, and second and fourth quadrants.

Notes

fft2 is just fftn with a different default for axes.

The output, analogously to fft, contains the term for zero frequency in the low-order corner of the transformed axes, the positive frequency terms in the first half of these axes, the term for the Nyquist frequency in the middle of the axes and the negative frequency terms in the second half of the axes, in order of decreasingly negative frequency.

See fftn for details and a plotting example, and numpy.fft for definitions and conventions used.

Examples

>>> a = np.mgrid[:5, :5][0]
>>> np.fft.fft2(a)
array([[ 50.  +0.j        ,   0.  +0.j        ,   0.  +0.j        , # may vary
          0.  +0.j        ,   0.  +0.j        ],
       [-12.5+17.20477401j,   0.  +0.j        ,   0.  +0.j        ,
          0.  +0.j        ,   0.  +0.j        ],
       [-12.5 +4.0614962j ,   0.  +0.j        ,   0.  +0.j        ,
          0.  +0.j        ,   0.  +0.j        ],
       [-12.5 -4.0614962j ,   0.  +0.j        ,   0.  +0.j        ,
          0.  +0.j        ,   0.  +0.j        ],
       [-12.5-17.20477401j,   0.  +0.j        ,   0.  +0.j        ,
          0.  +0.j        ,   0.  +0.j        ]])
dask.array.fft.fftn(a, s=None, axes=None)

Wrapping of numpy.fft.fftn

The axis along which the FFT is applied must have only one chunk. To change the array’s chunking use dask.Array.rechunk.

The numpy.fft.fftn docstring follows below:

Compute the N-dimensional discrete Fourier Transform.

This function computes the N-dimensional discrete Fourier Transform over any number of axes in an M-dimensional array by means of the Fast Fourier Transform (FFT).

Parameters
aarray_like

Input array, can be complex.

ssequence of ints, optional

Shape (length of each transformed axis) of the output (s[0] refers to axis 0, s[1] to axis 1, etc.). This corresponds to n for fft(x, n). Along any axis, if the given shape is smaller than that of the input, the input is cropped. If it is larger, the input is padded with zeros. if s is not given, the shape of the input along the axes specified by axes is used.

axessequence of ints, optional

Axes over which to compute the FFT. If not given, the last len(s) axes are used, or all axes if s is also not specified. Repeated indices in axes means that the transform over that axis is performed multiple times.

norm{None, “ortho”}, optional

New in version 1.10.0.

Normalization mode (see numpy.fft). Default is None.

Returns
outcomplex ndarray

The truncated or zero-padded input, transformed along the axes indicated by axes, or by a combination of s and a, as explained in the parameters section above.

Raises
ValueError

If s and axes have different length.

IndexError

If an element of axes is larger than than the number of axes of a.

See also

numpy.fft

Overall view of discrete Fourier transforms, with definitions and conventions used.

ifftn

The inverse of fftn, the inverse n-dimensional FFT.

fft

The one-dimensional FFT, with definitions and conventions used.

rfftn

The n-dimensional FFT of real input.

fft2

The two-dimensional FFT.

fftshift

Shifts zero-frequency terms to centre of array

Notes

The output, analogously to fft, contains the term for zero frequency in the low-order corner of all axes, the positive frequency terms in the first half of all axes, the term for the Nyquist frequency in the middle of all axes and the negative frequency terms in the second half of all axes, in order of decreasingly negative frequency.

See numpy.fft for details, definitions and conventions used.

Examples

>>> a = np.mgrid[:3, :3, :3][0]
>>> np.fft.fftn(a, axes=(1, 2))
array([[[ 0.+0.j,   0.+0.j,   0.+0.j], # may vary
        [ 0.+0.j,   0.+0.j,   0.+0.j],
        [ 0.+0.j,   0.+0.j,   0.+0.j]],
       [[ 9.+0.j,   0.+0.j,   0.+0.j],
        [ 0.+0.j,   0.+0.j,   0.+0.j],
        [ 0.+0.j,   0.+0.j,   0.+0.j]],
       [[18.+0.j,   0.+0.j,   0.+0.j],
        [ 0.+0.j,   0.+0.j,   0.+0.j],
        [ 0.+0.j,   0.+0.j,   0.+0.j]]])
>>> np.fft.fftn(a, (2, 2), axes=(0, 1))
array([[[ 2.+0.j,  2.+0.j,  2.+0.j], # may vary
        [ 0.+0.j,  0.+0.j,  0.+0.j]],
       [[-2.+0.j, -2.+0.j, -2.+0.j],
        [ 0.+0.j,  0.+0.j,  0.+0.j]]])
>>> import matplotlib.pyplot as plt
>>> [X, Y] = np.meshgrid(2 * np.pi * np.arange(200) / 12,
...                      2 * np.pi * np.arange(200) / 34)
>>> S = np.sin(X) + np.cos(Y) + np.random.uniform(0, 1, X.shape)
>>> FS = np.fft.fftn(S)
>>> plt.imshow(np.log(np.abs(np.fft.fftshift(FS))**2))
<matplotlib.image.AxesImage object at 0x...>
>>> plt.show()
dask.array.fft.ifft(a, n=None, axis=None)

Wrapping of numpy.fft.ifft

The axis along which the FFT is applied must have only one chunk. To change the array’s chunking use dask.Array.rechunk.

The numpy.fft.ifft docstring follows below:

Compute the one-dimensional inverse discrete Fourier Transform.

This function computes the inverse of the one-dimensional n-point discrete Fourier transform computed by fft. In other words, ifft(fft(a)) == a to within numerical accuracy. For a general description of the algorithm and definitions, see numpy.fft.

The input should be ordered in the same way as is returned by fft, i.e.,

  • a[0] should contain the zero frequency term,

  • a[1:n//2] should contain the positive-frequency terms,

  • a[n//2 + 1:] should contain the negative-frequency terms, in increasing order starting from the most negative frequency.

For an even number of input points, A[n//2] represents the sum of the values at the positive and negative Nyquist frequencies, as the two are aliased together. See numpy.fft for details.

Parameters
aarray_like

Input array, can be complex.

nint, optional

Length of the transformed axis of the output. If n is smaller than the length of the input, the input is cropped. If it is larger, the input is padded with zeros. If n is not given, the length of the input along the axis specified by axis is used. See notes about padding issues.

axisint, optional

Axis over which to compute the inverse DFT. If not given, the last axis is used.

norm{None, “ortho”}, optional

New in version 1.10.0.

Normalization mode (see numpy.fft). Default is None.

Returns
outcomplex ndarray

The truncated or zero-padded input, transformed along the axis indicated by axis, or the last one if axis is not specified.

Raises
IndexError

If axes is larger than the last axis of a.

See also

numpy.fft

An introduction, with definitions and general explanations.

fft

The one-dimensional (forward) FFT, of which ifft is the inverse

ifft2

The two-dimensional inverse FFT.

ifftn

The n-dimensional inverse FFT.

Notes

If the input parameter n is larger than the size of the input, the input is padded by appending zeros at the end. Even though this is the common approach, it might lead to surprising results. If a different padding is desired, it must be performed before calling ifft.

Examples

>>> np.fft.ifft([0, 4, 0, 0])
array([ 1.+0.j,  0.+1.j, -1.+0.j,  0.-1.j]) # may vary

Create and plot a band-limited signal with random phases:

>>> import matplotlib.pyplot as plt
>>> t = np.arange(400)
>>> n = np.zeros((400,), dtype=complex)
>>> n[40:60] = np.exp(1j*np.random.uniform(0, 2*np.pi, (20,)))
>>> s = np.fft.ifft(n)
>>> plt.plot(t, s.real, 'b-', t, s.imag, 'r--')
[<matplotlib.lines.Line2D object at ...>, <matplotlib.lines.Line2D object at ...>]
>>> plt.legend(('real', 'imaginary'))
<matplotlib.legend.Legend object at ...>
>>> plt.show()
dask.array.fft.ifft2(a, s=None, axes=None)

Wrapping of numpy.fft.ifft2

The axis along which the FFT is applied must have only one chunk. To change the array’s chunking use dask.Array.rechunk.

The numpy.fft.ifft2 docstring follows below:

Compute the 2-dimensional inverse discrete Fourier Transform.

This function computes the inverse of the 2-dimensional discrete Fourier Transform over any number of axes in an M-dimensional array by means of the Fast Fourier Transform (FFT). In other words, ifft2(fft2(a)) == a to within numerical accuracy. By default, the inverse transform is computed over the last two axes of the input array.

The input, analogously to ifft, should be ordered in the same way as is returned by fft2, i.e. it should have the term for zero frequency in the low-order corner of the two axes, the positive frequency terms in the first half of these axes, the term for the Nyquist frequency in the middle of the axes and the negative frequency terms in the second half of both axes, in order of decreasingly negative frequency.

Parameters
aarray_like

Input array, can be complex.

ssequence of ints, optional

Shape (length of each axis) of the output (s[0] refers to axis 0, s[1] to axis 1, etc.). This corresponds to n for ifft(x, n). Along each axis, if the given shape is smaller than that of the input, the input is cropped. If it is larger, the input is padded with zeros. if s is not given, the shape of the input along the axes specified by axes is used. See notes for issue on ifft zero padding.

axessequence of ints, optional

Axes over which to compute the FFT. If not given, the last two axes are used. A repeated index in axes means the transform over that axis is performed multiple times. A one-element sequence means that a one-dimensional FFT is performed.

norm{None, “ortho”}, optional

New in version 1.10.0.

Normalization mode (see numpy.fft). Default is None.

Returns
outcomplex ndarray

The truncated or zero-padded input, transformed along the axes indicated by axes, or the last two axes if axes is not given.

Raises
ValueError

If s and axes have different length, or axes not given and len(s) != 2.

IndexError

If an element of axes is larger than than the number of axes of a.

See also

numpy.fft

Overall view of discrete Fourier transforms, with definitions and conventions used.

fft2

The forward 2-dimensional FFT, of which ifft2 is the inverse.

ifftn

The inverse of the n-dimensional FFT.

fft

The one-dimensional FFT.

ifft

The one-dimensional inverse FFT.

Notes

ifft2 is just ifftn with a different default for axes.

See ifftn for details and a plotting example, and numpy.fft for definition and conventions used.

Zero-padding, analogously with ifft, is performed by appending zeros to the input along the specified dimension. Although this is the common approach, it might lead to surprising results. If another form of zero padding is desired, it must be performed before ifft2 is called.

Examples

>>> a = 4 * np.eye(4)
>>> np.fft.ifft2(a)
array([[1.+0.j,  0.+0.j,  0.+0.j,  0.+0.j], # may vary
       [0.+0.j,  0.+0.j,  0.+0.j,  1.+0.j],
       [0.+0.j,  0.+0.j,  1.+0.j,  0.+0.j],
       [0.+0.j,  1.+0.j,  0.+0.j,  0.+0.j]])
dask.array.fft.ifftn(a, s=None, axes=None)

Wrapping of numpy.fft.ifftn

The axis along which the FFT is applied must have only one chunk. To change the array’s chunking use dask.Array.rechunk.

The numpy.fft.ifftn docstring follows below:

Compute the N-dimensional inverse discrete Fourier Transform.

This function computes the inverse of the N-dimensional discrete Fourier Transform over any number of axes in an M-dimensional array by means of the Fast Fourier Transform (FFT). In other words, ifftn(fftn(a)) == a to within numerical accuracy. For a description of the definitions and conventions used, see numpy.fft.

The input, analogously to ifft, should be ordered in the same way as is returned by fftn, i.e. it should have the term for zero frequency in all axes in the low-order corner, the positive frequency terms in the first half of all axes, the term for the Nyquist frequency in the middle of all axes and the negative frequency terms in the second half of all axes, in order of decreasingly negative frequency.

Parameters
aarray_like

Input array, can be complex.

ssequence of ints, optional

Shape (length of each transformed axis) of the output (s[0] refers to axis 0, s[1] to axis 1, etc.). This corresponds to n for ifft(x, n). Along any axis, if the given shape is smaller than that of the input, the input is cropped. If it is larger, the input is padded with zeros. if s is not given, the shape of the input along the axes specified by axes is used. See notes for issue on ifft zero padding.

axessequence of ints, optional

Axes over which to compute the IFFT. If not given, the last len(s) axes are used, or all axes if s is also not specified. Repeated indices in axes means that the inverse transform over that axis is performed multiple times.

norm{None, “ortho”}, optional

New in version 1.10.0.

Normalization mode (see numpy.fft). Default is None.

Returns
outcomplex ndarray

The truncated or zero-padded input, transformed along the axes indicated by axes, or by a combination of s or a, as explained in the parameters section above.

Raises
ValueError

If s and axes have different length.

IndexError

If an element of axes is larger than than the number of axes of a.

See also

numpy.fft

Overall view of discrete Fourier transforms, with definitions and conventions used.

fftn

The forward n-dimensional FFT, of which ifftn is the inverse.

ifft

The one-dimensional inverse FFT.

ifft2

The two-dimensional inverse FFT.

ifftshift

Undoes fftshift, shifts zero-frequency terms to beginning of array.

Notes

See numpy.fft for definitions and conventions used.

Zero-padding, analogously with ifft, is performed by appending zeros to the input along the specified dimension. Although this is the common approach, it might lead to surprising results. If another form of zero padding is desired, it must be performed before ifftn is called.

Examples

>>> a = np.eye(4)
>>> np.fft.ifftn(np.fft.fftn(a, axes=(0,)), axes=(1,))
array([[1.+0.j,  0.+0.j,  0.+0.j,  0.+0.j], # may vary
       [0.+0.j,  1.+0.j,  0.+0.j,  0.+0.j],
       [0.+0.j,  0.+0.j,  1.+0.j,  0.+0.j],
       [0.+0.j,  0.+0.j,  0.+0.j,  1.+0.j]])

Create and plot an image with band-limited frequency content:

>>> import matplotlib.pyplot as plt
>>> n = np.zeros((200,200), dtype=complex)
>>> n[60:80, 20:40] = np.exp(1j*np.random.uniform(0, 2*np.pi, (20, 20)))
>>> im = np.fft.ifftn(n).real
>>> plt.imshow(im)
<matplotlib.image.AxesImage object at 0x...>
>>> plt.show()
dask.array.fft.rfft(a, n=None, axis=None)

Wrapping of numpy.fft.rfft

The axis along which the FFT is applied must have only one chunk. To change the array’s chunking use dask.Array.rechunk.

The numpy.fft.rfft docstring follows below:

Compute the one-dimensional discrete Fourier Transform for real input.

This function computes the one-dimensional n-point discrete Fourier Transform (DFT) of a real-valued array by means of an efficient algorithm called the Fast Fourier Transform (FFT).

Parameters
aarray_like

Input array

nint, optional

Number of points along transformation axis in the input to use. If n is smaller than the length of the input, the input is cropped. If it is larger, the input is padded with zeros. If n is not given, the length of the input along the axis specified by axis is used.

axisint, optional

Axis over which to compute the FFT. If not given, the last axis is used.

norm{None, “ortho”}, optional

New in version 1.10.0.

Normalization mode (see numpy.fft). Default is None.

Returns
outcomplex ndarray

The truncated or zero-padded input, transformed along the axis indicated by axis, or the last one if axis is not specified. If n is even, the length of the transformed axis is (n/2)+1. If n is odd, the length is (n+1)/2.

Raises
IndexError

If axis is larger than the last axis of a.

See also

numpy.fft

For definition of the DFT and conventions used.

irfft

The inverse of rfft.

fft

The one-dimensional FFT of general (complex) input.

fftn

The n-dimensional FFT.

rfftn

The n-dimensional FFT of real input.

Notes

When the DFT is computed for purely real input, the output is Hermitian-symmetric, i.e. the negative frequency terms are just the complex conjugates of the corresponding positive-frequency terms, and the negative-frequency terms are therefore redundant. This function does not compute the negative frequency terms, and the length of the transformed axis of the output is therefore n//2 + 1.

When A = rfft(a) and fs is the sampling frequency, A[0] contains the zero-frequency term 0*fs, which is real due to Hermitian symmetry.

If n is even, A[-1] contains the term representing both positive and negative Nyquist frequency (+fs/2 and -fs/2), and must also be purely real. If n is odd, there is no term at fs/2; A[-1] contains the largest positive frequency (fs/2*(n-1)/n), and is complex in the general case.

If the input a contains an imaginary part, it is silently discarded.

Examples

>>> np.fft.fft([0, 1, 0, 0])
array([ 1.+0.j,  0.-1.j, -1.+0.j,  0.+1.j]) # may vary
>>> np.fft.rfft([0, 1, 0, 0])
array([ 1.+0.j,  0.-1.j, -1.+0.j]) # may vary

Notice how the final element of the fft output is the complex conjugate of the second element, for real input. For rfft, this symmetry is exploited to compute only the non-negative frequency terms.

dask.array.fft.rfft2(a, s=None, axes=None)

Wrapping of numpy.fft.rfft2

The axis along which the FFT is applied must have only one chunk. To change the array’s chunking use dask.Array.rechunk.

The numpy.fft.rfft2 docstring follows below:

Compute the 2-dimensional FFT of a real array.

Parameters
aarray

Input array, taken to be real.

ssequence of ints, optional

Shape of the FFT.

axessequence of ints, optional

Axes over which to compute the FFT.

norm{None, “ortho”}, optional

New in version 1.10.0.

Normalization mode (see numpy.fft). Default is None.

Returns
outndarray

The result of the real 2-D FFT.

See also

rfftn

Compute the N-dimensional discrete Fourier Transform for real input.

Notes

This is really just rfftn with different default behavior. For more details see rfftn.

dask.array.fft.rfftn(a, s=None, axes=None)

Wrapping of numpy.fft.rfftn

The axis along which the FFT is applied must have only one chunk. To change the array’s chunking use dask.Array.rechunk.

The numpy.fft.rfftn docstring follows below:

Compute the N-dimensional discrete Fourier Transform for real input.

This function computes the N-dimensional discrete Fourier Transform over any number of axes in an M-dimensional real array by means of the Fast Fourier Transform (FFT). By default, all axes are transformed, with the real transform performed over the last axis, while the remaining transforms are complex.

Parameters
aarray_like

Input array, taken to be real.

ssequence of ints, optional

Shape (length along each transformed axis) to use from the input. (s[0] refers to axis 0, s[1] to axis 1, etc.). The final element of s corresponds to n for rfft(x, n), while for the remaining axes, it corresponds to n for fft(x, n). Along any axis, if the given shape is smaller than that of the input, the input is cropped. If it is larger, the input is padded with zeros. if s is not given, the shape of the input along the axes specified by axes is used.

axessequence of ints, optional

Axes over which to compute the FFT. If not given, the last len(s) axes are used, or all axes if s is also not specified.

norm{None, “ortho”}, optional

New in version 1.10.0.

Normalization mode (see numpy.fft). Default is None.

Returns
outcomplex ndarray

The truncated or zero-padded input, transformed along the axes indicated by axes, or by a combination of s and a, as explained in the parameters section above. The length of the last axis transformed will be s[-1]//2+1, while the remaining transformed axes will have lengths according to s, or unchanged from the input.

Raises
ValueError

If s and axes have different length.

IndexError

If an element of axes is larger than than the number of axes of a.

See also

irfftn

The inverse of rfftn, i.e. the inverse of the n-dimensional FFT of real input.

fft

The one-dimensional FFT, with definitions and conventions used.

rfft

The one-dimensional FFT of real input.

fftn

The n-dimensional FFT.

rfft2

The two-dimensional FFT of real input.

Notes

The transform for real input is performed over the last transformation axis, as by rfft, then the transform over the remaining axes is performed as by fftn. The order of the output is as for rfft for the final transformation axis, and as for fftn for the remaining transformation axes.

See fft for details, definitions and conventions used.

Examples

>>> a = np.ones((2, 2, 2))
>>> np.fft.rfftn(a)
array([[[8.+0.j,  0.+0.j], # may vary
        [0.+0.j,  0.+0.j]],
       [[0.+0.j,  0.+0.j],
        [0.+0.j,  0.+0.j]]])
>>> np.fft.rfftn(a, axes=(2, 0))
array([[[4.+0.j,  0.+0.j], # may vary
        [4.+0.j,  0.+0.j]],
       [[0.+0.j,  0.+0.j],
        [0.+0.j,  0.+0.j]]])
dask.array.fft.irfft(a, n=None, axis=None)

Wrapping of numpy.fft.irfft

The axis along which the FFT is applied must have only one chunk. To change the array’s chunking use dask.Array.rechunk.

The numpy.fft.irfft docstring follows below:

Compute the inverse of the n-point DFT for real input.

This function computes the inverse of the one-dimensional n-point discrete Fourier Transform of real input computed by rfft. In other words, irfft(rfft(a), len(a)) == a to within numerical accuracy. (See Notes below for why len(a) is necessary here.)

The input is expected to be in the form returned by rfft, i.e. the real zero-frequency term followed by the complex positive frequency terms in order of increasing frequency. Since the discrete Fourier Transform of real input is Hermitian-symmetric, the negative frequency terms are taken to be the complex conjugates of the corresponding positive frequency terms.

Parameters
aarray_like

The input array.

nint, optional

Length of the transformed axis of the output. For n output points, n//2+1 input points are necessary. If the input is longer than this, it is cropped. If it is shorter than this, it is padded with zeros. If n is not given, it is taken to be 2*(m-1) where m is the length of the input along the axis specified by axis.

axisint, optional

Axis over which to compute the inverse FFT. If not given, the last axis is used.

norm{None, “ortho”}, optional

New in version 1.10.0.

Normalization mode (see numpy.fft). Default is None.

Returns
outndarray

The truncated or zero-padded input, transformed along the axis indicated by axis, or the last one if axis is not specified. The length of the transformed axis is n, or, if n is not given, 2*(m-1) where m is the length of the transformed axis of the input. To get an odd number of output points, n must be specified.

Raises
IndexError

If axis is larger than the last axis of a.

See also

numpy.fft

For definition of the DFT and conventions used.

rfft

The one-dimensional FFT of real input, of which irfft is inverse.

fft

The one-dimensional FFT.

irfft2

The inverse of the two-dimensional FFT of real input.

irfftn

The inverse of the n-dimensional FFT of real input.

Notes

Returns the real valued n-point inverse discrete Fourier transform of a, where a contains the non-negative frequency terms of a Hermitian-symmetric sequence. n is the length of the result, not the input.

If you specify an n such that a must be zero-padded or truncated, the extra/removed values will be added/removed at high frequencies. One can thus resample a series to m points via Fourier interpolation by: a_resamp = irfft(rfft(a), m).

The correct interpretation of the hermitian input depends on the length of the original data, as given by n. This is because each input shape could correspond to either an odd or even length signal. By default, irfft assumes an even output length which puts the last entry at the Nyquist frequency; aliasing with its symmetric counterpart. By Hermitian symmetry, the value is thus treated as purely real. To avoid losing information, the correct length of the real input must be given.

Examples

>>> np.fft.ifft([1, -1j, -1, 1j])
array([0.+0.j,  1.+0.j,  0.+0.j,  0.+0.j]) # may vary
>>> np.fft.irfft([1, -1j, -1])
array([0.,  1.,  0.,  0.])

Notice how the last term in the input to the ordinary ifft is the complex conjugate of the second term, and the output has zero imaginary part everywhere. When calling irfft, the negative frequencies are not specified, and the output array is purely real.

dask.array.fft.irfft2(a, s=None, axes=None)

Wrapping of numpy.fft.irfft2

The axis along which the FFT is applied must have only one chunk. To change the array’s chunking use dask.Array.rechunk.

The numpy.fft.irfft2 docstring follows below:

Compute the 2-dimensional inverse FFT of a real array.

Parameters
aarray_like

The input array

ssequence of ints, optional

Shape of the real output to the inverse FFT.

axessequence of ints, optional

The axes over which to compute the inverse fft. Default is the last two axes.

norm{None, “ortho”}, optional

New in version 1.10.0.

Normalization mode (see numpy.fft). Default is None.

Returns
outndarray

The result of the inverse real 2-D FFT.

See also

irfftn

Compute the inverse of the N-dimensional FFT of real input.

Notes

This is really irfftn with different defaults. For more details see irfftn.

dask.array.fft.irfftn(a, s=None, axes=None)

Wrapping of numpy.fft.irfftn

The axis along which the FFT is applied must have only one chunk. To change the array’s chunking use dask.Array.rechunk.

The numpy.fft.irfftn docstring follows below:

Compute the inverse of the N-dimensional FFT of real input.

This function computes the inverse of the N-dimensional discrete Fourier Transform for real input over any number of axes in an M-dimensional array by means of the Fast Fourier Transform (FFT). In other words, irfftn(rfftn(a), a.shape) == a to within numerical accuracy. (The a.shape is necessary like len(a) is for irfft, and for the same reason.)

The input should be ordered in the same way as is returned by rfftn, i.e. as for irfft for the final transformation axis, and as for ifftn along all the other axes.

Parameters
aarray_like

Input array.

ssequence of ints, optional

Shape (length of each transformed axis) of the output (s[0] refers to axis 0, s[1] to axis 1, etc.). s is also the number of input points used along this axis, except for the last axis, where s[-1]//2+1 points of the input are used. Along any axis, if the shape indicated by s is smaller than that of the input, the input is cropped. If it is larger, the input is padded with zeros. If s is not given, the shape of the input along the axes specified by axes is used. Except for the last axis which is taken to be 2*(m-1) where m is the length of the input along that axis.

axessequence of ints, optional

Axes over which to compute the inverse FFT. If not given, the last len(s) axes are used, or all axes if s is also not specified. Repeated indices in axes means that the inverse transform over that axis is performed multiple times.

norm{None, “ortho”}, optional

New in version 1.10.0.

Normalization mode (see numpy.fft). Default is None.

Returns
outndarray

The truncated or zero-padded input, transformed along the axes indicated by axes, or by a combination of s or a, as explained in the parameters section above. The length of each transformed axis is as given by the corresponding element of s, or the length of the input in every axis except for the last one if s is not given. In the final transformed axis the length of the output when s is not given is 2*(m-1) where m is the length of the final transformed axis of the input. To get an odd number of output points in the final axis, s must be specified.

Raises
ValueError

If s and axes have different length.

IndexError

If an element of axes is larger than than the number of axes of a.

See also

rfftn

The forward n-dimensional FFT of real input, of which ifftn is the inverse.

fft

The one-dimensional FFT, with definitions and conventions used.

irfft

The inverse of the one-dimensional FFT of real input.

irfft2

The inverse of the two-dimensional FFT of real input.

Notes

See fft for definitions and conventions used.

See rfft for definitions and conventions used for real input.

The correct interpretation of the hermitian input depends on the shape of the original data, as given by s. This is because each input shape could correspond to either an odd or even length signal. By default, irfftn assumes an even output length which puts the last entry at the Nyquist frequency; aliasing with its symmetric counterpart. When performing the final complex to real transform, the last value is thus treated as purely real. To avoid losing information, the correct shape of the real input must be given.

Examples

>>> a = np.zeros((3, 2, 2))
>>> a[0, 0, 0] = 3 * 2 * 2
>>> np.fft.irfftn(a)
array([[[1.,  1.],
        [1.,  1.]],
       [[1.,  1.],
        [1.,  1.]],
       [[1.,  1.],
        [1.,  1.]]])
dask.array.fft.hfft(a, n=None, axis=None)

Wrapping of numpy.fft.hfft

The axis along which the FFT is applied must have only one chunk. To change the array’s chunking use dask.Array.rechunk.

The numpy.fft.hfft docstring follows below:

Compute the FFT of a signal that has Hermitian symmetry, i.e., a real spectrum.

Parameters
aarray_like

The input array.

nint, optional

Length of the transformed axis of the output. For n output points, n//2 + 1 input points are necessary. If the input is longer than this, it is cropped. If it is shorter than this, it is padded with zeros. If n is not given, it is taken to be 2*(m-1) where m is the length of the input along the axis specified by axis.

axisint, optional

Axis over which to compute the FFT. If not given, the last axis is used.

norm{None, “ortho”}, optional

Normalization mode (see numpy.fft). Default is None.

New in version 1.10.0.

Returns
outndarray

The truncated or zero-padded input, transformed along the axis indicated by axis, or the last one if axis is not specified. The length of the transformed axis is n, or, if n is not given, 2*m - 2 where m is the length of the transformed axis of the input. To get an odd number of output points, n must be specified, for instance as 2*m - 1 in the typical case,

Raises
IndexError

If axis is larger than the last axis of a.

See also

rfft

Compute the one-dimensional FFT for real input.

ihfft

The inverse of hfft.

Notes

hfft/ihfft are a pair analogous to rfft/irfft, but for the opposite case: here the signal has Hermitian symmetry in the time domain and is real in the frequency domain. So here it’s hfft for which you must supply the length of the result if it is to be odd.

  • even: ihfft(hfft(a, 2*len(a) - 2) == a, within roundoff error,

  • odd: ihfft(hfft(a, 2*len(a) - 1) == a, within roundoff error.

The correct interpretation of the hermitian input depends on the length of the original data, as given by n. This is because each input shape could correspond to either an odd or even length signal. By default, hfft assumes an even output length which puts the last entry at the Nyquist frequency; aliasing with its symmetric counterpart. By Hermitian symmetry, the value is thus treated as purely real. To avoid losing information, the shape of the full signal must be given.

Examples

>>> signal = np.array([1, 2, 3, 4, 3, 2])
>>> np.fft.fft(signal)
array([15.+0.j,  -4.+0.j,   0.+0.j,  -1.-0.j,   0.+0.j,  -4.+0.j]) # may vary
>>> np.fft.hfft(signal[:4]) # Input first half of signal
array([15.,  -4.,   0.,  -1.,   0.,  -4.])
>>> np.fft.hfft(signal, 6)  # Input entire signal and truncate
array([15.,  -4.,   0.,  -1.,   0.,  -4.])
>>> signal = np.array([[1, 1.j], [-1.j, 2]])
>>> np.conj(signal.T) - signal   # check Hermitian symmetry
array([[ 0.-0.j,  -0.+0.j], # may vary
       [ 0.+0.j,  0.-0.j]])
>>> freq_spectrum = np.fft.hfft(signal)
>>> freq_spectrum
array([[ 1.,  1.],
       [ 2., -2.]])
dask.array.fft.ihfft(a, n=None, axis=None)

Wrapping of numpy.fft.ihfft

The axis along which the FFT is applied must have only one chunk. To change the array’s chunking use dask.Array.rechunk.

The numpy.fft.ihfft docstring follows below:

Compute the inverse FFT of a signal that has Hermitian symmetry.

Parameters
aarray_like

Input array.

nint, optional

Length of the inverse FFT, the number of points along transformation axis in the input to use. If n is smaller than the length of the input, the input is cropped. If it is larger, the input is padded with zeros. If n is not given, the length of the input along the axis specified by axis is used.

axisint, optional

Axis over which to compute the inverse FFT. If not given, the last axis is used.

norm{None, “ortho”}, optional

Normalization mode (see numpy.fft). Default is None.

New in version 1.10.0.

Returns
outcomplex ndarray

The truncated or zero-padded input, transformed along the axis indicated by axis, or the last one if axis is not specified. The length of the transformed axis is n//2 + 1.

See also

hfft, irfft

Notes

hfft/ihfft are a pair analogous to rfft/irfft, but for the opposite case: here the signal has Hermitian symmetry in the time domain and is real in the frequency domain. So here it’s hfft for which you must supply the length of the result if it is to be odd:

  • even: ihfft(hfft(a, 2*len(a) - 2) == a, within roundoff error,

  • odd: ihfft(hfft(a, 2*len(a) - 1) == a, within roundoff error.

Examples

>>> spectrum = np.array([ 15, -4, 0, -1, 0, -4])
>>> np.fft.ifft(spectrum)
array([1.+0.j,  2.+0.j,  3.+0.j,  4.+0.j,  3.+0.j,  2.+0.j]) # may vary
>>> np.fft.ihfft(spectrum)
array([ 1.-0.j,  2.-0.j,  3.-0.j,  4.-0.j]) # may vary
dask.array.fft.fftfreq(n, d=1.0, chunks='auto')

Return the Discrete Fourier Transform sample frequencies.

This docstring was copied from numpy.fft.fftfreq.

Some inconsistencies with the Dask version may exist.

The returned float array f contains the frequency bin centers in cycles per unit of the sample spacing (with zero at the start). For instance, if the sample spacing is in seconds, then the frequency unit is cycles/second.

Given a window length n and a sample spacing d:

f = [0, 1, ...,   n/2-1,     -n/2, ..., -1] / (d*n)   if n is even
f = [0, 1, ..., (n-1)/2, -(n-1)/2, ..., -1] / (d*n)   if n is odd
Parameters
nint

Window length.

dscalar, optional

Sample spacing (inverse of the sampling rate). Defaults to 1.

Returns
fndarray

Array of length n containing the sample frequencies.

Examples

>>> signal = np.array([-2, 8, 6, 4, 1, 0, 3, 5], dtype=float)  
>>> fourier = np.fft.fft(signal)  
>>> n = signal.size  
>>> timestep = 0.1  
>>> freq = np.fft.fftfreq(n, d=timestep)  
>>> freq  
array([ 0.  ,  1.25,  2.5 , ..., -3.75, -2.5 , -1.25])
dask.array.fft.rfftfreq(n, d=1.0, chunks='auto')

Return the Discrete Fourier Transform sample frequencies (for usage with rfft, irfft).

This docstring was copied from numpy.fft.rfftfreq.

Some inconsistencies with the Dask version may exist.

The returned float array f contains the frequency bin centers in cycles per unit of the sample spacing (with zero at the start). For instance, if the sample spacing is in seconds, then the frequency unit is cycles/second.

Given a window length n and a sample spacing d:

f = [0, 1, ...,     n/2-1,     n/2] / (d*n)   if n is even
f = [0, 1, ..., (n-1)/2-1, (n-1)/2] / (d*n)   if n is odd

Unlike fftfreq (but like scipy.fftpack.rfftfreq) the Nyquist frequency component is considered to be positive.

Parameters
nint

Window length.

dscalar, optional

Sample spacing (inverse of the sampling rate). Defaults to 1.

Returns
fndarray

Array of length n//2 + 1 containing the sample frequencies.

Examples

>>> signal = np.array([-2, 8, 6, 4, 1, 0, 3, 5, -3, 4], dtype=float)  
>>> fourier = np.fft.rfft(signal)  
>>> n = signal.size  
>>> sample_rate = 100  
>>> freq = np.fft.fftfreq(n, d=1./sample_rate)  
>>> freq  
array([  0.,  10.,  20., ..., -30., -20., -10.])
>>> freq = np.fft.rfftfreq(n, d=1./sample_rate)  
>>> freq  
array([  0.,  10.,  20.,  30.,  40.,  50.])
dask.array.fft.fftshift(x, axes=None)

Shift the zero-frequency component to the center of the spectrum.

This docstring was copied from numpy.fft.fftshift.

Some inconsistencies with the Dask version may exist.

This function swaps half-spaces for all axes listed (defaults to all). Note that y[0] is the Nyquist component only if len(x) is even.

Parameters
xarray_like

Input array.

axesint or shape tuple, optional

Axes over which to shift. Default is None, which shifts all axes.

Returns
yndarray

The shifted array.

See also

ifftshift

The inverse of fftshift.

Examples

>>> freqs = np.fft.fftfreq(10, 0.1)  
>>> freqs  
array([ 0.,  1.,  2., ..., -3., -2., -1.])
>>> np.fft.fftshift(freqs)  
array([-5., -4., -3., -2., -1.,  0.,  1.,  2.,  3.,  4.])

Shift the zero-frequency component only along the second axis:

>>> freqs = np.fft.fftfreq(9, d=1./9).reshape(3, 3)  
>>> freqs  
array([[ 0.,  1.,  2.],
       [ 3.,  4., -4.],
       [-3., -2., -1.]])
>>> np.fft.fftshift(freqs, axes=(1,))  
array([[ 2.,  0.,  1.],
       [-4.,  3.,  4.],
       [-1., -3., -2.]])
dask.array.fft.ifftshift(x, axes=None)

The inverse of fftshift. Although identical for even-length x, the functions differ by one sample for odd-length x.

This docstring was copied from numpy.fft.ifftshift.

Some inconsistencies with the Dask version may exist.

Parameters
xarray_like

Input array.

axesint or shape tuple, optional

Axes over which to calculate. Defaults to None, which shifts all axes.

Returns
yndarray

The shifted array.

See also

fftshift

Shift zero-frequency component to the center of the spectrum.

Examples

>>> freqs = np.fft.fftfreq(9, d=1./9).reshape(3, 3)  
>>> freqs  
array([[ 0.,  1.,  2.],
       [ 3.,  4., -4.],
       [-3., -2., -1.]])
>>> np.fft.ifftshift(np.fft.fftshift(freqs))  
array([[ 0.,  1.,  2.],
       [ 3.,  4., -4.],
       [-3., -2., -1.]])
dask.array.random.beta(a, b, size=None, chunks='auto', **kwargs)

Draw samples from a Beta distribution.

This docstring was copied from numpy.random.mtrand.RandomState.beta.

Some inconsistencies with the Dask version may exist.

The Beta distribution is a special case of the Dirichlet distribution, and is related to the Gamma distribution. It has the probability distribution function

\[f(x; a,b) = \frac{1}{B(\alpha, \beta)} x^{\alpha - 1} (1 - x)^{\beta - 1},\]

where the normalization, B, is the beta function,

\[B(\alpha, \beta) = \int_0^1 t^{\alpha - 1} (1 - t)^{\beta - 1} dt.\]

It is often seen in Bayesian inference and order statistics.

Note

New code should use the beta method of a default_rng() instance instead; see random-quick-start.

Parameters
afloat or array_like of floats

Alpha, positive (>0).

bfloat or array_like of floats

Beta, positive (>0).

sizeint or tuple of ints, optional

Output shape. If the given shape is, e.g., (m, n, k), then m * n * k samples are drawn. If size is None (default), a single value is returned if a and b are both scalars. Otherwise, np.broadcast(a, b).size samples are drawn.

Returns
outndarray or scalar

Drawn samples from the parameterized beta distribution.

See also

Generator.beta

which should be used for new code.

dask.array.random.binomial(n, p, size=None, chunks='auto', **kwargs)

Draw samples from a binomial distribution.

This docstring was copied from numpy.random.mtrand.RandomState.binomial.

Some inconsistencies with the Dask version may exist.

Samples are drawn from a binomial distribution with specified parameters, n trials and p probability of success where n an integer >= 0 and p is in the interval [0,1]. (n may be input as a float, but it is truncated to an integer in use)

Note

New code should use the binomial method of a default_rng() instance instead; see random-quick-start.

Parameters
nint or array_like of ints

Parameter of the distribution, >= 0. Floats are also accepted, but they will be truncated to integers.

pfloat or array_like of floats

Parameter of the distribution, >= 0 and <=1.

sizeint or tuple of ints, optional

Output shape. If the given shape is, e.g., (m, n, k), then m * n * k samples are drawn. If size is None (default), a single value is returned if n and p are both scalars. Otherwise, np.broadcast(n, p).size samples are drawn.

Returns
outndarray or scalar

Drawn samples from the parameterized binomial distribution, where each sample is equal to the number of successes over the n trials.

See also

scipy.stats.binom

probability density function, distribution or cumulative density function, etc.

Generator.binomial

which should be used for new code.

Notes

The probability density for the binomial distribution is

\[P(N) = \binom{n}{N}p^N(1-p)^{n-N},\]

where \(n\) is the number of trials, \(p\) is the probability of success, and \(N\) is the number of successes.

When estimating the standard error of a proportion in a population by using a random sample, the normal distribution works well unless the product p*n <=5, where p = population proportion estimate, and n = number of samples, in which case the binomial distribution is used instead. For example, a sample of 15 people shows 4 who are left handed, and 11 who are right handed. Then p = 4/15 = 27%. 0.27*15 = 4, so the binomial distribution should be used in this case.

References

1

Dalgaard, Peter, “Introductory Statistics with R”, Springer-Verlag, 2002.

2

Glantz, Stanton A. “Primer of Biostatistics.”, McGraw-Hill, Fifth Edition, 2002.

3

Lentner, Marvin, “Elementary Applied Statistics”, Bogden and Quigley, 1972.

4

Weisstein, Eric W. “Binomial Distribution.” From MathWorld–A Wolfram Web Resource. http://mathworld.wolfram.com/BinomialDistribution.html

5

Wikipedia, “Binomial distribution”, https://en.wikipedia.org/wiki/Binomial_distribution

Examples

Draw samples from the distribution:

>>> n, p = 10, .5  # number of trials, probability of each trial  
>>> s = np.random.binomial(n, p, 1000)  
# result of flipping a coin 10 times, tested 1000 times.

A real world example. A company drills 9 wild-cat oil exploration wells, each with an estimated probability of success of 0.1. All nine wells fail. What is the probability of that happening?

Let’s do 20,000 trials of the model, and count the number that generate zero positive results.

>>> sum(np.random.binomial(9, 0.1, 20000) == 0)/20000.  
# answer = 0.38885, or 38%.
dask.array.random.chisquare(df, size=None, chunks='auto', **kwargs)

Draw samples from a chi-square distribution.

This docstring was copied from numpy.random.mtrand.RandomState.chisquare.

Some inconsistencies with the Dask version may exist.

When df independent random variables, each with standard normal distributions (mean 0, variance 1), are squared and summed, the resulting distribution is chi-square (see Notes). This distribution is often used in hypothesis testing.

Note

New code should use the chisquare method of a default_rng() instance instead; see random-quick-start.

Parameters
dffloat or array_like of floats

Number of degrees of freedom, must be > 0.

sizeint or tuple of ints, optional

Output shape. If the given shape is, e.g., (m, n, k), then m * n * k samples are drawn. If size is None (default), a single value is returned if df is a scalar. Otherwise, np.array(df).size samples are drawn.

Returns
outndarray or scalar

Drawn samples from the parameterized chi-square distribution.

Raises
ValueError

When df <= 0 or when an inappropriate size (e.g. size=-1) is given.

See also

Generator.chisquare

which should be used for new code.

Notes

The variable obtained by summing the squares of df independent, standard normally distributed random variables:

\[Q = \sum_{i=0}^{\mathtt{df}} X^2_i\]

is chi-square distributed, denoted

\[Q \sim \chi^2_k.\]

The probability density function of the chi-squared distribution is

\[p(x) = \frac{(1/2)^{k/2}}{\Gamma(k/2)} x^{k/2 - 1} e^{-x/2},\]

where \(\Gamma\) is the gamma function,

\[\Gamma(x) = \int_0^{-\infty} t^{x - 1} e^{-t} dt.\]

References

1

NIST “Engineering Statistics Handbook” https://www.itl.nist.gov/div898/handbook/eda/section3/eda3666.htm

Examples

>>> np.random.chisquare(2,4)  
array([ 1.89920014,  9.00867716,  3.13710533,  5.62318272]) # random
dask.array.random.choice(a, size=None, replace=True, p=None, chunks='auto')

Generates a random sample from a given 1-D array

This docstring was copied from numpy.random.mtrand.RandomState.choice.

Some inconsistencies with the Dask version may exist.

New in version 1.7.0.

Note

New code should use the choice method of a default_rng() instance instead; see random-quick-start.

Parameters
a1-D array-like or int

If an ndarray, a random sample is generated from its elements. If an int, the random sample is generated as if a were np.arange(a)

sizeint or tuple of ints, optional

Output shape. If the given shape is, e.g., (m, n, k), then m * n * k samples are drawn. Default is None, in which case a single value is returned.

replaceboolean, optional

Whether the sample is with or without replacement

p1-D array-like, optional

The probabilities associated with each entry in a. If not given the sample assumes a uniform distribution over all entries in a.

Returns
samplessingle item or ndarray

The generated random samples

Raises
ValueError

If a is an int and less than zero, if a or p are not 1-dimensional, if a is an array-like of size 0, if p is not a vector of probabilities, if a and p have different lengths, or if replace=False and the sample size is greater than the population size

See also

randint, shuffle, permutation
Generator.choice

which should be used in new code

Examples

Generate a uniform random sample from np.arange(5) of size 3:

>>> np.random.choice(5, 3)  
array([0, 3, 4]) # random
>>> #This is equivalent to np.random.randint(0,5,3)

Generate a non-uniform random sample from np.arange(5) of size 3:

>>> np.random.choice(5, 3, p=[0.1, 0, 0.3, 0.6, 0])  
array([3, 3, 0]) # random

Generate a uniform random sample from np.arange(5) of size 3 without replacement:

>>> np.random.choice(5, 3, replace=False)  
array([3,1,0]) # random
>>> #This is equivalent to np.random.permutation(np.arange(5))[:3]

Generate a non-uniform random sample from np.arange(5) of size 3 without replacement:

>>> np.random.choice(5, 3, replace=False, p=[0.1, 0, 0.3, 0.6, 0])  
array([2, 3, 0]) # random

Any of the above can be repeated with an arbitrary array-like instead of just integers. For instance:

>>> aa_milne_arr = ['pooh', 'rabbit', 'piglet', 'Christopher']  
>>> np.random.choice(aa_milne_arr, 5, p=[0.5, 0.1, 0.1, 0.3])  
array(['pooh', 'pooh', 'pooh', 'Christopher', 'piglet'], # random
      dtype='<U11')
dask.array.random.exponential(scale=1.0, size=None, chunks='auto', **kwargs)

Draw samples from an exponential distribution.

This docstring was copied from numpy.random.mtrand.RandomState.exponential.

Some inconsistencies with the Dask version may exist.

Its probability density function is

\[f(x; \frac{1}{\beta}) = \frac{1}{\beta} \exp(-\frac{x}{\beta}),\]

for x > 0 and 0 elsewhere. \(\beta\) is the scale parameter, which is the inverse of the rate parameter \(\lambda = 1/\beta\). The rate parameter is an alternative, widely used parameterization of the exponential distribution [3].

The exponential distribution is a continuous analogue of the geometric distribution. It describes many common situations, such as the size of raindrops measured over many rainstorms [1], or the time between page requests to Wikipedia [2].

Note

New code should use the exponential method of a default_rng() instance instead; see random-quick-start.

Parameters
scalefloat or array_like of floats

The scale parameter, \(\beta = 1/\lambda\). Must be non-negative.

sizeint or tuple of ints, optional

Output shape. If the given shape is, e.g., (m, n, k), then m * n * k samples are drawn. If size is None (default), a single value is returned if scale is a scalar. Otherwise, np.array(scale).size samples are drawn.

Returns
outndarray or scalar

Drawn samples from the parameterized exponential distribution.

See also

Generator.exponential

which should be used for new code.

References

1

Peyton Z. Peebles Jr., “Probability, Random Variables and Random Signal Principles”, 4th ed, 2001, p. 57.

2

Wikipedia, “Poisson process”, https://en.wikipedia.org/wiki/Poisson_process

3

Wikipedia, “Exponential distribution”, https://en.wikipedia.org/wiki/Exponential_distribution

dask.array.random.f(dfnum, dfden, size=None, chunks='auto', **kwargs)

Draw samples from an F distribution.

This docstring was copied from numpy.random.mtrand.RandomState.f.

Some inconsistencies with the Dask version may exist.

Samples are drawn from an F distribution with specified parameters, dfnum (degrees of freedom in numerator) and dfden (degrees of freedom in denominator), where both parameters must be greater than zero.

The random variate of the F distribution (also known as the Fisher distribution) is a continuous probability distribution that arises in ANOVA tests, and is the ratio of two chi-square variates.

Note

New code should use the f method of a default_rng() instance instead; see random-quick-start.

Parameters
dfnumfloat or array_like of floats

Degrees of freedom in numerator, must be > 0.

dfdenfloat or array_like of float

Degrees of freedom in denominator, must be > 0.

sizeint or tuple of ints, optional

Output shape. If the given shape is, e.g., (m, n, k), then m * n * k samples are drawn. If size is None (default), a single value is returned if dfnum and dfden are both scalars. Otherwise, np.broadcast(dfnum, dfden).size samples are drawn.

Returns
outndarray or scalar

Drawn samples from the parameterized Fisher distribution.

See also

scipy.stats.f

probability density function, distribution or cumulative density function, etc.

Generator.f

which should be used for new code.

Notes

The F statistic is used to compare in-group variances to between-group variances. Calculating the distribution depends on the sampling, and so it is a function of the respective degrees of freedom in the problem. The variable dfnum is the number of samples minus one, the between-groups degrees of freedom, while dfden is the within-groups degrees of freedom, the sum of the number of samples in each group minus the number of groups.

References

1

Glantz, Stanton A. “Primer of Biostatistics.”, McGraw-Hill, Fifth Edition, 2002.

2

Wikipedia, “F-distribution”, https://en.wikipedia.org/wiki/F-distribution

Examples

An example from Glantz[1], pp 47-40:

Two groups, children of diabetics (25 people) and children from people without diabetes (25 controls). Fasting blood glucose was measured, case group had a mean value of 86.1, controls had a mean value of 82.2. Standard deviations were 2.09 and 2.49 respectively. Are these data consistent with the null hypothesis that the parents diabetic status does not affect their children’s blood glucose levels? Calculating the F statistic from the data gives a value of 36.01.

Draw samples from the distribution:

>>> dfnum = 1. # between group degrees of freedom  
>>> dfden = 48. # within groups degrees of freedom  
>>> s = np.random.f(dfnum, dfden, 1000)  

The lower bound for the top 1% of the samples is :

>>> np.sort(s)[-10]  
7.61988120985 # random

So there is about a 1% chance that the F statistic will exceed 7.62, the measured value is 36, so the null hypothesis is rejected at the 1% level.

dask.array.random.gamma(shape, scale=1.0, size=None, chunks='auto', **kwargs)

Draw samples from a Gamma distribution.

This docstring was copied from numpy.random.mtrand.RandomState.gamma.

Some inconsistencies with the Dask version may exist.

Samples are drawn from a Gamma distribution with specified parameters, shape (sometimes designated “k”) and scale (sometimes designated “theta”), where both parameters are > 0.

Note

New code should use the gamma method of a default_rng() instance instead; see random-quick-start.

Parameters
shapefloat or array_like of floats

The shape of the gamma distribution. Must be non-negative.

scalefloat or array_like of floats, optional

The scale of the gamma distribution. Must be non-negative. Default is equal to 1.

sizeint or tuple of ints, optional

Output shape. If the given shape is, e.g., (m, n, k), then m * n * k samples are drawn. If size is None (default), a single value is returned if shape and scale are both scalars. Otherwise, np.broadcast(shape, scale).size samples are drawn.

Returns
outndarray or scalar

Drawn samples from the parameterized gamma distribution.

See also

scipy.stats.gamma

probability density function, distribution or cumulative density function, etc.

Generator.gamma

which should be used for new code.

Notes

The probability density for the Gamma distribution is

\[p(x) = x^{k-1}\frac{e^{-x/\theta}}{\theta^k\Gamma(k)},\]

where \(k\) is the shape and \(\theta\) the scale, and \(\Gamma\) is the Gamma function.

The Gamma distribution is often used to model the times to failure of electronic components, and arises naturally in processes for which the waiting times between Poisson distributed events are relevant.

References

1

Weisstein, Eric W. “Gamma Distribution.” From MathWorld–A Wolfram Web Resource. http://mathworld.wolfram.com/GammaDistribution.html

2

Wikipedia, “Gamma distribution”, https://en.wikipedia.org/wiki/Gamma_distribution

Examples

Draw samples from the distribution:

>>> shape, scale = 2., 2.  # mean=4, std=2*sqrt(2)  
>>> s = np.random.gamma(shape, scale, 1000)  

Display the histogram of the samples, along with the probability density function:

>>> import matplotlib.pyplot as plt  
>>> import scipy.special as sps  
>>> count, bins, ignored = plt.hist(s, 50, density=True)  
>>> y = bins**(shape-1)*(np.exp(-bins/scale) /  
...                      (sps.gamma(shape)*scale**shape))
>>> plt.plot(bins, y, linewidth=2, color='r')  
>>> plt.show()  
dask.array.random.geometric(p, size=None, chunks='auto', **kwargs)

Draw samples from the geometric distribution.

This docstring was copied from numpy.random.mtrand.RandomState.geometric.

Some inconsistencies with the Dask version may exist.

Bernoulli trials are experiments with one of two outcomes: success or failure (an example of such an experiment is flipping a coin). The geometric distribution models the number of trials that must be run in order to achieve success. It is therefore supported on the positive integers, k = 1, 2, ....

The probability mass function of the geometric distribution is

\[f(k) = (1 - p)^{k - 1} p\]

where p is the probability of success of an individual trial.

Note

New code should use the geometric method of a default_rng() instance instead; see random-quick-start.

Parameters
pfloat or array_like of floats

The probability of success of an individual trial.

sizeint or tuple of ints, optional

Output shape. If the given shape is, e.g., (m, n, k), then m * n * k samples are drawn. If size is None (default), a single value is returned if p is a scalar. Otherwise, np.array(p).size samples are drawn.

Returns
outndarray or scalar

Drawn samples from the parameterized geometric distribution.

See also

Generator.geometric

which should be used for new code.

Examples

Draw ten thousand values from the geometric distribution, with the probability of an individual success equal to 0.35:

>>> z = np.random.geometric(p=0.35, size=10000)  

How many trials succeeded after a single run?

>>> (z == 1).sum() / 10000.  
0.34889999999999999 #random
dask.array.random.gumbel(loc=0.0, scale=1.0, size=None, chunks='auto', **kwargs)

Draw samples from a Gumbel distribution.

This docstring was copied from numpy.random.mtrand.RandomState.gumbel.

Some inconsistencies with the Dask version may exist.

Draw samples from a Gumbel distribution with specified location and scale. For more information on the Gumbel distribution, see Notes and References below.

Note

New code should use the gumbel method of a default_rng() instance instead; see random-quick-start.

Parameters
locfloat or array_like of floats, optional

The location of the mode of the distribution. Default is 0.

scalefloat or array_like of floats, optional

The scale parameter of the distribution. Default is 1. Must be non- negative.

sizeint or tuple of ints, optional

Output shape. If the given shape is, e.g., (m, n, k), then m * n * k samples are drawn. If size is None (default), a single value is returned if loc and scale are both scalars. Otherwise, np.broadcast(loc, scale).size samples are drawn.

Returns
outndarray or scalar

Drawn samples from the parameterized Gumbel distribution.

See also

scipy.stats.gumbel_l
scipy.stats.gumbel_r
scipy.stats.genextreme
weibull
Generator.gumbel

which should be used for new code.

Notes

The Gumbel (or Smallest Extreme Value (SEV) or the Smallest Extreme Value Type I) distribution is one of a class of Generalized Extreme Value (GEV) distributions used in modeling extreme value problems. The Gumbel is a special case of the Extreme Value Type I distribution for maximums from distributions with “exponential-like” tails.

The probability density for the Gumbel distribution is

\[p(x) = \frac{e^{-(x - \mu)/ \beta}}{\beta} e^{ -e^{-(x - \mu)/ \beta}},\]

where \(\mu\) is the mode, a location parameter, and \(\beta\) is the scale parameter.

The Gumbel (named for German mathematician Emil Julius Gumbel) was used very early in the hydrology literature, for modeling the occurrence of flood events. It is also used for modeling maximum wind speed and rainfall rates. It is a “fat-tailed” distribution - the probability of an event in the tail of the distribution is larger than if one used a Gaussian, hence the surprisingly frequent occurrence of 100-year floods. Floods were initially modeled as a Gaussian process, which underestimated the frequency of extreme events.

It is one of a class of extreme value distributions, the Generalized Extreme Value (GEV) distributions, which also includes the Weibull and Frechet.

The function has a mean of \(\mu + 0.57721\beta\) and a variance of \(\frac{\pi^2}{6}\beta^2\).

References

1

Gumbel, E. J., “Statistics of Extremes,” New York: Columbia University Press, 1958.

2

Reiss, R.-D. and Thomas, M., “Statistical Analysis of Extreme Values from Insurance, Finance, Hydrology and Other Fields,” Basel: Birkhauser Verlag, 2001.

Examples

Draw samples from the distribution:

>>> mu, beta = 0, 0.1 # location and scale  
>>> s = np.random.gumbel(mu, beta, 1000)  

Display the histogram of the samples, along with the probability density function:

>>> import matplotlib.pyplot as plt  
>>> count, bins, ignored = plt.hist(s, 30, density=True)  
>>> plt.plot(bins, (1/beta)*np.exp(-(bins - mu)/beta)  
...          * np.exp( -np.exp( -(bins - mu) /beta) ),
...          linewidth=2, color='r')
>>> plt.show()  

Show how an extreme value distribution can arise from a Gaussian process and compare to a Gaussian:

>>> means = []  
>>> maxima = []  
>>> for i in range(0,1000) :  
...    a = np.random.normal(mu, beta, 1000)
...    means.append(a.mean())
...    maxima.append(a.max())
>>> count, bins, ignored = plt.hist(maxima, 30, density=True)  
>>> beta = np.std(maxima) * np.sqrt(6) / np.pi  
>>> mu = np.mean(maxima) - 0.57721*beta  
>>> plt.plot(bins, (1/beta)*np.exp(-(bins - mu)/beta)  
...          * np.exp(-np.exp(-(bins - mu)/beta)),
...          linewidth=2, color='r')
>>> plt.plot(bins, 1/(beta * np.sqrt(2 * np.pi))  
...          * np.exp(-(bins - mu)**2 / (2 * beta**2)),
...          linewidth=2, color='g')
>>> plt.show()  
dask.array.random.hypergeometric(ngood, nbad, nsample, size=None, chunks='auto', **kwargs)

Draw samples from a Hypergeometric distribution.

This docstring was copied from numpy.random.mtrand.RandomState.hypergeometric.

Some inconsistencies with the Dask version may exist.

Samples are drawn from a hypergeometric distribution with specified parameters, ngood (ways to make a good selection), nbad (ways to make a bad selection), and nsample (number of items sampled, which is less than or equal to the sum ngood + nbad).

Note

New code should use the hypergeometric method of a default_rng() instance instead; see random-quick-start.

Parameters
ngoodint or array_like of ints

Number of ways to make a good selection. Must be nonnegative.

nbadint or array_like of ints

Number of ways to make a bad selection. Must be nonnegative.

nsampleint or array_like of ints

Number of items sampled. Must be at least 1 and at most ngood + nbad.

sizeint or tuple of ints, optional

Output shape. If the given shape is, e.g., (m, n, k), then m * n * k samples are drawn. If size is None (default), a single value is returned if ngood, nbad, and nsample are all scalars. Otherwise, np.broadcast(ngood, nbad, nsample).size samples are drawn.

Returns
outndarray or scalar

Drawn samples from the parameterized hypergeometric distribution. Each sample is the number of good items within a randomly selected subset of size nsample taken from a set of ngood good items and nbad bad items.

See also

scipy.stats.hypergeom

probability density function, distribution or cumulative density function, etc.

Generator.hypergeometric

which should be used for new code.

Notes

The probability density for the Hypergeometric distribution is

\[P(x) = \frac{\binom{g}{x}\binom{b}{n-x}}{\binom{g+b}{n}},\]

where \(0 \le x \le n\) and \(n-b \le x \le g\)

for P(x) the probability of x good results in the drawn sample, g = ngood, b = nbad, and n = nsample.

Consider an urn with black and white marbles in it, ngood of them are black and nbad are white. If you draw nsample balls without replacement, then the hypergeometric distribution describes the distribution of black balls in the drawn sample.

Note that this distribution is very similar to the binomial distribution, except that in this case, samples are drawn without replacement, whereas in the Binomial case samples are drawn with replacement (or the sample space is infinite). As the sample space becomes large, this distribution approaches the binomial.

References

1

Lentner, Marvin, “Elementary Applied Statistics”, Bogden and Quigley, 1972.

2

Weisstein, Eric W. “Hypergeometric Distribution.” From MathWorld–A Wolfram Web Resource. http://mathworld.wolfram.com/HypergeometricDistribution.html

3

Wikipedia, “Hypergeometric distribution”, https://en.wikipedia.org/wiki/Hypergeometric_distribution

Examples

Draw samples from the distribution:

>>> ngood, nbad, nsamp = 100, 2, 10  
# number of good, number of bad, and number of samples
>>> s = np.random.hypergeometric(ngood, nbad, nsamp, 1000)  
>>> from matplotlib.pyplot import hist  
>>> hist(s)  
#   note that it is very unlikely to grab both bad items

Suppose you have an urn with 15 white and 15 black marbles. If you pull 15 marbles at random, how likely is it that 12 or more of them are one color?

>>> s = np.random.hypergeometric(15, 15, 15, 100000)  
>>> sum(s>=12)/100000. + sum(s<=3)/100000.  
#   answer = 0.003 ... pretty unlikely!
dask.array.random.laplace(loc=0.0, scale=1.0, size=None, chunks='auto', **kwargs)

Draw samples from the Laplace or double exponential distribution with specified location (or mean) and scale (decay).

This docstring was copied from numpy.random.mtrand.RandomState.laplace.

Some inconsistencies with the Dask version may exist.

The Laplace distribution is similar to the Gaussian/normal distribution, but is sharper at the peak and has fatter tails. It represents the difference between two independent, identically distributed exponential random variables.

Note

New code should use the laplace method of a default_rng() instance instead; see random-quick-start.

Parameters
locfloat or array_like of floats, optional

The position, \(\mu\), of the distribution peak. Default is 0.

scalefloat or array_like of floats, optional

\(\lambda\), the exponential decay. Default is 1. Must be non- negative.

sizeint or tuple of ints, optional

Output shape. If the given shape is, e.g., (m, n, k), then m * n * k samples are drawn. If size is None (default), a single value is returned if loc and scale are both scalars. Otherwise, np.broadcast(loc, scale).size samples are drawn.

Returns
outndarray or scalar

Drawn samples from the parameterized Laplace distribution.

See also

Generator.laplace

which should be used for new code.

Notes

It has the probability density function

\[f(x; \mu, \lambda) = \frac{1}{2\lambda} \exp\left(-\frac{|x - \mu|}{\lambda}\right).\]

The first law of Laplace, from 1774, states that the frequency of an error can be expressed as an exponential function of the absolute magnitude of the error, which leads to the Laplace distribution. For many problems in economics and health sciences, this distribution seems to model the data better than the standard Gaussian distribution.

References

1

Abramowitz, M. and Stegun, I. A. (Eds.). “Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables, 9th printing,” New York: Dover, 1972.

2

Kotz, Samuel, et. al. “The Laplace Distribution and Generalizations, ” Birkhauser, 2001.

3

Weisstein, Eric W. “Laplace Distribution.” From MathWorld–A Wolfram Web Resource. http://mathworld.wolfram.com/LaplaceDistribution.html

4

Wikipedia, “Laplace distribution”, https://en.wikipedia.org/wiki/Laplace_distribution

Examples

Draw samples from the distribution

>>> loc, scale = 0., 1.  
>>> s = np.random.laplace(loc, scale, 1000)  

Display the histogram of the samples, along with the probability density function:

>>> import matplotlib.pyplot as plt  
>>> count, bins, ignored = plt.hist(s, 30, density=True)  
>>> x = np.arange(-8., 8., .01)  
>>> pdf = np.exp(-abs(x-loc)/scale)/(2.*scale)  
>>> plt.plot(x, pdf)  

Plot Gaussian for comparison:

>>> g = (1/(scale * np.sqrt(2 * np.pi)) *  
...      np.exp(-(x - loc)**2 / (2 * scale**2)))
>>> plt.plot(x,g)  
dask.array.random.logistic(loc=0.0, scale=1.0, size=None, chunks='auto', **kwargs)

Draw samples from a logistic distribution.

This docstring was copied from numpy.random.mtrand.RandomState.logistic.

Some inconsistencies with the Dask version may exist.

Samples are drawn from a logistic distribution with specified parameters, loc (location or mean, also median), and scale (>0).

Note

New code should use the logistic method of a default_rng() instance instead; see random-quick-start.

Parameters
locfloat or array_like of floats, optional

Parameter of the distribution. Default is 0.

scalefloat or array_like of floats, optional

Parameter of the distribution. Must be non-negative. Default is 1.

sizeint or tuple of ints, optional

Output shape. If the given shape is, e.g., (m, n, k), then m * n * k samples are drawn. If size is None (default), a single value is returned if loc and scale are both scalars. Otherwise, np.broadcast(loc, scale).size samples are drawn.

Returns
outndarray or scalar

Drawn samples from the parameterized logistic distribution.

See also

scipy.stats.logistic

probability density function, distribution or cumulative density function, etc.

Generator.logistic

which should be used for new code.

Notes

The probability density for the Logistic distribution is

\[P(x) = P(x) = \frac{e^{-(x-\mu)/s}}{s(1+e^{-(x-\mu)/s})^2},\]

where \(\mu\) = location and \(s\) = scale.

The Logistic distribution is used in Extreme Value problems where it can act as a mixture of Gumbel distributions, in Epidemiology, and by the World Chess Federation (FIDE) where it is used in the Elo ranking system, assuming the performance of each player is a logistically distributed random variable.

References

1

Reiss, R.-D. and Thomas M. (2001), “Statistical Analysis of Extreme Values, from Insurance, Finance, Hydrology and Other Fields,” Birkhauser Verlag, Basel, pp 132-133.

2

Weisstein, Eric W. “Logistic Distribution.” From MathWorld–A Wolfram Web Resource. http://mathworld.wolfram.com/LogisticDistribution.html

3

Wikipedia, “Logistic-distribution”, https://en.wikipedia.org/wiki/Logistic_distribution

Examples

Draw samples from the distribution:

>>> loc, scale = 10, 1  
>>> s = np.random.logistic(loc, scale, 10000)  
>>> import matplotlib.pyplot as plt  
>>> count, bins, ignored = plt.hist(s, bins=50)  

# plot against distribution

>>> def logist(x, loc, scale):  
...     return np.exp((loc-x)/scale)/(scale*(1+np.exp((loc-x)/scale))**2)
>>> lgst_val = logist(bins, loc, scale)  
>>> plt.plot(bins, lgst_val * count.max() / lgst_val.max())  
>>> plt.show()  
dask.array.random.lognormal(mean=0.0, sigma=1.0, size=None, chunks='auto', **kwargs)

Draw samples from a log-normal distribution.

This docstring was copied from numpy.random.mtrand.RandomState.lognormal.

Some inconsistencies with the Dask version may exist.

Draw samples from a log-normal distribution with specified mean, standard deviation, and array shape. Note that the mean and standard deviation are not the values for the distribution itself, but of the underlying normal distribution it is derived from.

Note

New code should use the lognormal method of a default_rng() instance instead; see random-quick-start.

Parameters
meanfloat or array_like of floats, optional

Mean value of the underlying normal distribution. Default is 0.

sigmafloat or array_like of floats, optional

Standard deviation of the underlying normal distribution. Must be non-negative. Default is 1.

sizeint or tuple of ints, optional

Output shape. If the given shape is, e.g., (m, n, k), then m * n * k samples are drawn. If size is None (default), a single value is returned if mean and sigma are both scalars. Otherwise, np.broadcast(mean, sigma).size samples are drawn.

Returns
outndarray or scalar

Drawn samples from the parameterized log-normal distribution.

See also

scipy.stats.lognorm

probability density function, distribution, cumulative density function, etc.

Generator.lognormal

which should be used for new code.

Notes

A variable x has a log-normal distribution if log(x) is normally distributed. The probability density function for the log-normal distribution is:

\[p(x) = \frac{1}{\sigma x \sqrt{2\pi}} e^{(-\frac{(ln(x)-\mu)^2}{2\sigma^2})}\]

where \(\mu\) is the mean and \(\sigma\) is the standard deviation of the normally distributed logarithm of the variable. A log-normal distribution results if a random variable is the product of a large number of independent, identically-distributed variables in the same way that a normal distribution results if the variable is the sum of a large number of independent, identically-distributed variables.

References

1

Limpert, E., Stahel, W. A., and Abbt, M., “Log-normal Distributions across the Sciences: Keys and Clues,” BioScience, Vol. 51, No. 5, May, 2001. https://stat.ethz.ch/~stahel/lognormal/bioscience.pdf

2

Reiss, R.D. and Thomas, M., “Statistical Analysis of Extreme Values,” Basel: Birkhauser Verlag, 2001, pp. 31-32.

Examples

Draw samples from the distribution:

>>> mu, sigma = 3., 1. # mean and standard deviation  
>>> s = np.random.lognormal(mu, sigma, 1000)  

Display the histogram of the samples, along with the probability density function:

>>> import matplotlib.pyplot as plt  
>>> count, bins, ignored = plt.hist(s, 100, density=True, align='mid')  
>>> x = np.linspace(min(bins), max(bins), 10000)  
>>> pdf = (np.exp(-(np.log(x) - mu)**2 / (2 * sigma**2))  
...        / (x * sigma * np.sqrt(2 * np.pi)))
>>> plt.plot(x, pdf, linewidth=2, color='r')  
>>> plt.axis('tight')  
>>> plt.show()  

Demonstrate that taking the products of random samples from a uniform distribution can be fit well by a log-normal probability density function.

>>> # Generate a thousand samples: each is the product of 100 random
>>> # values, drawn from a normal distribution.
>>> b = []  
>>> for i in range(1000):  
...    a = 10. + np.random.standard_normal(100)
...    b.append(np.product(a))
>>> b = np.array(b) / np.min(b) # scale values to be positive  
>>> count, bins, ignored = plt.hist(b, 100, density=True, align='mid')  
>>> sigma = np.std(np.log(b))  
>>> mu = np.mean(np.log(b))  
>>> x = np.linspace(min(bins), max(bins), 10000)  
>>> pdf = (np.exp(-(np.log(x) - mu)**2 / (2 * sigma**2))  
...        / (x * sigma * np.sqrt(2 * np.pi)))
>>> plt.plot(x, pdf, color='r', linewidth=2)  
>>> plt.show()  
dask.array.random.logseries(p, size=None, chunks='auto', **kwargs)

Draw samples from a logarithmic series distribution.

This docstring was copied from numpy.random.mtrand.RandomState.logseries.

Some inconsistencies with the Dask version may exist.

Samples are drawn from a log series distribution with specified shape parameter, 0 < p < 1.

Note

New code should use the logseries method of a default_rng() instance instead; see random-quick-start.

Parameters
pfloat or array_like of floats

Shape parameter for the distribution. Must be in the range (0, 1).

sizeint or tuple of ints, optional

Output shape. If the given shape is, e.g., (m, n, k), then m * n * k samples are drawn. If size is None (default), a single value is returned if p is a scalar. Otherwise, np.array(p).size samples are drawn.

Returns
outndarray or scalar

Drawn samples from the parameterized logarithmic series distribution.

See also

scipy.stats.logser

probability density function, distribution or cumulative density function, etc.

Generator.logseries

which should be used for new code.

Notes

The probability density for the Log Series distribution is

\[P(k) = \frac{-p^k}{k \ln(1-p)},\]

where p = probability.

The log series distribution is frequently used to represent species richness and occurrence, first proposed by Fisher, Corbet, and Williams in 1943 [2]. It may also be used to model the numbers of occupants seen in cars [3].

References

1

Buzas, Martin A.; Culver, Stephen J., Understanding regional species diversity through the log series distribution of occurrences: BIODIVERSITY RESEARCH Diversity & Distributions, Volume 5, Number 5, September 1999 , pp. 187-195(9).

2

Fisher, R.A,, A.S. Corbet, and C.B. Williams. 1943. The relation between the number of species and the number of individuals in a random sample of an animal population. Journal of Animal Ecology, 12:42-58.

3

D. J. Hand, F. Daly, D. Lunn, E. Ostrowski, A Handbook of Small Data Sets, CRC Press, 1994.

4

Wikipedia, “Logarithmic distribution”, https://en.wikipedia.org/wiki/Logarithmic_distribution

Examples

Draw samples from the distribution:

>>> a = .6  
>>> s = np.random.logseries(a, 10000)  
>>> import matplotlib.pyplot as plt  
>>> count, bins, ignored = plt.hist(s)  

# plot against distribution

>>> def logseries(k, p):  
...     return -p**k/(k*np.log(1-p))
>>> plt.plot(bins, logseries(bins, a)*count.max()/  
...          logseries(bins, a).max(), 'r')
>>> plt.show()  
dask.array.random.negative_binomial(n, p, size=None, chunks='auto', **kwargs)

Draw samples from a negative binomial distribution.

This docstring was copied from numpy.random.mtrand.RandomState.negative_binomial.

Some inconsistencies with the Dask version may exist.

Samples are drawn from a negative binomial distribution with specified parameters, n successes and p probability of success where n is > 0 and p is in the interval [0, 1].

Note

New code should use the negative_binomial method of a default_rng() instance instead; see random-quick-start.

Parameters
nfloat or array_like of floats

Parameter of the distribution, > 0.

pfloat or array_like of floats

Parameter of the distribution, >= 0 and <=1.

sizeint or tuple of ints, optional

Output shape. If the given shape is, e.g., (m, n, k), then m * n * k samples are drawn. If size is None (default), a single value is returned if n and p are both scalars. Otherwise, np.broadcast(n, p).size samples are drawn.

Returns
outndarray or scalar

Drawn samples from the parameterized negative binomial distribution, where each sample is equal to N, the number of failures that occurred before a total of n successes was reached.

See also

Generator.negative_binomial

which should be used for new code.

Notes

The probability mass function of the negative binomial distribution is

\[P(N;n,p) = \frac{\Gamma(N+n)}{N!\Gamma(n)}p^{n}(1-p)^{N},\]

where \(n\) is the number of successes, \(p\) is the probability of success, \(N+n\) is the number of trials, and \(\Gamma\) is the gamma function. When \(n\) is an integer, \(\frac{\Gamma(N+n)}{N!\Gamma(n)} = \binom{N+n-1}{N}\), which is the more common form of this term in the the pmf. The negative binomial distribution gives the probability of N failures given n successes, with a success on the last trial.

If one throws a die repeatedly until the third time a “1” appears, then the probability distribution of the number of non-“1”s that appear before the third “1” is a negative binomial distribution.

References

1

Weisstein, Eric W. “Negative Binomial Distribution.” From MathWorld–A Wolfram Web Resource. http://mathworld.wolfram.com/NegativeBinomialDistribution.html

2

Wikipedia, “Negative binomial distribution”, https://en.wikipedia.org/wiki/Negative_binomial_distribution

Examples

Draw samples from the distribution:

A real world example. A company drills wild-cat oil exploration wells, each with an estimated probability of success of 0.1. What is the probability of having one success for each successive well, that is what is the probability of a single success after drilling 5 wells, after 6 wells, etc.?

>>> s = np.random.negative_binomial(1, 0.1, 100000)  
>>> for i in range(1, 11): 
...    probability = sum(s<i) / 100000.
...    print(i, "wells drilled, probability of one success =", probability)
dask.array.random.noncentral_chisquare(df, nonc, size=None, chunks='auto', **kwargs)

Draw samples from a noncentral chi-square distribution.

This docstring was copied from numpy.random.mtrand.RandomState.noncentral_chisquare.

Some inconsistencies with the Dask version may exist.

The noncentral \(\chi^2\) distribution is a generalization of the \(\chi^2\) distribution.

Note

New code should use the noncentral_chisquare method of a default_rng() instance instead; see random-quick-start.

Parameters
dffloat or array_like of floats

Degrees of freedom, must be > 0.

Changed in version 1.10.0: Earlier NumPy versions required dfnum > 1.

noncfloat or array_like of floats

Non-centrality, must be non-negative.

sizeint or tuple of ints, optional

Output shape. If the given shape is, e.g., (m, n, k), then m * n * k samples are drawn. If size is None (default), a single value is returned if df and nonc are both scalars. Otherwise, np.broadcast(df, nonc).size samples are drawn.

Returns
outndarray or scalar

Drawn samples from the parameterized noncentral chi-square distribution.

See also

Generator.noncentral_chisquare

which should be used for new code.

Notes

The probability density function for the noncentral Chi-square distribution is

\[P(x;df,nonc) = \sum^{\infty}_{i=0} \frac{e^{-nonc/2}(nonc/2)^{i}}{i!} P_{Y_{df+2i}}(x),\]

where \(Y_{q}\) is the Chi-square with q degrees of freedom.

References

1

Wikipedia, “Noncentral chi-squared distribution” https://en.wikipedia.org/wiki/Noncentral_chi-squared_distribution

Examples

Draw values from the distribution and plot the histogram

>>> import matplotlib.pyplot as plt  
>>> values = plt.hist(np.random.noncentral_chisquare(3, 20, 100000),  
...                   bins=200, density=True)
>>> plt.show()  

Draw values from a noncentral chisquare with very small noncentrality, and compare to a chisquare.

>>> plt.figure()  
>>> values = plt.hist(np.random.noncentral_chisquare(3, .0000001, 100000),  
...                   bins=np.arange(0., 25, .1), density=True)
>>> values2 = plt.hist(np.random.chisquare(3, 100000),  
...                    bins=np.arange(0., 25, .1), density=True)
>>> plt.plot(values[1][0:-1], values[0]-values2[0], 'ob')  
>>> plt.show()  

Demonstrate how large values of non-centrality lead to a more symmetric distribution.

>>> plt.figure()  
>>> values = plt.hist(np.random.noncentral_chisquare(3, 20, 100000),  
...                   bins=200, density=True)
>>> plt.show()  
dask.array.random.noncentral_f(dfnum, dfden, nonc, size=None, chunks='auto', **kwargs)

Draw samples from the noncentral F distribution.

This docstring was copied from numpy.random.mtrand.RandomState.noncentral_f.

Some inconsistencies with the Dask version may exist.

Samples are drawn from an F distribution with specified parameters, dfnum (degrees of freedom in numerator) and dfden (degrees of freedom in denominator), where both parameters > 1. nonc is the non-centrality parameter.

Note

New code should use the noncentral_f method of a default_rng() instance instead; see random-quick-start.

Parameters
dfnumfloat or array_like of floats

Numerator degrees of freedom, must be > 0.

Changed in version 1.14.0: Earlier NumPy versions required dfnum > 1.

dfdenfloat or array_like of floats

Denominator degrees of freedom, must be > 0.

noncfloat or array_like of floats

Non-centrality parameter, the sum of the squares of the numerator means, must be >= 0.

sizeint or tuple of ints, optional

Output shape. If the given shape is, e.g., (m, n, k), then m * n * k samples are drawn. If size is None (default), a single value is returned if dfnum, dfden, and nonc are all scalars. Otherwise, np.broadcast(dfnum, dfden, nonc).size samples are drawn.

Returns
outndarray or scalar

Drawn samples from the parameterized noncentral Fisher distribution.

See also

Generator.noncentral_f

which should be used for new code.

Notes

When calculating the power of an experiment (power = probability of rejecting the null hypothesis when a specific alternative is true) the non-central F statistic becomes important. When the null hypothesis is true, the F statistic follows a central F distribution. When the null hypothesis is not true, then it follows a non-central F statistic.

References

1

Weisstein, Eric W. “Noncentral F-Distribution.” From MathWorld–A Wolfram Web Resource. http://mathworld.wolfram.com/NoncentralF-Distribution.html

2

Wikipedia, “Noncentral F-distribution”, https://en.wikipedia.org/wiki/Noncentral_F-distribution

Examples

In a study, testing for a specific alternative to the null hypothesis requires use of the Noncentral F distribution. We need to calculate the area in the tail of the distribution that exceeds the value of the F distribution for the null hypothesis. We’ll plot the two probability distributions for comparison.

>>> dfnum = 3 # between group deg of freedom  
>>> dfden = 20 # within groups degrees of freedom  
>>> nonc = 3.0  
>>> nc_vals = np.random.noncentral_f(dfnum, dfden, nonc, 1000000)  
>>> NF = np.histogram(nc_vals, bins=50, density=True)  
>>> c_vals = np.random.f(dfnum, dfden, 1000000)  
>>> F = np.histogram(c_vals, bins=50, density=True)  
>>> import matplotlib.pyplot as plt  
>>> plt.plot(F[1][1:], F[0])  
>>> plt.plot(NF[1][1:], NF[0])  
>>> plt.show()  
dask.array.random.normal(loc=0.0, scale=1.0, size=None, chunks='auto', **kwargs)

Draw random samples from a normal (Gaussian) distribution.

This docstring was copied from numpy.random.mtrand.RandomState.normal.

Some inconsistencies with the Dask version may exist.

The probability density function of the normal distribution, first derived by De Moivre and 200 years later by both Gauss and Laplace independently [2], is often called the bell curve because of its characteristic shape (see the example below).

The normal distributions occurs often in nature. For example, it describes the commonly occurring distribution of samples influenced by a large number of tiny, random disturbances, each with its own unique distribution [2].

Note

New code should use the normal method of a default_rng() instance instead; see random-quick-start.

Parameters
locfloat or array_like of floats

Mean (“centre”) of the distribution.

scalefloat or array_like of floats

Standard deviation (spread or “width”) of the distribution. Must be non-negative.

sizeint or tuple of ints, optional

Output shape. If the given shape is, e.g., (m, n, k), then m * n * k samples are drawn. If size is None (default), a single value is returned if loc and scale are both scalars. Otherwise, np.broadcast(loc, scale).size samples are drawn.

Returns
outndarray or scalar

Drawn samples from the parameterized normal distribution.

See also

scipy.stats.norm

probability density function, distribution or cumulative density function, etc.

Generator.normal

which should be used for new code.

Notes

The probability density for the Gaussian distribution is

\[p(x) = \frac{1}{\sqrt{ 2 \pi \sigma^2 }} e^{ - \frac{ (x - \mu)^2 } {2 \sigma^2} },\]

where \(\mu\) is the mean and \(\sigma\) the standard deviation. The square of the standard deviation, \(\sigma^2\), is called the variance.

The function has its peak at the mean, and its “spread” increases with the standard deviation (the function reaches 0.607 times its maximum at \(x + \sigma\) and \(x - \sigma\) [2]). This implies that normal is more likely to return samples lying close to the mean, rather than those far away.

References

1

Wikipedia, “Normal distribution”, https://en.wikipedia.org/wiki/Normal_distribution

2(1,2,3)

P. R. Peebles Jr., “Central Limit Theorem” in “Probability, Random Variables and Random Signal Principles”, 4th ed., 2001, pp. 51, 51, 125.

Examples

Draw samples from the distribution:

>>> mu, sigma = 0, 0.1 # mean and standard deviation  
>>> s = np.random.normal(mu, sigma, 1000)  

Verify the mean and the variance:

>>> abs(mu - np.mean(s))  
0.0  # may vary
>>> abs(sigma - np.std(s, ddof=1))  
0.1  # may vary

Display the histogram of the samples, along with the probability density function:

>>> import matplotlib.pyplot as plt  
>>> count, bins, ignored = plt.hist(s, 30, density=True)  
>>> plt.plot(bins, 1/(sigma * np.sqrt(2 * np.pi)) *  
...                np.exp( - (bins - mu)**2 / (2 * sigma**2) ),
...          linewidth=2, color='r')
>>> plt.show()  

Two-by-four array of samples from N(3, 6.25):

>>> np.random.normal(3, 2.5, size=(2, 4))  
array([[-4.49401501,  4.00950034, -1.81814867,  7.29718677],   # random
       [ 0.39924804,  4.68456316,  4.99394529,  4.84057254]])  # random
dask.array.random.pareto(a, size=None, chunks='auto', **kwargs)

Draw samples from a Pareto II or Lomax distribution with specified shape.

This docstring was copied from numpy.random.mtrand.RandomState.pareto.

Some inconsistencies with the Dask version may exist.

The Lomax or Pareto II distribution is a shifted Pareto distribution. The classical Pareto distribution can be obtained from the Lomax distribution by adding 1 and multiplying by the scale parameter m (see Notes). The smallest value of the Lomax distribution is zero while for the classical Pareto distribution it is mu, where the standard Pareto distribution has location mu = 1. Lomax can also be considered as a simplified version of the Generalized Pareto distribution (available in SciPy), with the scale set to one and the location set to zero.

The Pareto distribution must be greater than zero, and is unbounded above. It is also known as the “80-20 rule”. In this distribution, 80 percent of the weights are in the lowest 20 percent of the range, while the other 20 percent fill the remaining 80 percent of the range.

Note

New code should use the pareto method of a default_rng() instance instead; see random-quick-start.

Parameters
afloat or array_like of floats

Shape of the distribution. Must be positive.

sizeint or tuple of ints, optional

Output shape. If the given shape is, e.g., (m, n, k), then m * n * k samples are drawn. If size is None (default), a single value is returned if a is a scalar. Otherwise, np.array(a).size samples are drawn.

Returns
outndarray or scalar

Drawn samples from the parameterized Pareto distribution.

See also

scipy.stats.lomax

probability density function, distribution or cumulative density function, etc.

scipy.stats.genpareto

probability density function, distribution or cumulative density function, etc.

Generator.pareto

which should be used for new code.

Notes

The probability density for the Pareto distribution is

\[p(x) = \frac{am^a}{x^{a+1}}\]

where \(a\) is the shape and \(m\) the scale.

The Pareto distribution, named after the Italian economist Vilfredo Pareto, is a power law probability distribution useful in many real world problems. Outside the field of economics it is generally referred to as the Bradford distribution. Pareto developed the distribution to describe the distribution of wealth in an economy. It has also found use in insurance, web page access statistics, oil field sizes, and many other problems, including the download frequency for projects in Sourceforge [1]. It is one of the so-called “fat-tailed” distributions.

References

1

Francis Hunt and Paul Johnson, On the Pareto Distribution of Sourceforge projects.

2

Pareto, V. (1896). Course of Political Economy. Lausanne.

3

Reiss, R.D., Thomas, M.(2001), Statistical Analysis of Extreme Values, Birkhauser Verlag, Basel, pp 23-30.

4

Wikipedia, “Pareto distribution”, https://en.wikipedia.org/wiki/Pareto_distribution

Examples

Draw samples from the distribution:

>>> a, m = 3., 2.  # shape and mode  
>>> s = (np.random.pareto(a, 1000) + 1) * m  

Display the histogram of the samples, along with the probability density function:

>>> import matplotlib.pyplot as plt  
>>> count, bins, _ = plt.hist(s, 100, density=True)  
>>> fit = a*m**a / bins**(a+1)  
>>> plt.plot(bins, max(count)*fit/max(fit), linewidth=2, color='r')  
>>> plt.show()  
dask.array.random.poisson(lam=1.0, size=None, chunks='auto', **kwargs)

Draw samples from a Poisson distribution.

This docstring was copied from numpy.random.mtrand.RandomState.poisson.

Some inconsistencies with the Dask version may exist.

The Poisson distribution is the limit of the binomial distribution for large N.

Note

New code should use the poisson method of a default_rng() instance instead; see random-quick-start.

Parameters
lamfloat or array_like of floats

Expectation of interval, must be >= 0. A sequence of expectation intervals must be broadcastable over the requested size.

sizeint or tuple of ints, optional

Output shape. If the given shape is, e.g., (m, n, k), then m * n * k samples are drawn. If size is None (default), a single value is returned if lam is a scalar. Otherwise, np.array(lam).size samples are drawn.

Returns
outndarray or scalar

Drawn samples from the parameterized Poisson distribution.

See also

Generator.poisson

which should be used for new code.

Notes

The Poisson distribution

\[f(k; \lambda)=\frac{\lambda^k e^{-\lambda}}{k!}\]

For events with an expected separation \(\lambda\) the Poisson distribution \(f(k; \lambda)\) describes the probability of \(k\) events occurring within the observed interval \(\lambda\).

Because the output is limited to the range of the C int64 type, a ValueError is raised when lam is within 10 sigma of the maximum representable value.

References

1

Weisstein, Eric W. “Poisson Distribution.” From MathWorld–A Wolfram Web Resource. http://mathworld.wolfram.com/PoissonDistribution.html

2

Wikipedia, “Poisson distribution”, https://en.wikipedia.org/wiki/Poisson_distribution

Examples

Draw samples from the distribution:

>>> import numpy as np  
>>> s = np.random.poisson(5, 10000)  

Display histogram of the sample:

>>> import matplotlib.pyplot as plt  
>>> count, bins, ignored = plt.hist(s, 14, density=True)  
>>> plt.show()  

Draw each 100 values for lambda 100 and 500:

>>> s = np.random.poisson(lam=(100., 500.), size=(100, 2))  
dask.array.random.power(a, size=None, chunks='auto', **kwargs)

Draws samples in [0, 1] from a power distribution with positive exponent a - 1.

This docstring was copied from numpy.random.mtrand.RandomState.power.

Some inconsistencies with the Dask version may exist.

Also known as the power function distribution.

Note

New code should use the power method of a default_rng() instance instead; see random-quick-start.

Parameters
afloat or array_like of floats

Parameter of the distribution. Must be non-negative.

sizeint or tuple of ints, optional

Output shape. If the given shape is, e.g., (m, n, k), then m * n * k samples are drawn. If size is None (default), a single value is returned if a is a scalar. Otherwise, np.array(a).size samples are drawn.

Returns
outndarray or scalar

Drawn samples from the parameterized power distribution.

Raises
ValueError

If a < 1.

See also

Generator.power

which should be used for new code.

Notes

The probability density function is

\[P(x; a) = ax^{a-1}, 0 \le x \le 1, a>0.\]

The power function distribution is just the inverse of the Pareto distribution. It may also be seen as a special case of the Beta distribution.

It is used, for example, in modeling the over-reporting of insurance claims.

References

1

Christian Kleiber, Samuel Kotz, “Statistical size distributions in economics and actuarial sciences”, Wiley, 2003.

2

Heckert, N. A. and Filliben, James J. “NIST Handbook 148: Dataplot Reference Manual, Volume 2: Let Subcommands and Library Functions”, National Institute of Standards and Technology Handbook Series, June 2003. https://www.itl.nist.gov/div898/software/dataplot/refman2/auxillar/powpdf.pdf

Examples

Draw samples from the distribution:

>>> a = 5. # shape  
>>> samples = 1000  
>>> s = np.random.power(a, samples)  

Display the histogram of the samples, along with the probability density function:

>>> import matplotlib.pyplot as plt  
>>> count, bins, ignored = plt.hist(s, bins=30)  
>>> x = np.linspace(0, 1, 100)  
>>> y = a*x**(a-1.)  
>>> normed_y = samples*np.diff(bins)[0]*y  
>>> plt.plot(x, normed_y)  
>>> plt.show()  

Compare the power function distribution to the inverse of the Pareto.

>>> from scipy import stats 
>>> rvs = np.random.power(5, 1000000)  
>>> rvsp = np.random.pareto(5, 1000000)  
>>> xx = np.linspace(0,1,100)  
>>> powpdf = stats.powerlaw.pdf(xx,5)  
>>> plt.figure()  
>>> plt.hist(rvs, bins=50, density=True)  
>>> plt.plot(xx,powpdf,'r-')  
>>> plt.title('np.random.power(5)')  
>>> plt.figure()  
>>> plt.hist(1./(1.+rvsp), bins=50, density=True)  
>>> plt.plot(xx,powpdf,'r-')  
>>> plt.title('inverse of 1 + np.random.pareto(5)')  
>>> plt.figure()  
>>> plt.hist(1./(1.+rvsp), bins=50, density=True)  
>>> plt.plot(xx,powpdf,'r-')  
>>> plt.title('inverse of stats.pareto(5)')  
dask.array.random.randint(low, high=None, size=None, chunks='auto', dtype='l', **kwargs)

Return random integers from low (inclusive) to high (exclusive).

This docstring was copied from numpy.random.mtrand.RandomState.randint.

Some inconsistencies with the Dask version may exist.

Return random integers from the “discrete uniform” distribution of the specified dtype in the “half-open” interval [low, high). If high is None (the default), then results are from [0, low).

Note

New code should use the integers method of a default_rng() instance instead; see random-quick-start.

Parameters
lowint or array-like of ints

Lowest (signed) integers to be drawn from the distribution (unless high=None, in which case this parameter is one above the highest such integer).

highint or array-like of ints, optional

If provided, one above the largest (signed) integer to be drawn from the distribution (see above for behavior if high=None). If array-like, must contain integer values

sizeint or tuple of ints, optional

Output shape. If the given shape is, e.g., (m, n, k), then m * n * k samples are drawn. Default is None, in which case a single value is returned.

dtypedtype, optional

Desired dtype of the result. All dtypes are determined by their name, i.e., ‘int64’, ‘int’, etc, so byteorder is not available and a specific precision may have different C types depending on the platform. The default value is np.int_.

New in version 1.11.0.

Returns
outint or ndarray of ints

size-shaped array of random integers from the appropriate distribution, or a single such random int if size not provided.

See also

random_integers

similar to randint, only for the closed interval [low, high], and 1 is the lowest value if high is omitted.

Generator.integers

which should be used for new code.

Examples

>>> np.random.randint(2, size=10)  
array([1, 0, 0, 0, 1, 1, 0, 0, 1, 0]) # random
>>> np.random.randint(1, size=10)  
array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])

Generate a 2 x 4 array of ints between 0 and 4, inclusive:

>>> np.random.randint(5, size=(2, 4))  
array([[4, 0, 2, 1], # random
       [3, 2, 2, 0]])

Generate a 1 x 3 array with 3 different upper bounds

>>> np.random.randint(1, [3, 5, 10])  
array([2, 2, 9]) # random

Generate a 1 by 3 array with 3 different lower bounds

>>> np.random.randint([1, 5, 7], 10)  
array([9, 8, 7]) # random

Generate a 2 by 4 array using broadcasting with dtype of uint8

>>> np.random.randint([1, 3, 5, 7], [[10], [20]], dtype=np.uint8)  
array([[ 8,  6,  9,  7], # random
       [ 1, 16,  9, 12]], dtype=uint8)
dask.array.random.random(size=None, chunks='auto', **kwargs)

Return random floats in the half-open interval [0.0, 1.0).

This docstring was copied from numpy.random.mtrand.RandomState.random_sample.

Some inconsistencies with the Dask version may exist.

Results are from the “continuous uniform” distribution over the stated interval. To sample \(Unif[a, b), b > a\) multiply the output of random_sample by (b-a) and add a:

(b - a) * random_sample() + a

Note

New code should use the random method of a default_rng() instance instead; see random-quick-start.

Parameters
sizeint or tuple of ints, optional

Output shape. If the given shape is, e.g., (m, n, k), then m * n * k samples are drawn. Default is None, in which case a single value is returned.

Returns
outfloat or ndarray of floats

Array of random floats of shape size (unless size=None, in which case a single float is returned).

See also

Generator.random

which should be used for new code.

Examples

>>> np.random.random_sample()  
0.47108547995356098 # random
>>> type(np.random.random_sample())  
<class 'float'>
>>> np.random.random_sample((5,))  
array([ 0.30220482,  0.86820401,  0.1654503 ,  0.11659149,  0.54323428]) # random

Three-by-two array of random numbers from [-5, 0):

>>> 5 * np.random.random_sample((3, 2)) - 5  
array([[-3.99149989, -0.52338984], # random
       [-2.99091858, -0.79479508],
       [-1.23204345, -1.75224494]])
dask.array.random.random_sample(size=None, chunks='auto', **kwargs)

Return random floats in the half-open interval [0.0, 1.0).

This docstring was copied from numpy.random.mtrand.RandomState.random_sample.

Some inconsistencies with the Dask version may exist.

Results are from the “continuous uniform” distribution over the stated interval. To sample \(Unif[a, b), b > a\) multiply the output of random_sample by (b-a) and add a:

(b - a) * random_sample() + a

Note

New code should use the random method of a default_rng() instance instead; see random-quick-start.

Parameters
sizeint or tuple of ints, optional

Output shape. If the given shape is, e.g., (m, n, k), then m * n * k samples are drawn. Default is None, in which case a single value is returned.

Returns
outfloat or ndarray of floats

Array of random floats of shape size (unless size=None, in which case a single float is returned).

See also

Generator.random

which should be used for new code.

Examples

>>> np.random.random_sample()  
0.47108547995356098 # random
>>> type(np.random.random_sample())  
<class 'float'>
>>> np.random.random_sample((5,))  
array([ 0.30220482,  0.86820401,  0.1654503 ,  0.11659149,  0.54323428]) # random

Three-by-two array of random numbers from [-5, 0):

>>> 5 * np.random.random_sample((3, 2)) - 5  
array([[-3.99149989, -0.52338984], # random
       [-2.99091858, -0.79479508],
       [-1.23204345, -1.75224494]])
dask.array.random.rayleigh(scale=1.0, size=None, chunks='auto', **kwargs)

Draw samples from a Rayleigh distribution.

This docstring was copied from numpy.random.mtrand.RandomState.rayleigh.

Some inconsistencies with the Dask version may exist.

The \(\chi\) and Weibull distributions are generalizations of the Rayleigh.

Note

New code should use the rayleigh method of a default_rng() instance instead; see random-quick-start.

Parameters
scalefloat or array_like of floats, optional

Scale, also equals the mode. Must be non-negative. Default is 1.

sizeint or tuple of ints, optional

Output shape. If the given shape is, e.g., (m, n, k), then m * n * k samples are drawn. If size is None (default), a single value is returned if scale is a scalar. Otherwise, np.array(scale).size samples are drawn.

Returns
outndarray or scalar

Drawn samples from the parameterized Rayleigh distribution.

See also

Generator.rayleigh

which should be used for new code.

Notes

The probability density function for the Rayleigh distribution is

\[P(x;scale) = \frac{x}{scale^2}e^{\frac{-x^2}{2 \cdotp scale^2}}\]

The Rayleigh distribution would arise, for example, if the East and North components of the wind velocity had identical zero-mean Gaussian distributions. Then the wind speed would have a Rayleigh distribution.

References

1

Brighton Webs Ltd., “Rayleigh Distribution,” https://web.archive.org/web/20090514091424/http://brighton-webs.co.uk:80/distributions/rayleigh.asp

2

Wikipedia, “Rayleigh distribution” https://en.wikipedia.org/wiki/Rayleigh_distribution

Examples

Draw values from the distribution and plot the histogram

>>> from matplotlib.pyplot import hist  
>>> values = hist(np.random.rayleigh(3, 100000), bins=200, density=True)  

Wave heights tend to follow a Rayleigh distribution. If the mean wave height is 1 meter, what fraction of waves are likely to be larger than 3 meters?

>>> meanvalue = 1  
>>> modevalue = np.sqrt(2 / np.pi) * meanvalue  
>>> s = np.random.rayleigh(modevalue, 1000000)  

The percentage of waves larger than 3 meters is:

>>> 100.*sum(s>3)/1000000.  
0.087300000000000003 # random
dask.array.random.standard_cauchy(size=None, chunks='auto', **kwargs)

Draw samples from a standard Cauchy distribution with mode = 0.

This docstring was copied from numpy.random.mtrand.RandomState.standard_cauchy.

Some inconsistencies with the Dask version may exist.

Also known as the Lorentz distribution.

Note

New code should use the standard_cauchy method of a default_rng() instance instead; see random-quick-start.

Parameters
sizeint or tuple of ints, optional

Output shape. If the given shape is, e.g., (m, n, k), then m * n * k samples are drawn. Default is None, in which case a single value is returned.

Returns
samplesndarray or scalar

The drawn samples.

See also

Generator.standard_cauchy

which should be used for new code.

Notes

The probability density function for the full Cauchy distribution is

\[P(x; x_0, \gamma) = \frac{1}{\pi \gamma \bigl[ 1+ (\frac{x-x_0}{\gamma})^2 \bigr] }\]

and the Standard Cauchy distribution just sets \(x_0=0\) and \(\gamma=1\)

The Cauchy distribution arises in the solution to the driven harmonic oscillator problem, and also describes spectral line broadening. It also describes the distribution of values at which a line tilted at a random angle will cut the x axis.

When studying hypothesis tests that assume normality, seeing how the tests perform on data from a Cauchy distribution is a good indicator of their sensitivity to a heavy-tailed distribution, since the Cauchy looks very much like a Gaussian distribution, but with heavier tails.

References

1

NIST/SEMATECH e-Handbook of Statistical Methods, “Cauchy Distribution”, https://www.itl.nist.gov/div898/handbook/eda/section3/eda3663.htm

2

Weisstein, Eric W. “Cauchy Distribution.” From MathWorld–A Wolfram Web Resource. http://mathworld.wolfram.com/CauchyDistribution.html

3

Wikipedia, “Cauchy distribution” https://en.wikipedia.org/wiki/Cauchy_distribution

Examples

Draw samples and plot the distribution:

>>> import matplotlib.pyplot as plt  
>>> s = np.random.standard_cauchy(1000000)  
>>> s = s[(s>-25) & (s<25)]  # truncate distribution so it plots well  
>>> plt.hist(s, bins=100)  
>>> plt.show()  
dask.array.random.standard_exponential(size=None, chunks='auto', **kwargs)

Draw samples from the standard exponential distribution.

This docstring was copied from numpy.random.mtrand.RandomState.standard_exponential.

Some inconsistencies with the Dask version may exist.

standard_exponential is identical to the exponential distribution with a scale parameter of 1.

Note

New code should use the standard_exponential method of a default_rng() instance instead; see random-quick-start.

Parameters
sizeint or tuple of ints, optional

Output shape. If the given shape is, e.g., (m, n, k), then m * n * k samples are drawn. Default is None, in which case a single value is returned.

Returns
outfloat or ndarray

Drawn samples.

See also

Generator.standard_exponential

which should be used for new code.

Examples

Output a 3x8000 array:

>>> n = np.random.standard_exponential((3, 8000))  
dask.array.random.standard_gamma(shape, size=None, chunks='auto', **kwargs)

Draw samples from a standard Gamma distribution.

This docstring was copied from numpy.random.mtrand.RandomState.standard_gamma.

Some inconsistencies with the Dask version may exist.

Samples are drawn from a Gamma distribution with specified parameters, shape (sometimes designated “k”) and scale=1.

Note

New code should use the standard_gamma method of a default_rng() instance instead; see random-quick-start.

Parameters
shapefloat or array_like of floats

Parameter, must be non-negative.

sizeint or tuple of ints, optional

Output shape. If the given shape is, e.g., (m, n, k), then m * n * k samples are drawn. If size is None (default), a single value is returned if shape is a scalar. Otherwise, np.array(shape).size samples are drawn.

Returns
outndarray or scalar

Drawn samples from the parameterized standard gamma distribution.

See also

scipy.stats.gamma

probability density function, distribution or cumulative density function, etc.

Generator.standard_gamma

which should be used for new code.

Notes

The probability density for the Gamma distribution is

\[p(x) = x^{k-1}\frac{e^{-x/\theta}}{\theta^k\Gamma(k)},\]

where \(k\) is the shape and \(\theta\) the scale, and \(\Gamma\) is the Gamma function.

The Gamma distribution is often used to model the times to failure of electronic components, and arises naturally in processes for which the waiting times between Poisson distributed events are relevant.

References

1

Weisstein, Eric W. “Gamma Distribution.” From MathWorld–A Wolfram Web Resource. http://mathworld.wolfram.com/GammaDistribution.html

2

Wikipedia, “Gamma distribution”, https://en.wikipedia.org/wiki/Gamma_distribution

Examples

Draw samples from the distribution:

>>> shape, scale = 2., 1. # mean and width  
>>> s = np.random.standard_gamma(shape, 1000000)  

Display the histogram of the samples, along with the probability density function:

>>> import matplotlib.pyplot as plt  
>>> import scipy.special as sps  
>>> count, bins, ignored = plt.hist(s, 50, density=True)  
>>> y = bins**(shape-1) * ((np.exp(-bins/scale))/  
...                       (sps.gamma(shape) * scale**shape))
>>> plt.plot(bins, y, linewidth=2, color='r')  
>>> plt.show()  
dask.array.random.standard_normal(size=None, chunks='auto', **kwargs)

Draw samples from a standard Normal distribution (mean=0, stdev=1).

This docstring was copied from numpy.random.mtrand.RandomState.standard_normal.

Some inconsistencies with the Dask version may exist.

Note

New code should use the standard_normal method of a default_rng() instance instead; see random-quick-start.

Parameters
sizeint or tuple of ints, optional

Output shape. If the given shape is, e.g., (m, n, k), then m * n * k samples are drawn. Default is None, in which case a single value is returned.

Returns
outfloat or ndarray

A floating-point array of shape size of drawn samples, or a single sample if size was not specified.

See also

normal

Equivalent function with additional loc and scale arguments for setting the mean and standard deviation.

Generator.standard_normal

which should be used for new code.

Notes

For random samples from \(N(\mu, \sigma^2)\), use one of:

mu + sigma * np.random.standard_normal(size=...)
np.random.normal(mu, sigma, size=...)

Examples

>>> np.random.standard_normal()  
2.1923875335537315 #random
>>> s = np.random.standard_normal(8000)  
>>> s  
array([ 0.6888893 ,  0.78096262, -0.89086505, ...,  0.49876311,  # random
       -0.38672696, -0.4685006 ])                                # random
>>> s.shape  
(8000,)
>>> s = np.random.standard_normal(size=(3, 4, 2))  
>>> s.shape  
(3, 4, 2)

Two-by-four array of samples from \(N(3, 6.25)\):

>>> 3 + 2.5 * np.random.standard_normal(size=(2, 4))  
array([[-4.49401501,  4.00950034, -1.81814867,  7.29718677],   # random
       [ 0.39924804,  4.68456316,  4.99394529,  4.84057254]])  # random
dask.array.random.standard_t(df, size=None, chunks='auto', **kwargs)

Draw samples from a standard Student’s t distribution with df degrees of freedom.

This docstring was copied from numpy.random.mtrand.RandomState.standard_t.

Some inconsistencies with the Dask version may exist.

A special case of the hyperbolic distribution. As df gets large, the result resembles that of the standard normal distribution (standard_normal).

Note

New code should use the standard_t method of a default_rng() instance instead; see random-quick-start.

Parameters
dffloat or array_like of floats

Degrees of freedom, must be > 0.

sizeint or tuple of ints, optional

Output shape. If the given shape is, e.g., (m, n, k), then m * n * k samples are drawn. If size is None (default), a single value is returned if df is a scalar. Otherwise, np.array(df).size samples are drawn.

Returns
outndarray or scalar

Drawn samples from the parameterized standard Student’s t distribution.

See also

Generator.standard_t

which should be used for new code.

Notes

The probability density function for the t distribution is

\[P(x, df) = \frac{\Gamma(\frac{df+1}{2})}{\sqrt{\pi df} \Gamma(\frac{df}{2})}\Bigl( 1+\frac{x^2}{df} \Bigr)^{-(df+1)/2}\]

The t test is based on an assumption that the data come from a Normal distribution. The t test provides a way to test whether the sample mean (that is the mean calculated from the data) is a good estimate of the true mean.

The derivation of the t-distribution was first published in 1908 by William Gosset while working for the Guinness Brewery in Dublin. Due to proprietary issues, he had to publish under a pseudonym, and so he used the name Student.

References

1

Dalgaard, Peter, “Introductory Statistics With R”, Springer, 2002.

2

Wikipedia, “Student’s t-distribution” https://en.wikipedia.org/wiki/Student’s_t-distribution

Examples

From Dalgaard page 83 [1], suppose the daily energy intake for 11 women in kilojoules (kJ) is:

>>> intake = np.array([5260., 5470, 5640, 6180, 6390, 6515, 6805, 7515, \  
...                    7515, 8230, 8770])

Does their energy intake deviate systematically from the recommended value of 7725 kJ?

We have 10 degrees of freedom, so is the sample mean within 95% of the recommended value?

>>> s = np.random.standard_t(10, size=100000)  
>>> np.mean(intake)  
6753.636363636364
>>> intake.std(ddof=1)  
1142.1232221373727

Calculate the t statistic, setting the ddof parameter to the unbiased value so the divisor in the standard deviation will be degrees of freedom, N-1.

>>> t = (np.mean(intake)-7725)/(intake.std(ddof=1)/np.sqrt(len(intake)))  
>>> import matplotlib.pyplot as plt  
>>> h = plt.hist(s, bins=100, density=True)  

For a one-sided t-test, how far out in the distribution does the t statistic appear?

>>> np.sum(s<t) / float(len(s))  
0.0090699999999999999  #random

So the p-value is about 0.009, which says the null hypothesis has a probability of about 99% of being true.

dask.array.random.triangular(left, mode, right, size=None, chunks='auto', **kwargs)

Draw samples from the triangular distribution over the interval [left, right].

This docstring was copied from numpy.random.mtrand.RandomState.triangular.

Some inconsistencies with the Dask version may exist.

The triangular distribution is a continuous probability distribution with lower limit left, peak at mode, and upper limit right. Unlike the other distributions, these parameters directly define the shape of the pdf.

Note

New code should use the triangular method of a default_rng() instance instead; see random-quick-start.

Parameters
leftfloat or array_like of floats

Lower limit.

modefloat or array_like of floats

The value where the peak of the distribution occurs. The value must fulfill the condition left <= mode <= right.

rightfloat or array_like of floats

Upper limit, must be larger than left.

sizeint or tuple of ints, optional

Output shape. If the given shape is, e.g., (m, n, k), then m * n * k samples are drawn. If size is None (default), a single value is returned if left, mode, and right are all scalars. Otherwise, np.broadcast(left, mode, right).size samples are drawn.

Returns
outndarray or scalar

Drawn samples from the parameterized triangular distribution.

See also

Generator.triangular

which should be used for new code.

Notes

The probability density function for the triangular distribution is

\[\begin{split}P(x;l, m, r) = \begin{cases} \frac{2(x-l)}{(r-l)(m-l)}& \text{for $l \leq x \leq m$},\\ \frac{2(r-x)}{(r-l)(r-m)}& \text{for $m \leq x \leq r$},\\ 0& \text{otherwise}. \end{cases}\end{split}\]

The triangular distribution is often used in ill-defined problems where the underlying distribution is not known, but some knowledge of the limits and mode exists. Often it is used in simulations.

References

1

Wikipedia, “Triangular distribution” https://en.wikipedia.org/wiki/Triangular_distribution

Examples

Draw values from the distribution and plot the histogram:

>>> import matplotlib.pyplot as plt  
>>> h = plt.hist(np.random.triangular(-3, 0, 8, 100000), bins=200,  
...              density=True)
>>> plt.show()  
dask.array.random.uniform(low=0.0, high=1.0, size=None, chunks='auto', **kwargs)

Draw samples from a uniform distribution.

This docstring was copied from numpy.random.mtrand.RandomState.uniform.

Some inconsistencies with the Dask version may exist.

Samples are uniformly distributed over the half-open interval [low, high) (includes low, but excludes high). In other words, any value within the given interval is equally likely to be drawn by uniform.

Note

New code should use the uniform method of a default_rng() instance instead; see random-quick-start.

Parameters
lowfloat or array_like of floats, optional

Lower boundary of the output interval. All values generated will be greater than or equal to low. The default value is 0.

highfloat or array_like of floats

Upper boundary of the output interval. All values generated will be less than high. The default value is 1.0.

sizeint or tuple of ints, optional

Output shape. If the given shape is, e.g., (m, n, k), then m * n * k samples are drawn. If size is None (default), a single value is returned if low and high are both scalars. Otherwise, np.broadcast(low, high).size samples are drawn.

Returns
outndarray or scalar

Drawn samples from the parameterized uniform distribution.

See also

randint

Discrete uniform distribution, yielding integers.

random_integers

Discrete uniform distribution over the closed interval [low, high].

random_sample

Floats uniformly distributed over [0, 1).

random

Alias for random_sample.

rand

Convenience function that accepts dimensions as input, e.g., rand(2,2) would generate a 2-by-2 array of floats, uniformly distributed over [0, 1).

Generator.uniform

which should be used for new code.

Notes

The probability density function of the uniform distribution is

\[p(x) = \frac{1}{b - a}\]

anywhere within the interval [a, b), and zero elsewhere.

When high == low, values of low will be returned. If high < low, the results are officially undefined and may eventually raise an error, i.e. do not rely on this function to behave when passed arguments satisfying that inequality condition.

Examples

Draw samples from the distribution:

>>> s = np.random.uniform(-1,0,1000)  

All values are within the given interval:

>>> np.all(s >= -1)  
True
>>> np.all(s < 0)  
True

Display the histogram of the samples, along with the probability density function:

>>> import matplotlib.pyplot as plt  
>>> count, bins, ignored = plt.hist(s, 15, density=True)  
>>> plt.plot(bins, np.ones_like(bins), linewidth=2, color='r')  
>>> plt.show()  
dask.array.random.vonmises(mu, kappa, size=None, chunks='auto', **kwargs)

Draw samples from a von Mises distribution.

This docstring was copied from numpy.random.mtrand.RandomState.vonmises.

Some inconsistencies with the Dask version may exist.

Samples are drawn from a von Mises distribution with specified mode (mu) and dispersion (kappa), on the interval [-pi, pi].

The von Mises distribution (also known as the circular normal distribution) is a continuous probability distribution on the unit circle. It may be thought of as the circular analogue of the normal distribution.

Note

New code should use the vonmises method of a default_rng() instance instead; see random-quick-start.

Parameters
mufloat or array_like of floats

Mode (“center”) of the distribution.

kappafloat or array_like of floats

Dispersion of the distribution, has to be >=0.

sizeint or tuple of ints, optional

Output shape. If the given shape is, e.g., (m, n, k), then m * n * k samples are drawn. If size is None (default), a single value is returned if mu and kappa are both scalars. Otherwise, np.broadcast(mu, kappa).size samples are drawn.

Returns
outndarray or scalar

Drawn samples from the parameterized von Mises distribution.

See also

scipy.stats.vonmises

probability density function, distribution, or cumulative density function, etc.

Generator.vonmises

which should be used for new code.

Notes

The probability density for the von Mises distribution is

\[p(x) = \frac{e^{\kappa cos(x-\mu)}}{2\pi I_0(\kappa)},\]

where \(\mu\) is the mode and \(\kappa\) the dispersion, and \(I_0(\kappa)\) is the modified Bessel function of order 0.

The von Mises is named for Richard Edler von Mises, who was born in Austria-Hungary, in what is now the Ukraine. He fled to the United States in 1939 and became a professor at Harvard. He worked in probability theory, aerodynamics, fluid mechanics, and philosophy of science.

References

1

Abramowitz, M. and Stegun, I. A. (Eds.). “Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables, 9th printing,” New York: Dover, 1972.

2

von Mises, R., “Mathematical Theory of Probability and Statistics”, New York: Academic Press, 1964.

Examples

Draw samples from the distribution:

>>> mu, kappa = 0.0, 4.0 # mean and dispersion  
>>> s = np.random.vonmises(mu, kappa, 1000)  

Display the histogram of the samples, along with the probability density function:

>>> import matplotlib.pyplot as plt  
>>> from scipy.special import i0  
>>> plt.hist(s, 50, density=True)  
>>> x = np.linspace(-np.pi, np.pi, num=51)  
>>> y = np.exp(kappa*np.cos(x-mu))/(2*np.pi*i0(kappa))  
>>> plt.plot(x, y, linewidth=2, color='r')  
>>> plt.show()  
dask.array.random.wald(mean, scale, size=None, chunks='auto', **kwargs)

Draw samples from a Wald, or inverse Gaussian, distribution.

This docstring was copied from numpy.random.mtrand.RandomState.wald.

Some inconsistencies with the Dask version may exist.

As the scale approaches infinity, the distribution becomes more like a Gaussian. Some references claim that the Wald is an inverse Gaussian with mean equal to 1, but this is by no means universal.

The inverse Gaussian distribution was first studied in relationship to Brownian motion. In 1956 M.C.K. Tweedie used the name inverse Gaussian because there is an inverse relationship between the time to cover a unit distance and distance covered in unit time.

Note

New code should use the wald method of a default_rng() instance instead; see random-quick-start.

Parameters
meanfloat or array_like of floats

Distribution mean, must be > 0.

scalefloat or array_like of floats

Scale parameter, must be > 0.

sizeint or tuple of ints, optional

Output shape. If the given shape is, e.g., (m, n, k), then m * n * k samples are drawn. If size is None (default), a single value is returned if mean and scale are both scalars. Otherwise, np.broadcast(mean, scale).size samples are drawn.

Returns
outndarray or scalar

Drawn samples from the parameterized Wald distribution.

See also

Generator.wald

which should be used for new code.

Notes

The probability density function for the Wald distribution is

\[P(x;mean,scale) = \sqrt{\frac{scale}{2\pi x^3}}e^ \frac{-scale(x-mean)^2}{2\cdotp mean^2x}\]

As noted above the inverse Gaussian distribution first arise from attempts to model Brownian motion. It is also a competitor to the Weibull for use in reliability modeling and modeling stock returns and interest rate processes.

References

1

Brighton Webs Ltd., Wald Distribution, https://web.archive.org/web/20090423014010/http://www.brighton-webs.co.uk:80/distributions/wald.asp

2

Chhikara, Raj S., and Folks, J. Leroy, “The Inverse Gaussian Distribution: Theory : Methodology, and Applications”, CRC Press, 1988.

3

Wikipedia, “Inverse Gaussian distribution” https://en.wikipedia.org/wiki/Inverse_Gaussian_distribution

Examples

Draw values from the distribution and plot the histogram:

>>> import matplotlib.pyplot as plt  
>>> h = plt.hist(np.random.wald(3, 2, 100000), bins=200, density=True)  
>>> plt.show()  
dask.array.random.weibull(a, size=None, chunks='auto', **kwargs)

Draw samples from a Weibull distribution.

This docstring was copied from numpy.random.mtrand.RandomState.weibull.

Some inconsistencies with the Dask version may exist.

Draw samples from a 1-parameter Weibull distribution with the given shape parameter a.

\[X = (-ln(U))^{1/a}\]

Here, U is drawn from the uniform distribution over (0,1].

The more common 2-parameter Weibull, including a scale parameter \(\lambda\) is just \(X = \lambda(-ln(U))^{1/a}\).

Note

New code should use the weibull method of a default_rng() instance instead; see random-quick-start.

Parameters
afloat or array_like of floats

Shape parameter of the distribution. Must be nonnegative.

sizeint or tuple of ints, optional

Output shape. If the given shape is, e.g., (m, n, k), then m * n * k samples are drawn. If size is None (default), a single value is returned if a is a scalar. Otherwise, np.array(a).size samples are drawn.

Returns
outndarray or scalar

Drawn samples from the parameterized Weibull distribution.

See also

scipy.stats.weibull_max
scipy.stats.weibull_min
scipy.stats.genextreme
gumbel
Generator.weibull

which should be used for new code.

Notes

The Weibull (or Type III asymptotic extreme value distribution for smallest values, SEV Type III, or Rosin-Rammler distribution) is one of a class of Generalized Extreme Value (GEV) distributions used in modeling extreme value problems. This class includes the Gumbel and Frechet distributions.

The probability density for the Weibull distribution is

\[p(x) = \frac{a} {\lambda}(\frac{x}{\lambda})^{a-1}e^{-(x/\lambda)^a},\]

where \(a\) is the shape and \(\lambda\) the scale.

The function has its peak (the mode) at \(\lambda(\frac{a-1}{a})^{1/a}\).

When a = 1, the Weibull distribution reduces to the exponential distribution.

References

1

Waloddi Weibull, Royal Technical University, Stockholm, 1939 “A Statistical Theory Of The Strength Of Materials”, Ingeniorsvetenskapsakademiens Handlingar Nr 151, 1939, Generalstabens Litografiska Anstalts Forlag, Stockholm.

2

Waloddi Weibull, “A Statistical Distribution Function of Wide Applicability”, Journal Of Applied Mechanics ASME Paper 1951.

3

Wikipedia, “Weibull distribution”, https://en.wikipedia.org/wiki/Weibull_distribution

Examples

Draw samples from the distribution:

>>> a = 5. # shape  
>>> s = np.random.weibull(a, 1000)  

Display the histogram of the samples, along with the probability density function:

>>> import matplotlib.pyplot as plt  
>>> x = np.arange(1,100.)/50.  
>>> def weib(x,n,a):  
...     return (a / n) * (x / n)**(a - 1) * np.exp(-(x / n)**a)
>>> count, bins, ignored = plt.hist(np.random.weibull(5.,1000))  
>>> x = np.arange(1,100.)/50.  
>>> scale = count.max()/weib(x, 1., 5.).max()  
>>> plt.plot(x, weib(x, 1., 5.)*scale)  
>>> plt.show()  
dask.array.random.zipf(a, size=None, chunks='auto', **kwargs)

Standard distributions

dask.array.stats.ttest_ind(a, b, axis=0, equal_var=True)

Calculate the T-test for the means of two independent samples of scores.

This docstring was copied from scipy.stats.ttest_ind.

Some inconsistencies with the Dask version may exist.

This is a two-sided test for the null hypothesis that 2 independent samples have identical average (expected) values. This test assumes that the populations have identical variances by default.

Parameters
a, barray_like

The arrays must have the same shape, except in the dimension corresponding to axis (the first, by default).

axisint or None, optional

Axis along which to compute test. If None, compute over the whole arrays, a, and b.

equal_varbool, optional

If True (default), perform a standard independent 2 sample test that assumes equal population variances [1]. If False, perform Welch’s t-test, which does not assume equal population variance [2].

New in version 0.11.0.

nan_policy{‘propagate’, ‘raise’, ‘omit’}, optional (Not supported in Dask)

Defines how to handle when input contains nan. The following options are available (default is ‘propagate’):

  • ‘propagate’: returns nan

  • ‘raise’: throws an error

  • ‘omit’: performs the calculations ignoring nan values

Returns
statisticfloat or array

The calculated t-statistic.

pvaluefloat or array

The two-tailed p-value.

Notes

We can use this test, if we observe two independent samples from the same or different population, e.g. exam scores of boys and girls or of two ethnic groups. The test measures whether the average (expected) value differs significantly across samples. If we observe a large p-value, for example larger than 0.05 or 0.1, then we cannot reject the null hypothesis of identical average scores. If the p-value is smaller than the threshold, e.g. 1%, 5% or 10%, then we reject the null hypothesis of equal averages.

References

1

https://en.wikipedia.org/wiki/T-test#Independent_two-sample_t-test

2

https://en.wikipedia.org/wiki/Welch%27s_t-test

Examples

>>> from scipy import stats  
>>> np.random.seed(12345678)  

Test with sample with identical means:

>>> rvs1 = stats.norm.rvs(loc=5,scale=10,size=500)  
>>> rvs2 = stats.norm.rvs(loc=5,scale=10,size=500)  
>>> stats.ttest_ind(rvs1,rvs2)  
(0.26833823296239279, 0.78849443369564776)
>>> stats.ttest_ind(rvs1,rvs2, equal_var = False)  
(0.26833823296239279, 0.78849452749500748)

ttest_ind underestimates p for unequal variances:

>>> rvs3 = stats.norm.rvs(loc=5, scale=20, size=500)  
>>> stats.ttest_ind(rvs1, rvs3)  
(-0.46580283298287162, 0.64145827413436174)
>>> stats.ttest_ind(rvs1, rvs3, equal_var = False)  
(-0.46580283298287162, 0.64149646246569292)

When n1 != n2, the equal variance t-statistic is no longer equal to the unequal variance t-statistic:

>>> rvs4 = stats.norm.rvs(loc=5, scale=20, size=100)  
>>> stats.ttest_ind(rvs1, rvs4)  
(-0.99882539442782481, 0.3182832709103896)
>>> stats.ttest_ind(rvs1, rvs4, equal_var = False)  
(-0.69712570584654099, 0.48716927725402048)

T-test with different means, variance, and n:

>>> rvs5 = stats.norm.rvs(loc=8, scale=20, size=100)  
>>> stats.ttest_ind(rvs1, rvs5)  
(-1.4679669854490653, 0.14263895620529152)
>>> stats.ttest_ind(rvs1, rvs5, equal_var = False)  
(-0.94365973617132992, 0.34744170334794122)
dask.array.stats.ttest_1samp(a, popmean, axis=0, nan_policy='propagate')

Calculate the T-test for the mean of ONE group of scores.

This docstring was copied from scipy.stats.ttest_1samp.

Some inconsistencies with the Dask version may exist.

This is a two-sided test for the null hypothesis that the expected value (mean) of a sample of independent observations a is equal to the given population mean, popmean.

Parameters
aarray_like

Sample observation.

popmeanfloat or array_like

Expected value in null hypothesis. If array_like, then it must have the same shape as a excluding the axis dimension.

axisint or None, optional

Axis along which to compute test. If None, compute over the whole array a.

nan_policy{‘propagate’, ‘raise’, ‘omit’}, optional

Defines how to handle when input contains nan. The following options are available (default is ‘propagate’):

  • ‘propagate’: returns nan

  • ‘raise’: throws an error

  • ‘omit’: performs the calculations ignoring nan values

Returns
statisticfloat or array

t-statistic.

pvaluefloat or array

Two-sided p-value.

Examples

>>> from scipy import stats  
>>> np.random.seed(7654567)  # fix seed to get the same result  
>>> rvs = stats.norm.rvs(loc=5, scale=10, size=(50,2))  

Test if mean of random sample is equal to true mean, and different mean. We reject the null hypothesis in the second case and don’t reject it in the first case.

>>> stats.ttest_1samp(rvs,5.0)  
(array([-0.68014479, -0.04323899]), array([ 0.49961383,  0.96568674]))
>>> stats.ttest_1samp(rvs,0.0)  
(array([ 2.77025808,  4.11038784]), array([ 0.00789095,  0.00014999]))

Examples using axis and non-scalar dimension for population mean.

>>> stats.ttest_1samp(rvs,[5.0,0.0])  
(array([-0.68014479,  4.11038784]), array([  4.99613833e-01,   1.49986458e-04]))
>>> stats.ttest_1samp(rvs.T,[5.0,0.0],axis=1)  
(array([-0.68014479,  4.11038784]), array([  4.99613833e-01,   1.49986458e-04]))
>>> stats.ttest_1samp(rvs,[[5.0],[0.0]])  
(array([[-0.68014479, -0.04323899],
       [ 2.77025808,  4.11038784]]), array([[  4.99613833e-01,   9.65686743e-01],
       [  7.89094663e-03,   1.49986458e-04]]))
dask.array.stats.ttest_rel(a, b, axis=0, nan_policy='propagate')

Calculate the t-test on TWO RELATED samples of scores, a and b.

This docstring was copied from scipy.stats.ttest_rel.

Some inconsistencies with the Dask version may exist.

This is a two-sided test for the null hypothesis that 2 related or repeated samples have identical average (expected) values.

Parameters
a, barray_like

The arrays must have the same shape.

axisint or None, optional

Axis along which to compute test. If None, compute over the whole arrays, a, and b.

nan_policy{‘propagate’, ‘raise’, ‘omit’}, optional

Defines how to handle when input contains nan. The following options are available (default is ‘propagate’):

  • ‘propagate’: returns nan

  • ‘raise’: throws an error

  • ‘omit’: performs the calculations ignoring nan values

Returns
statisticfloat or array

t-statistic.

pvaluefloat or array

Two-sided p-value.

Notes

Examples for use are scores of the same set of student in different exams, or repeated sampling from the same units. The test measures whether the average score differs significantly across samples (e.g. exams). If we observe a large p-value, for example greater than 0.05 or 0.1 then we cannot reject the null hypothesis of identical average scores. If the p-value is smaller than the threshold, e.g. 1%, 5% or 10%, then we reject the null hypothesis of equal averages. Small p-values are associated with large t-statistics.

References

https://en.wikipedia.org/wiki/T-test#Dependent_t-test_for_paired_samples

Examples

>>> from scipy import stats  
>>> np.random.seed(12345678) # fix random seed to get same numbers  
>>> rvs1 = stats.norm.rvs(loc=5,scale=10,size=500)  
>>> rvs2 = (stats.norm.rvs(loc=5,scale=10,size=500) +  
...         stats.norm.rvs(scale=0.2,size=500))
>>> stats.ttest_rel(rvs1,rvs2)  
(0.24101764965300962, 0.80964043445811562)
>>> rvs3 = (stats.norm.rvs(loc=8,scale=10,size=500) +  
...         stats.norm.rvs(scale=0.2,size=500))
>>> stats.ttest_rel(rvs1,rvs3)  
(-3.9995108708727933, 7.3082402191726459e-005)
dask.array.stats.chisquare(f_obs, f_exp=None, ddof=0, axis=0)

Calculate a one-way chi-square test.

This docstring was copied from scipy.stats.chisquare.

Some inconsistencies with the Dask version may exist.

The chi-square test tests the null hypothesis that the categorical data has the given frequencies.

Parameters
f_obsarray_like

Observed frequencies in each category.

f_exparray_like, optional

Expected frequencies in each category. By default the categories are assumed to be equally likely.

ddofint, optional

“Delta degrees of freedom”: adjustment to the degrees of freedom for the p-value. The p-value is computed using a chi-squared distribution with k - 1 - ddof degrees of freedom, where k is the number of observed frequencies. The default value of ddof is 0.

axisint or None, optional

The axis of the broadcast result of f_obs and f_exp along which to apply the test. If axis is None, all values in f_obs are treated as a single data set. Default is 0.

Returns
chisqfloat or ndarray

The chi-squared test statistic. The value is a float if axis is None or f_obs and f_exp are 1-D.

pfloat or ndarray

The p-value of the test. The value is a float if ddof and the return value chisq are scalars.

See also

scipy.stats.power_divergence

Notes

This test is invalid when the observed or expected frequencies in each category are too small. A typical rule is that all of the observed and expected frequencies should be at least 5.

The default degrees of freedom, k-1, are for the case when no parameters of the distribution are estimated. If p parameters are estimated by efficient maximum likelihood then the correct degrees of freedom are k-1-p. If the parameters are estimated in a different way, then the dof can be between k-1-p and k-1. However, it is also possible that the asymptotic distribution is not chi-square, in which case this test is not appropriate.

References

1

Lowry, Richard. “Concepts and Applications of Inferential Statistics”. Chapter 8. https://web.archive.org/web/20171022032306/http://vassarstats.net:80/textbook/ch8pt1.html

2

“Chi-squared test”, https://en.wikipedia.org/wiki/Chi-squared_test

Examples

When just f_obs is given, it is assumed that the expected frequencies are uniform and given by the mean of the observed frequencies.

>>> from scipy.stats import chisquare  
>>> chisquare([16, 18, 16, 14, 12, 12])  
(2.0, 0.84914503608460956)

With f_exp the expected frequencies can be given.

>>> chisquare([16, 18, 16, 14, 12, 12], f_exp=[16, 16, 16, 16, 16, 8])  
(3.5, 0.62338762774958223)

When f_obs is 2-D, by default the test is applied to each column.

>>> obs = np.array([[16, 18, 16, 14, 12, 12], [32, 24, 16, 28, 20, 24]]).T  
>>> obs.shape  
(6, 2)
>>> chisquare(obs)  
(array([ 2.        ,  6.66666667]), array([ 0.84914504,  0.24663415]))

By setting axis=None, the test is applied to all data in the array, which is equivalent to applying the test to the flattened array.

>>> chisquare(obs, axis=None)  
(23.31034482758621, 0.015975692534127565)
>>> chisquare(obs.ravel())  
(23.31034482758621, 0.015975692534127565)

ddof is the change to make to the default degrees of freedom.

>>> chisquare([16, 18, 16, 14, 12, 12], ddof=1)  
(2.0, 0.73575888234288467)

The calculation of the p-values is done by broadcasting the chi-squared statistic with ddof.

>>> chisquare([16, 18, 16, 14, 12, 12], ddof=[0,1,2])  
(2.0, array([ 0.84914504,  0.73575888,  0.5724067 ]))

f_obs and f_exp are also broadcast. In the following, f_obs has shape (6,) and f_exp has shape (2, 6), so the result of broadcasting f_obs and f_exp has shape (2, 6). To compute the desired chi-squared statistics, we use axis=1:

>>> chisquare([16, 18, 16, 14, 12, 12],  
...           f_exp=[[16, 16, 16, 16, 16, 8], [8, 20, 20, 16, 12, 12]],
...           axis=1)
(array([ 3.5 ,  9.25]), array([ 0.62338763,  0.09949846]))
dask.array.stats.power_divergence(f_obs, f_exp=None, ddof=0, axis=0, lambda_=None)

Cressie-Read power divergence statistic and goodness of fit test.

This docstring was copied from scipy.stats.power_divergence.

Some inconsistencies with the Dask version may exist.

This function tests the null hypothesis that the categorical data has the given frequencies, using the Cressie-Read power divergence statistic.

Parameters
f_obsarray_like

Observed frequencies in each category.

f_exparray_like, optional

Expected frequencies in each category. By default the categories are assumed to be equally likely.

ddofint, optional

“Delta degrees of freedom”: adjustment to the degrees of freedom for the p-value. The p-value is computed using a chi-squared distribution with k - 1 - ddof degrees of freedom, where k is the number of observed frequencies. The default value of ddof is 0.

axisint or None, optional

The axis of the broadcast result of f_obs and f_exp along which to apply the test. If axis is None, all values in f_obs are treated as a single data set. Default is 0.

lambda_float or str, optional

The power in the Cressie-Read power divergence statistic. The default is 1. For convenience, lambda_ may be assigned one of the following strings, in which case the corresponding numerical value is used:

String              Value   Description
"pearson"             1     Pearson's chi-squared statistic.
                            In this case, the function is
                            equivalent to `stats.chisquare`.
"log-likelihood"      0     Log-likelihood ratio. Also known as
                            the G-test [R5ed189a69e5c-3]_.
"freeman-tukey"      -1/2   Freeman-Tukey statistic.
"mod-log-likelihood" -1     Modified log-likelihood ratio.
"neyman"             -2     Neyman's statistic.
"cressie-read"        2/3   The power recommended in [R5ed189a69e5c-5]_.
Returns
statisticfloat or ndarray

The Cressie-Read power divergence test statistic. The value is a float if axis is None or if` f_obs and f_exp are 1-D.

pvaluefloat or ndarray

The p-value of the test. The value is a float if ddof and the return value stat are scalars.

See also

chisquare

Notes

This test is invalid when the observed or expected frequencies in each category are too small. A typical rule is that all of the observed and expected frequencies should be at least 5.

When lambda_ is less than zero, the formula for the statistic involves dividing by f_obs, so a warning or error may be generated if any value in f_obs is 0.

Similarly, a warning or error may be generated if any value in f_exp is zero when lambda_ >= 0.

The default degrees of freedom, k-1, are for the case when no parameters of the distribution are estimated. If p parameters are estimated by efficient maximum likelihood then the correct degrees of freedom are k-1-p. If the parameters are estimated in a different way, then the dof can be between k-1-p and k-1. However, it is also possible that the asymptotic distribution is not a chisquare, in which case this test is not appropriate.

This function handles masked arrays. If an element of f_obs or f_exp is masked, then data at that position is ignored, and does not count towards the size of the data set.

New in version 0.13.0.

References

1

Lowry, Richard. “Concepts and Applications of Inferential Statistics”. Chapter 8. https://web.archive.org/web/20171015035606/http://faculty.vassar.edu/lowry/ch8pt1.html

2

“Chi-squared test”, https://en.wikipedia.org/wiki/Chi-squared_test

3

“G-test”, https://en.wikipedia.org/wiki/G-test

4

Sokal, R. R. and Rohlf, F. J. “Biometry: the principles and practice of statistics in biological research”, New York: Freeman (1981)

5

Cressie, N. and Read, T. R. C., “Multinomial Goodness-of-Fit Tests”, J. Royal Stat. Soc. Series B, Vol. 46, No. 3 (1984), pp. 440-464.

Examples

(See chisquare for more examples.)

When just f_obs is given, it is assumed that the expected frequencies are uniform and given by the mean of the observed frequencies. Here we perform a G-test (i.e. use the log-likelihood ratio statistic):

>>> from scipy.stats import power_divergence  
>>> power_divergence([16, 18, 16, 14, 12, 12], lambda_='log-likelihood')  
(2.006573162632538, 0.84823476779463769)

The expected frequencies can be given with the f_exp argument:

>>> power_divergence([16, 18, 16, 14, 12, 12],  
...                  f_exp=[16, 16, 16, 16, 16, 8],
...                  lambda_='log-likelihood')
(3.3281031458963746, 0.6495419288047497)

When f_obs is 2-D, by default the test is applied to each column.

>>> obs = np.array([[16, 18, 16, 14, 12, 12], [32, 24, 16, 28, 20, 24]]).T  
>>> obs.shape  
(6, 2)
>>> power_divergence(obs, lambda_="log-likelihood")  
(array([ 2.00657316,  6.77634498]), array([ 0.84823477,  0.23781225]))

By setting axis=None, the test is applied to all data in the array, which is equivalent to applying the test to the flattened array.

>>> power_divergence(obs, axis=None)  
(23.31034482758621, 0.015975692534127565)
>>> power_divergence(obs.ravel())  
(23.31034482758621, 0.015975692534127565)

ddof is the change to make to the default degrees of freedom.

>>> power_divergence([16, 18, 16, 14, 12, 12], ddof=1)  
(2.0, 0.73575888234288467)

The calculation of the p-values is done by broadcasting the test statistic with ddof.

>>> power_divergence([16, 18, 16, 14, 12, 12], ddof=[0,1,2])  
(2.0, array([ 0.84914504,  0.73575888,  0.5724067 ]))

f_obs and f_exp are also broadcast. In the following, f_obs has shape (6,) and f_exp has shape (2, 6), so the result of broadcasting f_obs and f_exp has shape (2, 6). To compute the desired chi-squared statistics, we must use axis=1:

>>> power_divergence([16, 18, 16, 14, 12, 12],  
...                  f_exp=[[16, 16, 16, 16, 16, 8],
...                         [8, 20, 20, 16, 12, 12]],
...                  axis=1)
(array([ 3.5 ,  9.25]), array([ 0.62338763,  0.09949846]))
dask.array.stats.skew(a, axis=0, bias=True, nan_policy='propagate')

Compute the sample skewness of a data set.

This docstring was copied from scipy.stats.skew.

Some inconsistencies with the Dask version may exist.

For normally distributed data, the skewness should be about zero. For unimodal continuous distributions, a skewness value greater than zero means that there is more weight in the right tail of the distribution. The function skewtest can be used to determine if the skewness value is close enough to zero, statistically speaking.

Parameters
andarray

Input array.

axisint or None, optional

Axis along which skewness is calculated. Default is 0. If None, compute over the whole array a.

biasbool, optional

If False, then the calculations are corrected for statistical bias.

nan_policy{‘propagate’, ‘raise’, ‘omit’}, optional

Defines how to handle when input contains nan. The following options are available (default is ‘propagate’):

  • ‘propagate’: returns nan

  • ‘raise’: throws an error

  • ‘omit’: performs the calculations ignoring nan values

Returns
skewnessndarray

The skewness of values along an axis, returning 0 where all values are equal.

Notes

The sample skewness is computed as the Fisher-Pearson coefficient of skewness, i.e.

\[g_1=\frac{m_3}{m_2^{3/2}}\]

where

\[m_i=\frac{1}{N}\sum_{n=1}^N(x[n]-\bar{x})^i\]

is the biased sample \(i\texttt{th}\) central moment, and \(\bar{x}\) is the sample mean. If bias is False, the calculations are corrected for bias and the value computed is the adjusted Fisher-Pearson standardized moment coefficient, i.e.

\[G_1=\frac{k_3}{k_2^{3/2}}= \frac{\sqrt{N(N-1)}}{N-2}\frac{m_3}{m_2^{3/2}}.\]

References

1

Zwillinger, D. and Kokoska, S. (2000). CRC Standard Probability and Statistics Tables and Formulae. Chapman & Hall: New York. 2000. Section 2.2.24.1

Examples

>>> from scipy.stats import skew  
>>> skew([1, 2, 3, 4, 5])  
0.0
>>> skew([2, 8, 0, 4, 1, 9, 9, 0])  
0.2650554122698573
dask.array.stats.skewtest(a, axis=0, nan_policy='propagate')

Test whether the skew is different from the normal distribution.

This docstring was copied from scipy.stats.skewtest.

Some inconsistencies with the Dask version may exist.

This function tests the null hypothesis that the skewness of the population that the sample was drawn from is the same as that of a corresponding normal distribution.

Parameters
aarray

The data to be tested.

axisint or None, optional

Axis along which statistics are calculated. Default is 0. If None, compute over the whole array a.

nan_policy{‘propagate’, ‘raise’, ‘omit’}, optional

Defines how to handle when input contains nan. The following options are available (default is ‘propagate’):

  • ‘propagate’: returns nan

  • ‘raise’: throws an error

  • ‘omit’: performs the calculations ignoring nan values

Returns
statisticfloat

The computed z-score for this test.

pvaluefloat

Two-sided p-value for the hypothesis test.

Notes

The sample size must be at least 8.

References

1

R. B. D’Agostino, A. J. Belanger and R. B. D’Agostino Jr., “A suggestion for using powerful and informative tests of normality”, American Statistician 44, pp. 316-321, 1990.

Examples

>>> from scipy.stats import skewtest  
>>> skewtest([1, 2, 3, 4, 5, 6, 7, 8])  
SkewtestResult(statistic=1.0108048609177787, pvalue=0.3121098361421897)
>>> skewtest([2, 8, 0, 4, 1, 9, 9, 0])  
SkewtestResult(statistic=0.44626385374196975, pvalue=0.6554066631275459)
>>> skewtest([1, 2, 3, 4, 5, 6, 7, 8000])  
SkewtestResult(statistic=3.571773510360407, pvalue=0.0003545719905823133)
>>> skewtest([100, 100, 100, 100, 100, 100, 100, 101])  
SkewtestResult(statistic=3.5717766638478072, pvalue=0.000354567720281634)
dask.array.stats.kurtosis(a, axis=0, fisher=True, bias=True, nan_policy='propagate')

Compute the kurtosis (Fisher or Pearson) of a dataset.

This docstring was copied from scipy.stats.kurtosis.

Some inconsistencies with the Dask version may exist.

Kurtosis is the fourth central moment divided by the square of the variance. If Fisher’s definition is used, then 3.0 is subtracted from the result to give 0.0 for a normal distribution.

If bias is False then the kurtosis is calculated using k statistics to eliminate bias coming from biased moment estimators

Use kurtosistest to see if result is close enough to normal.

Parameters
aarray

Data for which the kurtosis is calculated.

axisint or None, optional

Axis along which the kurtosis is calculated. Default is 0. If None, compute over the whole array a.

fisherbool, optional

If True, Fisher’s definition is used (normal ==> 0.0). If False, Pearson’s definition is used (normal ==> 3.0).

biasbool, optional

If False, then the calculations are corrected for statistical bias.

nan_policy{‘propagate’, ‘raise’, ‘omit’}, optional

Defines how to handle when input contains nan. ‘propagate’ returns nan, ‘raise’ throws an error, ‘omit’ performs the calculations ignoring nan values. Default is ‘propagate’.

Returns
kurtosisarray

The kurtosis of values along an axis. If all values are equal, return -3 for Fisher’s definition and 0 for Pearson’s definition.

References

1

Zwillinger, D. and Kokoska, S. (2000). CRC Standard Probability and Statistics Tables and Formulae. Chapman & Hall: New York. 2000.

Examples

In Fisher’s definiton, the kurtosis of the normal distribution is zero. In the following example, the kurtosis is close to zero, because it was calculated from the dataset, not from the continuous distribution.

>>> from scipy.stats import norm, kurtosis  
>>> data = norm.rvs(size=1000, random_state=3)  
>>> kurtosis(data)  
-0.06928694200380558

The distribution with a higher kurtosis has a heavier tail. The zero valued kurtosis of the normal distribution in Fisher’s definition can serve as a reference point.

>>> import matplotlib.pyplot as plt  
>>> import scipy.stats as stats  
>>> from scipy.stats import kurtosis  
>>> x = np.linspace(-5, 5, 100)  
>>> ax = plt.subplot()  
>>> distnames = ['laplace', 'norm', 'uniform']  
>>> for distname in distnames:  
...     if distname == 'uniform':
...         dist = getattr(stats, distname)(loc=-2, scale=4)
...     else:
...         dist = getattr(stats, distname)
...     data = dist.rvs(size=1000)
...     kur = kurtosis(data, fisher=True)
...     y = dist.pdf(x)
...     ax.plot(x, y, label="{}, {}".format(distname, round(kur, 3)))
...     ax.legend()

The Laplace distribution has a heavier tail than the normal distribution. The uniform distribution (which has negative kurtosis) has the thinnest tail.

dask.array.stats.kurtosistest(a, axis=0, nan_policy='propagate')

Test whether a dataset has normal kurtosis.

This docstring was copied from scipy.stats.kurtosistest.

Some inconsistencies with the Dask version may exist.

This function tests the null hypothesis that the kurtosis of the population from which the sample was drawn is that of the normal distribution: kurtosis = 3(n-1)/(n+1).

Parameters
aarray

Array of the sample data.

axisint or None, optional

Axis along which to compute test. Default is 0. If None, compute over the whole array a.

nan_policy{‘propagate’, ‘raise’, ‘omit’}, optional

Defines how to handle when input contains nan. The following options are available (default is ‘propagate’):

  • ‘propagate’: returns nan

  • ‘raise’: throws an error

  • ‘omit’: performs the calculations ignoring nan values

Returns
statisticfloat

The computed z-score for this test.

pvaluefloat

The two-sided p-value for the hypothesis test.

Notes

Valid only for n>20. This function uses the method described in [1].

References

1

see e.g. F. J. Anscombe, W. J. Glynn, “Distribution of the kurtosis statistic b2 for normal samples”, Biometrika, vol. 70, pp. 227-234, 1983.

Examples

>>> from scipy.stats import kurtosistest  
>>> kurtosistest(list(range(20)))  
KurtosistestResult(statistic=-1.7058104152122062, pvalue=0.08804338332528348)
>>> np.random.seed(28041990)  
>>> s = np.random.normal(0, 1, 1000)  
>>> kurtosistest(s)  
KurtosistestResult(statistic=1.2317590987707365, pvalue=0.21803908613450895)
dask.array.stats.normaltest(a, axis=0, nan_policy='propagate')

Test whether a sample differs from a normal distribution.

This docstring was copied from scipy.stats.normaltest.

Some inconsistencies with the Dask version may exist.

This function tests the null hypothesis that a sample comes from a normal distribution. It is based on D’Agostino and Pearson’s [1], [2] test that combines skew and kurtosis to produce an omnibus test of normality.

Parameters
aarray_like

The array containing the sample to be tested.

axisint or None, optional

Axis along which to compute test. Default is 0. If None, compute over the whole array a.

nan_policy{‘propagate’, ‘raise’, ‘omit’}, optional

Defines how to handle when input contains nan. The following options are available (default is ‘propagate’):

  • ‘propagate’: returns nan

  • ‘raise’: throws an error

  • ‘omit’: performs the calculations ignoring nan values

Returns
statisticfloat or array

s^2 + k^2, where s is the z-score returned by skewtest and k is the z-score returned by kurtosistest.

pvaluefloat or array

A 2-sided chi squared probability for the hypothesis test.

References

1

D’Agostino, R. B. (1971), “An omnibus test of normality for moderate and large sample size”, Biometrika, 58, 341-348

2

D’Agostino, R. and Pearson, E. S. (1973), “Tests for departure from normality”, Biometrika, 60, 613-622

Examples

>>> from scipy import stats  
>>> pts = 1000  
>>> np.random.seed(28041990)  
>>> a = np.random.normal(0, 1, size=pts)  
>>> b = np.random.normal(2, 1, size=pts)  
>>> x = np.concatenate((a, b))  
>>> k2, p = stats.normaltest(x)  
>>> alpha = 1e-3  
>>> print("p = {:g}".format(p))  
p = 3.27207e-11
>>> if p < alpha:  # null hypothesis: x comes from a normal distribution  
...     print("The null hypothesis can be rejected")
... else:
...     print("The null hypothesis cannot be rejected")
The null hypothesis can be rejected
dask.array.stats.f_oneway(*args)

Perform one-way ANOVA.

This docstring was copied from scipy.stats.f_oneway.

Some inconsistencies with the Dask version may exist.

The one-way ANOVA tests the null hypothesis that two or more groups have the same population mean. The test is applied to samples from two or more groups, possibly with differing sizes.

Parameters
sample1, sample2, …array_like

The sample measurements for each group.

Returns
statisticfloat

The computed F-value of the test.

pvaluefloat

The associated p-value from the F-distribution.

Notes

The ANOVA test has important assumptions that must be satisfied in order for the associated p-value to be valid.

  1. The samples are independent.

  2. Each sample is from a normally distributed population.

  3. The population standard deviations of the groups are all equal. This property is known as homoscedasticity.

If these assumptions are not true for a given set of data, it may still be possible to use the Kruskal-Wallis H-test (scipy.stats.kruskal) although with some loss of power.

The algorithm is from Heiman[2], pp.394-7.

References

1

R. Lowry, “Concepts and Applications of Inferential Statistics”, Chapter 14, 2014, http://vassarstats.net/textbook/

2

G.W. Heiman, “Understanding research methods and statistics: An integrated introduction for psychology”, Houghton, Mifflin and Company, 2001.

3

G.H. McDonald, “Handbook of Biological Statistics”, One-way ANOVA. http://www.biostathandbook.com/onewayanova.html

Examples

>>> import scipy.stats as stats  

[3] Here are some data on a shell measurement (the length of the anterior adductor muscle scar, standardized by dividing by length) in the mussel Mytilus trossulus from five locations: Tillamook, Oregon; Newport, Oregon; Petersburg, Alaska; Magadan, Russia; and Tvarminne, Finland, taken from a much larger data set used in McDonald et al. (1991).

>>> tillamook = [0.0571, 0.0813, 0.0831, 0.0976, 0.0817, 0.0859, 0.0735,  
...              0.0659, 0.0923, 0.0836]
>>> newport = [0.0873, 0.0662, 0.0672, 0.0819, 0.0749, 0.0649, 0.0835,  
...            0.0725]
>>> petersburg = [0.0974, 0.1352, 0.0817, 0.1016, 0.0968, 0.1064, 0.105]  
>>> magadan = [0.1033, 0.0915, 0.0781, 0.0685, 0.0677, 0.0697, 0.0764,  
...            0.0689]
>>> tvarminne = [0.0703, 0.1026, 0.0956, 0.0973, 0.1039, 0.1045]  
>>> stats.f_oneway(tillamook, newport, petersburg, magadan, tvarminne)  
(7.1210194716424473, 0.00028122423145345439)
dask.array.stats.moment(a, moment=1, axis=0, nan_policy='propagate')

Calculate the nth moment about the mean for a sample.

This docstring was copied from scipy.stats.moment.

Some inconsistencies with the Dask version may exist.

A moment is a specific quantitative measure of the shape of a set of points. It is often used to calculate coefficients of skewness and kurtosis due to its close relationship with them.

Parameters
aarray_like

Input array.

momentint or array_like of ints, optional

Order of central moment that is returned. Default is 1.

axisint or None, optional

Axis along which the central moment is computed. Default is 0. If None, compute over the whole array a.

nan_policy{‘propagate’, ‘raise’, ‘omit’}, optional

Defines how to handle when input contains nan. The following options are available (default is ‘propagate’):

  • ‘propagate’: returns nan

  • ‘raise’: throws an error

  • ‘omit’: performs the calculations ignoring nan values

Returns
n-th central momentndarray or float

The appropriate moment along the given axis or over all values if axis is None. The denominator for the moment calculation is the number of observations, no degrees of freedom correction is done.

See also

kurtosis, skew, describe

Notes

The k-th central moment of a data sample is:

\[m_k = \frac{1}{n} \sum_{i = 1}^n (x_i - \bar{x})^k\]

Where n is the number of samples and x-bar is the mean. This function uses exponentiation by squares [1] for efficiency.

References

1

https://eli.thegreenplace.net/2009/03/21/efficient-integer-exponentiation-algorithms

Examples

>>> from scipy.stats import moment  
>>> moment([1, 2, 3, 4, 5], moment=1)  
0.0
>>> moment([1, 2, 3, 4, 5], moment=2)  
2.0
dask.array.image.imread(filename, imread=None, preprocess=None)

Read a stack of images into a dask array

Parameters
filename: string

A globstring like ‘myfile.*.png’

imread: function (optional)

Optionally provide custom imread function. Function should expect a filename and produce a numpy array. Defaults to skimage.io.imread.

preprocess: function (optional)

Optionally provide custom function to preprocess the image. Function should expect a numpy array for a single image.

Returns
Dask array of all images stacked along the first dimension. All images
will be treated as individual chunks

Examples

>>> from dask.array.image import imread
>>> im = imread('2015-*-*.png')  
>>> im.shape  
(365, 1000, 1000, 3)
dask.array.gufunc.apply_gufunc(func, signature, *args, **kwargs)

Apply a generalized ufunc or similar python function to arrays.

signature determines if the function consumes or produces core dimensions. The remaining dimensions in given input arrays (*args) are considered loop dimensions and are required to broadcast naturally against each other.

In other terms, this function is like np.vectorize, but for the blocks of dask arrays. If the function itself shall also be vectorized use vectorize=True for convenience.

Parameters
funccallable

Function to call like func(*args, **kwargs) on input arrays (*args) that returns an array or tuple of arrays. If multiple arguments with non-matching dimensions are supplied, this function is expected to vectorize (broadcast) over axes of positional arguments in the style of NumPy universal functions [1] (if this is not the case, set vectorize=True). If this function returns multiple outputs, output_core_dims has to be set as well.

signature: string

Specifies what core dimensions are consumed and produced by func. According to the specification of numpy.gufunc signature [2]

*argsnumeric

Input arrays or scalars to the callable function.

axes: List of tuples, optional, keyword only

A list of tuples with indices of axes a generalized ufunc should operate on. For instance, for a signature of "(i,j),(j,k)->(i,k)" appropriate for matrix multiplication, the base elements are two-dimensional matrices and these are taken to be stored in the two last axes of each argument. The corresponding axes keyword would be [(-2, -1), (-2, -1), (-2, -1)]. For simplicity, for generalized ufuncs that operate on 1-dimensional arrays (vectors), a single integer is accepted instead of a single-element tuple, and for generalized ufuncs for which all outputs are scalars, the output tuples can be omitted.

axis: int, optional, keyword only

A single axis over which a generalized ufunc should operate. This is a short-cut for ufuncs that operate over a single, shared core dimension, equivalent to passing in axes with entries of (axis,) for each single-core-dimension argument and () for all others. For instance, for a signature "(i),(i)->()", it is equivalent to passing in axes=[(axis,), (axis,), ()].

keepdims: bool, optional, keyword only

If this is set to True, axes which are reduced over will be left in the result as a dimension with size one, so that the result will broadcast correctly against the inputs. This option can only be used for generalized ufuncs that operate on inputs that all have the same number of core dimensions and with outputs that have no core dimensions , i.e., with signatures like "(i),(i)->()" or "(m,m)->()". If used, the location of the dimensions in the output can be controlled with axes and axis.

output_dtypesOptional, dtype or list of dtypes, keyword only

Valid numpy dtype specification or list thereof. If not given, a call of func with a small set of data is performed in order to try to automatically determine the output dtypes.

output_sizesdict, optional, keyword only

Optional mapping from dimension names to sizes for outputs. Only used if new core dimensions (not found on inputs) appear on outputs.

vectorize: bool, keyword only

If set to True, np.vectorize is applied to func for convenience. Defaults to False.

allow_rechunk: Optional, bool, keyword only

Allows rechunking, otherwise chunk sizes need to match and core dimensions are to consist only of one chunk. Warning: enabling this can increase memory usage significantly. Defaults to False.

**kwargsdict

Extra keyword arguments to pass to func

Returns
Single dask.array.Array or tuple of dask.array.Array

References

1

https://docs.scipy.org/doc/numpy/reference/ufuncs.html

2

https://docs.scipy.org/doc/numpy/reference/c-api/generalized-ufuncs.html

Examples

>>> import dask.array as da
>>> import numpy as np
>>> def stats(x):
...     return np.mean(x, axis=-1), np.std(x, axis=-1)
>>> a = da.random.normal(size=(10,20,30), chunks=(5, 10, 30))
>>> mean, std = da.apply_gufunc(stats, "(i)->(),()", a)
>>> mean.compute().shape
(10, 20)
>>> def outer_product(x, y):
...     return np.einsum("i,j->ij", x, y)
>>> a = da.random.normal(size=(   20,30), chunks=(10, 30))
>>> b = da.random.normal(size=(10, 1,40), chunks=(5, 1, 40))
>>> c = da.apply_gufunc(outer_product, "(i),(j)->(i,j)", a, b, vectorize=True)
>>> c.compute().shape
(10, 20, 30, 40)
dask.array.gufunc.as_gufunc(signature=None, **kwargs)

Decorator for dask.array.gufunc.

Parameters
signatureString

Specifies what core dimensions are consumed and produced by func. According to the specification of numpy.gufunc signature [2]

axes: List of tuples, optional, keyword only

A list of tuples with indices of axes a generalized ufunc should operate on. For instance, for a signature of "(i,j),(j,k)->(i,k)" appropriate for matrix multiplication, the base elements are two-dimensional matrices and these are taken to be stored in the two last axes of each argument. The corresponding axes keyword would be [(-2, -1), (-2, -1), (-2, -1)]. For simplicity, for generalized ufuncs that operate on 1-dimensional arrays (vectors), a single integer is accepted instead of a single-element tuple, and for generalized ufuncs for which all outputs are scalars, the output tuples can be omitted.

axis: int, optional, keyword only

A single axis over which a generalized ufunc should operate. This is a short-cut for ufuncs that operate over a single, shared core dimension, equivalent to passing in axes with entries of (axis,) for each single-core-dimension argument and () for all others. For instance, for a signature "(i),(i)->()", it is equivalent to passing in axes=[(axis,), (axis,), ()].

keepdims: bool, optional, keyword only

If this is set to True, axes which are reduced over will be left in the result as a dimension with size one, so that the result will broadcast correctly against the inputs. This option can only be used for generalized ufuncs that operate on inputs that all have the same number of core dimensions and with outputs that have no core dimensions , i.e., with signatures like "(i),(i)->()" or "(m,m)->()". If used, the location of the dimensions in the output can be controlled with axes and axis.

output_dtypesOptional, dtype or list of dtypes, keyword only

Valid numpy dtype specification or list thereof. If not given, a call of func with a small set of data is performed in order to try to automatically determine the output dtypes.

output_sizesdict, optional, keyword only

Optional mapping from dimension names to sizes for outputs. Only used if new core dimensions (not found on inputs) appear on outputs.

vectorize: bool, keyword only

If set to True, np.vectorize is applied to func for convenience. Defaults to False.

allow_rechunk: Optional, bool, keyword only

Allows rechunking, otherwise chunk sizes need to match and core dimensions are to consist only of one chunk. Warning: enabling this can increase memory usage significantly. Defaults to False.

Returns
Decorator for pyfunc that itself returns a gufunc.

References

1

https://docs.scipy.org/doc/numpy/reference/ufuncs.html

2

https://docs.scipy.org/doc/numpy/reference/c-api/generalized-ufuncs.html

Examples

>>> import dask.array as da
>>> import numpy as np
>>> a = da.random.normal(size=(10,20,30), chunks=(5, 10, 30))
>>> @da.as_gufunc("(i)->(),()", output_dtypes=(float, float))
... def stats(x):
...     return np.mean(x, axis=-1), np.std(x, axis=-1)
>>> mean, std = stats(a)
>>> mean.compute().shape
(10, 20)
>>> a = da.random.normal(size=(   20,30), chunks=(10, 30))
>>> b = da.random.normal(size=(10, 1,40), chunks=(5, 1, 40))
>>> @da.as_gufunc("(i),(j)->(i,j)", output_dtypes=float, vectorize=True)
... def outer_product(x, y):
...     return np.einsum("i,j->ij", x, y)
>>> c = outer_product(a, b)
>>> c.compute().shape
(10, 20, 30, 40)
dask.array.gufunc.gufunc(pyfunc, **kwargs)

Binds pyfunc into dask.array.apply_gufunc when called.

Parameters
pyfunccallable

Function to call like func(*args, **kwargs) on input arrays (*args) that returns an array or tuple of arrays. If multiple arguments with non-matching dimensions are supplied, this function is expected to vectorize (broadcast) over axes of positional arguments in the style of NumPy universal functions [1] (if this is not the case, set vectorize=True). If this function returns multiple outputs, output_core_dims has to be set as well.

signatureString, keyword only

Specifies what core dimensions are consumed and produced by func. According to the specification of numpy.gufunc signature [2]

axes: List of tuples, optional, keyword only

A list of tuples with indices of axes a generalized ufunc should operate on. For instance, for a signature of "(i,j),(j,k)->(i,k)" appropriate for matrix multiplication, the base elements are two-dimensional matrices and these are taken to be stored in the two last axes of each argument. The corresponding axes keyword would be [(-2, -1), (-2, -1), (-2, -1)]. For simplicity, for generalized ufuncs that operate on 1-dimensional arrays (vectors), a single integer is accepted instead of a single-element tuple, and for generalized ufuncs for which all outputs are scalars, the output tuples can be omitted.

axis: int, optional, keyword only

A single axis over which a generalized ufunc should operate. This is a short-cut for ufuncs that operate over a single, shared core dimension, equivalent to passing in axes with entries of (axis,) for each single-core-dimension argument and () for all others. For instance, for a signature "(i),(i)->()", it is equivalent to passing in axes=[(axis,), (axis,), ()].

keepdims: bool, optional, keyword only

If this is set to True, axes which are reduced over will be left in the result as a dimension with size one, so that the result will broadcast correctly against the inputs. This option can only be used for generalized ufuncs that operate on inputs that all have the same number of core dimensions and with outputs that have no core dimensions , i.e., with signatures like "(i),(i)->()" or "(m,m)->()". If used, the location of the dimensions in the output can be controlled with axes and axis.

output_dtypesOptional, dtype or list of dtypes, keyword only

Valid numpy dtype specification or list thereof. If not given, a call of func with a small set of data is performed in order to try to automatically determine the output dtypes.

output_sizesdict, optional, keyword only

Optional mapping from dimension names to sizes for outputs. Only used if new core dimensions (not found on inputs) appear on outputs.

vectorize: bool, keyword only

If set to True, np.vectorize is applied to func for convenience. Defaults to False.

allow_rechunk: Optional, bool, keyword only

Allows rechunking, otherwise chunk sizes need to match and core dimensions are to consist only of one chunk. Warning: enabling this can increase memory usage significantly. Defaults to False.

Returns
Wrapped function

References

1

https://docs.scipy.org/doc/numpy/reference/ufuncs.html

2

https://docs.scipy.org/doc/numpy/reference/c-api/generalized-ufuncs.html

Examples

>>> import dask.array as da
>>> import numpy as np
>>> a = da.random.normal(size=(10,20,30), chunks=(5, 10, 30))
>>> def stats(x):
...     return np.mean(x, axis=-1), np.std(x, axis=-1)
>>> gustats = da.gufunc(stats, signature="(i)->(),()", output_dtypes=(float, float))
>>> mean, std = gustats(a)
>>> mean.compute().shape
(10, 20)
>>> a = da.random.normal(size=(   20,30), chunks=(10, 30))
>>> b = da.random.normal(size=(10, 1,40), chunks=(5, 1, 40))
>>> def outer_product(x, y):
...     return np.einsum("i,j->ij", x, y)
>>> guouter_product = da.gufunc(outer_product, signature="(i),(j)->(i,j)", output_dtypes=float, vectorize=True)
>>> c = guouter_product(a, b)
>>> c.compute().shape
(10, 20, 30, 40)
dask.array.core.map_blocks(func, *args, name=None, token=None, dtype=None, chunks=None, drop_axis=[], new_axis=None, meta=None, **kwargs)

Map a function across all blocks of a dask array.

Parameters
funccallable

Function to apply to every block in the array.

argsdask arrays or other objects
dtypenp.dtype, optional

The dtype of the output array. It is recommended to provide this. If not provided, will be inferred by applying the function to a small set of fake data.

chunkstuple, optional

Chunk shape of resulting blocks if the function does not preserve shape. If not provided, the resulting array is assumed to have the same block structure as the first input array.

drop_axisnumber or iterable, optional

Dimensions lost by the function.

new_axisnumber or iterable, optional

New dimensions created by the function. Note that these are applied after drop_axis (if present).

tokenstring, optional

The key prefix to use for the output array. If not provided, will be determined from the function name.

namestring, optional

The key name to use for the output array. Note that this fully specifies the output key name, and must be unique. If not provided, will be determined by a hash of the arguments.

**kwargs :

Other keyword arguments to pass to function. Values must be constants (not dask.arrays)

See also

dask.array.blockwise

Generalized operation with control over block alignment.

Examples

>>> import dask.array as da
>>> x = da.arange(6, chunks=3)
>>> x.map_blocks(lambda x: x * 2).compute()
array([ 0,  2,  4,  6,  8, 10])

The da.map_blocks function can also accept multiple arrays.

>>> d = da.arange(5, chunks=2)
>>> e = da.arange(5, chunks=2)
>>> f = map_blocks(lambda a, b: a + b**2, d, e)
>>> f.compute()
array([ 0,  2,  6, 12, 20])

If the function changes shape of the blocks then you must provide chunks explicitly.

>>> y = x.map_blocks(lambda x: x[::2], chunks=((2, 2),))

You have a bit of freedom in specifying chunks. If all of the output chunk sizes are the same, you can provide just that chunk size as a single tuple.

>>> a = da.arange(18, chunks=(6,))
>>> b = a.map_blocks(lambda x: x[:3], chunks=(3,))

If the function changes the dimension of the blocks you must specify the created or destroyed dimensions.

>>> b = a.map_blocks(lambda x: x[None, :, None], chunks=(1, 6, 1),
...                  new_axis=[0, 2])

If chunks is specified but new_axis is not, then it is inferred to add the necessary number of axes on the left.

Map_blocks aligns blocks by block positions without regard to shape. In the following example we have two arrays with the same number of blocks but with different shape and chunk sizes.

>>> x = da.arange(1000, chunks=(100,))
>>> y = da.arange(100, chunks=(10,))

The relevant attribute to match is numblocks.

>>> x.numblocks
(10,)
>>> y.numblocks
(10,)

If these match (up to broadcasting rules) then we can map arbitrary functions across blocks

>>> def func(a, b):
...     return np.array([a.max(), b.max()])
>>> da.map_blocks(func, x, y, chunks=(2,), dtype='i8')
dask.array<func, shape=(20,), dtype=int64, chunksize=(2,), chunktype=numpy.ndarray>
>>> _.compute()
array([ 99,   9, 199,  19, 299,  29, 399,  39, 499,  49, 599,  59, 699,
        69, 799,  79, 899,  89, 999,  99])

Your block function get information about where it is in the array by accepting a special block_info keyword argument.

>>> def func(block, block_info=None):
...     pass

This will receive the following information:

>>> block_info  
{0: {'shape': (1000,),
     'num-chunks': (10,),
     'chunk-location': (4,),
     'array-location': [(400, 500)]},
 None: {'shape': (1000,),
        'num-chunks': (10,),
        'chunk-location': (4,),
        'array-location': [(400, 500)],
        'chunk-shape': (100,),
        'dtype': dtype('float64')}}

For each argument and keyword arguments that are dask arrays (the positions of which are the first index), you will receive the shape of the full array, the number of chunks of the full array in each dimension, the chunk location (for example the fourth chunk over in the first dimension), and the array location (for example the slice corresponding to 40:50). The same information is provided for the output, with the key None, plus the shape and dtype that should be returned.

These features can be combined to synthesize an array from scratch, for example:

>>> def func(block_info=None):
...     loc = block_info[None]['array-location'][0]
...     return np.arange(loc[0], loc[1])
>>> da.map_blocks(func, chunks=((4, 4),), dtype=np.float_)
dask.array<func, shape=(8,), dtype=float64, chunksize=(4,), chunktype=numpy.ndarray>
>>> _.compute()
array([0, 1, 2, 3, 4, 5, 6, 7])

You may specify the key name prefix of the resulting task in the graph with the optional token keyword argument.

>>> x.map_blocks(lambda x: x + 1, name='increment')  
dask.array<increment, shape=(100,), dtype=int64, chunksize=(10,), chunktype=numpy.ndarray>
dask.array.core.blockwise(func, out_ind, *args, name=None, token=None, dtype=None, adjust_chunks=None, new_axes=None, align_arrays=True, concatenate=None, meta=None, **kwargs)

Tensor operation: Generalized inner and outer products

A broad class of blocked algorithms and patterns can be specified with a concise multi-index notation. The blockwise function applies an in-memory function across multiple blocks of multiple inputs in a variety of ways. Many dask.array operations are special cases of blockwise including elementwise, broadcasting, reductions, tensordot, and transpose.

Parameters
funccallable

Function to apply to individual tuples of blocks

out_inditerable

Block pattern of the output, something like ‘ijk’ or (1, 2, 3)

*argssequence of Array, index pairs

Sequence like (x, ‘ij’, y, ‘jk’, z, ‘i’)

**kwargsdict

Extra keyword arguments to pass to function

dtypenp.dtype

Datatype of resulting array.

concatenatebool, keyword only

If true concatenate arrays along dummy indices, else provide lists

adjust_chunksdict

Dictionary mapping index to function to be applied to chunk sizes

new_axesdict, keyword only

New indexes and their dimension lengths

Examples

2D embarrassingly parallel operation from two arrays, x, and y.

>>> z = blockwise(operator.add, 'ij', x, 'ij', y, 'ij', dtype='f8')  # z = x + y  

Outer product multiplying x by y, two 1-d vectors

>>> z = blockwise(operator.mul, 'ij', x, 'i', y, 'j', dtype='f8')  

z = x.T

>>> z = blockwise(np.transpose, 'ji', x, 'ij', dtype=x.dtype)  

The transpose case above is illustrative because it does same transposition both on each in-memory block by calling np.transpose and on the order of the blocks themselves, by switching the order of the index ij -> ji.

We can compose these same patterns with more variables and more complex in-memory functions

z = X + Y.T

>>> z = blockwise(lambda x, y: x + y.T, 'ij', x, 'ij', y, 'ji', dtype='f8')  

Any index, like i missing from the output index is interpreted as a contraction (note that this differs from Einstein convention; repeated indices do not imply contraction.) In the case of a contraction the passed function should expect an iterable of blocks on any array that holds that index. To receive arrays concatenated along contracted dimensions instead pass concatenate=True.

Inner product multiplying x by y, two 1-d vectors

>>> def sequence_dot(x_blocks, y_blocks):
...     result = 0
...     for x, y in zip(x_blocks, y_blocks):
...         result += x.dot(y)
...     return result
>>> z = blockwise(sequence_dot, '', x, 'i', y, 'i', dtype='f8')  

Add new single-chunk dimensions with the new_axes= keyword, including the length of the new dimension. New dimensions will always be in a single chunk.

>>> def f(x):
...     return x[:, None] * np.ones((1, 5))
>>> z = blockwise(f, 'az', x, 'a', new_axes={'z': 5}, dtype=x.dtype)  

New dimensions can also be multi-chunk by specifying a tuple of chunk sizes. This has limited utility as is (because the chunks are all the same), but the resulting graph can be modified to achieve more useful results (see da.map_blocks).

>>> z = blockwise(f, 'az', x, 'a', new_axes={'z': (5, 5)}, dtype=x.dtype)  

If the applied function changes the size of each chunk you can specify this with a adjust_chunks={...} dictionary holding a function for each index that modifies the dimension size in that index.

>>> def double(x):
...     return np.concatenate([x, x])
>>> y = blockwise(double, 'ij', x, 'ij',
...               adjust_chunks={'i': lambda n: 2 * n}, dtype=x.dtype)  

Include literals by indexing with None

>>> y = blockwise(add, 'ij', x, 'ij', 1234, None, dtype=x.dtype)  
dask.array.core.normalize_chunks(chunks, shape=None, limit=None, dtype=None, previous_chunks=None)

Normalize chunks to tuple of tuples

This takes in a variety of input types and information and produces a full tuple-of-tuples result for chunks, suitable to be passed to Array or rechunk or any other operation that creates a Dask array.

Parameters
chunks: tuple, int, dict, or string

The chunks to be normalized. See examples below for more details

shape: Tuple[int]

The shape of the array

limit: int (optional)

The maximum block size to target in bytes, if freedom is given to choose

dtype: np.dtype
previous_chunks: Tuple[Tuple[int]] optional

Chunks from a previous array that we should use for inspiration when rechunking auto dimensions. If not provided but auto-chunking exists then auto-dimensions will prefer square-like chunk shapes.

Examples

Specify uniform chunk sizes

>>> normalize_chunks((2, 2), shape=(5, 6))
((2, 2, 1), (2, 2, 2))

Also passes through fully explicit tuple-of-tuples

>>> normalize_chunks(((2, 2, 1), (2, 2, 2)), shape=(5, 6))
((2, 2, 1), (2, 2, 2))

Cleans up lists to tuples

>>> normalize_chunks([[2, 2], [3, 3]])
((2, 2), (3, 3))

Expands integer inputs 10 -> (10, 10)

>>> normalize_chunks(10, shape=(30, 5))
((10, 10, 10), (5,))

Expands dict inputs

>>> normalize_chunks({0: 2, 1: 3}, shape=(6, 6))
((2, 2, 2), (3, 3))

The values -1 and None get mapped to full size

>>> normalize_chunks((5, -1), shape=(10, 10))
((5, 5), (10,))

Use the value “auto” to automatically determine chunk sizes along certain dimensions. This uses the limit= and dtype= keywords to determine how large to make the chunks. The term “auto” can be used anywhere an integer can be used. See array chunking documentation for more information.

>>> normalize_chunks(("auto",), shape=(20,), limit=5, dtype='uint8')
((5, 5, 5, 5),)

You can also use byte sizes (see dask.utils.parse_bytes) in place of “auto” to ask for a particular size

>>> normalize_chunks("1kiB", shape=(2000,), dtype='float32')
((250, 250, 250, 250, 250, 250, 250, 250),)

Respects null dimensions

>>> normalize_chunks((), shape=(0, 0))
((0,), (0,))

Array Methods

class dask.array.Array

Parallel Dask Array

A parallel nd-array comprised of many numpy arrays arranged in a grid.

This constructor is for advanced uses only. For normal use see the da.from_array function.

Parameters
daskdict

Task dependency graph

namestring

Name of array in dask

shapetuple of ints

Shape of the entire array

chunks: iterable of tuples

block sizes along each dimension

dtypestr or dtype

Typecode or data-type for the new Dask Array

metaempty ndarray

empty ndarray created with same NumPy backend, ndim and dtype as the Dask Array being created (overrides dtype)

all(axis=None, out=None, keepdims=False)

This docstring was copied from numpy.ndarray.all.

Some inconsistencies with the Dask version may exist.

Returns True if all elements evaluate to True.

Refer to numpy.all for full documentation.

See also

numpy.all

equivalent function

any(axis=None, out=None, keepdims=False)

This docstring was copied from numpy.ndarray.any.

Some inconsistencies with the Dask version may exist.

Returns True if any of the elements of a evaluate to True.

Refer to numpy.any for full documentation.

See also

numpy.any

equivalent function

argmax(axis=None, out=None)

This docstring was copied from numpy.ndarray.argmax.

Some inconsistencies with the Dask version may exist.

Return indices of the maximum values along the given axis.

Refer to numpy.argmax for full documentation.

See also

numpy.argmax

equivalent function

argmin(axis=None, out=None)

This docstring was copied from numpy.ndarray.argmin.

Some inconsistencies with the Dask version may exist.

Return indices of the minimum values along the given axis of a.

Refer to numpy.argmin for detailed documentation.

See also

numpy.argmin

equivalent function

argtopk(self, k, axis=-1, split_every=None)

The indices of the top k elements of an array.

See da.argtopk for docstring

astype(self, dtype, **kwargs)

Copy of the array, cast to a specified type.

Parameters
dtypestr or dtype

Typecode or data-type to which the array is cast.

casting{‘no’, ‘equiv’, ‘safe’, ‘same_kind’, ‘unsafe’}, optional

Controls what kind of data casting may occur. Defaults to ‘unsafe’ for backwards compatibility.

  • ‘no’ means the data types should not be cast at all.

  • ‘equiv’ means only byte-order changes are allowed.

  • ‘safe’ means only casts which can preserve values are allowed.

  • ‘same_kind’ means only safe casts or casts within a kind,

    like float64 to float32, are allowed.

  • ‘unsafe’ means any data conversions may be done.

copybool, optional

By default, astype always returns a newly allocated array. If this is set to False and the dtype requirement is satisfied, the input array is returned instead of a copy.

property blocks

Slice an array by blocks

This allows blockwise slicing of a Dask array. You can perform normal Numpy-style slicing but now rather than slice elements of the array you slice along blocks so, for example, x.blocks[0, ::2] produces a new dask array with every other block in the first row of blocks.

You can index blocks in any way that could index a numpy array of shape equal to the number of blocks in each dimension, (available as array.numblocks). The dimension of the output array will be the same as the dimension of this array, even if integer indices are passed. This does not support slicing with np.newaxis or multiple lists.

Returns
A Dask array

Examples

>>> import dask.array as da
>>> x = da.arange(10, chunks=2)
>>> x.blocks[0].compute()
array([0, 1])
>>> x.blocks[:3].compute()
array([0, 1, 2, 3, 4, 5])
>>> x.blocks[::2].compute()
array([0, 1, 4, 5, 8, 9])
>>> x.blocks[[-1, 0]].compute()
array([8, 9, 0, 1])
choose(choices, out=None, mode='raise')

This docstring was copied from numpy.ndarray.choose.

Some inconsistencies with the Dask version may exist.

Use an index array to construct a new array from a set of choices.

Refer to numpy.choose for full documentation.

See also

numpy.choose

equivalent function

clip(min=None, max=None, out=None, **kwargs)

This docstring was copied from numpy.ndarray.clip.

Some inconsistencies with the Dask version may exist.

Return an array whose values are limited to [min, max]. One of max or min must be given.

Refer to numpy.clip for full documentation.

See also

numpy.clip

equivalent function

compute_chunk_sizes(self)

Compute the chunk sizes for a Dask array. This is especially useful when the chunk sizes are unknown (e.g., when indexing one Dask array with another).

Notes

This function modifies the Dask array in-place.

Examples

>>> import dask.array as da
>>> import numpy as np
>>> x = da.from_array([-2, -1, 0, 1, 2], chunks=2)
>>> x.chunks
((2, 2, 1),)
>>> y = x[x <= 0]
>>> y.chunks
((nan, nan, nan),)
>>> y.compute_chunk_sizes()  # in-place computation
dask.array<getitem, shape=(3,), dtype=int64, chunksize=(2,), chunktype=numpy.ndarray>
>>> y.chunks
((2, 1, 0),)
copy(self)

Copy array. This is a no-op for dask.arrays, which are immutable

cumprod(axis=None, dtype=None, out=None)

This docstring was copied from numpy.ndarray.cumprod.

Some inconsistencies with the Dask version may exist.

Return the cumulative product of the elements along the given axis.

Refer to numpy.cumprod for full documentation.

See also

numpy.cumprod

equivalent function

cumsum(axis=None, dtype=None, out=None)

This docstring was copied from numpy.ndarray.cumsum.

Some inconsistencies with the Dask version may exist.

Return the cumulative sum of the elements along the given axis.

Refer to numpy.cumsum for full documentation.

See also

numpy.cumsum

equivalent function

dot(b, out=None)

This docstring was copied from numpy.ndarray.dot.

Some inconsistencies with the Dask version may exist.

Dot product of two arrays.

Refer to numpy.dot for full documentation.

See also

numpy.dot

equivalent function

Examples

>>> a = np.eye(2)  
>>> b = np.ones((2, 2)) * 2  
>>> a.dot(b)  
array([[2.,  2.],
       [2.,  2.]])

This array method can be conveniently chained:

>>> a.dot(b).dot(b)  
array([[8.,  8.],
       [8.,  8.]])
flatten([order])

This docstring was copied from numpy.ndarray.ravel.

Some inconsistencies with the Dask version may exist.

Return a flattened array.

Refer to numpy.ravel for full documentation.

See also

numpy.ravel

equivalent function

ndarray.flat

a flat iterator on the array.

property itemsize

Length of one array element in bytes

map_blocks(func, *args, name=None, token=None, dtype=None, chunks=None, drop_axis=[], new_axis=None, meta=None, **kwargs)

Map a function across all blocks of a dask array.

Parameters
funccallable

Function to apply to every block in the array.

argsdask arrays or other objects
dtypenp.dtype, optional

The dtype of the output array. It is recommended to provide this. If not provided, will be inferred by applying the function to a small set of fake data.

chunkstuple, optional

Chunk shape of resulting blocks if the function does not preserve shape. If not provided, the resulting array is assumed to have the same block structure as the first input array.

drop_axisnumber or iterable, optional

Dimensions lost by the function.

new_axisnumber or iterable, optional

New dimensions created by the function. Note that these are applied after drop_axis (if present).

tokenstring, optional

The key prefix to use for the output array. If not provided, will be determined from the function name.

namestring, optional

The key name to use for the output array. Note that this fully specifies the output key name, and must be unique. If not provided, will be determined by a hash of the arguments.

**kwargs :

Other keyword arguments to pass to function. Values must be constants (not dask.arrays)

See also

dask.array.blockwise

Generalized operation with control over block alignment.

Examples

>>> import dask.array as da
>>> x = da.arange(6, chunks=3)
>>> x.map_blocks(lambda x: x * 2).compute()
array([ 0,  2,  4,  6,  8, 10])

The da.map_blocks function can also accept multiple arrays.

>>> d = da.arange(5, chunks=2)
>>> e = da.arange(5, chunks=2)
>>> f = map_blocks(lambda a, b: a + b**2, d, e)
>>> f.compute()
array([ 0,  2,  6, 12, 20])

If the function changes shape of the blocks then you must provide chunks explicitly.

>>> y = x.map_blocks(lambda x: x[::2], chunks=((2, 2),))

You have a bit of freedom in specifying chunks. If all of the output chunk sizes are the same, you can provide just that chunk size as a single tuple.

>>> a = da.arange(18, chunks=(6,))
>>> b = a.map_blocks(lambda x: x[:3], chunks=(3,))

If the function changes the dimension of the blocks you must specify the created or destroyed dimensions.

>>> b = a.map_blocks(lambda x: x[None, :, None], chunks=(1, 6, 1),
...                  new_axis=[0, 2])

If chunks is specified but new_axis is not, then it is inferred to add the necessary number of axes on the left.

Map_blocks aligns blocks by block positions without regard to shape. In the following example we have two arrays with the same number of blocks but with different shape and chunk sizes.

>>> x = da.arange(1000, chunks=(100,))
>>> y = da.arange(100, chunks=(10,))

The relevant attribute to match is numblocks.

>>> x.numblocks
(10,)
>>> y.numblocks
(10,)

If these match (up to broadcasting rules) then we can map arbitrary functions across blocks

>>> def func(a, b):
...     return np.array([a.max(), b.max()])
>>> da.map_blocks(func, x, y, chunks=(2,), dtype='i8')
dask.array<func, shape=(20,), dtype=int64, chunksize=(2,), chunktype=numpy.ndarray>
>>> _.compute()
array([ 99,   9, 199,  19, 299,  29, 399,  39, 499,  49, 599,  59, 699,
        69, 799,  79, 899,  89, 999,  99])

Your block function get information about where it is in the array by accepting a special block_info keyword argument.

>>> def func(block, block_info=None):
...     pass

This will receive the following information:

>>> block_info  
{0: {'shape': (1000,),
     'num-chunks': (10,),
     'chunk-location': (4,),
     'array-location': [(400, 500)]},
 None: {'shape': (1000,),
        'num-chunks': (10,),
        'chunk-location': (4,),
        'array-location': [(400, 500)],
        'chunk-shape': (100,),
        'dtype': dtype('float64')}}

For each argument and keyword arguments that are dask arrays (the positions of which are the first index), you will receive the shape of the full array, the number of chunks of the full array in each dimension, the chunk location (for example the fourth chunk over in the first dimension), and the array location (for example the slice corresponding to 40:50). The same information is provided for the output, with the key None, plus the shape and dtype that should be returned.

These features can be combined to synthesize an array from scratch, for example:

>>> def func(block_info=None):
...     loc = block_info[None]['array-location'][0]
...     return np.arange(loc[0], loc[1])
>>> da.map_blocks(func, chunks=((4, 4),), dtype=np.float_)
dask.array<func, shape=(8,), dtype=float64, chunksize=(4,), chunktype=numpy.ndarray>
>>> _.compute()
array([0, 1, 2, 3, 4, 5, 6, 7])

You may specify the key name prefix of the resulting task in the graph with the optional token keyword argument.

>>> x.map_blocks(lambda x: x + 1, name='increment')  
dask.array<increment, shape=(100,), dtype=int64, chunksize=(10,), chunktype=numpy.ndarray>
map_overlap(self, func, depth, boundary=None, trim=True, **kwargs)

Map a function over blocks of the array with some overlap

We share neighboring zones between blocks of the array, then map a function, then trim away the neighboring strips.

Parameters
func: function

The function to apply to each extended block

depth: int, tuple, or dict

The number of elements that each block should share with its neighbors If a tuple or dict then this can be different per axis

boundary: str, tuple, dict

How to handle the boundaries. Values include ‘reflect’, ‘periodic’, ‘nearest’, ‘none’, or any constant value like 0 or np.nan

trim: bool

Whether or not to trim depth elements from each block after calling the map function. Set this to False if your mapping function already does this for you

**kwargs:

Other keyword arguments valid in map_blocks

Examples

>>> x = np.array([1, 1, 2, 3, 3, 3, 2, 1, 1])
>>> x = from_array(x, chunks=5)
>>> def derivative(x):
...     return x - np.roll(x, 1)
>>> y = x.map_overlap(derivative, depth=1, boundary=0)
>>> y.compute()
array([ 1,  0,  1,  1,  0,  0, -1, -1,  0])
>>> import dask.array as da
>>> x = np.arange(16).reshape((4, 4))
>>> d = da.from_array(x, chunks=(2, 2))
>>> d.map_overlap(lambda x: x + x.size, depth=1).compute()
array([[16, 17, 18, 19],
       [20, 21, 22, 23],
       [24, 25, 26, 27],
       [28, 29, 30, 31]])
>>> func = lambda x: x + x.size
>>> depth = {0: 1, 1: 1}
>>> boundary = {0: 'reflect', 1: 'none'}
>>> d.map_overlap(func, depth, boundary).compute()  
array([[12,  13,  14,  15],
       [16,  17,  18,  19],
       [20,  21,  22,  23],
       [24,  25,  26,  27]])
max(axis=None, out=None, keepdims=False, initial=<no value>, where=True)

This docstring was copied from numpy.ndarray.max.

Some inconsistencies with the Dask version may exist.

Return the maximum along a given axis.

Refer to numpy.amax for full documentation.

See also

numpy.amax

equivalent function

mean(axis=None, dtype=None, out=None, keepdims=False)

This docstring was copied from numpy.ndarray.mean.

Some inconsistencies with the Dask version may exist.

Returns the average of the array elements along given axis.

Refer to numpy.mean for full documentation.

See also

numpy.mean

equivalent function

min(axis=None, out=None, keepdims=False, initial=<no value>, where=True)

This docstring was copied from numpy.ndarray.min.

Some inconsistencies with the Dask version may exist.

Return the minimum along a given axis.

Refer to numpy.amin for full documentation.

See also

numpy.amin

equivalent function

moment(self, order, axis=None, dtype=None, keepdims=False, ddof=0, split_every=None, out=None)

Calculate the nth centralized moment.

Parameters
orderint

Order of the moment that is returned, must be >= 2.

axisint, optional

Axis along which the central moment is computed. The default is to compute the moment of the flattened array.

dtypedata-type, optional

Type to use in computing the moment. For arrays of integer type the default is float64; for arrays of float types it is the same as the array type.

keepdimsbool, optional

If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the original array.

ddofint, optional

“Delta Degrees of Freedom”: the divisor used in the calculation is N - ddof, where N represents the number of elements. By default ddof is zero.

Returns
momentndarray

References

1

Pebay, Philippe (2008), “Formulas for Robust, One-Pass Parallel Computation of Covariances and Arbitrary-Order Statistical Moments”, Technical Report SAND2008-6212, Sandia National Laboratories.

property nbytes

Number of bytes in array

nonzero()

This docstring was copied from numpy.ndarray.nonzero.

Some inconsistencies with the Dask version may exist.

Return the indices of the elements that are non-zero.

Refer to numpy.nonzero for full documentation.

See also

numpy.nonzero

equivalent function

property partitions

Slice an array by partitions. Alias of dask array .blocks attribute.

This alias allows you to write agnostic code that works with both dask arrays and dask dataframes.

This allows blockwise slicing of a Dask array. You can perform normal Numpy-style slicing but now rather than slice elements of the array you slice along blocks so, for example, x.blocks[0, ::2] produces a new dask array with every other block in the first row of blocks.

You can index blocks in any way that could index a numpy array of shape equal to the number of blocks in each dimension, (available as array.numblocks). The dimension of the output array will be the same as the dimension of this array, even if integer indices are passed. This does not support slicing with np.newaxis or multiple lists.

Returns
A Dask array

Examples

>>> import dask.array as da
>>> x = da.arange(10, chunks=2)
>>> x.partitions[0].compute()
array([0, 1])
>>> x.partitions[:3].compute()
array([0, 1, 2, 3, 4, 5])
>>> x.partitions[::2].compute()
array([0, 1, 4, 5, 8, 9])
>>> x.partitions[[-1, 0]].compute()
array([8, 9, 0, 1])
>>> all(x.partitions[:].compute() == x.blocks[:].compute())
True
prod(axis=None, dtype=None, out=None, keepdims=False, initial=1, where=True)

This docstring was copied from numpy.ndarray.prod.

Some inconsistencies with the Dask version may exist.

Return the product of the array elements over the given axis

Refer to numpy.prod for full documentation.

See also

numpy.prod

equivalent function

ravel([order])

This docstring was copied from numpy.ndarray.ravel.

Some inconsistencies with the Dask version may exist.

Return a flattened array.

Refer to numpy.ravel for full documentation.

See also

numpy.ravel

equivalent function

ndarray.flat

a flat iterator on the array.

rechunk(self, chunks='auto', threshold=None, block_size_limit=None)

See da.rechunk for docstring

repeat(repeats, axis=None)

This docstring was copied from numpy.ndarray.repeat.

Some inconsistencies with the Dask version may exist.

Repeat elements of an array.

Refer to numpy.repeat for full documentation.

See also

numpy.repeat

equivalent function

reshape(shape, order='C')

This docstring was copied from numpy.ndarray.reshape.

Some inconsistencies with the Dask version may exist.

Returns an array containing the same data with a new shape.

Refer to numpy.reshape for full documentation.

See also

numpy.reshape

equivalent function

Notes

Unlike the free function numpy.reshape, this method on ndarray allows the elements of the shape parameter to be passed in as separate arguments. For example, a.reshape(10, 11) is equivalent to a.reshape((10, 11)).

round(decimals=0, out=None)

This docstring was copied from numpy.ndarray.round.

Some inconsistencies with the Dask version may exist.

Return a with each element rounded to the given number of decimals.

Refer to numpy.around for full documentation.

See also

numpy.around

equivalent function

property size

Number of elements in array

squeeze(axis=None)

This docstring was copied from numpy.ndarray.squeeze.

Some inconsistencies with the Dask version may exist.

Remove single-dimensional entries from the shape of a.

Refer to numpy.squeeze for full documentation.

See also

numpy.squeeze

equivalent function

std(axis=None, dtype=None, out=None, ddof=0, keepdims=False)

This docstring was copied from numpy.ndarray.std.

Some inconsistencies with the Dask version may exist.

Returns the standard deviation of the array elements along given axis.

Refer to numpy.std for full documentation.

See also

numpy.std

equivalent function

store(sources, targets, lock=True, regions=None, compute=True, return_stored=False, **kwargs)

Store dask arrays in array-like objects, overwrite data in target

This stores dask arrays into object that supports numpy-style setitem indexing. It stores values chunk by chunk so that it does not have to fill up memory. For best performance you can align the block size of the storage target with the block size of your array.

If your data fits in memory then you may prefer calling np.array(myarray) instead.

Parameters
sources: Array or iterable of Arrays
targets: array-like or Delayed or iterable of array-likes and/or Delayeds

These should support setitem syntax target[10:20] = ...

lock: boolean or threading.Lock, optional

Whether or not to lock the data stores while storing. Pass True (lock each file individually), False (don’t lock) or a particular threading.Lock object to be shared among all writes.

regions: tuple of slices or list of tuples of slices

Each region tuple in regions should be such that target[region].shape = source.shape for the corresponding source and target in sources and targets, respectively. If this is a tuple, the contents will be assumed to be slices, so do not provide a tuple of tuples.

compute: boolean, optional

If true compute immediately, return dask.delayed.Delayed otherwise

return_stored: boolean, optional

Optionally return the stored result (default False).

Examples

>>> x = ...  
>>> import h5py  
>>> f = h5py.File('myfile.hdf5', mode='a')  
>>> dset = f.create_dataset('/data', shape=x.shape,
...                                  chunks=x.chunks,
...                                  dtype='f8')  
>>> store(x, dset)  

Alternatively store many arrays at the same time

>>> store([x, y, z], [dset1, dset2, dset3])  
sum(axis=None, dtype=None, out=None, keepdims=False, initial=0, where=True)

This docstring was copied from numpy.ndarray.sum.

Some inconsistencies with the Dask version may exist.

Return the sum of the array elements over the given axis.

Refer to numpy.sum for full documentation.

See also

numpy.sum

equivalent function

swapaxes(axis1, axis2)

This docstring was copied from numpy.ndarray.swapaxes.

Some inconsistencies with the Dask version may exist.

Return a view of the array with axis1 and axis2 interchanged.

Refer to numpy.swapaxes for full documentation.

See also

numpy.swapaxes

equivalent function

to_dask_dataframe(self, columns=None, index=None, meta=None)

Convert dask Array to dask Dataframe

Parameters
columns: list or string

list of column names if DataFrame, single string if Series

indexdask.dataframe.Index, optional

An optional dask Index to use for the output Series or DataFrame.

The default output index depends on whether the array has any unknown chunks. If there are any unknown chunks, the output has None for all the divisions (one per chunk). If all the chunks are known, a default index with known divsions is created.

Specifying index can be useful if you’re conforming a Dask Array to an existing dask Series or DataFrame, and you would like the indices to match.

metaobject, optional

An optional meta parameter can be passed for dask to specify the concrete dataframe type to use for partitions of the Dask dataframe. By default, pandas DataFrame is used.

to_delayed(self, optimize_graph=True)

Convert into an array of dask.delayed objects, one per chunk.

Parameters
optimize_graphbool, optional

If True [default], the graph is optimized before converting into dask.delayed objects.

to_hdf5(self, filename, datapath, **kwargs)

Store array in HDF5 file

>>> x.to_hdf5('myfile.hdf5', '/x')  

Optionally provide arguments as though to h5py.File.create_dataset

>>> x.to_hdf5('myfile.hdf5', '/x', compression='lzf', shuffle=True)  

See also

da.store
h5py.File.create_dataset
to_svg(self, size=500)

Convert chunks from Dask Array into an SVG Image

Parameters
chunks: tuple
size: int

Rough size of the image

Returns
text: An svg string depicting the array as a grid of chunks

Examples

>>> x.to_svg(size=500)  
to_tiledb(self, uri, *args, **kwargs)

Save array to the TileDB storage manager

See function to_tiledb() for argument documentation.

See https://docs.tiledb.io for details about the format and engine.

to_zarr(self, *args, **kwargs)

Save array to the zarr storage format

See https://zarr.readthedocs.io for details about the format.

See function to_zarr() for parameters.

topk(self, k, axis=-1, split_every=None)

The top k elements of an array.

See da.topk for docstring

trace(offset=0, axis1=0, axis2=1, dtype=None, out=None)

This docstring was copied from numpy.ndarray.trace.

Some inconsistencies with the Dask version may exist.

Return the sum along diagonals of the array.

Refer to numpy.trace for full documentation.

See also

numpy.trace

equivalent function

transpose(*axes)

This docstring was copied from numpy.ndarray.transpose.

Some inconsistencies with the Dask version may exist.

Returns a view of the array with axes transposed.

For a 1-D array this has no effect, as a transposed vector is simply the same vector. To convert a 1-D array into a 2D column vector, an additional dimension must be added. np.atleast2d(a).T achieves this, as does a[:, np.newaxis]. For a 2-D array, this is a standard matrix transpose. For an n-D array, if axes are given, their order indicates how the axes are permuted (see Examples). If axes are not provided and a.shape = (i[0], i[1], ... i[n-2], i[n-1]), then a.transpose().shape = (i[n-1], i[n-2], ... i[1], i[0]).

Parameters
axesNone, tuple of ints, or n ints
  • None or no argument: reverses the order of the axes.

  • tuple of ints: i in the j-th place in the tuple means a’s i-th axis becomes a.transpose()’s j-th axis.

  • n ints: same as an n-tuple of the same ints (this form is intended simply as a “convenience” alternative to the tuple form)

Returns
outndarray

View of a, with axes suitably permuted.

See also

ndarray.T

Array property returning the array transposed.

ndarray.reshape

Give a new shape to an array without changing its data.

Examples

>>> a = np.array([[1, 2], [3, 4]])  
>>> a  
array([[1, 2],
       [3, 4]])
>>> a.transpose()  
array([[1, 3],
       [2, 4]])
>>> a.transpose((1, 0))  
array([[1, 3],
       [2, 4]])
>>> a.transpose(1, 0)  
array([[1, 3],
       [2, 4]])
var(axis=None, dtype=None, out=None, ddof=0, keepdims=False)

This docstring was copied from numpy.ndarray.var.

Some inconsistencies with the Dask version may exist.

Returns the variance of the array elements, along given axis.

Refer to numpy.var for full documentation.

See also

numpy.var

equivalent function

view(self, dtype=None, order='C')

Get a view of the array as a new data type

Parameters
dtype:

The dtype by which to view the array. The default, None, results in the view having the same data-type as the original array.

order: string

‘C’ or ‘F’ (Fortran) ordering

This reinterprets the bytes of the array under a new dtype. If that
dtype does not have the same size as the original array then the shape
will change.
Beware that both numpy and dask.array can behave oddly when taking
shape-changing views of arrays under Fortran ordering. Under some
versions of NumPy this function will fail when taking shape-changing
views of Fortran ordered arrays if the first dimension has chunks of
size one.
property vindex

Vectorized indexing with broadcasting.

This is equivalent to numpy’s advanced indexing, using arrays that are broadcast against each other. This allows for pointwise indexing:

>>> x = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
>>> x = from_array(x, chunks=2)
>>> x.vindex[[0, 1, 2], [0, 1, 2]].compute()
array([1, 5, 9])

Mixed basic/advanced indexing with slices/arrays is also supported. The order of dimensions in the result follows those proposed for ndarray.vindex: the subspace spanned by arrays is followed by all slices.

Note: vindex provides more general functionality than standard indexing, but it also has fewer optimizations and can be significantly slower.