API#

Code documentation of the public application programming interface provided by this library.

kde1d(x, n=1024, limits=None)[source]#

Estimates the 1d density from discrete observations.

The input is a list/array x of numbers that represent discrete observations of a random variable. They are binned on a grid of n points within the data limits, if specified, or within the limits given by the values’ range. n will be coerced to the next highest power of two if it isn’t one to begin with.

The limits may be given as a tuple (xmin, xmax) or a single number denoting the upper bound of a range centered at zero. If any of those values are None, they will be inferred from the data.

After binning, the function determines the optimal bandwidth according to the diffusion-based method. It then smooths the binned data over the grid using a Gaussian kernel with a standard deviation corresponding to that bandwidth.

Returns the estimated density and the grid upon which it was computed, as well as the optimal bandwidth value the algorithm determined. Raises ValueError if the algorithm did not converge.

kde2d(x, y, n=256, limits=None)[source]#

Estimates the 2d density from discrete observations.

The input is two lists/arrays x and y of numbers that represent discrete observations of a random variable with two coordinate components. The observations are binned on a grid of n×n points. n will be coerced to the next highest power of two if it isn’t one to begin with.

Data limits may be specified as a tuple of tuples denoting ((xmin, xmax), (ymin, ymax)). If any of the values are None, they will be inferred from the data. Each tuple, or even both of them, may also be replaced by a single value denoting the upper bound of a range centered at zero.

After binning, the function determines the optimal bandwidth according to the diffusion-based method. It then smooths the binned data over the grid using a Gaussian kernel with a standard deviation corresponding to that bandwidth.

Returns the estimated density and the grid (along each of the two axes) upon which it was computed, as well as the optimal bandwidth values (per axis) that the algorithm determined. Raises ValueError if the algorithm did not converge or x and y are not the same length.