Histogram tools

This module contains function to build histograms, get cumulative density function and generate randon values from customed distributions, both on GPU or CPU.

histogram_tools.binarySearchCpu(x, y)

Count values less than or equal to in another array. The algorithm used is binary search.

https://www.enjoyalgorithms.com/blog/count-values-less-than-equal-to-in-another-array

Parameters

x1d-array

“reference” array.

y1d-array

Array in which we want to count the number of elements lower or equal to the values in x. MUST be sorted

Returns

highint

Number of elements less than or equal to in another array.

histogram_tools.computeCdf(x_axis, data, mode, normed)

Compute the empirical cumulative density function (CDF). It is a wrapper which calls the GPU or CPU version depending on the presence of cupy and a GPU.

Parameters

x_axiscupy array

x-axis of the CDF.

datacupy array

Data used to create the CDF.

modestring

If ccdf, the survival function (complementary of the CDF) is calculated instead.

normedbool

If True, the CDF is normed so that the maximum is equal to 1.

Returns

cdfcupy array

CDF of data.

histogram_tools.computeCdfCpu(x_axis, data, mode, normed)

Compute the empirical cumulative density function (CDF) on CPU.

Parameters

x_axiscupy array

x-axis of the CDF.

dataarray

Data used to create the CDF.

modestring

If ccdf, the survival function (complementary of the CDF) is calculated instead.

normedbool

If True, the CDF is normed so that the maximum is equal to 1.

Returns

cdfcupy array

CDF of data.

histogram_tools.computeCdfGPU(x_axis, data, mode, normed)

Compute the empirical cumulative density function (CDF) on GPU with CUDA.

Parameters

x_axiscupy array

x-axis of the CDF.

datacupy array

Data used to create the CDF.

modestring

If ccdf, the survival function (complementary of the CDF) is calculated instead.

normedbool

If True, the CDF is normed so that the maximum is equal to 1.

Returns

cdfcupy array

CDF of data.

histogram_tools.compute_data_histogram(data_null, bin_bounds, wl_scale, **kwargs)

Calculate the historam of the null depth for each spectral channel. By default, the histogram is normalised by its integral, unless specified in **kwargs.

Parameters

data_null2d-array (wl, number of points)

sequence of null depths. The first axis corresponds to the spectral dispersion.

bin_boundstuple-like (scalar, scalar)

boundaries of the null depths range. Values out of this range are pruned from data_null when making the histogram

wl_scale1d-array

wavelength scale.

**kwargsextra-keywords

Use normed=False to not normalise the histogram by its sum.

Returns

null_pdf2d-array (wavelength size, nb of bins)

Histogram of the null depth per spectral channel.

null_pdf_errTYPE

Error on the histogram frequency per spectral channel, assuming the number of elements per bin follows a binomial distribution.

histogram_tools.create_histogram_model(params_to_fit, xbins, wl_scale0, instrument_model, instrument_args, rvu_forfit, cdfs, rvus, **kwargs)

Monte-Carlo simulator of the instrument model to give a histogram.

To avoid memory overflow, the total number of samples can be chunked into smaller parts. The resulting histogram is the same if the simulation is made with the total number of samples in one go.

Parameters

xbins2D array

1st axis = wavelength.

params_to_fittuple-like

List of the parameters to fit.

wl_scale01D array

Wavelength scale.

instrument_modelfunction

Function simulating the instrument.

instrument_argstuple, must contain same type of data (all float or all array of the same shape)

List of arguments to pass to instrument_model which are not fitted.

cdfstuple. First put CDF of quantities which does not depend on the wavelength.

For wavelength-dependant quantity, 1st axis = wavelength.

rvustuple. First put CDF of quantities which does not depend on the wavelength.

For wavelength-dependant quantity, 1st axis = wavelength.

**kwargskeywords

n_samp_per_loop (int): number of samples for the MC simulation per loop.nloop (int): number of loops

Returns

1d-array

Model of the histogram.

histogram_tools.getErrorBinomNorm(pdf, data_size, normed)

Calculate the error of the PDF knowing the number of elements in a bin is a random value following a binomial distribution.

Parameters

pdfarray

Normalized PDF which the error is calculated.

data_sizeint

Number of elements used to calculate the PDF.

normedbool

Set to True if pdf is normalised, False otherwise.

Returns

pdf_errarray

Error of the PDF.

histogram_tools.getErrorCDF(data_null, data_null_err, null_axis)

Calculate the error of the CDF. It uses the cupy library.

Parameters

data_nullarray

Null depth measurements used to create the CDF.

data_null_errarray

Error on the null depth measurements.

null_axisarray

Abscissa of the CDF.

Returns

array

Error of the CDF.

histogram_tools.getErrorNull(data_dic, dark_dic)

Compute the error of the null depth.

Parameters

data_dicdict

Dictionary of the data from load_data.

dark_dicdict

Dictionary of the dark from load_data.

Returns

std_nullarray

Array of the error on the null depths.

histogram_tools.getErrorPDF(data_null, data_null_err, null_axis)

Calculate the error of the PDF. It uses the cupy library.

Parameters

data_nullarray

Null depth measurements used to create the PDF.

data_null_errarray

Error on the null depth measurements.

null_axisarray

Abscissa of the CDF.

Returns

array

Error of the PDF.

histogram_tools.get_cdf(data)

Get the CDF of measured quantities. This function works on CPU and GPU.

Parameters

dataarray

Data from which the CDF is wanted.

wl_scale0array

Wavelength axis.

Returns

axes2d-array

axis of the CDF, first axis is the wavelength.

cdfs2d-array

CDF, first axis is the wavelength.

histogram_tools.get_dark_cdf(dk, wl_scale0)

Get the CDF for generating RV from measured dark distributions.

Parameters

dkarray-like

dark data.

wl_scale0array

wavelength axis.

Returns

dark_axisnd-array

axis of the CDF.

dark_cdfnd-array

CDF.

histogram_tools.rv_generator(absc, cdf, nsamp, rvu=None)

Random values generator based on the CDF.

Parameters

absccupy array

Abscissa of the CDF.

cdfcupy array

Normalized arbitrary CDF to use to generate rv.

nsampint

Number of values to generate.

rvuTYPE, optional

Use the same sequence of uniformly random values. The default is None.

Returns

output_samplescupy array

Sequence of random values following the CDF.