Documentation

beinf (zero- and one- inflated beta distribution)

class beinf.beinf_gen(momtype=1, a=None, b=None, xtol=1e-14, badvalue=None, name=None, longname=None, shapes=None, extradoc=None, seed=None)[source]

A zero- and one- inflated beta (BEINF) random variable

The probability density function for BEINF is:
\[f(x;a,b,p,q) = p(1-q)\delta(x)+(1-p)f_{\mathrm{beta}}(x;a,b) + pq\delta(x-1)\]

where

\[f_{\mathrm{beta}}(x;a,b)=\frac{1}{\mathrm{B}(a,b)}x^{a-1}(1-x)^{b-1},\]

\(\delta(x)\) is the delta function, and \(\mathrm{B}(a,b)\) is the beta function (scipy.special.beta).

beinf takes \(a>0\), \(b>0\), \(0\leq p \leq 1\), \(0\leq q \leq 1\) as shape parameters.

beinf is an instance of a subclass of rv_continuous, and therefore inherits all of the methods within rv_continuous. Some of those methods have been subclassed here:

_argcheck

_cdf

_pdf

_ppf

Additional methods added to beinf are:

beta_moments(data_sub)
Computes the parameters for the beta distribution using the method of moments.
cdf_eval(x,data_params,data)
Chooses the appropriate method for evaluating the cumulative distribution function for data at points x, and computes it.
check_cases(data)
Checks to see if data satisfies any of cases 1-4.
ecdf(x, X_samp, p=None, q=None)
The empirical distribution function X_samp at points x.
fit(data)
This replaces the fit method in rv_continuous, but is not subclassed. Computes the MLE parameters for the BEINF distribution.
fit_beta(data_sub)
Computes the MLE parameters for the beta distribution.
beta_moments(data_sub)[source]

Computes the method of moments estimates of a and b for the beta distribution.

Args:
data_sub (ndarray):
The data to be fit to the beta distribution. The values in data_sub should lie on the open interval (0,1).
Returns: a,b (floats):
The shape parameters for the beta distribution.
cdf_eval(x, data_params, data)[source]

Evaluates the cumulative distribution function (empirically or parametrically) for a random variable \(X\sim\mathrm{BEINF}(a,b,p,q)\). When \(a\) and \(b\) are unkown (i.e. when a=np.inf and b=np.inf), the cdf is evaluated using ecdf(). When \(a\) and \(b\) are known or when \(p=1\), the cdf is evaluated using cdf(). This is used after applying TAQM (see the tutorial on using the taqm class).

Args:
x (float or ndarray):
The value(s) at which the ecdf is evaluated.
data_params (ndarray):
An array containing the four parameters a, b, p, and q.
data (ndarray):
The data which defines the ecdf when \(a\) and \(b\) are unknown (i.e. when a=np.inf and b=np.inf). This array may contain the values np.inf, 0, and 1.
Returns: cdf_vals (ndarry):
The cdf or ecdf for \(X\sim\mathrm{BEINF}(a,b,p,q)\) evaluated at x.
check_cases(data)[source]

Checks whether data satisfies any of cases 1-4 described in Section 3b of Dirkson et al, 2018. When True, the parameters a and b for the beta-portion of the BEINF distribution cannot be fit.

Args:
data (ndarray):
The values for which to test if cases 1-4 apply.
Returns: True or False
Boolean value of True or False. True if any of the cases are satisfied; False if all of the cases are not satisfied.
ecdf(x, X_samp, p=None, q=None)[source]

For computing the empirical cumulative distribution function (ecdf) for a random variable \(X\sim\mathrm{BEINF}(a,b,p,q)\) when the parameters \(a\) and \(b\) (or additionally \(p\) and \(q\)) are unkown.

Args:
x (float or ndarray):
The value(s) at which the ecdf is evaluated
X_samp (float or ndarray):
A sample that lies on either: the open interval (0,1) when p and q are included as arguments, or the closed interval [0,1] when p and q are not included as arguments.
p (float, optional):
Shape parameter(s) for the beinf distribution.
q (float, optional):
Shape parameter(s) for the beinf distribution.
Returns: ecdf_vals (ndarray):
The ecdf for X_samp, evaluated at x.
fit(data)[source]

Computes the MLE parameters \(a\), \(b\), \(p\), and \(q\) for the BEINF distribution to the given data.

Args:
data (ndarray):
The data to be fit to the BEINF distribution.
Returns: a,b, p, q (floats):
The shape parameters for the BEINF distribution. When \(a\) and \(b\) cannot be fit, they are returned as a=np.inf and b=np.inf.
fit_beta(data_sub)[source]

Computes the MLE parameters \(a\) and \(b\) for the beta distribution to the given data.

Args:
data_sub (ndarray):
The data to be fit to the beta distribution. The values in data_sub should lie on the open interval (0,1).
Returns: a,b (floats):
The shape parameters for the beta distribution.

taqm (trend-adjusted quantile mapping)

class taqm.taqm[source]

Contains the methods needed for performing trend-adjusted quantile mapping (TAQM). It relies on the methods from the beinf class.

calibrate(x_params,y_params,x_t_params,X,Y,X_t,trust_sharp_fcst=False)
calibrated forecast BEINF parameters and calibrated forecast ensemble
fit_data(X,Y,X_t)
BEINF parameters for the TAMH, TAOH, and the raw forecast
lin(a1,b1,T)
linear equation values
piecewise_lin(a1,b1,a2,b2,t_b,T)
piece-wise linear equation values
trend_adjust_1p(data_all,tau_t,t)
trend-adjusted values using a single period
trend_adjust_2p(data_all,tau_t,t,t_b=1999)
trend-adjusted values using two periods
unpack_params(array)
the individual parameters stored in array
calibrate(x_params, y_params, x_t_params, X, Y, X_t, trust_sharp_fcst=False)[source]

Calibrates the raw forecast BEINF paramaters \(a_{x_t}\), \(b_{x_t}\), \(p_{x_t}\) and \(q_{x_t}\). This method carries out the calibration step described in section 5c in Dirkson et al, 2018.

Args:
x_params (ndarray):
An array containing the four parameters of the BEINF distribution for the TAMH ensemble time series.
y_params (ndarray):
An array containing the four parameters of the BEINF distribution for the TAOH time series.
x_t_params (ndarray):
An array containing the four parameters of the BEINF distribution for the raw forecast ensemble.
trust_sharp_fcst (boolean, optional):
True to revert to the raw forecast when \(p_{x_t}=1\). False to revert to the TAOH distribution when \(p_{x_t}=1\).
Returns: x_t_cal_params (ndarray), X_t_cal_beta (ndarray):

x_t_cal_params contains the four BEINF distribution parameters for the calibrated forecast: \(a_{\hat{x}_t}\), \(b_{\hat{x}_t}\), \(p_{\hat{x}_t}\) and \(q_{\hat{x}_t}\). When \(a_{\hat{x}_t}\) and \(b_{\hat{x}_t}\) could not be fit, they are returned as a=np.inf and b=np.inf.

X_t_cal_beta contains the calibrated forecast ensemble (np.inf replaces replace 0’s and 1’s in the ensemble). This array contains all np.inf values when any of \(p_y=1\), \(p_x=1\), or \(p_{x_t}=1\), or when all parameters in x_t_cal_params are defined (none are equal to np.inf).

fit_params(X, Y, X_t)[source]

Fits X (the TAMH ensemble time series), Y (the TAOH time series), and X_t (the raw forecast ensemble) to the BEINF distribution. This method carries out the fiting procedure described in section 5b of Dirkson et al, 2018.

Args:
X (ndarray):
The TAMH ensemble time series of size NxM.
Y (ndarray):
The TAOH time series of size N.
X_t (ndarray):
The raw forecast ensemble of size M.
Returns: x_params (ndarray), y_params (ndarray), x_t_params (ndarray):
The shape parameters \(a\), \(b\), \(p\), \(q\) for the BEINF distribution fitted to each X, Y, and X_t (see beinf.fit()).
lin(a1, b1, T)[source]

Evaluates the piece-wise linear equation

(1)\[z = a_1 T + b_1 \]

at \(T\).

Args:
a1 (float):
The slope in (1).
b1 (float):
The z-intercept in (1).
t_b (float):
The breakpoint for \(z\) in (1).
T (float or ndarray):
The point(s) at which (1) is evaluated.
Returns: z (float or ndarray):
The value of (1) at each point T. Values less than 0 and greater than 1 are clipped to 0 and 1, respectively.
piecewise_lin(a1, b1, a2, b2, t_b, T)[source]

Evaluates the piece-wise linear equation

(2)\[\begin{split}z = \begin{cases} a_1 T + b_1, & T<t_b \\ a_2 T + b_2, & T>t_b \end{cases} \end{split}\]

at \(T\).

Args:
a1, a2 (floats):
The slopes in (2).
b1, b2 (floats):
The z-intercepts in (2).
t_b (float):
The breakpoint for \(z\) in (2).
T (float or ndarray):
The point(s) at which (2) is evaluated.
Returns: z (float or ndarray):
The value of (2) at each point T. Values less than 0 and greater than 1 are clipped to 0 and 1, respectively.
trend_adjust_1p(data_all, tau_t, t)[source]

Linearly detrend data_all and re-center about its linear least squares fit evaluated at \(T=t\). One may want to use this trend adjustment over trend_adjust_2p() if the hindcast record is over the more recent record.

Args:
data_all (ndarray):
A time series of size N, or an ensemble time series of size NxM, where M is the number of ensemble members.
tau_t (ndarray):
All hindcast years exluding the forecast year.
t (float):
The forecast year.
Returns: data_ta (ndarray):
Trend-adjusted values with same shape as data_all.
trend_adjust_2p(data_all, tau_t, t, t_b=1999)[source]

Piece-wise linearly detrend data_all and re-center it about its non-linear least squares fit to Eq. (2) evaluated at \(T=\) t. This method carries out the trend-adjustment technique described in section 5a of Dirkson et al, 2018. The non-linear least squares fit constrains Eq. (2) to be continuous at \(T=t_b\).

Args:
data_all (ndarray):
A time series of size N, or an ensemble time series of size NxM, where M is the number of ensemble members.
tau_t (ndarray):
All hindcast years exluding the forecast year t.
t (float):
The forecast year.
t_b (float):
The breakpoint year in Eq. (2).
Returns: data_ta (ndarray):
Trend-adjusted values with same shape as data_all.
unpack_params(array)[source]

Unpacks the individual parameters a, b, p, q from array.

Args:
array (ndarray):
Array containing the four parameters a,b,p,q for a BEINF distribution.
Returns: a,b,p,q (floats):
The individual parameters for the BEINF disribution.