Documentation¶
beinf (zero- and one- inflated beta distribution)¶
-
class
beinf.
beinf_gen
(momtype=1, a=None, b=None, xtol=1e-14, badvalue=None, name=None, longname=None, shapes=None, extradoc=None, seed=None)[source]¶ A zero- and one- inflated beta (BEINF) random variable
- The probability density function for BEINF is:
- \[f(x;a,b,p,q) = p(1-q)\delta(x)+(1-p)f_{\mathrm{beta}}(x;a,b) + pq\delta(x-1)\]
where
\[f_{\mathrm{beta}}(x;a,b)=\frac{1}{\mathrm{B}(a,b)}x^{a-1}(1-x)^{b-1},\]\(\delta(x)\) is the delta function, and \(\mathrm{B}(a,b)\) is the beta function (
scipy.special.beta
).beinf
takes \(a>0\), \(b>0\), \(0\leq p \leq 1\), \(0\leq q \leq 1\) as shape parameters.beinf
is an instance of a subclass ofrv_continuous
, and therefore inherits all of the methods withinrv_continuous
. Some of those methods have been subclassed here:_argcheck
_cdf
_pdf
_ppf
Additional methods added to
beinf
are:beta_moments(data_sub)
- Computes the parameters for the beta distribution using the method of moments.
cdf_eval(x,data_params,data)
- Chooses the appropriate method for evaluating the cumulative distribution function for data at points x, and computes it.
check_cases(data)
- Checks to see if data satisfies any of cases 1-4.
ecdf(x, X_samp, p=None, q=None)
- The empirical distribution function X_samp at points x.
fit(data)
- This replaces the
fit
method inrv_continuous
, but is not subclassed. Computes the MLE parameters for the BEINF distribution. fit_beta(data_sub)
- Computes the MLE parameters for the beta distribution.
-
beta_moments
(data_sub)[source]¶ Computes the method of moments estimates of a and b for the beta distribution.
- Args:
- data_sub (ndarray):
- The data to be fit to the beta distribution. The values in data_sub should lie on the open interval (0,1).
- Returns: a,b (floats):
- The shape parameters for the beta distribution.
-
cdf_eval
(x, data_params, data)[source]¶ Evaluates the cumulative distribution function (empirically or parametrically) for a random variable \(X\sim\mathrm{BEINF}(a,b,p,q)\). When \(a\) and \(b\) are unkown (i.e. when
a=np.inf
andb=np.inf
), the cdf is evaluated usingecdf()
. When \(a\) and \(b\) are known or when \(p=1\), the cdf is evaluated usingcdf()
. This is used after applying TAQM (see the tutorial on using thetaqm
class).- Args:
- x (float or ndarray):
- The value(s) at which the ecdf is evaluated.
- data_params (ndarray):
- An array containing the four parameters a, b, p, and q.
- data (ndarray):
- The data which defines the ecdf when \(a\) and \(b\)
are unknown (i.e. when
a=np.inf
andb=np.inf
). This array may contain the valuesnp.inf
, 0, and 1.
- Returns: cdf_vals (ndarry):
- The cdf or ecdf for \(X\sim\mathrm{BEINF}(a,b,p,q)\) evaluated at x.
-
check_cases
(data)[source]¶ Checks whether data satisfies any of cases 1-4 described in Section 3b of Dirkson et al, 2018. When True, the parameters a and b for the beta-portion of the BEINF distribution cannot be fit.
- Args:
- data (ndarray):
- The values for which to test if cases 1-4 apply.
- Returns: True or False
- Boolean value of True or False. True if any of the cases are satisfied; False if all of the cases are not satisfied.
-
ecdf
(x, X_samp, p=None, q=None)[source]¶ For computing the empirical cumulative distribution function (ecdf) for a random variable \(X\sim\mathrm{BEINF}(a,b,p,q)\) when the parameters \(a\) and \(b\) (or additionally \(p\) and \(q\)) are unkown.
- Args:
- x (float or ndarray):
- The value(s) at which the ecdf is evaluated
- X_samp (float or ndarray):
- A sample that lies on either: the open interval (0,1) when p and q are included as arguments, or the closed interval [0,1] when p and q are not included as arguments.
- p (float, optional):
- Shape parameter(s) for the beinf distribution.
- q (float, optional):
- Shape parameter(s) for the beinf distribution.
- Returns: ecdf_vals (ndarray):
- The ecdf for X_samp, evaluated at x.
-
fit
(data)[source]¶ Computes the MLE parameters \(a\), \(b\), \(p\), and \(q\) for the BEINF distribution to the given data.
- Args:
- data (ndarray):
- The data to be fit to the BEINF distribution.
- Returns: a,b, p, q (floats):
- The shape parameters for the BEINF distribution. When \(a\)
and \(b\) cannot be fit, they are returned as
a=np.inf
andb=np.inf
.
-
fit_beta
(data_sub)[source]¶ Computes the MLE parameters \(a\) and \(b\) for the beta distribution to the given data.
- Args:
- data_sub (ndarray):
- The data to be fit to the beta distribution. The values in data_sub should lie on the open interval (0,1).
- Returns: a,b (floats):
- The shape parameters for the beta distribution.
taqm (trend-adjusted quantile mapping)¶
-
class
taqm.
taqm
[source]¶ Contains the methods needed for performing trend-adjusted quantile mapping (TAQM). It relies on the methods from the
beinf
class.calibrate(x_params,y_params,x_t_params,X,Y,X_t,trust_sharp_fcst=False)
- calibrated forecast BEINF parameters and calibrated forecast ensemble
fit_data(X,Y,X_t)
- BEINF parameters for the TAMH, TAOH, and the raw forecast
lin(a1,b1,T)
- linear equation values
piecewise_lin(a1,b1,a2,b2,t_b,T)
- piece-wise linear equation values
trend_adjust_1p(data_all,tau_t,t)
- trend-adjusted values using a single period
trend_adjust_2p(data_all,tau_t,t,t_b=1999)
- trend-adjusted values using two periods
unpack_params(array)
- the individual parameters stored in array
-
calibrate
(x_params, y_params, x_t_params, X, Y, X_t, trust_sharp_fcst=False)[source]¶ Calibrates the raw forecast BEINF paramaters \(a_{x_t}\), \(b_{x_t}\), \(p_{x_t}\) and \(q_{x_t}\). This method carries out the calibration step described in section 5c in Dirkson et al, 2018.
- Args:
- x_params (ndarray):
- An array containing the four parameters of the BEINF distribution for the TAMH ensemble time series.
- y_params (ndarray):
- An array containing the four parameters of the BEINF distribution for the TAOH time series.
- x_t_params (ndarray):
- An array containing the four parameters of the BEINF distribution for the raw forecast ensemble.
- trust_sharp_fcst (boolean, optional):
- True to revert to the raw forecast when \(p_{x_t}=1\). False to revert to the TAOH distribution when \(p_{x_t}=1\).
- Returns: x_t_cal_params (ndarray), X_t_cal_beta (ndarray):
x_t_cal_params contains the four BEINF distribution parameters for the calibrated forecast: \(a_{\hat{x}_t}\), \(b_{\hat{x}_t}\), \(p_{\hat{x}_t}\) and \(q_{\hat{x}_t}\). When \(a_{\hat{x}_t}\) and \(b_{\hat{x}_t}\) could not be fit, they are returned as
a=np.inf
andb=np.inf
.X_t_cal_beta contains the calibrated forecast ensemble (np.inf replaces replace 0’s and 1’s in the ensemble). This array contains all
np.inf
values when any of \(p_y=1\), \(p_x=1\), or \(p_{x_t}=1\), or when all parameters in x_t_cal_params are defined (none are equal tonp.inf
).
-
fit_params
(X, Y, X_t)[source]¶ Fits X (the TAMH ensemble time series), Y (the TAOH time series), and X_t (the raw forecast ensemble) to the BEINF distribution. This method carries out the fiting procedure described in section 5b of Dirkson et al, 2018.
- Args:
- X (ndarray):
- The TAMH ensemble time series of size NxM.
- Y (ndarray):
- The TAOH time series of size N.
- X_t (ndarray):
- The raw forecast ensemble of size M.
- Returns: x_params (ndarray), y_params (ndarray), x_t_params (ndarray):
- The shape parameters \(a\),
\(b\), \(p\), \(q\)
for the BEINF distribution fitted to each
X, Y, and X_t (see
beinf.fit()
).
-
lin
(a1, b1, T)[source]¶ Evaluates the piece-wise linear equation
(1)¶\[z = a_1 T + b_1 \]at \(T\).
- Args:
- Returns: z (float or ndarray):
- The value of (1) at each point T. Values less than 0 and greater than 1 are clipped to 0 and 1, respectively.
-
piecewise_lin
(a1, b1, a2, b2, t_b, T)[source]¶ Evaluates the piece-wise linear equation
(2)¶\[\begin{split}z = \begin{cases} a_1 T + b_1, & T<t_b \\ a_2 T + b_2, & T>t_b \end{cases} \end{split}\]at \(T\).
- Args:
- Returns: z (float or ndarray):
- The value of (2) at each point T. Values less than 0 and greater than 1 are clipped to 0 and 1, respectively.
-
trend_adjust_1p
(data_all, tau_t, t)[source]¶ Linearly detrend data_all and re-center about its linear least squares fit evaluated at \(T=t\). One may want to use this trend adjustment over
trend_adjust_2p()
if the hindcast record is over the more recent record.- Args:
- data_all (ndarray):
- A time series of size N, or an ensemble time series of size NxM, where M is the number of ensemble members.
- tau_t (ndarray):
- All hindcast years exluding the forecast year.
- t (float):
- The forecast year.
- Returns: data_ta (ndarray):
- Trend-adjusted values with same shape as data_all.
-
trend_adjust_2p
(data_all, tau_t, t, t_b=1999)[source]¶ Piece-wise linearly detrend data_all and re-center it about its non-linear least squares fit to Eq. (2) evaluated at \(T=\) t. This method carries out the trend-adjustment technique described in section 5a of Dirkson et al, 2018. The non-linear least squares fit constrains Eq. (2) to be continuous at \(T=t_b\).
- Args:
- data_all (ndarray):
- A time series of size N, or an ensemble time series of size NxM, where M is the number of ensemble members.
- tau_t (ndarray):
- All hindcast years exluding the forecast year t.
- t (float):
- The forecast year.
- t_b (float):
- The breakpoint year in Eq. (2).
- Returns: data_ta (ndarray):
- Trend-adjusted values with same shape as data_all.