Title: | Bayesian Modelling of Extremal Dependence in Time Series |
---|---|
Description: | Characterisation of the extremal dependence structure of time series, avoiding pre-processing and filtering as done typically with peaks-over-threshold methods. It uses the conditional approach of Heffernan and Tawn (2004) <DOI:10.1111/j.1467-9868.2004.02050.x> which is very flexible in terms of extremal and asymptotic dependence structures, and Bayesian methods improve efficiency and allow for deriving measures of uncertainty. For example, the extremal index, related to the size of clusters in time, can be estimated and samples from its posterior distribution obtained. |
Authors: | Thomas Lugrin |
Maintainer: | Thomas Lugrin <[email protected]> |
License: | GPL (>=2) |
Version: | 0.3.3 |
Built: | 2024-10-14 06:18:01 UTC |
Source: | https://github.com/tlugrin/tsxtreme |
Characterisation of the extremal dependence structure of time series, avoiding pre-processing and filtering as done typically with peaks-over-threshold methods. It uses the conditional approach of Heffernan and Tawn (2004) <DOI:10.1111/j.1467-9868.2004.02050.x> which is very flexible in terms of extremal and asymptotic dependence structures, and Bayesian methods improve efficiency and allow for deriving measures of uncertainty. For example, the extremal index, related to the size of clusters in time, can be estimated and samples from its posterior distribution obtained.
Index of help topics:
bayesfit Traces from MCMC output bayesparams Parameters for the semi-parametric approach dep2fit Dependence model fit (stepwise) depfit Dependence model fit depmeasure Dependence measures estimates depmeasures Estimate dependence measures dlapl The Laplace Distribution stepfit Estimates from stepwise fit theta2fit Fit time series extremes thetaruns Runs estimator tsxtreme-package Bayesian Modelling of Extremal Dependence in Time Series
The Heffernan–Tawn conditional formulation for a stationary time series with Laplace marginal distribution states that for a large enough threshold
there exist scale parameters
and
such that
with non-degenerate; the equality holds exactly only when
tends to infinity.
There are mainly 3 functions provided by this package, which allow estimation of extremal dependence measures and fitting the Heffernan–Tawn model using Dirichlet processes.
depfit
fits the Heffernan–Tawn model using a Bayesian semi-parametric approach.
thetafit
computes posterior samples of the threshold-based index of Ledford and Tawn (2003) based on inference in depfit
.
chifit
computes posterior samples of the extremal measure of dependence of Coles, Heffernan and Tawn (1999) at any extremal level.
Some corresponding functions using the stepwise approach of Heffernan and Tawn (2004) are also part of the package, namely dep2fit
and theta2fit
.
The empirical estimation of the extremal index can be done using thetaruns
and some basic functions handling the Laplace distribution are also available in dlapl
.
Thomas Lugrin
Maintainer: Thomas Lugrin <[email protected]>
Coles, S., Heffernan, J. E. and Tawn, J. A. (1999) Dependence measures for extreme value analyses. Extremes, 2, 339–365.
Davison, A. C. and Smith, R. L. (1990) Models for exceedances over high thresholds. Journal of the Royal Statistical Society Series B, 52, 393–442.
Heffernan, J. E. and Tawn, J. A. (2004) A conditional approach for multivariate extreme values. Journal of the Royal Statistical Society Series B, 66, 497–546.
Ledford, W. A. and Tawn, J. A. (2003) Diagnostics for dependence within time series extremes. Journal of the Royal Statistical Society Series B, 65, 521–543.
Lugrin, T., Davison, A. C. and Tawn, J. A. (2016) Bayesian uncertainty management in temporal dependence of extremes. Extremes, 19, 491–515.
Test or show objects of class "bayesfit".
is.bayesfit(x)
is.bayesfit(x)
x |
an arbitrary R object. |
Default plot shows samples of residual densities (which==1
), residual distribution with credible interval ( and
posterior quantiles;
which==2
), and joint posterior distribution of and
(
which==3
) for each lag successively. which
can be any composition of 1,2 and 3.
An object of class "bayesfit" is a list containing MCMC traces for:
a , b
|
Heffernan-Tawn parameters. |
sd , mean , w
|
standard deviations, means and weights of the mixture components. |
prec |
precision parameter of the Dirichlet process. |
ci |
auxiliary variable; components' indices for each observation. |
noo |
number of observations in each mixture component. |
noc |
number of non-empty components in the mixture. |
prop.sd |
standard deviations of the proposal distributions for |
And len
, the length of the traces, i.e., the number of iterations saved.
Create, test or show objects of class "bayesparams".
bayesparams(prop.a = 0.02, prop.b = 0.02, prior.mu = c(0, 10), prior.nu = c(2, 1/2), prior.eta = c(2, 2), trunc = 100, comp.saved = 15, maxit = 30000, burn = 5000, thin = 1, adapt = 5000, batch.size = 125, mode = 1) is.bayesparams(x)
bayesparams(prop.a = 0.02, prop.b = 0.02, prior.mu = c(0, 10), prior.nu = c(2, 1/2), prior.eta = c(2, 2), trunc = 100, comp.saved = 15, maxit = 30000, burn = 5000, thin = 1, adapt = 5000, batch.size = 125, mode = 1) is.bayesparams(x)
prop.a , prop.b
|
standard deviation for the Gaussian proposal of the Heffernan–Tawn parameters. |
prior.mu |
mean and standard deviation of the Gaussian prior for the components' means. |
prior.nu |
shape and rate of the inverse gamma prior for the components' variances. |
prior.eta |
shape and scale of the gamma prior for the precision parameter of the Dirichlet process. |
trunc |
integer; value of the truncation for the approximation of the infinite sum in the stick-breaking representation. |
comp.saved |
number of first components to be saved and returned. |
maxit |
maximum number of iterations. |
burn |
number of first iterations to discard. |
thin |
positive integer; spacing between iterations to be saved. Default is 1, i.e., all iterations are saved. |
adapt |
integer; number of iterations during which an adaption algorithm is applied to the proposal variances of |
batch.size |
size of batches used in the adaption algorithm. It has no effect if |
mode |
verbosity; 0 for debug mode, 1 (default) for standard output, and 2 for silent. |
x |
an arbitrary R object. |
prop.a
is a vector of length 5 with the standard deviations for each region of the RAMA for the (Gaussian) proposal for . If a scalar is given, 5 identical values are assumed.
prop.b
is a vector of length 3 with the standard deviations for each region of the RAMA for the (Gaussian) proposal for . If a scalar is provided, 3 identical values are assumed.
comp.saved
has no impact on the calculations: its only purpose is to prevent from storing huge amounts of empty components.
The regional adaption scheme targets a acceptance probability. It splits
in
regions for
and
in
regions for
. The decision to increase/decrease the proposal standard deviation is based on the past
batch.size
MCMC iterations, so too low values yield inefficient adaption, while too large values yield slow adaption.
Default values for the hyperparameters are chosen to get reasonably uninformative priors.
is.bayesparams(bayesparams()) # TRUE ## use defaults, change max number of iteration of MCMC par <- bayesparams(maxit=1e5)
is.bayesparams(bayesparams()) # TRUE ## use defaults, change max number of iteration of MCMC par <- bayesparams(maxit=1e5)
The conditional Heffernan–Tawn model is used to fit the dependence in time of a stationary series. A standard 2-stage procedure is used.
dep2fit(ts, u.mar = 0, u.dep, lapl = FALSE, method.mar = c("mle","mom","pwm"), nlag = 1, conditions = TRUE)
dep2fit(ts, u.mar = 0, u.dep, lapl = FALSE, method.mar = c("mle","mom","pwm"), nlag = 1, conditions = TRUE)
ts |
numeric vector; time series to be fitted. |
u.mar |
marginal threshold; used when transforming the time series to Laplace scale. |
u.dep |
dependence threshold; level above which the dependence is modelled. |
lapl |
logical; is |
method.mar |
a character string defining the method used to estimate the marginal GPD; either |
nlag |
integer; number of lags to be considered when modelling the dependence in time. |
conditions |
logical; should conditions on |
Consider a stationary time series with Laplace marginal distribution; the fitting procedure consists of fitting
with the number of lags considered. A likelihood is maximised assuming
, then an empirical distribution for the
is derived using the estimates of
and
and the relation
conditions
implements additional conditions suggested by Keef, Papastathopoulos and Tawn (2013) on the ordering of conditional quantiles. These conditions help with getting a consistent fit by shrinking the domain in which live.
alpha |
parameter controlling the conditional extremal expectation. |
beta |
parameter controlling the conditional extremal expectation and variance. |
res |
empirical residual of the model. |
pars.se |
vector of length 2 giving the estimated standard errors for |
## generate data from an AR(1) ## with Gaussian marginal distribution n <- 10000 dep <- 0.5 ar <- numeric(n) ar[1] <- rnorm(1) for(i in 2:n) ar[i] <- rnorm(1, mean=dep*ar[i-1], sd=1-dep^2) plot(ar, type="l") plot(density(ar)) grid <- seq(-3,3,0.01) lines(grid, dnorm(grid), col="blue") ## rescale margin ar <- qlapl(pnorm(ar)) ## fit model without constraints... fit1 <- dep2fit(ts=ar, u.mar = 0.95, u.dep=0.98, conditions=FALSE) fit1$a; fit1$b ## ...and compare with a fit with constraints fit2 <- dep2fit(ts=ar, u.mar = 0.95, u.dep=0.98, conditions=TRUE) fit2$a; fit2$b# should be similar, as true parameters lie well within the constraints
## generate data from an AR(1) ## with Gaussian marginal distribution n <- 10000 dep <- 0.5 ar <- numeric(n) ar[1] <- rnorm(1) for(i in 2:n) ar[i] <- rnorm(1, mean=dep*ar[i-1], sd=1-dep^2) plot(ar, type="l") plot(density(ar)) grid <- seq(-3,3,0.01) lines(grid, dnorm(grid), col="blue") ## rescale margin ar <- qlapl(pnorm(ar)) ## fit model without constraints... fit1 <- dep2fit(ts=ar, u.mar = 0.95, u.dep=0.98, conditions=FALSE) fit1$a; fit1$b ## ...and compare with a fit with constraints fit2 <- dep2fit(ts=ar, u.mar = 0.95, u.dep=0.98, conditions=TRUE) fit2$a; fit2$b# should be similar, as true parameters lie well within the constraints
Bayesian semiparametrics are used to fit the Heffernan–Tawn model to time series. Options are available to impose a structure in time on the model.
depfit(ts, u.mar = 0, u.dep=u.mar, lapl = FALSE, method.mar=c("mle","mom","pwm"), nlag = 1, par = bayesparams(), submodel = c("fom", "none", "ugm"))
depfit(ts, u.mar = 0, u.dep=u.mar, lapl = FALSE, method.mar=c("mle","mom","pwm"), nlag = 1, par = bayesparams(), submodel = c("fom", "none", "ugm"))
ts |
numeric vector; time series to be fitted. |
u.mar |
marginal threshold; used when transforming the time series to Laplace scale. |
u.dep |
dependence threshold; level above which the dependence is modelled. |
lapl |
logical; is |
method.mar |
a character string defining the method used to estimate the marginal GPD; either |
nlag |
integer; number of lags to be considered when modelling the dependence in time. |
par |
an object of class 'bayesparams'. |
submodel |
a character string; "fom" for first order Markov, "none" for no particular time structure, or "ugm" for univariate Gaussian mixture (see details). |
submodel
can be "fom"
to impose a first order Markov structure on the model parameters and
(see
thetafit
for more details); it can take the value "none"
to impose no particular structure in time; it can also be "ugm"
which can be applied to density estimation, as it corresponds to setting (see examples).
An object of class 'bayesfit' with elements:
a |
posterior trace of |
b |
posterior trace of |
sd |
posterior trace of the components' standard deviations. |
mean |
posterior trace of the components' means. |
w |
posterior trace of the components' assigned weights. |
prec |
posterior trace of the precision parameter. |
noo |
posterior trace of the number of observations per component. |
noc |
posterior trace of the number of components containing at least one observation. |
prop.sd |
trace of proposal standard deviations in the 5+3 regions of the adaption scheme for |
len |
length of the returned traces. |
## generate data from an AR(1) ## with Gaussian marginal distribution n <- 10000 dep <- 0.5 ar <- numeric(n) ar[1] <- rnorm(1) for(i in 2:n) ar[i] <- rnorm(1, mean=dep*ar[i-1], sd=1-dep^2) ## rescale the margin ar <- qlapl(pnorm(ar)) ## fit the data params <- bayesparams() params$maxit <- 100# bigger numbers would be params$burn <- 10 # more sensible... params$thin <- 4 fit <- depfit(ts=ar, u.mar=0.95, u.dep=0.98, par=params) ######## ## density estimation with submodel=="ugm" data <- MASS::galaxies/1e3 dens <- depfit(ts=data, par=params, submodel="ugm")
## generate data from an AR(1) ## with Gaussian marginal distribution n <- 10000 dep <- 0.5 ar <- numeric(n) ar[1] <- rnorm(1) for(i in 2:n) ar[i] <- rnorm(1, mean=dep*ar[i-1], sd=1-dep^2) ## rescale the margin ar <- qlapl(pnorm(ar)) ## fit the data params <- bayesparams() params$maxit <- 100# bigger numbers would be params$burn <- 10 # more sensible... params$thin <- 4 fit <- depfit(ts=ar, u.mar=0.95, u.dep=0.98, par=params) ######## ## density estimation with submodel=="ugm" data <- MASS::galaxies/1e3 dens <- depfit(ts=data, par=params, submodel="ugm")
Test or show objects of class "depmeasure".
is.depmeasure(x)
is.depmeasure(x)
x |
an arbitrary R object. |
An object of class 'depmeasure' is a list which contains:
fit |
an object of class 'bayesfit' |
distr |
an array with the samples used for the estimation. |
probs , levels
|
points —probability and original scale respectively— at which the dependence measure is estimated |
Depending on the dependence measure, theta
or chi
, a matrix with levels on row-entries and mean, median and specified quantiles of the posterior distribution of theta or chi respectively.
Appropriate marginal transforms are done before the fit using standard procedures, before the dependence model is fitted to the data. Then the posterior distribution of a measure of dependence is derived. thetafit
gives posterior samples for the extremal index and
chifit
does the same for the coefficient of extremal dependence .
thetafit(ts, lapl = FALSE, nlag = 1, R = 1000, S = 500, u.mar = 0, u.dep, probs = seq(u.dep, 0.9999, length.out = 30), method.mar = c("mle", "mom","pwm"), method = c("prop", "MCi"), silent = FALSE, fit = TRUE, prev.fit=bayesfit(), par = bayesparams(), submodel = c("fom", "none"), levels=c(.025,.975)) chifit(ts, lapl = FALSE, nlag = 1, R = 1000, S = 500, u.mar = 0, u.dep, probs = seq(u.dep, 0.9999, length.out = 30), method.mar = c("mle", "mom","pwm"), method = c("prop", "MCi"), silent = FALSE, fit = TRUE, prev.fit=bayesfit(), par = bayesparams(), submodel = c("fom", "none"), levels=c(.025,.975))
thetafit(ts, lapl = FALSE, nlag = 1, R = 1000, S = 500, u.mar = 0, u.dep, probs = seq(u.dep, 0.9999, length.out = 30), method.mar = c("mle", "mom","pwm"), method = c("prop", "MCi"), silent = FALSE, fit = TRUE, prev.fit=bayesfit(), par = bayesparams(), submodel = c("fom", "none"), levels=c(.025,.975)) chifit(ts, lapl = FALSE, nlag = 1, R = 1000, S = 500, u.mar = 0, u.dep, probs = seq(u.dep, 0.9999, length.out = 30), method.mar = c("mle", "mom","pwm"), method = c("prop", "MCi"), silent = FALSE, fit = TRUE, prev.fit=bayesfit(), par = bayesparams(), submodel = c("fom", "none"), levels=c(.025,.975))
ts |
a vector, the time series for which to estimate the extremal index |
lapl |
logical; |
nlag |
the run-length; an integer larger or equal to 1. |
R |
the number of samples per MCMC iteration drawn from the sampled posterior distributions; used for the estimation of the dependence measure. |
S |
the number of posterior distributions sampled to be used for the estimation of the dependence measure. |
u.mar |
probability; threshold used for marginal transformation if |
u.dep |
probability; threshold used for the extremal dependence model. |
probs |
vector of probabilities; the values of |
method.mar |
a character string defining the method used to estimate the marginal GPD; either |
method |
a character string defining the method used to estimate the dependence measure; either |
silent |
logical ( |
fit |
logical; |
prev.fit |
an object of class 'bayesfit'. Needed if |
par |
an object of class ' |
submodel |
a character string, either |
levels |
vector of probabilites; the quantiles of the posterior distribution of the extremal measure to be computed. |
The sub-asymptotic extremal index is defined as
whose limit as and
go to
appropriately is the extremal index
. The extremal index can be interpreted as the inverse of the asymptotic mean cluster size (see
thetaruns)
.
The sub-asymptotic coefficient of extremal dependence is
whose limit defines asymptotic dependence (
) or asymptotic independence (
).
Both types of extremal dependence measures can be estimated either using a
* proportion method (method == "prop"
), sampling from the conditional probability given and counting the proportion of sampled points falling in the region of interest, or
* Monte Carlo integration (method == "MCi"
), sampling replicates from the marginal exponential tail distribution and evaluating the conditional tail distribution in these replicates, then taking their mean as an approximation of the integral.
submodel == "fom"
imposes a first order Markov structure to the model, namely a geometrical decrease in and a constant
across lags, i.e.
and
,
.
An object of class 'depmeasure', containing a subset of:
bayesfit |
An object of class 'bayesfit' |
theta |
An array with dimensions |
distr |
An array with dimensions |
chi |
An array with dimensions |
probs |
|
levels |
|
## generate data from an AR(1) ## with Gaussian marginal distribution n <- 10000 dep <- 0.5 ar <- numeric(n) ar[1] <- rnorm(1) for(i in 2:n) ar[i] <- rnorm(1, mean=dep*ar[i-1], sd=1-dep^2) plot(ar, type="l") plot(density(ar)) grid <- seq(-3,3,0.01) lines(grid, dnorm(grid), col="blue") ## rescale the margin (focus on dependence) ar <- qlapl(pnorm(ar)) ## fit the data params <- bayesparams() params$maxit <- 100 # bigger numbers would be params$burn <- 10 # more sensible... params$thin <- 4 theta <- thetafit(ts=ar, R=500, S=100, u.mar=0.95, u.dep=0.98, probs = c(0.98, 0.999), par=params) ## or, same thing in two steps to control fit output before computing theta: fit <- depfit(ts=ar, u.mar=0.95, u.dep=0.98, par=params) plot(fit) theta <- thetafit(ts=ar, R=500, S=100, u.mar=0.95, u.dep=0.98, probs = c(0.98, 0.999), fit=FALSE, prev.fit=fit)
## generate data from an AR(1) ## with Gaussian marginal distribution n <- 10000 dep <- 0.5 ar <- numeric(n) ar[1] <- rnorm(1) for(i in 2:n) ar[i] <- rnorm(1, mean=dep*ar[i-1], sd=1-dep^2) plot(ar, type="l") plot(density(ar)) grid <- seq(-3,3,0.01) lines(grid, dnorm(grid), col="blue") ## rescale the margin (focus on dependence) ar <- qlapl(pnorm(ar)) ## fit the data params <- bayesparams() params$maxit <- 100 # bigger numbers would be params$burn <- 10 # more sensible... params$thin <- 4 theta <- thetafit(ts=ar, R=500, S=100, u.mar=0.95, u.dep=0.98, probs = c(0.98, 0.999), par=params) ## or, same thing in two steps to control fit output before computing theta: fit <- depfit(ts=ar, u.mar=0.95, u.dep=0.98, par=params) plot(fit) theta <- thetafit(ts=ar, R=500, S=100, u.mar=0.95, u.dep=0.98, probs = c(0.98, 0.999), fit=FALSE, prev.fit=fit)
Density, distribution function, quantile function and random generation for the Laplace distribution with location parameter loc
and scale parameter scale
.
dlapl(x, loc = 0, scale = 1, log = FALSE) plapl(q, loc = 0, scale = 1, lower.tail = TRUE, log.p = FALSE) qlapl(p, loc = 0, scale = 1, lower.tail = TRUE, log.p = FALSE) rlapl(n, loc = 0, scale = 1)
dlapl(x, loc = 0, scale = 1, log = FALSE) plapl(q, loc = 0, scale = 1, lower.tail = TRUE, log.p = FALSE) qlapl(p, loc = 0, scale = 1, lower.tail = TRUE, log.p = FALSE) rlapl(n, loc = 0, scale = 1)
x , q
|
vector of quantiles. |
p |
vector of probabilities. |
n |
number of samples to generate. |
loc |
vector of location parameters. |
scale |
vector of scale parameters. These must be positive. |
lower.tail |
logical; if TRUE (default), probabilities are |
log , log.p
|
logical; if TRUE, probabilities |
If loc
or scale
are not specified, they assume the default values of 0 and 1 respectively.
The Laplace distribution has density
where is the location parameter and
is the scale parameter.
dlapl
gives the density, plapl
gives the distribution function, qlapl
gives the quantile function, and rlapl
generates random deviates.
The length of the result is determined by n
in rlapl
, and is the maximum of the lengths of the numerical arguments for the other functions. Standard R
vector operations are to be assumed.
If sd
=0, the limit as sd
decreases to 0 is returned, i.e., a point mass at loc
. The case sd
<0 is an error and nothing is returned.
Some checks are done previous to standard evaluation, but vector computations have not yet been tested thoroughly! Typically vectors not having lengths multiple of each other return an error.
dexp for the exponential distribution which is the positive part of the Laplace distribution.
## evaluate the density function on a grid of values x <- seq(from=-5, to=5, by=0.1) fx <- dlapl(x, loc=1, scale=.5) ## generate random samples of a mixture of Laplace distributions rnd <- rlapl(1000, loc=c(-5,-3,2), scale=0.5) ## an alternative: rnd <- runif(1000) rnd <- qlapl(rnd, loc=c(-5,-3,2), scale=0.5) ## integrate the Laplace density on [a,b] a <- -1 b <- 7 integral <- plapl(b)-plapl(a)
## evaluate the density function on a grid of values x <- seq(from=-5, to=5, by=0.1) fx <- dlapl(x, loc=1, scale=.5) ## generate random samples of a mixture of Laplace distributions rnd <- rlapl(1000, loc=c(-5,-3,2), scale=0.5) ## an alternative: rnd <- runif(1000) rnd <- qlapl(rnd, loc=c(-5,-3,2), scale=0.5) ## integrate the Laplace density on [a,b] a <- -1 b <- 7 integral <- plapl(b)-plapl(a)
Create, test or show objects of class "stepfit".
stepfit() is.stepfit(x)
stepfit() is.stepfit(x)
x |
an arbitrary R object. |
An object of class "stepfit" is a list containing:
a , b
|
Heffernan–Tawn parameters. |
res |
fitted residuals. |
pars.se |
estimated standard error of |
Appropriate marginal transforms are done before the fit using standard procedures, before the dependence model is fitted to the data. Then the measure of dependence is derived using a method described in Eastoe and Tawn (2012).
theta2fit(ts, lapl = FALSE, nlag = 1, R = 1000, u.mar = 0, u.dep, probs = seq(u.dep, 0.9999, length.out = 30), method.mar = c("mle","mom","pwm"), method = c("prop", "MCi"), silent = FALSE, R.boot = 0, block.length = m * 5, levels = c(.025,.975))
theta2fit(ts, lapl = FALSE, nlag = 1, R = 1000, u.mar = 0, u.dep, probs = seq(u.dep, 0.9999, length.out = 30), method.mar = c("mle","mom","pwm"), method = c("prop", "MCi"), silent = FALSE, R.boot = 0, block.length = m * 5, levels = c(.025,.975))
ts |
numeric vector; time series to be fitted. |
lapl |
logical; is |
nlag |
integer; number of lags to be considered when modelling the dependence in time. |
R |
integer; the number of samples used for estimating |
u.mar |
marginal threshold; used when transforming the time series to Laplace scale if |
u.dep |
dependence threshold; level above which the dependence is modelled. |
probs |
vector of probabilities; the values of |
method.mar |
a character string defining the method used to estimate the marginal GPD; either |
method |
a character string defining the method used to estimate the dependence measure; either |
silent |
logical ( |
R.boot |
integer; the number of samples used for the block bootstrap for the confidence intervals. |
block.length |
integer; the block length used for the block-bootstrapped confidence intervals. |
levels |
vector of probabilities; the quantiles of the bootstrap distribution of the extremal measure to be computed. |
The standard procedure (method="prop"
) to estimating probabilities from a Heffernan-Tawn fit best illustrated in the bivariate context ():
1. sample from an exponential distribution above
,
2. sample (residuals) from their empirical distribution,
3. compute using the relation
,
4. estimate by calculating the proportion
of
samples above
and multiply
with the marginal survival distribution evaluated at
.
With method="MCi"
a Monte Carlo integration approach is used, where the survivor distribution of is evaluated at pseudo-residuals of the form
where is sampled from an exponential distribution above
. Taking the mean of these survival probabilities, we get the Monte Carlo equivalent of
in the proportion approach.
List containing:
depfit |
an object of class 'stepfit' |
probs |
|
levels |
|
theta |
a matrix with proportion or Monte Carlo estimates of |
## generate data from an AR(1) ## with Gaussian marginal distribution n <- 10000 dep <- 0.5 ar <- numeric(n) ar[1] <- rnorm(1) for(i in 2:n) ar[i] <- rnorm(1, mean=dep*ar[i-1], sd=1-dep^2) plot(ar, type="l") plot(density(ar)) grid <- seq(-3,3,0.01) lines(grid, dnorm(grid), col="blue") ## rescale the margin (focus on dependence) ar <- qlapl(pnorm(ar)) ## fit the data fit <- theta2fit(ts=ar, u.mar=0.95, u.dep=0.98) ## plot theta(x,1) plot(fit) abline(h=1, lty="dotted")
## generate data from an AR(1) ## with Gaussian marginal distribution n <- 10000 dep <- 0.5 ar <- numeric(n) ar[1] <- rnorm(1) for(i in 2:n) ar[i] <- rnorm(1, mean=dep*ar[i-1], sd=1-dep^2) plot(ar, type="l") plot(density(ar)) grid <- seq(-3,3,0.01) lines(grid, dnorm(grid), col="blue") ## rescale the margin (focus on dependence) ar <- qlapl(pnorm(ar)) ## fit the data fit <- theta2fit(ts=ar, u.mar=0.95, u.dep=0.98) ## plot theta(x,1) plot(fit) abline(h=1, lty="dotted")
Compute the empirical estimator of the extremal index using the runs method (Smith & Weissman, 1994, JRSSB).
thetaruns(ts, lapl = FALSE, nlag = 1, u.mar = 0, probs = seq(u.mar, 0.995, length.out = 30), method.mar = c("mle", "mom", "pwm"), R.boot = 0, block.length = (nlag+1) * 5, levels = c(0.025, 0.975))
thetaruns(ts, lapl = FALSE, nlag = 1, u.mar = 0, probs = seq(u.mar, 0.995, length.out = 30), method.mar = c("mle", "mom", "pwm"), R.boot = 0, block.length = (nlag+1) * 5, levels = c(0.025, 0.975))
ts |
a vector, the time series for which to estimate the threshold-based extremal index |
lapl |
logical; is |
nlag |
the run-length; an integer larger or equal to 1. |
u.mar |
marginal threshold (probability); used when transforming the time series to Laplace scale if |
probs |
vector of probabilities; the values of |
method.mar |
a character string defining the method used to estimate the marginal GPD; either |
block.length |
integer; the block length used for the block-bootstrapped confidence intervals. |
R.boot |
integer; the number of samples used for the block bootstrap. |
levels |
vector of probabilites; the quantiles of the posterior distribution of the extremal index |
Consider a stationary time series . A characterisation of the extremal index is
In the limit when and
tend to
appropriately,
corresponds to the asymptotic inverse mean cluster size. It also links the generalised extreme value distribution of the independent series
, with the same marginal distribution as
,
with and
the extreme value distributions of
and
respectively.
nlag
corresponds to the run-length and
probs
is a set of values for .
The runs estimator is computed, which consists of counting the proportion of clusters to the number of exceedances of a threshold
; two exceedances of the threshold belong to different clusters if there are at least
non-exceedances inbetween.
An object of class 'depmeasure
' containing:
theta |
matrix; estimates of the extremal index |
nbr.exc |
numeric vector; number of exceedances for each threshold corresponding to the elements in |
probs |
|
levels |
numeric vector; |
nlag |
|
## generate data from an AR(1) ## with Gaussian marginal distribution n <- 10000 dep <- 0.5 ar <- numeric(n) ar[1] <- rnorm(1) for(i in 2:n) ar[i] <- rnorm(1, mean=dep*ar[i-1], sd=1-dep^2) ## transform to Laplace scale ar <- qlapl(pnorm(ar)) ## compute empirical estimate theta <- thetaruns(ts=ar, u.mar=.95, probs=c(.95,.98,.99)) ## output plot(theta, ylim=c(.2,1)) abline(h=1, lty="dotted")
## generate data from an AR(1) ## with Gaussian marginal distribution n <- 10000 dep <- 0.5 ar <- numeric(n) ar[1] <- rnorm(1) for(i in 2:n) ar[i] <- rnorm(1, mean=dep*ar[i-1], sd=1-dep^2) ## transform to Laplace scale ar <- qlapl(pnorm(ar)) ## compute empirical estimate theta <- thetaruns(ts=ar, u.mar=.95, probs=c(.95,.98,.99)) ## output plot(theta, ylim=c(.2,1)) abline(h=1, lty="dotted")