Title: | Hawkes and Log-Gaussian Cox Point Processes Using Template Model Builder |
---|---|
Description: | Fit Hawkes and log-Gaussian Cox process models with extensions. Introduced in Hawkes (1971) <doi:10.2307/2334319> a Hawkes process is a self-exciting temporal point process where the occurrence of an event immediately increases the chance of another. We extend this to consider self-inhibiting process and a non-homogeneous background rate. A log-Gaussian Cox process is a Poisson point process where the log-intensity is given by a Gaussian random field. We extend this to a joint likelihood formulation fitting a marked log-Gaussian Cox model. In addition, the package offers functionality to fit self-exciting spatiotemporal point processes. Models are fitted via maximum likelihood using 'TMB' (Template Model Builder). Where included 1) random fields are assumed to be Gaussian and are integrated over using the Laplace approximation and 2) a stochastic partial differential equation model, introduced by Lindgren, Rue, and Lindström. (2011) <doi:10.1111/j.1467-9868.2011.00777.x>, is defined for the field(s). |
Authors: | Charlotte M. Jones-Todd [aut, cre, cph] (<https://orcid.org/0000-0003-1201-2781>, Charlotte Jones-Todd wrote and continued developmend of the main code.), Alec van Helsdingen [aut] (Alec van Helsdingen wrote the Hawkes templates and extended self-exciting TMB templates), Xiangjie Xue [ctb] (Xiangjie Xue worked the early spatio-temporal self-exciting TMB templates), Joseph Reps [ctb] (Joseph Reps worked on the spatio-temporal self-exciting TMB templates), Marsden Fund 3723517 [fnd], Asian Office of Aerospace Research & Development FA2386-21-1-4028 [fnd] |
Maintainer: | Charlotte M. Jones-Todd <[email protected]> |
License: | GPL (>= 3) |
Version: | 1.0.1 |
Built: | 2024-11-15 02:53:06 UTC |
Source: | https://github.com/cmjt/stelfi |
Extract the compensator differences from a fitted Hawkes model.
compensator_differences(obj)
compensator_differences(obj)
obj |
A fitted model object returned by |
## ********************** ### ## A univariate Hawkes model ### ********************** ### data(retweets_niwa, package = "stelfi") times <- unique(sort(as.numeric(difftime(retweets_niwa, min(retweets_niwa), units = "mins")))) params <- c(mu = 0.05, alpha = 0.05, beta = 0.1) fit <- fit_hawkes(times = times, parameters = params) compensator_differences(fit)
## ********************** ### ## A univariate Hawkes model ### ********************** ### data(retweets_niwa, package = "stelfi") times <- unique(sort(as.numeric(difftime(retweets_niwa, min(retweets_niwa), units = "mins")))) params <- c(mu = 0.05, alpha = 0.05, beta = 0.1) fit <- fit_hawkes(times = times, parameters = params) compensator_differences(fit)
Fit a Hawkes process using Template Model Builder (TMB). The function fit_hawkes()
fits a
self-exciting Hawkes process with a constant background rate. Whereas, fit_hawkes_cbf()
fits a Hawkes
processes with a user defined custom background function (non-homogeneous background rate). The function
fit_mhawkes()
fits a multivariate Hawkes process that allows for between- and within-stream
self-excitement.
fit_hawkes( times, parameters = list(), model = 1, marks = c(rep(1, length(times))), tmb_silent = TRUE, optim_silent = TRUE, ... ) fit_hawkes_cbf( times, parameters = list(), model = 1, marks = c(rep(1, length(times))), background, background_integral, background_parameters, background_min, tmb_silent = TRUE, optim_silent = TRUE ) fit_mhawkes( times, stream, parameters = list(), tmb_silent = TRUE, optim_silent = TRUE, ... )
fit_hawkes( times, parameters = list(), model = 1, marks = c(rep(1, length(times))), tmb_silent = TRUE, optim_silent = TRUE, ... ) fit_hawkes_cbf( times, parameters = list(), model = 1, marks = c(rep(1, length(times))), background, background_integral, background_parameters, background_min, tmb_silent = TRUE, optim_silent = TRUE ) fit_mhawkes( times, stream, parameters = list(), tmb_silent = TRUE, optim_silent = TRUE, ... )
times |
A vector of numeric observed time points. |
parameters |
A named list of parameter starting values:
|
model |
A numeric indicator specifying which model to fit:
|
marks |
Optional, a vector of numeric marks, defaults to 1 (i.e., no marks). |
tmb_silent |
Logical, if |
optim_silent |
Logical, if |
... |
Additional arguments to pass to |
background |
A function taking one parameter and an independent variable, returning a scalar. |
background_integral |
The integral of |
background_parameters |
The parameter(s)
for the background function |
background_min |
A function taking one parameter and two points,
returns min of |
stream |
A character vector specifying the stream ID of each observation in |
A univariate Hawkes (Hawkes, AG. 1971) process is a self-exciting temporal point process
with conditional intensity function
. Here
is the constant
baseline rate,
is the instantaneous increase in intensity after an event, and
is the
exponential decay in intensity. The term
describes the historic dependence
and the clustering density of the temporal point process, where the
are the events in
time occurring prior to time
. From this we can derive the following quantities 1)
is the branching ratio, it gives the average number of events triggered by an event, and
2)
gives the rate of decay of the self-excitement.
Including mark information results in the conditional intensity
,
where
is the temporal mark. This model can be fitted with
fit_hawkes()
.
An in-homogenous marked Hawkes process has conditional intensity function
. Here, the
background rate,
, varies in time. Such a model can be fitted
using
fit_hawkes_cbf()
where the parameters of the custom background function are estimated
before being passed to TMB
.
A multivariate Hawkes process that allows for between- and within-stream self-excitement.
The conditional intensity for the (
) stream is given by
, where
. Here,
is the excitement caused by the
stream on the
. Therefore,
is an
matrix where the diagonals represent the within-stream excitement and the off-diagonals
represent the excitement between streams.
A list containing components of the fitted model, see TMB::MakeADFun
. Includes
par
, a numeric vector of estimated parameter values;
objective
, the objective function;
gr
, the TMB calculated gradient function; and
simulate
, (where available) a simulation function.
Hawkes, AG. (1971) Spectra of some self-exciting and mutually exciting point processes. Biometrika, 58: 83–90.
### ********************** ### ## A Hawkes model ### ********************** ### data(retweets_niwa, package = "stelfi") times <- unique(sort(as.numeric(difftime(retweets_niwa, min(retweets_niwa), units = "mins")))) params <- c(mu = 0.05, alpha = 0.05, beta = 0.1) fit <- fit_hawkes(times = times, parameters = params) get_coefs(fit) ### ********************** ### ## A Hawkes model with marks (ETAS-type) ### ********************** ### data("nz_earthquakes", package = "stelfi") earthquakes <- nz_earthquakes[order(nz_earthquakes$origintime),] earthquakes <- earthquakes[!duplicated(earthquakes$origintime), ] times <- earthquakes$origintime times <- as.numeric(difftime(times, min(times), units = "hours")) marks <- earthquakes$magnitude params <- c(mu = 0.05, alpha = 0.05, beta = 1) fit <- fit_hawkes(times = times, parameters = params, marks = marks) get_coefs(fit) ### ********************** ### ## A Hawkes process with a custom background function ### ********************** ### if(require("hawkesbow")) { times <- hawkesbow::hawkes(1000, fun = function(y) {1 + 0.5*sin(y)}, M = 1.5, repr = 0.5, family = "exp", rate = 2)$p ## The background function must take a single parameter and ## the time(s) at which it is evaluated background <- function(params,times) { A = exp(params[[1]]) B = stats::plogis(params[[2]]) * A return(A + B *sin(times)) } ## The background_integral function must take a ## single parameter and the time at which it is evaluated background_integral <- function(params,x) { A = exp(params[[1]]) B = stats::plogis(params[[2]]) * A return((A*x)-B*cos(x)) } param = list(alpha = 0.5, beta = 1.5) background_param = list(1,1) fit <- fit_hawkes_cbf(times = times, parameters = param, background = background, background_integral = background_integral, background_parameters = background_param) get_coefs(fit) } ### ********************** ### ## A multivariate Hawkes model ### ********************** ### data(multi_hawkes, package = "stelfi") fit <- fit_mhawkes(times = multi_hawkes$times, stream = multi_hawkes$stream, parameters = list(mu = c(0.2,0.2), alpha = matrix(c(0.5,0.1,0.1,0.5),byrow = TRUE,nrow = 2), beta = c(0.7,0.7))) get_coefs(fit)
### ********************** ### ## A Hawkes model ### ********************** ### data(retweets_niwa, package = "stelfi") times <- unique(sort(as.numeric(difftime(retweets_niwa, min(retweets_niwa), units = "mins")))) params <- c(mu = 0.05, alpha = 0.05, beta = 0.1) fit <- fit_hawkes(times = times, parameters = params) get_coefs(fit) ### ********************** ### ## A Hawkes model with marks (ETAS-type) ### ********************** ### data("nz_earthquakes", package = "stelfi") earthquakes <- nz_earthquakes[order(nz_earthquakes$origintime),] earthquakes <- earthquakes[!duplicated(earthquakes$origintime), ] times <- earthquakes$origintime times <- as.numeric(difftime(times, min(times), units = "hours")) marks <- earthquakes$magnitude params <- c(mu = 0.05, alpha = 0.05, beta = 1) fit <- fit_hawkes(times = times, parameters = params, marks = marks) get_coefs(fit) ### ********************** ### ## A Hawkes process with a custom background function ### ********************** ### if(require("hawkesbow")) { times <- hawkesbow::hawkes(1000, fun = function(y) {1 + 0.5*sin(y)}, M = 1.5, repr = 0.5, family = "exp", rate = 2)$p ## The background function must take a single parameter and ## the time(s) at which it is evaluated background <- function(params,times) { A = exp(params[[1]]) B = stats::plogis(params[[2]]) * A return(A + B *sin(times)) } ## The background_integral function must take a ## single parameter and the time at which it is evaluated background_integral <- function(params,x) { A = exp(params[[1]]) B = stats::plogis(params[[2]]) * A return((A*x)-B*cos(x)) } param = list(alpha = 0.5, beta = 1.5) background_param = list(1,1) fit <- fit_hawkes_cbf(times = times, parameters = param, background = background, background_integral = background_integral, background_parameters = background_param) get_coefs(fit) } ### ********************** ### ## A multivariate Hawkes model ### ********************** ### data(multi_hawkes, package = "stelfi") fit <- fit_mhawkes(times = multi_hawkes$times, stream = multi_hawkes$stream, parameters = list(mu = c(0.2,0.2), alpha = matrix(c(0.5,0.1,0.1,0.5),byrow = TRUE,nrow = 2), beta = c(0.7,0.7))) get_coefs(fit)
Fit a log-Gaussian Cox process (LGCP) using Template Model Builder (TMB) and the
R_inla
namespace for the SPDE-based construction of the latent field.
fit_lgcp( locs, sf, smesh, tmesh, parameters, covariates, tmb_silent = TRUE, nlminb_silent = TRUE, ... )
fit_lgcp( locs, sf, smesh, tmesh, parameters, covariates, tmb_silent = TRUE, nlminb_silent = TRUE, ... )
locs |
A |
sf |
An |
smesh |
A Delaunay triangulation of the spatial domain returned by |
tmesh |
Optional, a temporal mesh returned by |
parameters |
A named list of parameter starting values:
Default values are used if none are provided. NOTE: these may not always be appropriate. |
covariates |
Optional, a |
tmb_silent |
Logical, if |
nlminb_silent |
Logical, if |
... |
optional extra arguments to pass into |
A log-Gaussian Cox process (LGCP) where the Gaussian random field, ,
has zero mean, variance-covariance matrix
, and covariance function
. The random intensity surface is
,
for design matrix
, coefficients
, and random error
.
Shown in Lindgren et. al., (2011) the stationary solution to the SPDE (stochastic
partial differential equation) is
a random field with a Matérn covariance function,
. Here
controls
the smoothness of the field and
controls the range.
A Markovian random field is obtained when is an integer. Following
Lindgren et. al., (2011) we set
in 2D and therefore fix
. Under these
conditions the solution to the SPDE is a Gaussian Markov Random Field (GMRF). This is the approximation
we use.
The (approximate) spatial range and
the standard deviation of the model,
.
Under
INLA
(Lindgren and Rue, 2015) methodology the practical range is defined as the
distance such that the correlation is .
A list containing components of the fitted model, see TMB::MakeADFun
. Includes
par
, a numeric vector of estimated parameter values;
objective
, the objective function;
gr
, the TMB calculated gradient function; and
simulate
, a simulation function.
Lindgren, F., Rue, H., and Lindström, J. (2011) An explicit link between Gaussian fields and Gaussian Markov random fields: the stochastic partial differential equation approach. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 73: 423–498.
Lindgren, F. and Rue, H. (2015) Bayesian spatial modelling with R-INLA. Journal of Statistical Software, 63: 1–25.
### ********************** ### ## A spatial only LGCP ### ********************** ### if(requireNamespace("fmesher")) { data(xyt, package = "stelfi") domain <- sf::st_as_sf(xyt$window) locs <- data.frame(x = xyt$x, y = xyt$y) bnd <- fmesher::fm_as_segm(as.matrix(sf::st_coordinates(domain)[, 1:2])) smesh <- fmesher::fm_mesh_2d(boundary = bnd, max.edge = 0.75, cutoff = 0.3) fit <- fit_lgcp(locs = locs, sf = domain, smesh = smesh, parameters = c(beta = 0, log_tau = log(1), log_kappa = log(1))) ### ********************** ### ## A spatiotemporal LGCP, AR(1) ### ********************** ### ndays <- 2 locs <- data.frame(x = xyt$x, y = xyt$y, t = xyt$t) w0 <- 2 tmesh <- fmesher::fm_mesh_1d(seq(0, ndays, by = w0)) fit <- fit_lgcp(locs = locs, sf = domain, smesh = smesh, tmesh = tmesh, parameters = c(beta = 0, log_tau = log(1), log_kappa = log(1), atanh_rho = 0.2)) }
### ********************** ### ## A spatial only LGCP ### ********************** ### if(requireNamespace("fmesher")) { data(xyt, package = "stelfi") domain <- sf::st_as_sf(xyt$window) locs <- data.frame(x = xyt$x, y = xyt$y) bnd <- fmesher::fm_as_segm(as.matrix(sf::st_coordinates(domain)[, 1:2])) smesh <- fmesher::fm_mesh_2d(boundary = bnd, max.edge = 0.75, cutoff = 0.3) fit <- fit_lgcp(locs = locs, sf = domain, smesh = smesh, parameters = c(beta = 0, log_tau = log(1), log_kappa = log(1))) ### ********************** ### ## A spatiotemporal LGCP, AR(1) ### ********************** ### ndays <- 2 locs <- data.frame(x = xyt$x, y = xyt$y, t = xyt$t) w0 <- 2 tmesh <- fmesher::fm_mesh_1d(seq(0, ndays, by = w0)) fit <- fit_lgcp(locs = locs, sf = domain, smesh = smesh, tmesh = tmesh, parameters = c(beta = 0, log_tau = log(1), log_kappa = log(1), atanh_rho = 0.2)) }
Fit a marked LGCP using Template Model Builder (TMB) and the R_inla
namespace for the SPDE-based construction of the latent field.
fit_mlgcp( locs, sf, marks, smesh, parameters = list(), methods, strfixed = matrix(1, nrow = nrow(locs), ncol = ncol(marks)), fields = rep(1, ncol(marks)), covariates, pp_covariates, marks_covariates, tmb_silent = TRUE, nlminb_silent = TRUE, ... )
fit_mlgcp( locs, sf, marks, smesh, parameters = list(), methods, strfixed = matrix(1, nrow = nrow(locs), ncol = ncol(marks)), fields = rep(1, ncol(marks)), covariates, pp_covariates, marks_covariates, tmb_silent = TRUE, nlminb_silent = TRUE, ... )
locs |
A |
sf |
An |
marks |
A matrix of marks for each observation of the point pattern. |
smesh |
A Delaunay triangulation of the spatial domain returned by |
parameters |
a list of named parameters: log_tau, log_kappa, betamarks, betapp, marks_coefs_pp. |
methods |
An integer value:
|
strfixed |
A matrix of fixed structural parameters, defined for each event and mark.
Defaults to
|
fields |
A binary vector indicating whether there is a new random field for each mark. By default, each mark has its own random field. |
covariates |
Covariate(s) corresponding to each area in the spatial mesh |
pp_covariates |
Which columns of the covariates apply to the point process |
marks_covariates |
Which columns of the covariates apply to the marks. By default, all covariates apply to the marks only. |
tmb_silent |
Logical, if |
nlminb_silent |
Logical, if |
... |
optional extra arguments to pass into |
The random intensity surface of the point process is (as fit_lgcp
)
,
for design matrix
, coefficients
, and random error
.
Each mark, , is jointly modelled and has their own random field
where
are coefficient(s) linking the point process and the mark(s).
depends on the distribution of the marks. If the marks are from a Poisson distribution, it is
the intensity (as with the point process). If the marks are from a Binomial distribution, it is the
success probability, and the user must supply the number of trials for each event (via
strfixed
).
If the marks are normally distributed then this models the mean, and the user must supply
the standard deviation (via strfixed
). The user can choose for the point processes and the marks to
share a common GMRF, i.e. ; this is controlled via the argument
fields
.
A list containing components of the fitted model, see TMB::MakeADFun
. Includes
par
, a numeric vector of estimated parameter values;
objective
, the objective function; and
gr
, the TMB calculated gradient function.
Lindgren, F., Rue, H., and Lindström, J. (2011) An explicit link between Gaussian fields and Gaussian Markov random fields: the stochastic partial differential equation approach. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 73: 423–498.
### ********************** ### ## A joint likelihood marked LGCP model ### ********************** ### if(requireNamespace("fmesher")){ data(marked, package = "stelfi") loc.d <- 3 * cbind(c(0, 1, 1, 0, 0), c(0, 0, 1, 1, 0)) domain <- sf::st_sf(geometry = sf::st_sfc(sf::st_polygon(list(loc.d)))) smesh <- fmesher::fm_mesh_2d(loc.domain = loc.d, offset = c(0.3, 1), max.edge = c(0.3, 0.7), cutoff = 0.05) locs <- cbind(x = marked$x, y = marked$y) marks <- cbind(m1 = marked$m1) ## Gaussian mark parameters <- list(betamarks = matrix(0, nrow = 1, ncol = ncol(marks)), log_tau = rep(log(1), 2), log_kappa = rep(log(1), 2), marks_coefs_pp = rep(0, ncol(marks)), betapp = 0) fit <- fit_mlgcp(locs = locs, marks = marks, sf = domain, smesh = smesh, parameters = parameters, methods = 0,fields = 1) }
### ********************** ### ## A joint likelihood marked LGCP model ### ********************** ### if(requireNamespace("fmesher")){ data(marked, package = "stelfi") loc.d <- 3 * cbind(c(0, 1, 1, 0, 0), c(0, 0, 1, 1, 0)) domain <- sf::st_sf(geometry = sf::st_sfc(sf::st_polygon(list(loc.d)))) smesh <- fmesher::fm_mesh_2d(loc.domain = loc.d, offset = c(0.3, 1), max.edge = c(0.3, 0.7), cutoff = 0.05) locs <- cbind(x = marked$x, y = marked$y) marks <- cbind(m1 = marked$m1) ## Gaussian mark parameters <- list(betamarks = matrix(0, nrow = 1, ncol = ncol(marks)), log_tau = rep(log(1), 2), log_kappa = rep(log(1), 2), marks_coefs_pp = rep(0, ncol(marks)), betapp = 0) fit <- fit_mlgcp(locs = locs, marks = marks, sf = domain, smesh = smesh, parameters = parameters, methods = 0,fields = 1) }
Fits spatiotemporal Hawkes models. The self-excitement is Gaussian in space and exponentially decaying in time.
fit_stelfi( times, locs, sf, smesh, parameters, covariates, GMRF = FALSE, time_independent = TRUE, tmb_silent = TRUE, nlminb_silent = TRUE, ... )
fit_stelfi( times, locs, sf, smesh, parameters, covariates, GMRF = FALSE, time_independent = TRUE, tmb_silent = TRUE, nlminb_silent = TRUE, ... )
times |
A vector of numeric observed time points. |
locs |
A |
sf |
An |
smesh |
A Delaunay triangulation of the spatial domain returned by |
parameters |
A list of named parameters:
|
covariates |
Optional, a |
GMRF |
Logical, default |
time_independent |
Logical, default |
tmb_silent |
Logical, if |
nlminb_silent |
Logical, if |
... |
Additional arguments to pass to |
Temporal self-excitement follows an exponential decay function.
The self-excitement over space follows a Gaussian distribution centered at the triggering event.
There are two formulations of this model. The default is that the Gaussian function has a fixed spatial
covariance matrix, independent of time. Alternatively, covariance can be directly proportional to time,
meaning that the self-excitement radiates out from the center over time.
This can be appropriate when the mechanism causing self-excitement travels
at a finite speed, but is very memory-intensive. The spatiotemporal intensity function
used by stelfi
is
where
is the background rate,
is the rate of temporal decay,
is the increase in intensity after an event,
are the event times,
are the event locations (in 2D Euclidean space), and
is the spatial self-excitement kernel.
can take two forms:
For time-independent spatial excitement (time_independent = TRUE
),
where
is the density function of
.
For time-dependent spatial excitement (time_independent = FALSE
),
where
is the density function of
.
A list containing components of the fitted model, see TMB::MakeADFun
. Includes
par
, a numeric vector of estimated parameter values;
objective
, the objective function; and
gr
, the TMB calculated gradient function.
fit_hawkes
and fit_lgcp
## No GMRF if(requireNamespace("fmesher")){ data(xyt, package = "stelfi") N <- 50 locs <- data.frame(x = xyt$x[1:N], y = xyt$y[1:N]) times <- xyt$t[1:N] domain <- sf::st_as_sf(xyt$window) bnd <- fmesher::fm_as_segm(as.matrix(sf::st_coordinates(domain)[, 1:2])) smesh <- fmesher::fm_mesh_2d(boundary = bnd, max.edge = 0.75, cutoff = 0.3) param <- list( mu = 3, alpha = 1, beta = 3, xsigma = 0.2, ysigma = 0.2, rho = 0.8) fit <- fit_stelfi(times = times, locs = locs, sf = domain, smesh = smesh, parameters = param) get_coefs(fit) ## GMRF param <- list( mu = 5, alpha = 1, beta = 3, kappa = 0.9, tau = 1, xsigma = 0.2, ysigma = 0.2, rho = 0.8) fit <- fit_stelfi(times = times, locs = locs, sf = domain, smesh = smesh, parameters = param, GMRF = TRUE) get_coefs(fit) }
## No GMRF if(requireNamespace("fmesher")){ data(xyt, package = "stelfi") N <- 50 locs <- data.frame(x = xyt$x[1:N], y = xyt$y[1:N]) times <- xyt$t[1:N] domain <- sf::st_as_sf(xyt$window) bnd <- fmesher::fm_as_segm(as.matrix(sf::st_coordinates(domain)[, 1:2])) smesh <- fmesher::fm_mesh_2d(boundary = bnd, max.edge = 0.75, cutoff = 0.3) param <- list( mu = 3, alpha = 1, beta = 3, xsigma = 0.2, ysigma = 0.2, rho = 0.8) fit <- fit_stelfi(times = times, locs = locs, sf = domain, smesh = smesh, parameters = param) get_coefs(fit) ## GMRF param <- list( mu = 5, alpha = 1, beta = 3, kappa = 0.9, tau = 1, xsigma = 0.2, ysigma = 0.2, rho = 0.8) fit <- fit_stelfi(times = times, locs = locs, sf = domain, smesh = smesh, parameters = param, GMRF = TRUE) get_coefs(fit) }
Return parameter estimates from a fitted model.
get_coefs(obj)
get_coefs(obj)
obj |
A fitted model as returned by one of |
A matrix of estimated parameters and standard errors returned by
TMB::sdreport
("report"
).
fit_hawkes
, fit_hawkes_cbf
, fit_lgcp
, fit_mlgcp
, and fit_stelfi
## Hawkes data(retweets_niwa, package = "stelfi") times <- unique(sort(as.numeric(difftime(retweets_niwa, min(retweets_niwa),units = "mins")))) params <- c(mu = 9, alpha = 3, beta = 10) fit <- fit_hawkes(times = times, parameters = params) get_coefs(fit) ## LGCP if(requireNamespace("fmesher")) { data(xyt, package = "stelfi") domain <- sf::st_as_sf(xyt$window) locs <- data.frame(x = xyt$x, y = xyt$y) bnd <- fmesher::fm_as_segm(as.matrix(sf::st_coordinates(domain)[, 1:2])) smesh <- fmesher::fm_mesh_2d(boundary = bnd, max.edge = 0.75, cutoff = 0.3) fit <- fit_lgcp(locs = locs, sf = domain, smesh = smesh, parameters = c(beta = 0, log_tau = log(1), log_kappa = log(1))) get_coefs(fit) }
## Hawkes data(retweets_niwa, package = "stelfi") times <- unique(sort(as.numeric(difftime(retweets_niwa, min(retweets_niwa),units = "mins")))) params <- c(mu = 9, alpha = 3, beta = 10) fit <- fit_hawkes(times = times, parameters = params) get_coefs(fit) ## LGCP if(requireNamespace("fmesher")) { data(xyt, package = "stelfi") domain <- sf::st_as_sf(xyt$window) locs <- data.frame(x = xyt$x, y = xyt$y) bnd <- fmesher::fm_as_segm(as.matrix(sf::st_coordinates(domain)[, 1:2])) smesh <- fmesher::fm_mesh_2d(boundary = bnd, max.edge = 0.75, cutoff = 0.3) fit <- fit_lgcp(locs = locs, sf = domain, smesh = smesh, parameters = c(beta = 0, log_tau = log(1), log_kappa = log(1))) get_coefs(fit) }
Extract the estimated mean, or standard deviation, of the
values of the Gaussian Markov random field for a fitted log-Gaussian
Cox process model at each node of smesh
.
get_fields(obj, smesh, tmesh, plot = FALSE, sd = FALSE)
get_fields(obj, smesh, tmesh, plot = FALSE, sd = FALSE)
obj |
A fitted model object returned by |
smesh |
A Delaunay triangulation of the spatial domain returned by |
tmesh |
Optional, a temporal mesh returned by |
plot |
Logical, if |
sd |
Logical, if |
A numeric
vector or a list
of returned values at each smesh
node.
if(requireNamespace("fmesher")) { data(xyt, package = "stelfi") domain <- sf::st_as_sf(xyt$window) locs <- data.frame(x = xyt$x, y = xyt$y) bnd <- fmesher::fm_as_segm(as.matrix(sf::st_coordinates(domain)[, 1:2])) smesh <- fmesher::fm_mesh_2d(boundary = bnd, max.edge = 0.75, cutoff = 0.3) fit <- fit_lgcp(locs = locs, sf = domain, smesh = smesh, parameters = c(beta = 0, log_tau = log(1), log_kappa = log(1))) get_fields(fit, smesh, plot = TRUE) }
if(requireNamespace("fmesher")) { data(xyt, package = "stelfi") domain <- sf::st_as_sf(xyt$window) locs <- data.frame(x = xyt$x, y = xyt$y) bnd <- fmesher::fm_as_segm(as.matrix(sf::st_coordinates(domain)[, 1:2])) smesh <- fmesher::fm_mesh_2d(boundary = bnd, max.edge = 0.75, cutoff = 0.3) fit <- fit_lgcp(locs = locs, sf = domain, smesh = smesh, parameters = c(beta = 0, log_tau = log(1), log_kappa = log(1))) get_fields(fit, smesh, plot = TRUE) }
Calculate the areas (weights) around the mesh nodes that
are within the specified spatial polygon sf
of the domain.
get_weights(mesh, sf, plot = FALSE)
get_weights(mesh, sf, plot = FALSE)
mesh |
A spatial mesh of class |
sf |
An |
plot |
Logical, whether to plot the calculated |
Either a simple features, sf
, object or values returned by geom_sf
.
https://becarioprecario.bitbucket.io/spde-gitbook/.
data(horse_mesh, package = "stelfi") data(horse_sf, package = "stelfi") get_weights(horse_mesh, horse_sf, plot = TRUE)
data(horse_mesh, package = "stelfi") data(horse_sf, package = "stelfi") get_weights(horse_mesh, horse_sf, plot = TRUE)
Example Delaunay triangulation
A fmesher::fm_mesh_2d()
based on the outline of a horse
A dataset containing information of terrorism activity carried out by the Islamic State of Iraq and the Levant (ISIL) in Iraq, 2013 - 2017.
A simple features dataframe, of type POINT
, with 4208 observations and 16 fields:
numeric year 2013–2017
numeric month index 1–12
numeric day 1–31 (zeros are a non-entry)
country (IRAQ)
latitude location
longitude location
x-coord location UTM
y-coord location UTM
logical, was fatal? TRUE
= fatal
number of fatalities per attack
spatial accuracy of event: 1 = most accurate, 5 = worst
character name of attack perpetrators (ISIL)
x coordinate from location projected onto a sphere
y coordinate from location projected onto a sphere
z coordinate from location projected onto a sphere
scaled: number of people per kilometer squared
scaled: luminosity
scaled: time to nearest city in minutes
https://www.start.umd.edu/gtd/
Example marked point pattern data set
A data frame with 159 rows and 5 variables:
x coordinate
y coordinate
mark, Gaussian distributed
mark, Bernoulli distributed
mark, Gamma distributed
fmesher::fm_mesh_2d
into a sf
objectTransform a fmesher::fm_mesh_2d
into a sf
object
mesh_2_sf(mesh)
mesh_2_sf(mesh)
mesh |
A |
A simple features, sf
, object.
Modified from sp
based function suggested by Finn in the
R-inla discussion Google Group
https://groups.google.com/g/r-inla-discussion-group/c/z1n1exlZrKM.
data(horse_mesh, package = "stelfi") sf <- mesh_2_sf(horse_mesh) if(require("ggplot2")) { ggplot(sf) + geom_sf() }
data(horse_mesh, package = "stelfi") sf <- mesh_2_sf(horse_mesh) if(require("ggplot2")) { ggplot(sf) + geom_sf() }
Calculates a number of geometric attributes for a given Delaunay triangulation based on the circumscribed and inscribed circle of each triangle.
meshmetrics(mesh)
meshmetrics(mesh)
mesh |
A |
A triangle's circumcircle (circumscribed circle) is the unique circle that passes through each of its three vertices. A triangle's incircle (inscribed circle) is the largest circle that can be contained within it (i.e., touches it's three edges).
An object of class sf
with the following data for each triangle in the
triangulation
V1
, V2
, and V3
corresponding vertices
of mesh
matches mesh$graph$tv
;
ID
, numeric triangle id;
angleA
, angleB
, and angleC
, the
interior angles;
circumcircle radius, circumradius, circumcircle_R
();
incircle radius incircle_r
();
centroid locations of the circumcircle, circumcenter, (c_Ox, c_Oy
);
centroid locations of the incircle, incenter, (i_Ox, i_Oy
);
the radius-edge ratio radius_edge
,
where
is the minimum edge length;
the radius ratio radius_ratio
;
area
, area ();
quality
a measure of "quality" defined as
,
where
is the length of edge
.
data(horse_mesh, package = "stelfi") metrics <- meshmetrics(horse_mesh) if(require("ggplot2")) { ggplot(metrics) + geom_sf(aes(fill = radius_ratio)) }
data(horse_mesh, package = "stelfi") metrics <- meshmetrics(horse_mesh) if(require("ggplot2")) { ggplot(metrics) + geom_sf(aes(fill = radius_ratio)) }
Two-stream multivariate Hawkes data with ,
, and
=
matrix(c(0.5,0.1,0.1,0.5),byrow = TRUE,nrow = 2)
.
A data frame with 213 observations and 2 variables:
ordered time stamp of observation
character giving stream ID (i.e., Stream 1 or Stream 2) of observation
Earthquake data from Canterbury, New Zealand 16-Jan-2010–24-Dec-2014.
A simple features dataframe, of type POINT
, with 3824 observations and 3 fields:
The UTC time of the event's occurrence
The magnitude of the earthquake
The focal depth of the event (km)
https://quakesearch.geonet.org.nz/
A dataset of recorded murder cases in New Zealand 2004 - 2019.
A simple features dataframe, of type POINT
, with 967 observations and 11 fields:
Biological sex of victim
Age of victim (years)
Month and day of murder
Year
Cause of death
Killer
Name of victim
Date object of observation on single days
Month name of observation
Cause of death as category
NZ region
Data scraped and cleaned by Charlie Timmings, honours student at the University of Auckland from the website https://interactives.stuff.co.nz/2019/the-homicide-report/
A dataset of retweet times of NIWA's viral leopard seal tweet on the 5th Feb 2019 (https://twitter.com/niwa_nz/status/1092610541401587712).
A vector of length 4890 specifying the date and time of retweet (UTC)
https://twitter.com/niwa_nz/status/1092610541401587712
Subset of data sourced from the Bigfoot Field Researchers Organization (BFRO) (https://data.world/timothyrenner/bfro-sightings-data).
A simple features dataframe, of type POINT
, with 972 observations and 27 fields:
Text observation summary
Text location summary
County
State
Season
Report title
Date
Report number
Report classification
Geohash code
Reported weather measure
Reported weather measure
Reported weather measure
Reported weather measure
Reported weather measure
Reported weather measure
Reported measure
Reported weather measure
Reported weather measure
Reported weather measure
Reported weather measure
Text weather summary
Reported weather measure
Reported weather measure
Reported weather measure
Reported weather measure
Year
https://data.world/timothyrenner/bfro-sightings-data
Plots the values of x
at each node of smesh
, with
optional control over resolutions using dims
.
show_field(x, smesh, sf, dims = c(500, 500), clip = FALSE)
show_field(x, smesh, sf, dims = c(500, 500), clip = FALSE)
x |
A vector of values, one value per each |
smesh |
A Delaunay triangulation of the spatial domain returned by |
sf |
Optional, |
dims |
A numeric vector of length 2 specifying
the spatial pixel resolution. Default |
clip |
Logical, if |
A gg
class object, values returned by geom_tile
and geom_sf
.
if(requireNamespace("fmesher")){ if(require("sf")){ data(xyt, package = "stelfi") domain <- sf::st_as_sf(xyt$window) bnd <- fmesher::fm_as_segm(as.matrix(sf::st_coordinates(domain)[, 1:2])) smesh <- fmesher::fm_mesh_2d(boundary = bnd, max.edge = 0.75, cutoff = 0.3) parameters <- c(beta = 1, log_tau = log(1), log_kappa = log(1)) simdata <- sim_lgcp(parameters = parameters, sf = domain, smesh = smesh) show_field(c(simdata$x), smesh = smesh, sf = domain) show_field(c(simdata$x), smesh = smesh, sf = domain, clip = TRUE) } }
if(requireNamespace("fmesher")){ if(require("sf")){ data(xyt, package = "stelfi") domain <- sf::st_as_sf(xyt$window) bnd <- fmesher::fm_as_segm(as.matrix(sf::st_coordinates(domain)[, 1:2])) smesh <- fmesher::fm_mesh_2d(boundary = bnd, max.edge = 0.75, cutoff = 0.3) parameters <- c(beta = 1, log_tau = log(1), log_kappa = log(1)) simdata <- sim_lgcp(parameters = parameters, sf = domain, smesh = smesh) show_field(c(simdata$x), smesh = smesh, sf = domain) show_field(c(simdata$x), smesh = smesh, sf = domain, clip = TRUE) } }
Plots a Hawkes intensity function, options to extend to non-homogeneous background intensity.
Plots a number of goodness-of-fit plots for a fitted Hawkes process. Includes 1) a comparison of the compensator and observed events, 2) a histogram of transformed interarrival times, 3) a Q-Q plot of transformed interarrival times, and 4) the CDF of consecutive interarrival times, In addition, results of a Kolmogorov-Smirnov and Ljung-Box hypothesis test for the compensator differences are printed.
show_hawkes(obj, type = c("fitted", "data", "both")) show_hawkes_GOF(obj, plot = TRUE, return_values = FALSE, tests = TRUE)
show_hawkes(obj, type = c("fitted", "data", "both")) show_hawkes_GOF(obj, plot = TRUE, return_values = FALSE, tests = TRUE)
obj |
Either object returned by |
type |
One of |
plot |
Logical, whether to plot goodness-of-fit plots. Default |
return_values |
Logical, whether to return GOF values. Default |
tests |
Logical, whether to print the results of a Kolmogorov-Smirnov and
Ljung-Box hypothesis test on the compensator differences. Default |
show_hawkes
returns a gtable
object
with geom_line
and/or geom_histogram
values.
show_hawkes_GOF
returns no value unless return_values = TRUE
,
in this case a list of interarrival times is returned.
data(retweets_niwa, package = "stelfi") times <- unique(sort(as.numeric(difftime(retweets_niwa, min(retweets_niwa),units = "mins")))) params <- c(mu = 9, alpha = 3, beta = 10) show_hawkes(list(times = times, params = params)) fit <- fit_hawkes(times = times, parameters = params) show_hawkes(fit) data(retweets_niwa, package = "stelfi") times <- unique(sort(as.numeric(difftime(retweets_niwa, min(retweets_niwa),units = "mins")))) params <- c(mu = 9, alpha = 3, beta = 10) show_hawkes_GOF(list(times = times, params = params)) fit <- fit_hawkes(times = times, parameters = params) show_hawkes_GOF(fit)
data(retweets_niwa, package = "stelfi") times <- unique(sort(as.numeric(difftime(retweets_niwa, min(retweets_niwa),units = "mins")))) params <- c(mu = 9, alpha = 3, beta = 10) show_hawkes(list(times = times, params = params)) fit <- fit_hawkes(times = times, parameters = params) show_hawkes(fit) data(retweets_niwa, package = "stelfi") times <- unique(sort(as.numeric(difftime(retweets_niwa, min(retweets_niwa),units = "mins")))) params <- c(mu = 9, alpha = 3, beta = 10) show_hawkes_GOF(list(times = times, params = params)) fit <- fit_hawkes(times = times, parameters = params) show_hawkes_GOF(fit)
Plots the estimated spatial intensity from
a fitted log-Gaussian Cox process model. If obj
is a
spatiotemporal model then timestamp
provides control
over which temporal index to plot the estimated spatial intensity.
show_lambda( obj, smesh, sf, tmesh, covariates, clip = FALSE, dims = c(500, 500), timestamp = 1 )
show_lambda( obj, smesh, sf, tmesh, covariates, clip = FALSE, dims = c(500, 500), timestamp = 1 )
obj |
A fitted LGCP model object for, for example, |
smesh |
A Delaunay triangulation of the spatial domain returned by |
sf |
An |
tmesh |
Optional, a temporal mesh returned by |
covariates |
Optional, a |
clip |
Logical, if |
dims |
A numeric vector of length 2 specifying
the spatial pixel resolution. Default |
timestamp |
The index of time stamp to plot. Default |
A gg
class object, values returned by geom_tile
and geom_sf
.
fit_lgcp
, show_field
, and get_fields
if(requireNamespace("fmesher")) { data(xyt, package = "stelfi") domain <- sf::st_as_sf(xyt$window) locs <- data.frame(x = xyt$x, y = xyt$y) bnd <- fmesher::fm_as_segm(as.matrix(sf::st_coordinates(domain)[, 1:2])) smesh <- fmesher::fm_mesh_2d(boundary = bnd, max.edge = 0.75, cutoff = 0.3) fit <- fit_lgcp(locs = locs, sf = domain, smesh = smesh, parameters = c(beta = 0, log_tau = log(1), log_kappa = log(1))) show_lambda(fit, smesh = smesh, sf = domain) }
if(requireNamespace("fmesher")) { data(xyt, package = "stelfi") domain <- sf::st_as_sf(xyt$window) locs <- data.frame(x = xyt$x, y = xyt$y) bnd <- fmesher::fm_as_segm(as.matrix(sf::st_coordinates(domain)[, 1:2])) smesh <- fmesher::fm_mesh_2d(boundary = bnd, max.edge = 0.75, cutoff = 0.3) fit <- fit_lgcp(locs = locs, sf = domain, smesh = smesh, parameters = c(beta = 0, log_tau = log(1), log_kappa = log(1))) show_lambda(fit, smesh = smesh, sf = domain) }
Multivariate Hawkes fitted model plot
show_multivariate_hawkes(obj, type = c("fitted", "data", "both"))
show_multivariate_hawkes(obj, type = c("fitted", "data", "both"))
obj |
Either object returned by |
type |
One of |
Simulates a self-exciting Hawkes process.
sim_hawkes(mu, alpha, beta, n = 100, plot = FALSE, seed = 123, method = "1")
sim_hawkes(mu, alpha, beta, n = 100, plot = FALSE, seed = 123, method = "1")
mu |
A numeric specifying base rate of the Hawkes process. |
alpha |
A numeric specifying intensity jump after an event occurrence. |
beta |
A numeric specifying exponential intensity decay |
n |
A numeric depending on method: if |
plot |
Logical, if |
seed |
The seed. Default, |
method |
A character "1" or "2" specifying the method (see details)
to simulate Hawkes process. Default, |
Option of two methods to simulate a Hawkes process:
if method = "1"
then a univariate Hawkes process as
algorithm 2 in Ogata, Y. (1981) is simulated,
if method = "2"
then an accept/reject
framework is used).
A numeric
vector of simulated event times.
sim_hawkes(10.2, 3.1, 8.9) sim_hawkes(10.2, 3.1, 8.9, method = "2")
sim_hawkes(10.2, 3.1, 8.9) sim_hawkes(10.2, 3.1, 8.9, method = "2")
Simulate a realisation of a log-Gaussian Cox process (LGCP) using the
TMB
C++
template. If rho
is supplied in parameters
as well as tmesh
then spatiotemporal (AR(1)) data will be simulated.
sim_lgcp(parameters, sf, smesh, tmesh, covariates, all = FALSE)
sim_lgcp(parameters, sf, smesh, tmesh, covariates, all = FALSE)
parameters |
A named list of parameter starting values:
Default values are used if none are provided. NOTE: these may not always be appropriate. |
sf |
An |
smesh |
A Delaunay triangulation of the spatial domain returned by |
tmesh |
Optional, a temporal mesh returned by |
covariates |
Optional, a |
all |
Logical, if |
A named list. If all = FALSE
then only the simulated values of
the GMRF at each mesh node are returned, x
, alongside the number of
events, y
, simulated at each node.
if(requireNamespace("fmesher")){ if(require("sf")) { data(xyt, package = "stelfi") domain <- sf::st_as_sf(xyt$window) bnd <- fmesher::fm_as_segm(as.matrix(sf::st_coordinates(domain)[, 1:2])) smesh <- fmesher::fm_mesh_2d(boundary = bnd, max.edge = 0.75, cutoff = 0.3) parameters <- c(beta = 1, log_tau = log(1), log_kappa = log(1)) sim <- sim_lgcp(parameters = parameters, sf = domain, smesh = smesh) ## spatiotemporal ndays <- 2 w0 <- 2 tmesh <- fmesher::fm_mesh_1d(seq(0, ndays, by = w0)) parameters <- c(beta = 1, log_tau = log(1), log_kappa = log(1), atanh_rho = 0.2) sim <- sim_lgcp(parameters = parameters, sf = domain, smesh = smesh, tmesh = tmesh) } }
if(requireNamespace("fmesher")){ if(require("sf")) { data(xyt, package = "stelfi") domain <- sf::st_as_sf(xyt$window) bnd <- fmesher::fm_as_segm(as.matrix(sf::st_coordinates(domain)[, 1:2])) smesh <- fmesher::fm_mesh_2d(boundary = bnd, max.edge = 0.75, cutoff = 0.3) parameters <- c(beta = 1, log_tau = log(1), log_kappa = log(1)) sim <- sim_lgcp(parameters = parameters, sf = domain, smesh = smesh) ## spatiotemporal ndays <- 2 w0 <- 2 tmesh <- fmesher::fm_mesh_1d(seq(0, ndays, by = w0)) parameters <- c(beta = 1, log_tau = log(1), log_kappa = log(1), atanh_rho = 0.2) sim <- sim_lgcp(parameters = parameters, sf = domain, smesh = smesh, tmesh = tmesh) } }
Fit Hawkes and log-Gaussian Cox process models with extensions. Introduced in Hawkes (1971) a Hawkes process is a self-exciting temporal point process where the occurrence of an event immediately increases the chance of another. We extend this to consider self-inhibiting process and a non-homogeneous background rate. A log-Gaussian Cox process is a Poisson point process where the log-intensity is given by a Gaussian random field. We extend this to a joint likelihood formulation fitting a marked log-Gaussian Cox model. In addition, the package offers functionality to fit self-exciting spatiotemporal point processes. Models are fitted via maximum likelihood using 'TMB' (Template Model Builder) (Kristensen, Nielsen, Berg, Skaug, and Bell, 2016). Where included 1) random fields are assumed to be Gaussian and are integrated over using the Laplace approximation and 2) a stochastic partial differential equation model, introduced by Lindgren, Rue, and Lindström. (2011), is defined for the field(s).
The functions fit_hawkes
and fit_hawkes_cbf
fit self-exciting Hawkes (Hawkes AG., 1971) processes to temporal point pattern data.
The function fit_lgcp
fit a log-Gaussian Cox process to
either spatial or spatiotemporal point pattern data. If a spatiotemporal
model is fitted a AR1 process is assumed for the temporal progression.
The function fit_mlgcp
fits a joint likelihood model between
the point locations and the mark(s).
The function fit_stelfi
fits self-exciting spatiotemporal
Hawkes models to point pattern data. The self-excitement is Gaussian in space
and exponentially decaying over time. In addition, a GMRF can be included
to account for latent spatial dependency.
Hawkes, AG. (1971) Spectra of some self-exciting and mutually exciting point processes. Biometrika, 58: 83–90.
Lindgren, F., Rue, H., and Lindström, J. (2011) An explicit link between Gaussian fields and Gaussian Markov random fields: the stochastic partial differential equation approach. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 73: 423–498.
Kristensen, K., Nielsen, A., Berg, C. W., Skaug, H., and Bell B. M. (2016). TMB: Automatic Differentiation and Laplace Approximation. Journal of Statistical Software, 70: 1–21.
A dataset containing the names and number of recorded murders committed by some of the infamous UK serial killers from 1828 to 2015.
A dataframe with 62 rows and 8 variables:
approx number of murders committed
The years of operation
Name of convicted serial killer
Some serial killers were given nicknames
Year the murders began
Year the murders ended
Date, if known, first murder was committed
Est popn in millions at time of first murder
Simulated self-exciting spatiotemporal point pattern of class stppp
A stppp
object with 653 observations
Domain of the point pattern of class owin
Number of observations, 653
x coordinate
y coordinate
none
Timestamp of points
Time frame 0 2