Title: | Estimate Global Clustering in Infectious Disease |
---|---|
Description: | Implements various novel and standard clustering statistics and other analyses useful for understanding the spread of infectious disease. |
Authors: | John Giles [aut, ctb], Henrik Salje [aut, ctb], Justin Lessler [aut, cre], Timothy Willard [ctb] |
Maintainer: | Justin Lessler <[email protected]> |
License: | GPL (>= 2) |
Version: | 0.4.0 |
Built: | 2025-02-24 04:33:31 UTC |
Source: | https://github.com/cran/IDSpatialStats |
Dataset simulated using an agent based model with a spatially heterogeneous population structure. Infectious agents were introduced resulting in agent to agent transmission. The distance between successive cases in a transmission chain were randomly drawn from a uniform distribution U(0,100). Each infectious agent resulted in a single transmission to another agent after a delay of 15 days, reflecting the generation time of dengue. There are 11 transmission chains, each with a different genotype. The genotypes are subdivided into four serotypes.
DengueSimR01
DengueSimR01
Matrix with five columns representing the X and Y coordinates of infected individuals, the time of infection, the genotype of the infecting pathogen and the serotype of the infecting pathogen.
Justin Lessler and Henrik Salje
Dataset simulated using an agent based model with a spatially heterogeneous population structure. Infectious agents were introduced resulting in agent to agent transmission. The distance between successive cases in a transmission chain were randomly drawn from a uniform distribution U(0,100). Each infectious agent resulted in transmissions to two other agents after a delay of 15 days, reflecting the generation time of dengue. There are 11 transmission chains, each with a different genotype. The genotypes are subdivided into four serotypes.
DengueSimR02
DengueSimR02
Matrix with five columns representing the X and Y coordinates of infected individuals, the time of infection, the genotype of the infecting pathogen and the serotype of the infecting pathogen.
Justin Lessler and Henrik Salje
Dataset simulated using an agent based model with a spatially heterogeneous population structure. Infectious agents were introduced resulting in agent to agent transmission. The distance between successive cases in a transmission chain were randomly drawn from a uniform distribution U(0,100). Each infectious agent resulted in transmissions to two other agents after a delay of 15 days, reflecting the generation time of dengue. There are 11 transmission chains, each with a different genotype. The genotypes are subdivided into four serotypes. 500 randomly selected individuals from the underlying population also included.
DengueSimRepresentative
DengueSimRepresentative
Matrix with five columns representing the X and Y coordinates of infected individuals, the time of infection, the genotype of the infecting pathogen and the serotype of the infecting pathogen. Individuals representative from the underlying population have '-999'for time, genotype and serotype.
Justin Lessler and Henrik Salje
this function estimates the mean transmission distance of an epidemic when given the locations and times of symptomatic cases and the mean and standard deviation of the generation time of the infecting pathogen
est.transdist( epi.data, gen.t.mean, gen.t.sd, t1, max.sep, max.dist, n.transtree.reps = 100, theta.weights = NULL )
est.transdist( epi.data, gen.t.mean, gen.t.sd, t1, max.sep, max.dist, n.transtree.reps = 100, theta.weights = NULL )
epi.data |
a three-column matrix giving the coordinates ( |
gen.t.mean |
mean generation time of the infecting pathogen |
gen.t.sd |
standard deviation of generation time of the infecting pathogen |
t1 |
time step to begin estimation of transmission distance |
max.sep |
maximum number of time steps allowed between two cases (passed to the |
max.dist |
maximum spatial distance between two cases considered in calculation |
n.transtree.reps |
number of time to simulate transmission trees when estimating the weights of theta (passed to the |
theta.weights |
use external matrix of theta weights. If NULL (default) the matrix of theta weights is automatically estimated by calling the |
a list containing the estimated mean distance of the transmission kernel (mu.est
) and its standard deviation (sigma.est
). Bounded estimates (bound.mu.est
and bound.sigma.est
) are also given for when the assumption of equal mean and standard deviation is violated.
John Giles, Justin Lessler, and Henrik Salje
Salje H, Cummings DAT and Lessler J (2016). “Estimating infectious disease transmission distances using the overall distribution of cases.” Epidemics, 17, pp. 10–18. ISSN 1755-4365, doi: 10.1016/j.epidem.2016.10.001.
Other est.wt:
est.wt.matrix()
,
est.wt.matrix.weights()
Other transdist:
est.transdist.bootstrap.ci()
,
est.transdist.temporal()
,
est.transdist.temporal.bootstrap.ci()
,
est.transdist.theta.weights()
,
get.transdist.theta()
set.seed(123) # Exponentially distributed transmission kernel with mean and standard deviation = 100 dist.func <- alist(n=1, a=1/100, rexp(n, a)) # Simulate epidemic a <- sim.epidemic(R=1.5, gen.t.mean=7, gen.t.sd=2, min.cases=50, tot.generations=12, trans.kern.func=dist.func) # Estimate mean and standara deviation of transmission kernel b <- est.transdist(epi.data=a, gen.t.mean=7, gen.t.sd=2, t1=0, max.sep=1e10, max.dist=1e10, n.transtree.reps=10) b
set.seed(123) # Exponentially distributed transmission kernel with mean and standard deviation = 100 dist.func <- alist(n=1, a=1/100, rexp(n, a)) # Simulate epidemic a <- sim.epidemic(R=1.5, gen.t.mean=7, gen.t.sd=2, min.cases=50, tot.generations=12, trans.kern.func=dist.func) # Estimate mean and standara deviation of transmission kernel b <- est.transdist(epi.data=a, gen.t.mean=7, gen.t.sd=2, t1=0, max.sep=1e10, max.dist=1e10, n.transtree.reps=10) b
Runs est.trandsdist
on multiple bootstraps of the data and calculates confidence intervals for the mean transmission distance.
est.transdist.bootstrap.ci( epi.data, gen.t.mean, gen.t.sd, t1, max.sep, max.dist, n.transtree.reps = 100, mean.equals.sd = FALSE, theta.weights = NULL, boot.iter, ci.low = 0.025, ci.high = 0.975, parallel = FALSE, n.cores = NULL )
est.transdist.bootstrap.ci( epi.data, gen.t.mean, gen.t.sd, t1, max.sep, max.dist, n.transtree.reps = 100, mean.equals.sd = FALSE, theta.weights = NULL, boot.iter, ci.low = 0.025, ci.high = 0.975, parallel = FALSE, n.cores = NULL )
epi.data |
a three-column matrix giving the coordinates ( |
gen.t.mean |
mean generation time of the infecting pathogen |
gen.t.sd |
standard deviation of generation time of the infecting pathogen |
t1 |
time step to begin estimation of transmission distance |
max.sep |
maximum number of time steps allowed between two cases (passed to the |
max.dist |
maximum spatial distance between two cases considered in calculation |
n.transtree.reps |
number of time to simulate transmission trees when estimating the weights of theta (passed to the |
mean.equals.sd |
logical term indicating if the mean and standard deviation of the transmission kernel are expected to be equal (default = FALSE) |
theta.weights |
use external matrix of theta weights. If NULL (default) the matrix of theta weights is automatically estimated by calling the |
boot.iter |
the number of bootstrapped iterations to perform |
ci.low |
low end of the confidence interval (default = 0.025) |
ci.high |
high end of the confidence interval (default = 0.975) |
parallel |
run bootstraps in parallel (default = FALSE) |
n.cores |
number of cores to use when |
a list object containing the point estimate for mean transmission distance and low and high bootstrapped confidence intervals
John Giles, Justin Lessler, and Henrik Salje
Salje H, Cummings DAT and Lessler J (2016). “Estimating infectious disease transmission distances using the overall distribution of cases.” Epidemics, 17, pp. 10–18. ISSN 1755-4365, doi: 10.1016/j.epidem.2016.10.001.
Other transdist:
est.transdist()
,
est.transdist.temporal()
,
est.transdist.temporal.bootstrap.ci()
,
est.transdist.theta.weights()
,
get.transdist.theta()
set.seed(1) # Exponentially distributed transmission kernel with mean and standard deviation = 100 dist.func <- alist(n=1, a=1/100, rexp(n, a)) # Simulate epidemic a <- sim.epidemic(R=2.5, gen.t.mean=7, gen.t.sd=2, min.cases=20, tot.generations=5, trans.kern.func=dist.func) # Estimate mean transmission kernel and its bootstrapped confidence intervals b <- est.transdist.bootstrap.ci(epi.data=a, gen.t.mean=7, gen.t.sd=2, t1=0, max.sep=1e10, max.dist=1e10, n.transtree.reps=10, mean.equals.sd=TRUE, boot.iter=10, ci.low=0.025, ci.high=0.975, n.cores=2) b
set.seed(1) # Exponentially distributed transmission kernel with mean and standard deviation = 100 dist.func <- alist(n=1, a=1/100, rexp(n, a)) # Simulate epidemic a <- sim.epidemic(R=2.5, gen.t.mean=7, gen.t.sd=2, min.cases=20, tot.generations=5, trans.kern.func=dist.func) # Estimate mean transmission kernel and its bootstrapped confidence intervals b <- est.transdist.bootstrap.ci(epi.data=a, gen.t.mean=7, gen.t.sd=2, t1=0, max.sep=1e10, max.dist=1e10, n.transtree.reps=10, mean.equals.sd=TRUE, boot.iter=10, ci.low=0.025, ci.high=0.975, n.cores=2) b
Estimates the change in mean transmission distance over the duration of the epidemic by running est.trandsdist
on all cases
occuring up to each time point.
est.transdist.temporal( epi.data, gen.t.mean, gen.t.sd, t1, max.sep, max.dist, n.transtree.reps = 10, mean.equals.sd = FALSE, theta.weights = NULL, parallel = FALSE, n.cores = NULL )
est.transdist.temporal( epi.data, gen.t.mean, gen.t.sd, t1, max.sep, max.dist, n.transtree.reps = 10, mean.equals.sd = FALSE, theta.weights = NULL, parallel = FALSE, n.cores = NULL )
epi.data |
a three-column matrix giving the coordinates ( |
gen.t.mean |
mean generation time of the infecting pathogen |
gen.t.sd |
standard deviation of generation time of the infecting pathogen |
t1 |
time step to begin estimation of transmission distance |
max.sep |
maximum number of time steps allowed between two cases (passed to the |
max.dist |
maximum spatial distance between two cases considered in calculation |
n.transtree.reps |
number of time to simulate transmission trees when estimating the weights of theta (passed to the |
mean.equals.sd |
logical term indicating if the mean and standard deviation of the transmission kernel are expected to be equal (default = FALSE) |
theta.weights |
use external matrix of theta weights. If NULL (default) the matrix of theta weights is automatically estimated by calling the |
parallel |
run time steps in parallel (default = FALSE) |
n.cores |
number of cores to use when |
a numeric matrix containing the point estimate for mean transmission distance for each unique time step of the epidemic and the sample size $n$ used to make the estimate NAs are returned for time steps which contain fewer than three cases
John Giles, Justin Lessler, and Henrik Salje
Salje H, Cummings DAT and Lessler J (2016). “Estimating infectious disease transmission distances using the overall distribution of cases.” Epidemics, 17, pp. 10–18. ISSN 1755-4365, doi: 10.1016/j.epidem.2016.10.001.
Other transdist:
est.transdist()
,
est.transdist.bootstrap.ci()
,
est.transdist.temporal.bootstrap.ci()
,
est.transdist.theta.weights()
,
get.transdist.theta()
set.seed(123) # Exponentially distributed transmission kernel with mean and standard deviation = 100 dist.func <- alist(n=1, a=1/100, rexp(n, a)) # Simulate epidemic a <- sim.epidemic(R=2, gen.t.mean=7, gen.t.sd=2, tot.generations=7, min.cases=30, trans.kern.func=dist.func) a <- a[sample(1:nrow(a), 50),] # subsample a to 50 observations # Estimate mean transmission kernel over time b <- est.transdist.temporal(epi.data=a, gen.t.mean=7, gen.t.sd=2, t1=0, max.sep=1e10, max.dist=1e10, n.transtree.reps=5, mean.equals.sd=TRUE, n.cores=2) b plot(b[,2], pch=19, col='grey', ylim=c(min(b[,2], na.rm=TRUE), max(b[,2], na.rm=TRUE)), xlab='Time step', ylab='Estimated mean of transmission kernel') abline(h=100, col='red', lty=2) axis(3, b[,2]) low <- loess(b[,2] ~ as.vector(1:length(b[,2]))) low <- predict(low, newdata=data.frame(as.vector(1:length(b[,2])))) lines(low, lwd=3, col='blue')
set.seed(123) # Exponentially distributed transmission kernel with mean and standard deviation = 100 dist.func <- alist(n=1, a=1/100, rexp(n, a)) # Simulate epidemic a <- sim.epidemic(R=2, gen.t.mean=7, gen.t.sd=2, tot.generations=7, min.cases=30, trans.kern.func=dist.func) a <- a[sample(1:nrow(a), 50),] # subsample a to 50 observations # Estimate mean transmission kernel over time b <- est.transdist.temporal(epi.data=a, gen.t.mean=7, gen.t.sd=2, t1=0, max.sep=1e10, max.dist=1e10, n.transtree.reps=5, mean.equals.sd=TRUE, n.cores=2) b plot(b[,2], pch=19, col='grey', ylim=c(min(b[,2], na.rm=TRUE), max(b[,2], na.rm=TRUE)), xlab='Time step', ylab='Estimated mean of transmission kernel') abline(h=100, col='red', lty=2) axis(3, b[,2]) low <- loess(b[,2] ~ as.vector(1:length(b[,2]))) low <- predict(low, newdata=data.frame(as.vector(1:length(b[,2])))) lines(low, lwd=3, col='blue')
Estimates bootstrapped confidence intervals for the mean transmission distance over the duration of the epidemic by running est.trandsdist
on all cases
occuring up to each time point.
est.transdist.temporal.bootstrap.ci( epi.data, gen.t.mean, gen.t.sd, t1, max.sep, max.dist, n.transtree.reps = 100, mean.equals.sd = FALSE, theta.weights = NULL, boot.iter, ci.low = 0.025, ci.high = 0.975, parallel = FALSE, n.cores = NULL )
est.transdist.temporal.bootstrap.ci( epi.data, gen.t.mean, gen.t.sd, t1, max.sep, max.dist, n.transtree.reps = 100, mean.equals.sd = FALSE, theta.weights = NULL, boot.iter, ci.low = 0.025, ci.high = 0.975, parallel = FALSE, n.cores = NULL )
epi.data |
a three-column matrix giving the coordinates ( |
gen.t.mean |
mean generation time of the infecting pathogen |
gen.t.sd |
standard deviation of generation time of the infecting pathogen |
t1 |
time step to begin estimation of transmission distance |
max.sep |
maximum number of time steps allowed between two cases (passed to the |
max.dist |
maximum spatial distance between two cases considered in calculation |
n.transtree.reps |
number of time to simulate transmission trees when estimating the weights of theta (passed to the |
mean.equals.sd |
logical term indicating if the mean and standard deviation of the transmission kernel are expected to be equal (default = FALSE) |
theta.weights |
use external matrix of theta weights. If NULL (default) the matrix of theta weights is automatically estimated by calling the |
boot.iter |
the number of bootstrapped iterations to perform |
ci.low |
low end of the confidence interval (default = 0.025) |
ci.high |
high end of the confidence interval (default = 0.975) |
parallel |
run bootstraps in parallel (default = FALSE) |
n.cores |
number of cores to use when |
a four-column numeric matrix containing the point estimate for mean transmission distance, low and high bootstrapped confidence intervals, and the sample size up to each time step
John Giles, Justin Lessler, and Henrik Salje
Salje H, Cummings DAT and Lessler J (2016). “Estimating infectious disease transmission distances using the overall distribution of cases.” Epidemics, 17, pp. 10–18. ISSN 1755-4365, doi: 10.1016/j.epidem.2016.10.001.
Other transdist:
est.transdist()
,
est.transdist.bootstrap.ci()
,
est.transdist.temporal()
,
est.transdist.theta.weights()
,
get.transdist.theta()
set.seed(123) # Exponentially distributed transmission kernel with mean and standard deviation = 100 dist.func <- alist(n=1, a=1/100, rexp(n, a)) # Simulate epidemic a <- sim.epidemic(R=2, gen.t.mean=7, gen.t.sd=2, tot.generations=8, min.cases=30, trans.kern.func=dist.func) a <- a[sample(1:nrow(a), 70),] # subsample a to 70 observations # Estimate change in mean transmission kernel over time with confidence intervals b <- est.transdist.temporal.bootstrap.ci(epi.data=a, gen.t.mean=7, gen.t.sd=2, t1=0, max.sep=1e10, max.dist=1e10, n.transtree.reps=10, mean.equals.sd=TRUE, boot.iter=10, ci.low=0.025, ci.high=0.975, n.cores=2) plot(b[,2], pch=19, col='grey', ylim=c(min(b[,1:3], na.rm=TRUE), max(b[,2:4], na.rm=TRUE)), xlab='Time step', ylab='Estimated mean of transmission kernel') abline(h=100, col='red', lty=2) axis(3, 1:nrow(b), b[,5]) low <- loess(b[,2] ~ as.vector(1:nrow(b)), span=1) low <- predict(low, newdata=data.frame(as.vector(1:nrow(b)))) lines(low, lwd=3, col='blue') for(i in 3:4) { low <- loess(b[,i] ~ as.vector(1:nrow(b)), span=1) low <- predict(low, newdata=data.frame(as.vector(1:nrow(b)))) lines(low, lty=2, lwd=3, col='blue') }
set.seed(123) # Exponentially distributed transmission kernel with mean and standard deviation = 100 dist.func <- alist(n=1, a=1/100, rexp(n, a)) # Simulate epidemic a <- sim.epidemic(R=2, gen.t.mean=7, gen.t.sd=2, tot.generations=8, min.cases=30, trans.kern.func=dist.func) a <- a[sample(1:nrow(a), 70),] # subsample a to 70 observations # Estimate change in mean transmission kernel over time with confidence intervals b <- est.transdist.temporal.bootstrap.ci(epi.data=a, gen.t.mean=7, gen.t.sd=2, t1=0, max.sep=1e10, max.dist=1e10, n.transtree.reps=10, mean.equals.sd=TRUE, boot.iter=10, ci.low=0.025, ci.high=0.975, n.cores=2) plot(b[,2], pch=19, col='grey', ylim=c(min(b[,1:3], na.rm=TRUE), max(b[,2:4], na.rm=TRUE)), xlab='Time step', ylab='Estimated mean of transmission kernel') abline(h=100, col='red', lty=2) axis(3, 1:nrow(b), b[,5]) low <- loess(b[,2] ~ as.vector(1:nrow(b)), span=1) low <- predict(low, newdata=data.frame(as.vector(1:nrow(b)))) lines(low, lwd=3, col='blue') for(i in 3:4) { low <- loess(b[,i] ~ as.vector(1:nrow(b)), span=1) low <- predict(low, newdata=data.frame(as.vector(1:nrow(b)))) lines(low, lty=2, lwd=3, col='blue') }
This function estimates the weight of each theta value by performing a user defined number of replications with the get.transdist.theta
function. The weights
of each theta are calculated as the number of simulations in which a case at time t1
and t2
are separated by theta transmission events.
est.transdist.theta.weights(case.times, gen.t.mean, t.density, t1, n.rep = 100)
est.transdist.theta.weights(case.times, gen.t.mean, t.density, t1, n.rep = 100)
case.times |
a vector giving the occurrence time for each case |
gen.t.mean |
the mean generation time of the infecting pathogen |
t.density |
a vector giving the generation time density of the infecting pathogen |
t1 |
time step to begin simulation |
n.rep |
number of replications in the simulation (default = 100) |
a three-dimensional array containing the mean normalized theta weights estimated across all replications
John Giles, Justin Lessler, and Henrik Salje
Salje H, Cummings DAT and Lessler J (2016). “Estimating infectious disease transmission distances using the overall distribution of cases.” Epidemics, 17, pp. 10–18. ISSN 1755-4365, doi: 10.1016/j.epidem.2016.10.001.
Other transdist:
est.transdist()
,
est.transdist.bootstrap.ci()
,
est.transdist.temporal()
,
est.transdist.temporal.bootstrap.ci()
,
get.transdist.theta()
set.seed(1) gen.t.mean <- 7 gen.t.sd <- 2 t1 <- 0 # Normally distributed transmission kernel with mean and standard deviation = 100 dist.func <- alist(n=1, a=1/100, rexp(n, a)) # Simulate epidemic a <- sim.epidemic(R=5, gen.t.mean=gen.t.mean, gen.t.sd=gen.t.sd, min.cases=5, tot.generations=3, trans.kern.func=dist.func) # Get case times a <- a[order(a[,3]),] case.times <- round(a[,3]) unique.times <- unique(case.times) ntimes <- length(unique.times) # Generation time distribution max.t <- round(max(unique.times) - t1) - 1 n.step <- round(max.t/gen.t.mean) gen <- rep(0, max.t*2) for (i in 1:n.step){gen <- gen + dnorm(1:(max.t*2), gen.t.mean*i, gen.t.sd*i)} gen[1] <- 0 # No instantaneous infections t.density <- gen/sum(gen) # Estimation of theta weights matrix b <- est.transdist.theta.weights(case.times=case.times, n.rep=3, gen.t.mean=gen.t.mean, t1=t1, t.density=t.density)
set.seed(1) gen.t.mean <- 7 gen.t.sd <- 2 t1 <- 0 # Normally distributed transmission kernel with mean and standard deviation = 100 dist.func <- alist(n=1, a=1/100, rexp(n, a)) # Simulate epidemic a <- sim.epidemic(R=5, gen.t.mean=gen.t.mean, gen.t.sd=gen.t.sd, min.cases=5, tot.generations=3, trans.kern.func=dist.func) # Get case times a <- a[order(a[,3]),] case.times <- round(a[,3]) unique.times <- unique(case.times) ntimes <- length(unique.times) # Generation time distribution max.t <- round(max(unique.times) - t1) - 1 n.step <- round(max.t/gen.t.mean) gen <- rep(0, max.t*2) for (i in 1:n.step){gen <- gen + dnorm(1:(max.t*2), gen.t.mean*i, gen.t.sd*i)} gen[1] <- 0 # No instantaneous infections t.density <- gen/sum(gen) # Estimation of theta weights matrix b <- est.transdist.theta.weights(case.times=case.times, n.rep=3, gen.t.mean=gen.t.mean, t1=t1, t.density=t.density)
A function which takes the time of each case occurrence, the generation time distribution of the infecting pathogen, and the matrix of basic Wallinga-Teunis weights and estimates the probability that an infectee occurring time step j (columns) was infected by a case occurring at time i (rows).
est.wt.matrix(case.times, gen.t.dist, basic.wt.weights = NULL)
est.wt.matrix(case.times, gen.t.dist, basic.wt.weights = NULL)
case.times |
a vector giving the occurrence time for each case |
gen.t.dist |
a vector giving the generation time distribution for the infecting pathogen |
basic.wt.weights |
a matrix giving the basic normalized Wallinga-Teunis weights for each time step (output from the |
a numerical matrix with the number of columns and rows equal to the number of cases in the epidemic
John Giles, Justin Lessler, and Henrik Salje
Salje H, Cummings DAT and Lessler J (2016). “Estimating infectious disease transmission distances using the overall distribution of cases.” Epidemics, 17, pp. 10–18. ISSN 1755-4365, doi: 10.1016/j.epidem.2016.10.001.
Other est.wt:
est.transdist()
,
est.wt.matrix.weights()
case.times <- c(1,2,2,3,3) gen <- c(0, 2/3, 1/3, 0, 0) t.density <- gen/sum(gen) a <- est.wt.matrix(case.times=case.times, gen.t.dist=t.density)
case.times <- c(1,2,2,3,3) gen <- c(0, 2/3, 1/3, 0, 0) t.density <- gen/sum(gen) a <- est.wt.matrix(case.times=case.times, gen.t.dist=t.density)
A function called by est.wt.matrix
, which calculates the basic weights in the Wallinga-Teunis matrix given
the time of each case occurrence and the generation time distribution of the pathogen. Code adapted from the R0 package.
est.wt.matrix.weights(case.times, gen.t.dist)
est.wt.matrix.weights(case.times, gen.t.dist)
case.times |
a vector giving the occurrence time for each case |
gen.t.dist |
a vector giving the generation time distribution for the infecting pathogen |
a numerical matrix with the number of columns and rows equal to the number of time steps in the epidemic
John Giles, Justin Lessler, and Henrik Salje
Boelle P and Obadia T (2015). R0: Estimation of R0 and Real-Time Reproduction Number from Epidemics. R package version 1.2-6, https://CRAN.R-project.org/package=R0.
Salje H, Cummings DAT and Lessler J (2016). “Estimating infectious disease transmission distances using the overall distribution of cases.” Epidemics, 17, pp. 10–18. ISSN 1755-4365, doi: 10.1016/j.epidem.2016.10.001.
Other est.wt:
est.transdist()
,
est.wt.matrix()
case.times <- c(1,2,2,3,3) gen <- c(0, 2/3, 1/3, 0, 0) t.density <- gen/sum(gen) a <- est.wt.matrix.weights(case.times=case.times, gen.t.dist=t.density)
case.times <- c(1,2,2,3,3) gen <- c(0, 2/3, 1/3, 0, 0) t.density <- gen/sum(gen) a <- est.wt.matrix.weights(case.times=case.times, gen.t.dist=t.density)
A wrapper function of the Kcross function from the spatstat.explore package (Baddeley et al. 2016) that takes epidemiological data used by IDSpatialStats functions and calculates the cross type K-function based on user defined case type homology
get.cross.K(epi.data, type, hom, het = NULL, r = NULL, correction = "none")
get.cross.K(epi.data, type, hom, het = NULL, r = NULL, correction = "none")
epi.data |
a three-column numerical matrix that contains coordinates ( |
type |
an integer giving the column that contains information on case type. Must be an integer or a character |
hom |
a scalar or vector giving the homotypic case type(s). Equivalent to the 'j' point type used in the cross K function. Must be an integer or character |
het |
a scalar or vector giving the heterotypic case type(s). Equivalent to the 'i' point type used in the cross K function. The default is |
r |
a numeric vector giving the spatial distances |
correction |
type of edge correction to be applied (default set to 'none'). See the Kcross function in the spatstat.explore package for more details |
a data frame with a minimum of three columns giving the radius (r
), the theoretical value of the K function for a Poisson process (theo
), and value of the K function evaluated at radius r
. The column name gives the type of edge correction used
John Giles
Baddeley A, Rubak E, and Turner R. (2016). "Spatial Point Patterns: Methodology and Applications with R". CRC Press.
data(DengueSimR01) k <- get.cross.K(epi.data=DengueSimR01, type=5, hom=2, het=NULL, r=NULL, correction='border') plot(k[,2], type='l', col='red', lty=2, xlab='r', ylab='cross K function') lines(k$border)
data(DengueSimR01) k <- get.cross.K(epi.data=DengueSimR01, type=5, hom=2, het=NULL, r=NULL, correction='border') plot(k[,2], type='l', col='red', lty=2, xlab='r', ylab='cross K function') lines(k$border)
A wrapper function of the pcf function from the spatstat.explore package (Baddeley et al. 2016) that takes epidemiological data used by IDSpatialStats functions and calculates the cross type Pair Correlation Function based on user defined case type homology
get.cross.PCF(epi.data, type, hom, het = NULL, r = NULL, correction = "none")
get.cross.PCF(epi.data, type, hom, het = NULL, r = NULL, correction = "none")
epi.data |
a three-column numerical matrix that contains coordinates ( |
type |
an integer giving the column that contains information on case type. Must be an integer or a character |
hom |
a scalar or vector giving the homotypic case type(s). Equivalent to the 'j' point type used in the cross K function. Must be an integer or character |
het |
a scalar or vector giving the heterotypic case type(s). Equivalent to the 'i' point type used in the cross K function. The default is |
r |
a numeric vector giving the spatial distances |
correction |
type of edge correction to be applied (default set to 'none'). See the pcf function in the spatstat.explore package for more details |
a data frame with two columns giving the radius r
, the theoretical value of the Pair Correlation Function for a Poisson process (theo
), and value of the Pair Correlation Function pcf
John Giles
Baddeley A, Rubak E, and Turner R. (2016). "Spatial Point Patterns: Methodology and Applications with R". CRC Press.
data(DengueSimR01) g <- get.cross.PCF(epi.data=DengueSimR01, type=5, hom=2, het=NULL, r=NULL, correction='none') plot(g$pcf, type='l', xlab='r', ylab='cross PCF') abline(h=1, col='red', lty=2)
data(DengueSimR01) g <- get.cross.PCF(epi.data=DengueSimR01, type=5, hom=2, het=NULL, r=NULL, correction='none') plot(g$pcf, type='l', xlab='r', ylab='cross PCF') abline(h=1, col='red', lty=2)
get.pi
Generalized version of the get.pi
function that takes in an arbitrary function and
returns the probability that a point within a particular range of a point of interest shares the relationship
specified by the passed in function with that point.
get.pi(posmat, fun, r = 1, r.low = rep(0, length(r)), data.frame = TRUE)
get.pi(posmat, fun, r = 1, r.low = rep(0, length(r)), data.frame = TRUE)
posmat |
a matrix with columns x, y and any other named columns
columns needed by |
fun |
a function that takes in two rows of
Note that names from |
r |
the series of spatial distances (or there maximums) we are interested in |
r.low |
the low end of each range, 0 by default |
data.frame |
logical indicating whether to return results as a data frame (default = TRUE) |
pi value for each distance range that we look at. Where:
Justin Lessler and Henrik Salje
Other get.pi:
get.pi.bootstrap()
,
get.pi.ci()
,
get.pi.permute()
,
get.pi.typed()
,
get.pi.typed.bootstrap()
,
get.pi.typed.permute()
Other spatialtau:
get.tau()
,
get.theta()
data(DengueSimR02) r.max<-seq(20,1000,20) r.min<-seq(0,980,20) sero.type.func<-function(a,b,tlimit=20){ if(a[5]==b[5]&(abs(a[3]-b[3])<=tlimit)){rc=1} else{rc=2} return(rc) } sero.pi<-get.pi(DengueSimR02,sero.type.func,r=r.max,r.low=r.min)
data(DengueSimR02) r.max<-seq(20,1000,20) r.min<-seq(0,980,20) sero.type.func<-function(a,b,tlimit=20){ if(a[5]==b[5]&(abs(a[3]-b[3])<=tlimit)){rc=1} else{rc=2} return(rc) } sero.pi<-get.pi(DengueSimR02,sero.type.func,r=r.max,r.low=r.min)
get.pi
values.Runs get.pi
on multiple bootstraps of the data. Is formulated
such that the relationships between
points and themselves will not be calculated.
get.pi.bootstrap( posmat, fun, r = 1, r.low = rep(0, length(r)), boot.iter = 500, data.frame = TRUE )
get.pi.bootstrap( posmat, fun, r = 1, r.low = rep(0, length(r)), boot.iter = 500, data.frame = TRUE )
posmat |
a matrix with columns type, x and y |
fun |
the function to decide relationships |
r |
the series of spatial distances we are interested in |
r.low |
the low end of each range. 0 by default |
boot.iter |
the number of bootstrap iterations |
data.frame |
logical indicating whether to return results as a data frame (default = TRUE) |
pi values for all the distances we looked at
In each bootstrap iteration N observations are drawn from the existing data with replacement. To avoid errors in inference resulting from the same observatin being compared with itself in the bootstrapped data set, original indices are perserved, and pairs of points in the bootstrapped dataset with the same original index are ignored.
Justin Lessler and Henrik Salje
Other get.pi:
get.pi()
,
get.pi.ci()
,
get.pi.permute()
,
get.pi.typed()
,
get.pi.typed.bootstrap()
,
get.pi.typed.permute()
#compare normally distributed with uniform points x<-cbind(1,runif(100,-100,100), runif(100,-100,100)) x<-rbind(x, cbind(2,rnorm(100,0,20), rnorm(100,0,20))) colnames(x) <- c("type","x","y") fun<-function(a,b) { if(a[1]!=2) return(3) if (b[1]==2) return(1) return(2) } r.max<-seq(10,100,10) r.min<-seq(0,90,10) r.mid <- (r.max+r.min)/2 pi<-get.pi(x,fun,r=r.max,r.low=r.min) pi.boot<-get.pi.bootstrap(x,fun,r=r.max,r.low=r.min,boot.iter=100) pi.ci<-apply(pi.boot[,-(1:2)],1,quantile,probs=c(0.25,0.75)) plot(r.mid, pi$pi , type="l") lines(r.mid, pi.ci[1,] , lty=2) lines(r.mid, pi.ci[2,] , lty=2)
#compare normally distributed with uniform points x<-cbind(1,runif(100,-100,100), runif(100,-100,100)) x<-rbind(x, cbind(2,rnorm(100,0,20), rnorm(100,0,20))) colnames(x) <- c("type","x","y") fun<-function(a,b) { if(a[1]!=2) return(3) if (b[1]==2) return(1) return(2) } r.max<-seq(10,100,10) r.min<-seq(0,90,10) r.mid <- (r.max+r.min)/2 pi<-get.pi(x,fun,r=r.max,r.low=r.min) pi.boot<-get.pi.bootstrap(x,fun,r=r.max,r.low=r.min,boot.iter=100) pi.ci<-apply(pi.boot[,-(1:2)],1,quantile,probs=c(0.25,0.75)) plot(r.mid, pi$pi , type="l") lines(r.mid, pi.ci[1,] , lty=2) lines(r.mid, pi.ci[2,] , lty=2)
get.pi
values.Wrapper to get.pi.bootstrap
that takes care of calculating the
confidence intervals based on the bootstrapped values..
get.pi.ci( posmat, fun, r = 1, r.low = rep(0, length(r)), boot.iter = 1000, ci.low = 0.025, ci.high = 0.975, data.frame = TRUE )
get.pi.ci( posmat, fun, r = 1, r.low = rep(0, length(r)), boot.iter = 1000, ci.low = 0.025, ci.high = 0.975, data.frame = TRUE )
posmat |
a matrix with columns type, x and y |
fun |
the function to decide relationships |
r |
the series of spatial distances wer are interested in |
r.low |
the low end of each range. 0 by default |
boot.iter |
the number of bootstrap iterations |
ci.low |
the low end of the ci...0.025 by default |
ci.high |
the high end of the ci...0.975 by default |
data.frame |
logical indicating whether to return results as a data frame (default = TRUE) |
a matrix with a row for the high and low values and a column per distance
Justin Lessler
Other get.pi:
get.pi()
,
get.pi.bootstrap()
,
get.pi.permute()
,
get.pi.typed()
,
get.pi.typed.bootstrap()
,
get.pi.typed.permute()
#compare normally distributed with uniform points x<-cbind(1,runif(100,-100,100), runif(100,-100,100)) x<-rbind(x, cbind(2,rnorm(100,0,20), rnorm(100,0,20))) colnames(x) <- c("type","x","y") fun<-function(a,b) { if(a[1]!=2) return(3) if (b[1]==2) return(1) return(2) } r.max<-seq(10,100,10) r.min<-seq(0,90,10) r.mid <- (r.max+r.min)/2 pi<-get.pi(x,fun,r=r.max,r.low=r.min) pi.ci<-get.pi.ci(x,fun,r=r.max,r.low=r.min,boot.iter=100) plot(r.mid, pi$pi, type="l") lines(r.mid, pi.ci[,2] , lty=2) lines(r.mid, pi.ci[,3] , lty=2)
#compare normally distributed with uniform points x<-cbind(1,runif(100,-100,100), runif(100,-100,100)) x<-rbind(x, cbind(2,rnorm(100,0,20), rnorm(100,0,20))) colnames(x) <- c("type","x","y") fun<-function(a,b) { if(a[1]!=2) return(3) if (b[1]==2) return(1) return(2) } r.max<-seq(10,100,10) r.min<-seq(0,90,10) r.mid <- (r.max+r.min)/2 pi<-get.pi(x,fun,r=r.max,r.low=r.min) pi.ci<-get.pi.ci(x,fun,r=r.max,r.low=r.min,boot.iter=100) plot(r.mid, pi$pi, type="l") lines(r.mid, pi.ci[,2] , lty=2) lines(r.mid, pi.ci[,3] , lty=2)
get.pi
functionDoes permutations to calculate the null distribution of get pi if there were no spatial dependence. Randomly reassigns coordinates to each observation permutations times
get.pi.permute( posmat, fun, r = 1, r.low = rep(0, length(r)), permutations, data.frame = TRUE )
get.pi.permute( posmat, fun, r = 1, r.low = rep(0, length(r)), permutations, data.frame = TRUE )
posmat |
a matrix with columns type, x and y |
fun |
the function to evaluate |
r |
the series of spatial distances we are interested in |
r.low |
the low end of each range....0 by default |
permutations |
the number of permute iterations |
data.frame |
logical indicating whether to return results as a data frame (default = TRUE) |
pi values for all the distances we looked at
Other get.pi:
get.pi()
,
get.pi.bootstrap()
,
get.pi.ci()
,
get.pi.typed()
,
get.pi.typed.bootstrap()
,
get.pi.typed.permute()
#compare normally distributed with uniform points x<-cbind(1,runif(100,-100,100), runif(100,-100,100)) x<-rbind(x, cbind(2,rnorm(100,0,20), rnorm(100,0,20))) colnames(x) <- c("type","x","y") fun<-function(a,b) { if(a[1]!=2) return(3) if (b[1]==2) return(1) return(2) } r.max<-seq(10,100,10) r.min<-seq(0,90,10) r.mid <- (r.max+r.min)/2 pi<-get.pi(x,fun,r=r.max,r.low=r.min) pi.null<-get.pi.permute(x,fun,r=r.max,r.low=r.min,permutations=100) null.ci<-apply(pi.null[,-(1:2)],1,quantile,probs=c(0.25,0.75)) plot(r.mid, pi$pi, type="l") lines(r.mid, null.ci[1,] , lty=2) lines(r.mid, null.ci[2,] , lty=2)
#compare normally distributed with uniform points x<-cbind(1,runif(100,-100,100), runif(100,-100,100)) x<-rbind(x, cbind(2,rnorm(100,0,20), rnorm(100,0,20))) colnames(x) <- c("type","x","y") fun<-function(a,b) { if(a[1]!=2) return(3) if (b[1]==2) return(1) return(2) } r.max<-seq(10,100,10) r.min<-seq(0,90,10) r.mid <- (r.max+r.min)/2 pi<-get.pi(x,fun,r=r.max,r.low=r.min) pi.null<-get.pi.permute(x,fun,r=r.max,r.low=r.min,permutations=100) null.ci<-apply(pi.null[,-(1:2)],1,quantile,probs=c(0.25,0.75)) plot(r.mid, pi$pi, type="l") lines(r.mid, null.ci[1,] , lty=2) lines(r.mid, null.ci[2,] , lty=2)
get.pi
for typed data.Version of the get.pi
function that is optimized for statically typed data. That is
data where we are interested in the probability of points within some distance of points of
typeA are of typeB.
get.pi.typed( posmat, typeA = -1, typeB = -1, r = 1, r.low = rep(0, length(r)), data.frame = TRUE )
get.pi.typed( posmat, typeA = -1, typeB = -1, r = 1, r.low = rep(0, length(r)), data.frame = TRUE )
posmat |
a matrix with columns type, x and y |
typeA |
the "from" type that we are interested in, -1 is wildcard |
typeB |
the "to" type that we are interested i, -1 is wildcard |
r |
the series of spatial distances wer are interested in |
r.low |
the low end of each range....0 by default |
data.frame |
logical indicating whether to return results as a data frame (default = TRUE) |
pi values for all the distances we looked at
Justin Lessler and Henrik Salje
Other get.pi:
get.pi()
,
get.pi.bootstrap()
,
get.pi.ci()
,
get.pi.permute()
,
get.pi.typed.bootstrap()
,
get.pi.typed.permute()
data(DengueSimR02) r.max<-seq(20,1000,20) r.min<-seq(0,980,20) #Lets see if there's a difference in spatial dependence by time case occurs type<-2-(DengueSimR02[,"time"]<120) tmp<-cbind(DengueSimR02,type=type) typed.pi<-get.pi.typed(tmp,typeA=1,typeB=2,r=r.max,r.low=r.min)
data(DengueSimR02) r.max<-seq(20,1000,20) r.min<-seq(0,980,20) #Lets see if there's a difference in spatial dependence by time case occurs type<-2-(DengueSimR02[,"time"]<120) tmp<-cbind(DengueSimR02,type=type) typed.pi<-get.pi.typed(tmp,typeA=1,typeB=2,r=r.max,r.low=r.min)
get.pi.typed
Bootstraps typed pi values. Makes sure distances between a sample and another draw of itself are left out
get.pi.typed.bootstrap( posmat, typeA = -1, typeB = -1, r = 1, r.low = rep(0, length(r)), boot.iter, data.frame = TRUE )
get.pi.typed.bootstrap( posmat, typeA = -1, typeB = -1, r = 1, r.low = rep(0, length(r)), boot.iter, data.frame = TRUE )
posmat |
a matrix with columns type, x and y |
typeA |
the "from" type that we are interested in, -1 is wildcard |
typeB |
the "to" type that we are interested i, -1 is wildcard |
r |
the series of spatial distances we are interested in |
r.low |
the low end of each range....0 by default |
boot.iter |
the number of bootstrap iterations |
data.frame |
logical indicating whether to return results as a data frame (default = TRUE) |
pi values for all the distances we looked at
Other get.pi:
get.pi()
,
get.pi.bootstrap()
,
get.pi.ci()
,
get.pi.permute()
,
get.pi.typed()
,
get.pi.typed.permute()
data(DengueSimR02) r.max<-seq(20,1000,20) r.min<-seq(0,980,20) #Lets see if there's a difference in spatial dependence by time case occurs type<-2-(DengueSimR02[,"time"]<120) tmp<-cbind(DengueSimR02,type=type) typed.pi.bs<-get.pi.typed.bootstrap(tmp,typeA=1,typeB=2,r=r.max,r.low=r.min,boot.iter=100)
data(DengueSimR02) r.max<-seq(20,1000,20) r.min<-seq(0,980,20) #Lets see if there's a difference in spatial dependence by time case occurs type<-2-(DengueSimR02[,"time"]<120) tmp<-cbind(DengueSimR02,type=type) typed.pi.bs<-get.pi.typed.bootstrap(tmp,typeA=1,typeB=2,r=r.max,r.low=r.min,boot.iter=100)
Does permutations to calculate the null distribution of get pi if there were no spatial dependence. Randomly reassigns coordinates to each observation permutations times
get.pi.typed.permute( posmat, typeA = -1, typeB = -1, r = 1, r.low = rep(0, length(r)), permutations, data.frame = TRUE )
get.pi.typed.permute( posmat, typeA = -1, typeB = -1, r = 1, r.low = rep(0, length(r)), permutations, data.frame = TRUE )
posmat |
a matrix with columns type, x and y |
typeA |
the "from" type that we are interested in, -1 is wildcard |
typeB |
the "to" type that we are interested i, -1 is wildcard |
r |
the series of spatial distances we are interested in |
r.low |
the low end of each range....0 by default |
permutations |
the number of permute iterations |
data.frame |
logical indicating whether to return results as a data frame (default = TRUE) |
pi values for all the distances we looked at
Justin Lessler and Henrik Salje
Other get.pi:
get.pi()
,
get.pi.bootstrap()
,
get.pi.ci()
,
get.pi.permute()
,
get.pi.typed()
,
get.pi.typed.bootstrap()
data(DengueSimR02) r.max<-seq(20,1000,20) r.min<-seq(0,980,20) #Lets see if there's a difference in spatial dependence by time case occurs type<-2-(DengueSimR02[,"time"]<75) tmp<-cbind(DengueSimR02,type=type) typed.pi<-get.pi.typed(tmp,typeA=1,typeB=2,r=r.max,r.low=r.min) typed.pi.type.null<-get.pi.typed.permute(tmp,typeA=1,typeB=2,r=r.max,r.low=r.min,permutations=100)
data(DengueSimR02) r.max<-seq(20,1000,20) r.min<-seq(0,980,20) #Lets see if there's a difference in spatial dependence by time case occurs type<-2-(DengueSimR02[,"time"]<75) tmp<-cbind(DengueSimR02,type=type) typed.pi<-get.pi.typed(tmp,typeA=1,typeB=2,r=r.max,r.low=r.min) typed.pi.type.null<-get.pi.typed.permute(tmp,typeA=1,typeB=2,r=r.max,r.low=r.min,permutations=100)
get.tau
returns the relative probability (or odds) that points at some distance from an index point share some relationship with that point versus the probability (or odds) any point shares that relationship with that point.
get.tau( posmat, fun, r = 1, r.low = rep(0, length(r)), comparison.type = "representative", data.frame = TRUE )
get.tau( posmat, fun, r = 1, r.low = rep(0, length(r)), comparison.type = "representative", data.frame = TRUE )
posmat |
a matrix with columns x, y and any other named columns columns needed by fun |
fun |
a function that takes in two rows of posmat and returns:
Note that names from |
r |
the series of spatial distances (or there maximums) we are interested in |
r.low |
the low end of each range, 0 by default |
comparison.type |
what type of points are included in the comparison set.
|
data.frame |
logical indicating whether to return results as a data frame (default = TRUE) |
The tau value for each distance we look at. If comparison.type
is "representative", this is:
tau = get.pi(posmat, fun, r, r.low)/get.pi(posmat,fun,infinity,0)
If comparison.type
is "independent", this is:
tau = get.theta(posmat, fun, r, r.low)/get.theta(posmat,fun,infinity,0)
Justin Lessler and Henrik Salje
Other get.tau:
get.tau.bootstrap()
,
get.tau.ci()
,
get.tau.permute()
,
get.tau.typed()
,
get.tau.typed.bootstrap()
,
get.tau.typed.permute()
Other spatialtau:
get.pi()
,
get.theta()
data(DengueSimR01) data(DengueSimR02) data(DengueSimRepresentative) r.max<-seq(20,1000,20) r.min<-seq(0,980,20) r.mid<-(r.max+r.min)/2 sero.type.func<-function(a,b,tlimit=20){ if(a[5]==b[5]&(abs(a[3]-b[3])<=tlimit)){rc=1} else{rc=2} return(rc) } geno.type.func<-function(a,b,tlimit=20){ if(a[4]==b[4]&(abs(a[3]-b[3])<=tlimit)){rc=1} else{rc=2} return(rc) } sero.type.rep.func<-function(a,b,tlimit=20){ if(a[5]==1&b[5]==1&(abs(a[3]-b[3])<=tlimit)){rc=1} else{if(a[5]==1&b[5]==-999){rc=2}else{rc=3}} return(rc) } sero.tau.R01 <- get.tau(DengueSimR01, sero.type.func, r=r.max, r.low=r.min, comparison.type="independent") geno.tau.R01 <- get.tau(DengueSimR01, geno.type.func, r=r.max, r.low=r.min, comparison.type="independent") sero.tau.R02 <- get.tau(DengueSimR02, sero.type.func, r=r.max, r.low=r.min, comparison.type="independent") geno.tau.R02 <- get.tau(DengueSimR02, geno.type.func, r=r.max, r.low=r.min, comparison.type="independent") sero.tau.representative <- get.tau(DengueSimRepresentative, sero.type.rep.func, r=r.max, r.low=r.min, comparison.type="representative") ## R0 of 1 plot(r.mid,sero.tau.R01$tau,ylim=c(0.3,max(geno.tau.R01$tau)),log="y", cex.axis=1.25,col=rgb(t(col2rgb("blue")/255),alpha=0.6), xlab="Distance (m)",ylab="Tau",cex.main=0.9,lwd=2,type="l",las=1,cex.axis=0.75) abline(h=1,lty=2) abline(v=100,lty=1,lwd=2) lines(r.mid,geno.tau.R01$tau,pch=20,col=rgb(t(col2rgb("dark green")/255),alpha=0.6),lwd=1) lines(r.mid,sero.tau.representative$tau,pch=20,col=rgb(t(col2rgb("dark blue")/255),alpha=0.6),lty=2) legend("topright", legend=c("Genotype", "Serotype", "Serotype (representative population)", "Maximum transmission distance"), lwd=1,col=c("dark green","blue","blue","black"), lty=c(1,1,2,1),bty="n") ## R0 of 2 plot(r.mid,sero.tau.R02$tau,ylim=c(0.3,max(geno.tau.R02)),log="y", cex.axis=1.25,col=rgb(t(col2rgb("blue")/255),alpha=0.6), xlab="Distance (m)",ylab="Tau",cex.main=0.9,lwd=2,type="l",las=1,cex.axis=0.75) abline(h=1,lty=2) abline(v=100,lty=1,lwd=2) lines(r.mid,geno.tau.R02$tau,pch=20,col=rgb(t(col2rgb("dark green")/255),alpha=0.6),lwd=1) legend("topright", legend=c("Genotype", "Serotype", "Maximum transmission distance"), lwd=1,col=c("dark green","blue","black"),lty=1,bty="n")
data(DengueSimR01) data(DengueSimR02) data(DengueSimRepresentative) r.max<-seq(20,1000,20) r.min<-seq(0,980,20) r.mid<-(r.max+r.min)/2 sero.type.func<-function(a,b,tlimit=20){ if(a[5]==b[5]&(abs(a[3]-b[3])<=tlimit)){rc=1} else{rc=2} return(rc) } geno.type.func<-function(a,b,tlimit=20){ if(a[4]==b[4]&(abs(a[3]-b[3])<=tlimit)){rc=1} else{rc=2} return(rc) } sero.type.rep.func<-function(a,b,tlimit=20){ if(a[5]==1&b[5]==1&(abs(a[3]-b[3])<=tlimit)){rc=1} else{if(a[5]==1&b[5]==-999){rc=2}else{rc=3}} return(rc) } sero.tau.R01 <- get.tau(DengueSimR01, sero.type.func, r=r.max, r.low=r.min, comparison.type="independent") geno.tau.R01 <- get.tau(DengueSimR01, geno.type.func, r=r.max, r.low=r.min, comparison.type="independent") sero.tau.R02 <- get.tau(DengueSimR02, sero.type.func, r=r.max, r.low=r.min, comparison.type="independent") geno.tau.R02 <- get.tau(DengueSimR02, geno.type.func, r=r.max, r.low=r.min, comparison.type="independent") sero.tau.representative <- get.tau(DengueSimRepresentative, sero.type.rep.func, r=r.max, r.low=r.min, comparison.type="representative") ## R0 of 1 plot(r.mid,sero.tau.R01$tau,ylim=c(0.3,max(geno.tau.R01$tau)),log="y", cex.axis=1.25,col=rgb(t(col2rgb("blue")/255),alpha=0.6), xlab="Distance (m)",ylab="Tau",cex.main=0.9,lwd=2,type="l",las=1,cex.axis=0.75) abline(h=1,lty=2) abline(v=100,lty=1,lwd=2) lines(r.mid,geno.tau.R01$tau,pch=20,col=rgb(t(col2rgb("dark green")/255),alpha=0.6),lwd=1) lines(r.mid,sero.tau.representative$tau,pch=20,col=rgb(t(col2rgb("dark blue")/255),alpha=0.6),lty=2) legend("topright", legend=c("Genotype", "Serotype", "Serotype (representative population)", "Maximum transmission distance"), lwd=1,col=c("dark green","blue","blue","black"), lty=c(1,1,2,1),bty="n") ## R0 of 2 plot(r.mid,sero.tau.R02$tau,ylim=c(0.3,max(geno.tau.R02)),log="y", cex.axis=1.25,col=rgb(t(col2rgb("blue")/255),alpha=0.6), xlab="Distance (m)",ylab="Tau",cex.main=0.9,lwd=2,type="l",las=1,cex.axis=0.75) abline(h=1,lty=2) abline(v=100,lty=1,lwd=2) lines(r.mid,geno.tau.R02$tau,pch=20,col=rgb(t(col2rgb("dark green")/255),alpha=0.6),lwd=1) legend("topright", legend=c("Genotype", "Serotype", "Maximum transmission distance"), lwd=1,col=c("dark green","blue","black"),lty=1,bty="n")
get.tau
values.Runs get.tau
on multiple bootstraps of the data. Is formulated
such that the relationship between points and themselves will not be
calculated
get.tau.bootstrap( posmat, fun, r = 1, r.low = rep(0, length(r)), boot.iter, comparison.type = "representative", data.frame = TRUE )
get.tau.bootstrap( posmat, fun, r = 1, r.low = rep(0, length(r)), boot.iter, comparison.type = "representative", data.frame = TRUE )
posmat |
a matrix appropriate for input to |
fun |
a function appropriate as input to |
r |
the series of spatial distances wer are interested in |
r.low |
the low end of each range....0 by default |
boot.iter |
the number of bootstrap iterations |
comparison.type |
the comparison type to pass as input to |
data.frame |
logical indicating whether to return results as a data frame (default = TRUE) |
a matrix containing all bootstrapped values of tau for each distance interval
Justin Lessler and Henrik Salje
Other get.tau:
get.tau()
,
get.tau.ci()
,
get.tau.permute()
,
get.tau.typed()
,
get.tau.typed.bootstrap()
,
get.tau.typed.permute()
#compare normally distributed with uniform points x<-cbind(1,runif(100,-100,100), runif(100,-100,100)) x<-rbind(x, cbind(2,rnorm(100,0,20), rnorm(100,0,20))) colnames(x) <- c("type","x","y") fun<-function(a,b) { if(a[1]!=2) return(3) if (b[1]==2) return(1) return(2) } r.max<-seq(10,100,10) r.min<-seq(0,90,10) r.mid <- (r.max+r.min)/2 tau<-get.tau(x,fun,r=r.max,r.low=r.min) tau.boot<-get.tau.bootstrap(x,fun,r=r.max,r.low=r.min,boot.iter=50) tau.ci<-apply(tau.boot[,-(1:2)],1,quantile,probs=c(0.25,0.75)) plot(r.mid, tau$tau ,ylim=c(min(tau.ci),max(tau.ci)), type="l", log="y") lines(c(0,100),c(1,1), lty=3, col="grey") lines(r.mid, tau.ci[1,] , lty=2) lines(r.mid, tau.ci[2,] , lty=2)
#compare normally distributed with uniform points x<-cbind(1,runif(100,-100,100), runif(100,-100,100)) x<-rbind(x, cbind(2,rnorm(100,0,20), rnorm(100,0,20))) colnames(x) <- c("type","x","y") fun<-function(a,b) { if(a[1]!=2) return(3) if (b[1]==2) return(1) return(2) } r.max<-seq(10,100,10) r.min<-seq(0,90,10) r.mid <- (r.max+r.min)/2 tau<-get.tau(x,fun,r=r.max,r.low=r.min) tau.boot<-get.tau.bootstrap(x,fun,r=r.max,r.low=r.min,boot.iter=50) tau.ci<-apply(tau.boot[,-(1:2)],1,quantile,probs=c(0.25,0.75)) plot(r.mid, tau$tau ,ylim=c(min(tau.ci),max(tau.ci)), type="l", log="y") lines(c(0,100),c(1,1), lty=3, col="grey") lines(r.mid, tau.ci[1,] , lty=2) lines(r.mid, tau.ci[2,] , lty=2)
get.tau
valuesWrapper to get.tau.bootstrap
that takes care of calulating
the confidence intervals based on the bootstrapped values
get.tau.ci( posmat, fun, r = 1, r.low = rep(0, length(r)), boot.iter = 1000, comparison.type = "representative", ci.low = 0.025, ci.high = 0.975, data.frame = TRUE )
get.tau.ci( posmat, fun, r = 1, r.low = rep(0, length(r)), boot.iter = 1000, comparison.type = "representative", ci.low = 0.025, ci.high = 0.975, data.frame = TRUE )
posmat |
a matrix appropriate for input to |
fun |
a function appropriate as input to |
r |
the series of spatial distances we are interested in |
r.low |
the low end of each range....0 by default |
boot.iter |
the number of bootstrap iterations |
comparison.type |
the comparison type to pass to get.tau |
ci.low |
the low end of the ci...0.025 by default |
ci.high |
the high end of the ci...0.975 by default |
data.frame |
logical indicating whether to return results as a data frame (default = TRUE) |
a data frame with the point estimate of tau and its low and high confidence interval at each distance
Justin Lessler and Henrik Salje
Other get.tau:
get.tau()
,
get.tau.bootstrap()
,
get.tau.permute()
,
get.tau.typed()
,
get.tau.typed.bootstrap()
,
get.tau.typed.permute()
#compare normally distributed with uniform points x<-cbind(1,runif(100,-100,100), runif(100,-100,100)) x<-rbind(x, cbind(2,rnorm(100,0,20), rnorm(100,0,20))) colnames(x) <- c("type","x","y") fun<-function(a,b) { if(a[1]!=2) return(3) if (b[1]==2) return(1) return(2) } r.max<-seq(10,100,10) r.min<-seq(0,90,10) r.mid <- (r.max+r.min)/2 tau <- get.tau.ci(x,fun,r=r.max,r.low=r.min,boot.iter=50) plot(r.mid, tau$pt.est, ylim=c(1/max(tau[,3:5]), max(tau[,3:5])), type="l", log="y", xlab="Distance", ylab="Tau") lines(r.mid, tau$ci.low , lty=2) lines(r.mid, tau$ci.high, lty=2) lines(c(0,100),c(1,1), lty=3, col="grey")
#compare normally distributed with uniform points x<-cbind(1,runif(100,-100,100), runif(100,-100,100)) x<-rbind(x, cbind(2,rnorm(100,0,20), rnorm(100,0,20))) colnames(x) <- c("type","x","y") fun<-function(a,b) { if(a[1]!=2) return(3) if (b[1]==2) return(1) return(2) } r.max<-seq(10,100,10) r.min<-seq(0,90,10) r.mid <- (r.max+r.min)/2 tau <- get.tau.ci(x,fun,r=r.max,r.low=r.min,boot.iter=50) plot(r.mid, tau$pt.est, ylim=c(1/max(tau[,3:5]), max(tau[,3:5])), type="l", log="y", xlab="Distance", ylab="Tau") lines(r.mid, tau$ci.low , lty=2) lines(r.mid, tau$ci.high, lty=2) lines(c(0,100),c(1,1), lty=3, col="grey")
get.tau
functionDoes permutations to calculate the null distribution of get pi if there were no spatial dependence. Randomly reassigns coordinates to each observation permutations times
get.tau.permute( posmat, fun, r = 1, r.low = rep(0, length(r)), permutations, comparison.type = "representative", data.frame = TRUE )
get.tau.permute( posmat, fun, r = 1, r.low = rep(0, length(r)), permutations, comparison.type = "representative", data.frame = TRUE )
posmat |
a matrix appropriate for input to |
fun |
a function appropriate for input to |
r |
the series of spatial distances we are interested in |
r.low |
the low end of each range....0 by default |
permutations |
the number of permute iterations |
comparison.type |
the comparison type to pass as input to |
data.frame |
logical indicating whether to return results as a data frame (default = TRUE) |
tau values for all the distances we looked at
Justin Lessler and Henrik Salje
Other get.tau:
get.tau()
,
get.tau.bootstrap()
,
get.tau.ci()
,
get.tau.typed()
,
get.tau.typed.bootstrap()
,
get.tau.typed.permute()
#compare normally distributed with uniform points x<-cbind(1,runif(100,-100,100), runif(100,-100,100)) x<-rbind(x, cbind(2,rnorm(100,0,20), rnorm(100,0,20))) colnames(x) <- c("type","x","y") fun<-function(a,b) { if(a[1]!=2) return(3) if (b[1]==2) return(1) return(2) } r.max<-seq(10,100,10) r.min<-seq(0,90,10) r.mid <- (r.max+r.min)/2 tau<-get.tau(x,fun,r=r.max,r.low=r.min,comparison.type = "independent") tau.null<-get.tau.permute(x,fun,r=r.max,r.low=r.min,permutations=50,comparison.type = "independent") null.ci<-apply(tau.null[,-(1:2)],1,quantile,probs=c(0.25,0.75)) plot(r.mid, tau$tau, ylim=c(1/max(tau$tau),max(tau$tau)), type="l", log="y") lines(c(0,100),c(1,1), lty=3, col="grey") lines(r.mid, null.ci[1,] , lty=2) lines(r.mid, null.ci[2,] , lty=2)
#compare normally distributed with uniform points x<-cbind(1,runif(100,-100,100), runif(100,-100,100)) x<-rbind(x, cbind(2,rnorm(100,0,20), rnorm(100,0,20))) colnames(x) <- c("type","x","y") fun<-function(a,b) { if(a[1]!=2) return(3) if (b[1]==2) return(1) return(2) } r.max<-seq(10,100,10) r.min<-seq(0,90,10) r.mid <- (r.max+r.min)/2 tau<-get.tau(x,fun,r=r.max,r.low=r.min,comparison.type = "independent") tau.null<-get.tau.permute(x,fun,r=r.max,r.low=r.min,permutations=50,comparison.type = "independent") null.ci<-apply(tau.null[,-(1:2)],1,quantile,probs=c(0.25,0.75)) plot(r.mid, tau$tau, ylim=c(1/max(tau$tau),max(tau$tau)), type="l", log="y") lines(c(0,100),c(1,1), lty=3, col="grey") lines(r.mid, null.ci[1,] , lty=2) lines(r.mid, null.ci[2,] , lty=2)
get.tau
for typed dataVersion of th e get.tau
function that is optimized for
statically typed data. That is data where we want the relationship between
points of type A and points of type B
get.tau.typed( posmat, typeA = -1, typeB = -1, r = 1, r.low = rep(0, length(r)), comparison.type = "representative", data.frame = TRUE )
get.tau.typed( posmat, typeA = -1, typeB = -1, r = 1, r.low = rep(0, length(r)), comparison.type = "representative", data.frame = TRUE )
posmat |
a matrix with columns type, x and y |
typeA |
the "from" type that we are interested in, -1 is wildcard |
typeB |
the "to" type that we are interested i, -1 is wildcard |
r |
the series of spatial distances we are interested in |
r.low |
the low end of each range....0 by default |
comparison.type |
what type of points are included in the comparison set.
|
data.frame |
logical indicating whether to return results as a data frame (default = TRUE) |
data frame of tau values for all the distances
Justin Lessler and Henrik Salje
Other get.tau:
get.tau()
,
get.tau.bootstrap()
,
get.tau.ci()
,
get.tau.permute()
,
get.tau.typed.bootstrap()
,
get.tau.typed.permute()
data(DengueSimR02) r.max<-seq(20,1000,20) r.min<-seq(0,980,20) r.mid<-(r.max+r.min)/2 #Lets see if there's a difference in spatial dependence by time case occurs type<-2-(DengueSimR02[,"time"]<120) tmp<-cbind(DengueSimR02,type=type) typed.tau<-get.tau.typed(tmp,typeA=1,typeB=2,r=r.max,r.low=r.min,comparison.type = "independent") plot(r.mid,typed.tau$tau,log="y",cex.axis=1.25, xlab="Distance (m)",ylab="Tau",cex.main=0.9,lwd=2,type="l") abline(h=1,lty=2)
data(DengueSimR02) r.max<-seq(20,1000,20) r.min<-seq(0,980,20) r.mid<-(r.max+r.min)/2 #Lets see if there's a difference in spatial dependence by time case occurs type<-2-(DengueSimR02[,"time"]<120) tmp<-cbind(DengueSimR02,type=type) typed.tau<-get.tau.typed(tmp,typeA=1,typeB=2,r=r.max,r.low=r.min,comparison.type = "independent") plot(r.mid,typed.tau$tau,log="y",cex.axis=1.25, xlab="Distance (m)",ylab="Tau",cex.main=0.9,lwd=2,type="l") abline(h=1,lty=2)
get.tau.typed
runs bootstrapping for get.tau.typed
get.tau.typed.bootstrap( posmat, typeA = -1, typeB = -1, r = 1, r.low = rep(0, length(r)), boot.iter, comparison.type = "representative", data.frame = TRUE )
get.tau.typed.bootstrap( posmat, typeA = -1, typeB = -1, r = 1, r.low = rep(0, length(r)), boot.iter, comparison.type = "representative", data.frame = TRUE )
posmat |
a matrix with columns type, x and y |
typeA |
the "from" type that we are interested in, -1 is wildcard |
typeB |
the "to" type that we are interested i, -1 is wildcard |
r |
the series of spatial distances we are interested in |
r.low |
the low end of each range....0 by default |
boot.iter |
the number of bootstrap iterations |
comparison.type |
what type of points are included in the comparison set.
|
data.frame |
logical indicating whether to return results as a data frame (default = TRUE) |
tau values for all the distances we looked at
Justin Lessler and Henrik Salje
Other get.tau:
get.tau()
,
get.tau.bootstrap()
,
get.tau.ci()
,
get.tau.permute()
,
get.tau.typed()
,
get.tau.typed.permute()
data(DengueSimulationR02) r.max<-seq(20,1000,20) r.min<-seq(0,980,20) r.mid<-(r.max+r.min)/2 # Lets see if there's a difference in spatial dependence between those that occurred # late versus early in the outbreak type <- 2 - (DengueSimR02[,"time"] < 120) tmp <- cbind(DengueSimR02, type=type) typed.tau <- get.tau.typed(tmp, typeA=1, typeB=2, r=r.max, r.low=r.min, comparison.type = "independent") typed.tau.type.bs <- get.tau.typed.bootstrap(tmp, typeA=1, typeB=2, r=r.max, r.low=r.min, boot.iter=100, comparison.type = "independent") ci <- apply(typed.tau.type.bs[,-(1:2)], 1, quantile, probs=c(0.025,0.975)) plot(r.mid, typed.tau$tau, log="y", ylim=c(0.1,4), cex.axis=1.25, xlab="Distance (m)", ylab="Tau", cex.main=0.9, lwd=2, type="n") abline(h=1,lty=1) lines(r.mid,typed.tau$tau,pch=20,col=1,lwd=3) lines(r.mid, ci[1,] , lty=2) lines(r.mid, ci[2,] , lty=2)
data(DengueSimulationR02) r.max<-seq(20,1000,20) r.min<-seq(0,980,20) r.mid<-(r.max+r.min)/2 # Lets see if there's a difference in spatial dependence between those that occurred # late versus early in the outbreak type <- 2 - (DengueSimR02[,"time"] < 120) tmp <- cbind(DengueSimR02, type=type) typed.tau <- get.tau.typed(tmp, typeA=1, typeB=2, r=r.max, r.low=r.min, comparison.type = "independent") typed.tau.type.bs <- get.tau.typed.bootstrap(tmp, typeA=1, typeB=2, r=r.max, r.low=r.min, boot.iter=100, comparison.type = "independent") ci <- apply(typed.tau.type.bs[,-(1:2)], 1, quantile, probs=c(0.025,0.975)) plot(r.mid, typed.tau$tau, log="y", ylim=c(0.1,4), cex.axis=1.25, xlab="Distance (m)", ylab="Tau", cex.main=0.9, lwd=2, type="n") abline(h=1,lty=1) lines(r.mid,typed.tau$tau,pch=20,col=1,lwd=3) lines(r.mid, ci[1,] , lty=2) lines(r.mid, ci[2,] , lty=2)
get.tau.typed
functionget the null distribution for the get.tau.typed
function
get.tau.typed.permute( posmat, typeA = -1, typeB = -1, r = 1, r.low = rep(0, length(r)), permutations, comparison.type = "representative", data.frame = TRUE )
get.tau.typed.permute( posmat, typeA = -1, typeB = -1, r = 1, r.low = rep(0, length(r)), permutations, comparison.type = "representative", data.frame = TRUE )
posmat |
a matrix with columns type, x and y |
typeA |
the "from" type that we are interested in, -1 is wildcard |
typeB |
the "to" type that we are interested i, -1 is wildcard |
r |
the series of spatial distances we are interested in |
r.low |
the low end of each range....0 by default |
permutations |
the number of permute iterations |
comparison.type |
what type of points are included in the comparison set.
|
data.frame |
logical indicating whether to return results as a data frame (default = TRUE) |
a matrix with permutation tau values for each distance specified
Justin Lessler and Henrik Salje
Other get.tau:
get.tau()
,
get.tau.bootstrap()
,
get.tau.ci()
,
get.tau.permute()
,
get.tau.typed()
,
get.tau.typed.bootstrap()
data(DengueSimulationR02) r.max<-seq(20,1000,20) r.min<-seq(0,980,20) r.mid<-(r.max+r.min)/2 #Lets see if there's a difference in spatial dependence by time case occurs type <- 2 - (DengueSimR02[,"time"] < 120) tmp <- cbind(DengueSimR02, type=type) typed.tau <- get.tau.typed(tmp, typeA=1, typeB=2, r=r.max, r.low=r.min, comparison.type = "independent") typed.tau.type.null<-get.tau.typed.permute(tmp, typeA=1, typeB=2, r=r.max, r.low=r.min, permutations=100, comparison.type = "independent") null.ci <- apply(typed.tau.type.null[,-(1:2)], 1, quantile, probs=c(0.025,0.975)) plot(r.mid, typed.tau$tau, ylim=c(0.3,4), log="y", cex.axis=1.25, xlab="Distance (m)", ylab="Tau", cex.main=0.9, lwd=2, type="n") abline(h=1,lty=1) lines(r.mid,typed.tau$tau,pch=20,col=1,lwd=3) lines(r.mid, null.ci[1,] , lty=2) lines(r.mid, null.ci[2,] , lty=2)
data(DengueSimulationR02) r.max<-seq(20,1000,20) r.min<-seq(0,980,20) r.mid<-(r.max+r.min)/2 #Lets see if there's a difference in spatial dependence by time case occurs type <- 2 - (DengueSimR02[,"time"] < 120) tmp <- cbind(DengueSimR02, type=type) typed.tau <- get.tau.typed(tmp, typeA=1, typeB=2, r=r.max, r.low=r.min, comparison.type = "independent") typed.tau.type.null<-get.tau.typed.permute(tmp, typeA=1, typeB=2, r=r.max, r.low=r.min, permutations=100, comparison.type = "independent") null.ci <- apply(typed.tau.type.null[,-(1:2)], 1, quantile, probs=c(0.025,0.975)) plot(r.mid, typed.tau$tau, ylim=c(0.3,4), log="y", cex.axis=1.25, xlab="Distance (m)", ylab="Tau", cex.main=0.9, lwd=2, type="n") abline(h=1,lty=1) lines(r.mid,typed.tau$tau,pch=20,col=1,lwd=3) lines(r.mid, null.ci[1,] , lty=2) lines(r.mid, null.ci[2,] , lty=2)
get.theta
Generalized version of the get.theta
function that takes in an arbitrary function and
returns the odds that a point within a particular range of a point of interest shares the relationship
specified by the passed in function with that point.
get.theta(posmat, fun, r = 1, r.low = rep(0, length(r)), data.frame = TRUE)
get.theta(posmat, fun, r = 1, r.low = rep(0, length(r)), data.frame = TRUE)
posmat |
a matrix with columns x, y and any other named columns columns needed by fun |
fun |
a function that takes in two rows of posmat and returns:
Note that names from |
r |
the series of spatial distances (or there maximums) we are interested in |
r.low |
the low end of each range, 0 by default |
data.frame |
logical indicating whether to return results as a data frame (default = TRUE) |
theta value for each distance range that we look at. Where:
Justin Lessler and Henrik Salje
Other get.theta:
get.theta.bootstrap()
,
get.theta.ci()
,
get.theta.permute()
,
get.theta.typed()
,
get.theta.typed.bootstrap()
,
get.theta.typed.permute()
Other spatialtau:
get.pi()
,
get.tau()
data(DengueSimR02) r.max<-seq(20,1000,20) r.min<-seq(0,980,20) sero.type.func<-function(a,b,tlimit=20){ if(a[5]==b[5]&(abs(a[3]-b[3])<=tlimit)){rc=1} else{rc=2} return(rc) } sero.theta<-get.theta(DengueSimR02,sero.type.func,r=r.max,r.low=r.min)
data(DengueSimR02) r.max<-seq(20,1000,20) r.min<-seq(0,980,20) sero.type.func<-function(a,b,tlimit=20){ if(a[5]==b[5]&(abs(a[3]-b[3])<=tlimit)){rc=1} else{rc=2} return(rc) } sero.theta<-get.theta(DengueSimR02,sero.type.func,r=r.max,r.low=r.min)
get.theta
values.Runs get.theta
on multiple bootstraps of the data. Is formulated
such that the relationships between
points and themselves will not be calculated.
get.theta.bootstrap( posmat, fun, r = 1, r.low = rep(0, length(r)), boot.iter = 500, data.frame = TRUE )
get.theta.bootstrap( posmat, fun, r = 1, r.low = rep(0, length(r)), boot.iter = 500, data.frame = TRUE )
posmat |
a matrix with columns type, x and y |
fun |
the function to decide relationships |
r |
the series of spatial distances we are interested in |
r.low |
the low end of each range. 0 by default |
boot.iter |
the number of bootstrap iterations |
data.frame |
logical indicating whether to return results as a data frame (default = TRUE) |
theta values for all the distances we looked at
In each bootstrap iteration N observations are drawn from the existing data with replacement. To avoid errors in inference resulting from the same observatin being compared with itself in the bootstrapped data set, original indices are perserved, and pairs of points in the bootstrapped dataset with the same original index are ignored.
Justin Lessler and Henrik Salje
Other get.theta:
get.theta()
,
get.theta.ci()
,
get.theta.permute()
,
get.theta.typed()
,
get.theta.typed.bootstrap()
,
get.theta.typed.permute()
#compare normally distributed with uniform points x<-cbind(1,runif(100,-100,100), runif(100,-100,100)) x<-rbind(x, cbind(2,rnorm(100,0,20), rnorm(100,0,20))) colnames(x) <- c("type","x","y") fun<-function(a,b) { if(a[1]!=2) return(3) if (b[1]==2) return(1) return(2) } r.max<-seq(10,100,10) r.min<-seq(0,90,10) r.mid <- (r.max+r.min)/2 theta<-get.theta(x,fun,r=r.max,r.low=r.min) theta.boot<-get.theta.bootstrap(x,fun,r=r.max,r.low=r.min,boot.iter=100) theta.ci<-apply(theta.boot[,-(1:2)],1,quantile,probs=c(0.25,0.75)) plot(r.mid, theta$theta , type="l") lines(r.mid, theta.ci[1,] , lty=2) lines(r.mid, theta.ci[2,] , lty=2)
#compare normally distributed with uniform points x<-cbind(1,runif(100,-100,100), runif(100,-100,100)) x<-rbind(x, cbind(2,rnorm(100,0,20), rnorm(100,0,20))) colnames(x) <- c("type","x","y") fun<-function(a,b) { if(a[1]!=2) return(3) if (b[1]==2) return(1) return(2) } r.max<-seq(10,100,10) r.min<-seq(0,90,10) r.mid <- (r.max+r.min)/2 theta<-get.theta(x,fun,r=r.max,r.low=r.min) theta.boot<-get.theta.bootstrap(x,fun,r=r.max,r.low=r.min,boot.iter=100) theta.ci<-apply(theta.boot[,-(1:2)],1,quantile,probs=c(0.25,0.75)) plot(r.mid, theta$theta , type="l") lines(r.mid, theta.ci[1,] , lty=2) lines(r.mid, theta.ci[2,] , lty=2)
get.theta
values.Wrapper to get.theta.bootstrap
that takes care of calculating the
confience intervals based on the bootstrapped values.
get.theta.ci( posmat, fun, r = 1, r.low = rep(0, length(r)), boot.iter = 1000, ci.low = 0.025, ci.high = 0.975, data.frame = TRUE )
get.theta.ci( posmat, fun, r = 1, r.low = rep(0, length(r)), boot.iter = 1000, ci.low = 0.025, ci.high = 0.975, data.frame = TRUE )
posmat |
a matrix with columns type, x and y |
fun |
the function to decide relationships |
r |
the series of spatial distances we are interested in |
r.low |
the low end of each range. 0 by default |
boot.iter |
the number of bootstrap iterations |
ci.low |
the low end of the ci...0.025 by default |
ci.high |
the high end of the ci...0.975 by default |
data.frame |
logical indicating whether to return results as a data frame (default = TRUE) |
a matrix with a row for the high and low values and a column per distance
Justin Lessler
Other get.theta:
get.theta()
,
get.theta.bootstrap()
,
get.theta.permute()
,
get.theta.typed()
,
get.theta.typed.bootstrap()
,
get.theta.typed.permute()
#compare normally distributed with uniform points x<-cbind(1,runif(100,-100,100), runif(100,-100,100)) x<-rbind(x, cbind(2,rnorm(100,0,20), rnorm(100,0,20))) colnames(x) <- c("type","x","y") fun<-function(a,b) { if(a[1]!=2) return(3) if (b[1]==2) return(1) return(2) } r.max<-seq(10,100,10) r.min<-seq(0,90,10) r.mid <- (r.max+r.min)/2 theta<-get.theta(x,fun,r=r.max,r.low=r.min) theta.ci<-get.theta.ci(x,fun,r=r.max,r.low=r.min,boot.iter=100) plot(r.mid, theta$theta, type="l") lines(r.mid, theta.ci[,2] , lty=2) lines(r.mid, theta.ci[,3] , lty=2)
#compare normally distributed with uniform points x<-cbind(1,runif(100,-100,100), runif(100,-100,100)) x<-rbind(x, cbind(2,rnorm(100,0,20), rnorm(100,0,20))) colnames(x) <- c("type","x","y") fun<-function(a,b) { if(a[1]!=2) return(3) if (b[1]==2) return(1) return(2) } r.max<-seq(10,100,10) r.min<-seq(0,90,10) r.mid <- (r.max+r.min)/2 theta<-get.theta(x,fun,r=r.max,r.low=r.min) theta.ci<-get.theta.ci(x,fun,r=r.max,r.low=r.min,boot.iter=100) plot(r.mid, theta$theta, type="l") lines(r.mid, theta.ci[,2] , lty=2) lines(r.mid, theta.ci[,3] , lty=2)
get.theta
functionDoes permutations to calculate the null distribution of get theta if there were no spatial dependence. Randomly reassigns coordinates to each observation permutations times
get.theta.permute( posmat, fun, r = 1, r.low = rep(0, length(r)), permutations, data.frame = TRUE )
get.theta.permute( posmat, fun, r = 1, r.low = rep(0, length(r)), permutations, data.frame = TRUE )
posmat |
a matrix with columns type, x and y |
fun |
the function to evaluate |
r |
the series of spatial distances we are interested in |
r.low |
the low end of each range....0 by default |
permutations |
the number of permute iterations |
data.frame |
logical indicating whether to return results as a data frame (default = TRUE) |
theta values for all the distances we looked at
Other get.theta:
get.theta()
,
get.theta.bootstrap()
,
get.theta.ci()
,
get.theta.typed()
,
get.theta.typed.bootstrap()
,
get.theta.typed.permute()
#compare normally distributed with uniform points x<-cbind(1,runif(100,-100,100), runif(100,-100,100)) x<-rbind(x, cbind(2,rnorm(100,0,20), rnorm(100,0,20))) colnames(x) <- c("type","x","y") fun<-function(a,b) { if(a[1]!=2) return(3) if (b[1]==2) return(1) return(2) } r.max<-seq(10,100,10) r.min<-seq(0,90,10) r.mid <- (r.max+r.min)/2 theta<-get.theta(x,fun,r=r.max,r.low=r.min) theta.null<-get.theta.permute(x,fun,r=r.max,r.low=r.min,permutations=100) null.ci<-apply(theta.null[,-(1:2)],1,quantile,probs=c(0.25,0.75)) plot(r.mid, theta$theta , type="l") lines(r.mid, null.ci[1,] , lty=2) lines(r.mid, null.ci[2,] , lty=2)
#compare normally distributed with uniform points x<-cbind(1,runif(100,-100,100), runif(100,-100,100)) x<-rbind(x, cbind(2,rnorm(100,0,20), rnorm(100,0,20))) colnames(x) <- c("type","x","y") fun<-function(a,b) { if(a[1]!=2) return(3) if (b[1]==2) return(1) return(2) } r.max<-seq(10,100,10) r.min<-seq(0,90,10) r.mid <- (r.max+r.min)/2 theta<-get.theta(x,fun,r=r.max,r.low=r.min) theta.null<-get.theta.permute(x,fun,r=r.max,r.low=r.min,permutations=100) null.ci<-apply(theta.null[,-(1:2)],1,quantile,probs=c(0.25,0.75)) plot(r.mid, theta$theta , type="l") lines(r.mid, null.ci[1,] , lty=2) lines(r.mid, null.ci[2,] , lty=2)
get.theta
for typed data.Version of the get.theta
function that is optimized for statically typed data. That is
data where we are interested in the odds that points within some distance of points of
typeA are of typeB.
get.theta.typed( posmat, typeA = -1, typeB = -1, r = 1, r.low = rep(0, length(r)), data.frame = TRUE )
get.theta.typed( posmat, typeA = -1, typeB = -1, r = 1, r.low = rep(0, length(r)), data.frame = TRUE )
posmat |
a matrix with columns type, x and y |
typeA |
the "from" type that we are interested in, -1 is wildcard |
typeB |
the "to" type that we are interested i, -1 is wildcard |
r |
the series of spatial distances wer are interested in |
r.low |
the low end of each range....0 by default |
data.frame |
logical indicating whether to return results as a data frame (default = TRUE) |
theta values for all the distances we looked at
Justin Lessler and Henrik Salje
Other get.theta:
get.theta()
,
get.theta.bootstrap()
,
get.theta.ci()
,
get.theta.permute()
,
get.theta.typed.bootstrap()
,
get.theta.typed.permute()
data(DengueSimR02) r.max<-seq(20,1000,20) r.min<-seq(0,980,20) #Lets see if there's a difference in spatial dependence by time case occurs type<-2-(DengueSimR02[,"time"]<120) tmp<-cbind(DengueSimR02,type=type) typed.theta.R01<-get.theta.typed(tmp,typeA=2,typeB=2,r=r.max,r.low=r.min)
data(DengueSimR02) r.max<-seq(20,1000,20) r.min<-seq(0,980,20) #Lets see if there's a difference in spatial dependence by time case occurs type<-2-(DengueSimR02[,"time"]<120) tmp<-cbind(DengueSimR02,type=type) typed.theta.R01<-get.theta.typed(tmp,typeA=2,typeB=2,r=r.max,r.low=r.min)
get.theta.typed
Bootstraps typed pi values. Makes sure distances between a sample and another draw of itself are left out
get.theta.typed.bootstrap( posmat, typeA = -1, typeB = -1, r = 1, r.low = rep(0, length(r)), boot.iter, data.frame = TRUE )
get.theta.typed.bootstrap( posmat, typeA = -1, typeB = -1, r = 1, r.low = rep(0, length(r)), boot.iter, data.frame = TRUE )
posmat |
a matrix with columns type, x and y |
typeA |
the "from" type that we are interested in, -1 is wildcard |
typeB |
the "to" type that we are interested i, -1 is wildcard |
r |
the series of spatial distances we are interested in |
r.low |
the low end of each range....0 by default |
boot.iter |
the number of bootstrap iterations |
data.frame |
logical indicating whether to return results as a data frame (default = TRUE) |
theta values for all the distances we looked at
Other get.theta:
get.theta()
,
get.theta.bootstrap()
,
get.theta.ci()
,
get.theta.permute()
,
get.theta.typed()
,
get.theta.typed.permute()
data(DengueSimR01) r.max<-seq(20,1000,20) r.min<-seq(0,980,20) #Lets see if there's a difference in spatial dependence by time case occurs type<-2-(DengueSimR01[,"time"]<75) tmp<-cbind(DengueSimR01,type=type) typed.theta.bs<-get.theta.typed.bootstrap(tmp,typeA=1,typeB=2,r=r.max,r.low=r.min,boot.iter=100)
data(DengueSimR01) r.max<-seq(20,1000,20) r.min<-seq(0,980,20) #Lets see if there's a difference in spatial dependence by time case occurs type<-2-(DengueSimR01[,"time"]<75) tmp<-cbind(DengueSimR01,type=type) typed.theta.bs<-get.theta.typed.bootstrap(tmp,typeA=1,typeB=2,r=r.max,r.low=r.min,boot.iter=100)
Does permutations to calculate the null distribution of get theta if there were no spatial dependence. Randomly reassigns coordinates to each observation permutations times
get.theta.typed.permute( posmat, typeA = -1, typeB = -1, r = 1, r.low = rep(0, length(r)), permutations, data.frame = TRUE )
get.theta.typed.permute( posmat, typeA = -1, typeB = -1, r = 1, r.low = rep(0, length(r)), permutations, data.frame = TRUE )
posmat |
a matrix with columns type, x and y |
typeA |
the "from" type that we are interested in, -1 is wildcard |
typeB |
the "to" type that we are interested i, -1 is wildcard |
r |
the series of spatial distances we are interested in |
r.low |
the low end of each range....0 by default |
permutations |
the number of permute iterations |
data.frame |
logical indicating whether to return results as a data frame (default = TRUE) |
theta values for all the distances we looked at
Justin Lessler and Henrik Salje
Other get.theta:
get.theta()
,
get.theta.bootstrap()
,
get.theta.ci()
,
get.theta.permute()
,
get.theta.typed()
,
get.theta.typed.bootstrap()
data(DengueSimR01) r.max<-seq(20,1000,20) r.min<-seq(0,980,20) #Lets see if there's a difference in spatial dependence by time case occurs type<-2-(DengueSimR01[,"time"]<75) tmp<-cbind(DengueSimR01,type=type) typed.theta.R01<-get.theta.typed(tmp,typeA=1,typeB=2,r=r.max,r.low=r.min) typed.theta.type.null<-get.theta.typed.permute(tmp, typeA=1, typeB=2, r=r.max, r.low=r.min, permutations=100)
data(DengueSimR01) r.max<-seq(20,1000,20) r.min<-seq(0,980,20) #Lets see if there's a difference in spatial dependence by time case occurs type<-2-(DengueSimR01[,"time"]<75) tmp<-cbind(DengueSimR01,type=type) typed.theta.R01<-get.theta.typed(tmp,typeA=1,typeB=2,r=r.max,r.low=r.min) typed.theta.type.null<-get.theta.typed.permute(tmp, typeA=1, typeB=2, r=r.max, r.low=r.min, permutations=100)
This function estimates the weights of each theta (number of transmission events separating cases at two time points). A randomized transmission tree is drawn and the number of transmission events separating cases at two time points is calculated based on probabilies found in the Wallinga-Teunis matrix.
get.transdist.theta( wal.teun.mat, cases, gen.t.mean, max.sep, ret.theta.mat = FALSE )
get.transdist.theta( wal.teun.mat, cases, gen.t.mean, max.sep, ret.theta.mat = FALSE )
wal.teun.mat |
a Wallinga-Teunis matrix produced by the |
cases |
a vector of case times for each case |
gen.t.mean |
the mean generation time of the infecting pathogen |
max.sep |
maximum number of transmission events allowed between two cases |
ret.theta.mat |
logical value which returns the matrix of estimated theta values (default = FALSE) |
a three-dimensional array containing normalized theta weights. Columns and rows represent unique case times. The third dimension is the number of transmission events between two cases.
John Giles, Justin Lessler, and Henrik Salje
Salje H, Cummings DAT and Lessler J (2016). “Estimating infectious disease transmission distances using the overall distribution of cases.” Epidemics, 17, pp. 10–18. ISSN 1755-4365, doi: 10.1016/j.epidem.2016.10.001.
Other transdist:
est.transdist()
,
est.transdist.bootstrap.ci()
,
est.transdist.temporal()
,
est.transdist.temporal.bootstrap.ci()
,
est.transdist.theta.weights()
case.times <- c(1,2,2,3,3) gen <- c(0, 2/3, 1/3, 0, 0) t.density <- gen/sum(gen) gen.time <- 2 # mean generation time wt <- est.wt.matrix(case.times=case.times, gen.t.dist=t.density) ngen <- round((max(case.times) - min(case.times)) / gen.time) + 1 # Number of generations a <- get.transdist.theta(wal.teun.mat=wt, cases=case.times, gen.t.mean=gen.time, max.sep=ngen*2)
case.times <- c(1,2,2,3,3) gen <- c(0, 2/3, 1/3, 0, 0) t.density <- gen/sum(gen) gen.time <- 2 # mean generation time wt <- est.wt.matrix(case.times=case.times, gen.t.dist=t.density) ngen <- round((max(case.times) - min(case.times)) / gen.time) + 1 # Number of generations a <- get.transdist.theta(wal.teun.mat=wt, cases=case.times, gen.t.mean=gen.time, max.sep=ngen*2)
A function which simulates the spatial spread of infections through time given the reproductive number (R
),
a function describing the spatial transmission kernel (trans.kern.func
), and the mean and standard deviation
of the generation time distribution (gen.t.mean
and gen.t.sd
) for the infecting pathogen. The function returns
the location (x
, y
) and time (t
) for each case of infection in the simulation.
sim.epidemic( R, gen.t.mean, gen.t.sd, trans.kern.func, tot.generations = 10, min.cases = 0, max.try = 1000 )
sim.epidemic( R, gen.t.mean, gen.t.sd, trans.kern.func, tot.generations = 10, min.cases = 0, max.try = 1000 )
R |
a scalar or a vector of length |
gen.t.mean |
mean of generation time |
gen.t.sd |
standard deviation of the generation time (assumed to be normally distributed) |
trans.kern.func |
a function for the transmission kernel that takes |
tot.generations |
the total number of generations in the epidemic, where the index case (x,y,t = [0,0,0]) is considered generation zero (default = 10) |
min.cases |
the minimum number of cases in the epidemic (default = 0) |
max.try |
maximum number of tries to acheive the minimum number of cases (default = 1000) |
a numerical matrix with three columns giving the coordinates x
and y
, and time t
of simulated cases
John Giles, Justin Lessler, and Henrik Salje
set.seed(1) dist.func <- alist(n=1, a=1/100, rexp(n, a)) # Exponential transmission kernel with mean = sd = 100 # Simulate epidemic with constant R value a <- sim.epidemic(R=1.5, gen.t.mean=7, gen.t.sd=2, tot.generations=15, min.cases=100, trans.kern.func=dist.func) sim.plot(a) # Simulate an epidemic with variable R value r1 <- 2 r2 <- 0.25 tg <- 25 R <- seq(r1, r2, (r2 -r1)/(tg - 1)) b <- sim.epidemic(R=R, gen.t.mean=7, gen.t.sd=2, tot.generations=tg, min.cases=100, trans.kern.func=dist.func) sim.plot(b)
set.seed(1) dist.func <- alist(n=1, a=1/100, rexp(n, a)) # Exponential transmission kernel with mean = sd = 100 # Simulate epidemic with constant R value a <- sim.epidemic(R=1.5, gen.t.mean=7, gen.t.sd=2, tot.generations=15, min.cases=100, trans.kern.func=dist.func) sim.plot(a) # Simulate an epidemic with variable R value r1 <- 2 r2 <- 0.25 tg <- 25 R <- seq(r1, r2, (r2 -r1)/(tg - 1)) b <- sim.epidemic(R=R, gen.t.mean=7, gen.t.sd=2, tot.generations=tg, min.cases=100, trans.kern.func=dist.func) sim.plot(b)
A simple visualization function which plots the location of the index case and the spatial distribution of subsequent cases, and the epidemic curve showing the case count over time.
sim.plot(sim)
sim.plot(sim)
sim |
a three-column matrix object produced by the |
A two-panel plotted object
John Giles, Justin Lessler, and Henrik Salje