Bayesian spatial-temporal generalized linear models with group-specific terms via Stan

Bayesian inference for stap-glms with group-specific coefficients that have unknown covariance matrices with flexible priors.

stapdnd_glmer(formula, family = gaussian(), subject_data = NULL,
  distance_data = NULL, time_data = NULL, subject_ID = NULL,
  group_ID = NULL, max_distance = NULL, max_time = NULL, weights,
  offset, contrasts = NULL, ..., prior = normal(),
  prior_intercept = normal(), prior_stap = normal(),
  prior_theta = log_normal(location = 1L, scale = 1L),
  prior_aux = exponential(), prior_covariance = decov(),
  adapt_delta = NULL)

Arguments

formula

Same as for glmer. Note that in-formula transformations will not be passed to the final design matrix.Covariates that have "scale" in their name are not advised as this text is parsed for in the final model fit.

family

Same as for glmer except limited to gaussian, binomial and poisson

subject_data

a data.frame that contains data specific to the subject or subjects on whom the outcome is measured. Must contain one column that has the subject_ID on which to join the distance and time_data

distance_data

a (minimum) three column data.frame that contains (1) an id_key (2) The sap/tap/stap features and (3) the distances between subject with a given id and the built environment feature in column (2), the distance column must be the only column of type "double" and the sap/tap/stap features must be specified in the dataframe exactly as they are in the formula.

time_data

same as distance_data except with time that the subject has been exposed to the built environment feature, instead of distance

subject_ID

name of column to join on between subject_data and bef_data

group_ID

name of column to join on between subject_data and bef_data that uniquely identifies the groups

max_distance

the upper bound on any and all distances included in the model

max_time

the upper bound on any and all times included in the model

weights, offset

Same as glm.

contrasts

Same as glm, but rarely specified.

...

For stap_glmer, further arguments passed to sampling (e.g. iter, chains, cores, etc.). For stap_lmer ... should also contain all relevant arguments to pass to stap_glmer (except family).

prior

The prior distribution for the regression coefficients. prior should be a call to one of the various functions provided by rstap for specifying priors. The subset of these functions that can be used for the prior on the coefficients can be grouped into several "families":

Family	Functions
Student t family	`normal`, `student_t`, `cauchy`
Hierarchical shrinkage family	`hs`, `hs_plus`
Laplace family	`laplace`, `lasso`
Product normal family	`product_normal`

See the priors help page for details on the families and how to specify the arguments for all of the functions in the table above. To omit a prior ---i.e., to use a flat (improper) uniform prior--- prior can be set to NULL, although this is rarely a good idea.

Note: If prior is from the Student t family or Laplace family, and if the autoscale argument to the function used to specify the prior (e.g. normal) is left at its default and recommended value of TRUE, then the default or user-specified prior scale(s) may be adjusted internally based on the scales of the predictors. See the priors help page and the Prior Distributions vignette for details on the rescaling and the prior_summary function for a summary of the priors used for a particular model.

prior_intercept

The prior distribution for the intercept. prior_intercept can be a call to normal, student_t or cauchy. See the priors help page for details on these functions. To omit a prior on the intercept ---i.e., to use a flat (improper) uniform prior--- prior_intercept can be set to NULL.

Note: The prior distribution for the intercept is set so it applies to the value when all predictors are centered. If you prefer to specify a prior on the intercept without the predictors being auto-centered, then you have to omit the intercept from the formula and include a column of ones as a predictor, in which case some element of prior specifies the prior on it, rather than prior_intercept. Regardless of how prior_intercept is specified, the reported estimates of the intercept always correspond to a parameterization without centered predictors (i.e., same as in glm).

prior_theta, prior_stap

priors for the spatial scale and spatial effect parameters, respectively

prior_aux

The prior distribution for the "auxiliary" parameter (if applicable). The "auxiliary" parameter refers to a different parameter depending on the family. For Gaussian models prior_aux controls "sigma", the error standard deviation. For negative binomial models prior_aux controls "reciprocal_dispersion", which is similar to the "size" parameter of rnbinom: smaller values of "reciprocal_dispersion" correspond to greater dispersion. For gamma models prior_aux sets the prior on to the "shape" parameter (see e.g., rgamma), and for inverse-Gaussian models it is the so-called "lambda" parameter (which is essentially the reciprocal of a scale parameter). Binomial and Poisson models do not have auxiliary parameters.

prior_aux can be a call to exponential to use an exponential distribution, or normal, student_t or cauchy, which results in a half-normal, half-t, or half-Cauchy prior. See priors for details on these functions. To omit a prior ---i.e., to use a flat (improper) uniform prior--- set prior_aux to NULL.

prior_covariance

Cannot be NULL; see decov for more information about the default arguments.

adapt_delta

See the adapt_delta help page for details.

Value

A stapreg object is returned for stap_glmer, stap_lmer.

Details

The stap_glmer function is similar in syntax to glmer but rather than performing (restricted) maximum likelihood estimation of generalized linear models, Bayesian estimation is performed via MCMC. The Bayesian model adds priors on the regression coefficients (in the same way as stap_glm) and priors on the terms of a decomposition of the covariance matrices of the group-specific parameters. See priors for more information about the priors.

The stap_lmer function is equivalent to stap_glmer with family = gaussian(link = "identity").

References

Gelman, A. and Hill, J. (2007). Data Analysis Using Regression and Multilevel/Hierarchical Models. Cambridge University Press, Cambridge, UK.

Muth, C., Oravecz, Z., and Gabry, J. (2018) User-friendly Bayesian regression modeling: A tutorial with rstanarm and shinystan. The Quantitative Methods for Psychology. 14(2), 99--119. https://www.tqmp.org/RegularArticles/vol14-2/p099/p099.pdf

Examples

if (FALSE) {
## subset to only include id, class name and distance variables
distdata <- homog_longitudinal_bef_data[,c("subj_ID","measure_ID","class","dist")]
timedata <- homog_longitudinal_bef_data[,c("subj_ID","measure_ID","class","time")]
## distance or time column must be numeric
timedata$time <- as.numeric(timedata$time)
fit <- stap_glmer(y_bern ~ centered_income +  sex + centered_age + stap(Coffee_Shop) + (1|subj_ID),
                  family = binomial(link='logit'),
                  subject_data = homog_longitudinal_subject_data,
                  distance_data = distdata,
                  time_data = timedata,
                  subject_ID = 'subj_ID',
                  group_ID = 'measure_ID',
                  prior_intercept = normal(location = 25, scale = 4, autoscale = F),
                  prior = normal(location = 0, scale = 4, autoscale=F),
                  prior_stap = normal(location = 0, scale = 4),
                  prior_theta = list(Coffee_Shop = list(spatial = log_normal(location = 1,
                                                                             scale = 1),
                                                         temporal = log_normal(location = 1,
                                                                               scale = 1))),
                  max_distance = 3, max_time = 50,
                  chains = 4, refresh = -1, verbose = FALSE,
                  iter = 1E3, cores = 1)
}