Skip to contents

This function fits a linear model and constructs simultaneous confidence bands (SCB) using a non-parametric bootstrap method for the mean outcome of regression on a fixed test set design matrix

Usage

SCB_linear_outcome(
  df_fit,
  model,
  grid_df = NULL,
  n_boot = 1000,
  alpha = 0.05,
  grid_df_boot = NULL
)

Arguments

df_fit

A data frame containing the training design matrix used to fit the linear model. Acceptable input format includes numeric and factor.

model

A character string representing the formula for the linear model (e.g., "y ~ x1 + x2").

grid_df

A data frame specifying the covariate settings that define the mean outcome for which simultaneous confidence bands (SCB) are constructed. Each row represents one covariate combination at which predictions and SCBs are evaluated. Column names should match variables in the fitted model, but grid_df may include only the subset of covariates of interest for the SCB (it is not required to cover all model variables). Default is NULL, in which case the SCB is constructed over the fitted values based on 'df_fit`.

n_boot

Number of bootstrap samples used in the non-parametric bootstrap procedure to generate the empirical distribution. Default is 1000.

alpha

Significance level for the confidence band (e.g., 0.05 for 95% confidence). Default is 0.05.

grid_df_boot

An optional data frame specifying the input grid at which predictions are evaluated during bootstrap resampling. This allows SCBs to be constructed on a denser set of covariate values if desired. If NULL, uses grid_df. If grid_df is set to NULL, grid_df_boot will also be set to NULL.

Value

A data frame with the following columns:

scb_low

Lower bound of the simultaneous confidence band.

Mean

Predicted mean response from the fitted model.

scb_up

Upper bound of the simultaneous confidence band.

...

All columns from grid_df, representing the prediction grid.

References

Ren, J., Telschow, F. J. E., & Schwartzman, A. (2024). Inverse set estimation and inversion of simultaneous confidence intervals. Journal of the Royal Statistical Society: Series C (Applied Statistics), 73(4), 1082–1109. doi:10.1093/jrsssc/qlae027

Examples

set.seed(262)
x1 <- rnorm(100)
epsilon <- rnorm(100,0,sqrt(2))
y <- -1 + x1 + epsilon
df <- data.frame(x1 = x1, y = y)
grid <- data.frame(x1 = seq(-1, 1, length.out = 100))
model <- "y ~ x1"
results <- SCB_linear_outcome(df_fit = df, model = model, grid_df = grid, n_boot = 100)