pred_S3VS performs prediction using predictors selected by S3VS in linear, generalized linear, and survival models.

pred_S3VS(y, X, family, surv_model = NULL, method)

Arguments

y

Response. If family = "normal", a numeric vector. If family = "binomial", a numeric/integer/logical vector with values in {0,1}. If family = "survival", a list with components time and status (1 = event, 0 = censored).

X

Predictor matrix. This should include predictors selected by S3VS. Can be a base matrix or something as.matrix() can coerce. No missing values are allowed.

family

Model family; one of c("normal","binomial","survival"). Determines which engine is called (pred_LM, pred_GLM, or pred_SURV).

surv_model

Character string specifying the survival model (family="survival" only). Must be explicitly provided; there is no default. Values are "Cox" for proportional hazards models, "AFT" for accelerated failure time models.

method

Character string indicating the prediction method used. Allowed values depend on family: for family = "normal"(functions pred_S3VS_LM), available options are "NLP", "LASSO", "SCAD", "MCP"; for "binomial" (S3VS_GLM), available options are "NLP", "LASSO"; for family = "survival" (S3VS_SURV), available options are "COXGLMNET" for surv_model = "COX" and .... for surv_model = "AFT". See Details for more information.

Value

A list containing:

y.pred

Predicted response

coef

Coefficient estimates of the predictors used for prediction

Author

Nilotpal Sanyal <nsanyal@utep.edu>, Padmore N. Prempeh <pprempeh@albany.edu>

Examples

# Simulate continuous data
set.seed(123)
n <- 100
p <- 150
X <- matrix(rnorm(n * p), n, p)
colnames(X) <- paste0("V", 1:p)
y <- X[,1] + 0.5 * X[,2] + rnorm(n)
# Run S3VS for LM
res_lm <- S3VS(y = y, X = X, family = "normal",
               method_xy = "topk", param_xy = list(k=1),
               method_xx = "topk", param_xx = list(k=3),
               vsel_method = "LASSO", method_sel = "conservative", 
               method_rem = "conservative_begin", rem_regout = FALSE, 
               m = 100, nskip = 3, verbose = TRUE, seed = 123)
#> -------------
#> Iteration 1
#> -------------
#> input : V1 V119 V70 
#> selected : V1 
#> -------------
#> Iteration 2
#> -------------
#> input : V2 V43 V17 
#> selected : V2 
#> -------------
#> Iteration 3
#> -------------
#> input : V76 V3 V15 
#> selected :  
#> *** nskip= 1 *** 
#> -------------
#> Iteration 4
#> -------------
#> input : V14 V121 V11 
#> selected :  
#> *** nskip= 2 *** 
#> -------------
#> Iteration 5
#> -------------
#> input : V149 V70 V8 
#> selected :  
#> *** nskip= 3 *** 
#> =================================
#> Number of selected variables: 2
#> Time taken: 0.07 sec
#> =================================
pred_lm <- pred_S3VS(y = y, X = X[,res_lm$selected], family = "normal", method = "LASSO")