Screening Predictors As 'Leading Variables' By Evaluating Predictor-Response Associations In Linear Models — get_leadvars_LM • S3VS

get_leadvars_LM screens some predictors as "leading variables" based on predictor-response associations in linear models.

get_leadvars_LM(y, X, method = c("topk", "fixedcorthresh", "perccorthresh"), param)

Arguments

y: Response. A numeric vector.

X: Predictor matrix. Can be a base matrix or something as.matrix() can coerce. No missing values are allowed.

method: Screening rule, one of c("topk", "fixedthresh", "percthresh"). The association measure is correlation. "topk" keeps the predictors with the largest \(k\) association values; "fixedthresh" keeps predictors whose association is greater than or equal to a specified threshold; "percthresh" keeps predictors whose association is within a given percentage of the best.

param: Tuning parameter for method. If "topk", supply an integer \(k\) (keep the top \(k\)). If "fixedthresh", supply a numeric threshold (keep predictors with association \(\ge\) threshold). If "percthresh", supply a percentage in \((0,100]\) (keep predictors with association \(\ge\) that percent of the highest association).).

Value

A character vector containing the names of the leading varibales.

Author

Nilotpal Sanyal <nsanyal@utep.edu>, Padmore N. Prempeh <pprempeh@albany.edu>

Examples

# Simulate continuous data
set.seed(123)
n <- 100
p <- 150
X <- matrix(rnorm(n * p), n, p)
colnames(X) <- paste0("V", 1:p)
y <- X[,1] + 0.5 * X[,2] + rnorm(n)
# Select leading variables
leadvars <- get_leadvars_LM(y = y, X = X, method = "topk", param = list(k=2))
leadvars
#> [1] "V1"   "V136"