get_leadvars_LM screens some predictors as "leading variables" based on predictor-response associations in linear models.

get_leadvars_LM(y, X, method = c("topk", "fixedcorthresh", "perccorthresh"), param)

Arguments

y

Response. A numeric vector.

X

Predictor matrix. Can be a base matrix or something as.matrix() can coerce. No missing values are allowed.

method

Screening rule, one of c("topk", "fixedthresh", "percthresh"). The association measure is correlation. "topk" keeps the predictors with the largest \(k\) association values; "fixedthresh" keeps predictors whose association is greater than or equal to a specified threshold; "percthresh" keeps predictors whose association is within a given percentage of the best.

param

Tuning parameter for method. If "topk", supply an integer \(k\) (keep the top \(k\)). If "fixedthresh", supply a numeric threshold (keep predictors with association \(\ge\) threshold). If "percthresh", supply a percentage in \((0,100]\) (keep predictors with association \(\ge\) that percent of the highest association).).

Value

A character vector containing the names of the leading varibales.

Author

Nilotpal Sanyal <nsanyal@utep.edu>, Padmore N. Prempeh <pprempeh@albany.edu>

Examples

# Simulate continuous data
set.seed(123)
n <- 100
p <- 150
X <- matrix(rnorm(n * p), n, p)
colnames(X) <- paste0("V", 1:p)
y <- X[,1] + 0.5 * X[,2] + rnorm(n)
# Select leading variables
leadvars <- get_leadvars_LM(y = y, X = X, method = "topk", param = list(k=2))
leadvars
#> [1] "V1"   "V136"