Skip to contents

Find the optimal projection using various projectin pursuit models.

Usage

PPO(X, y, model = "PPR", split = "gini", weights = NULL, ...)

Arguments

X

An n by d numeric matrix (preferable) or data frame.

y

A response vector of length n.

model

Model for projection pursuit.

  • "PPR"(default): projection projection regression from ppr. When y is a category label, it is expanded to K binary features.

  • "Log": logistic based on nnet.

  • "Rand": The random projection generated from \(\{-1, 1\}\). The following models can only be used for classification, i.e. the split must be ”entropy” or 'gini'.

  • "LDA", "PDA", "Lr", "GINI", and "ENTROPY" from library PPtreeViz.

  • The following models based on Pursuit.

    • "holes": Holes index

    • "cm": Central Mass index

    • "holes": Holes index

    • "friedmantukey": Friedman Tukey index

    • "legendre": Legendre index

    • "laguerrefourier": Laguerre Fourier index

    • "hermite": Hermite index

    • "naturalhermite": Natural Hermite index

    • "kurtosismax": Maximum kurtosis index

    • "kurtosismin": Minimum kurtosis index

    • "moment": Moment index

    • "mf": MF index

    • "chi": Chi-square index

split

The criterion used for splitting the variable. 'gini': gini impurity index (classification, default), 'entropy': information gain (classification) or 'mse': mean square error (regression).

weights

Vector of non-negative observational weights; fractional weights are allowed (default NULL).

...

optional parameters to be passed to the low level function.

Value

Optimal projection direction.

References

Friedman, J. H., & Stuetzle, W. (1981). Projection pursuit regression. Journal of the American statistical Association, 76(376), 817-823.

Ripley, B. D. (1996) Pattern Recognition and Neural Networks. Cambridge.

Lee, YD, Cook, D., Park JW, and Lee, EK(2013) PPtree: Projection Pursuit Classification Tree, Electronic Journal of Statistics, 7:1369-1386.

Cook, D., Buja, A., Lee, E. K., & Wickham, H. (2008). Grand tours, projection pursuit guided tours, and manual controls. In Handbook of data visualization (pp. 295-314). Springer, Berlin, Heidelberg.

See also

Examples

# classification
data(seeds)
(PP <- PPO(seeds[, 1:7], seeds[, 8], model = "Log", split = "entropy"))
#> [1]  -2.309383   2.847194  31.625381  35.243198   3.104818  -1.524764 -36.500103
(PP <- PPO(seeds[, 1:7], seeds[, 8], model = "PPR", split = "entropy"))
#> [1] -0.04663995 -0.01700724 -0.92771564 -0.23227466  0.20425324  0.01478790
#> [7]  0.20245882
(PP <- PPO(seeds[, 1:7], seeds[, 8], model = "LDA", split = "entropy"))
#> [1] -0.18579584 -0.38830262  0.79096768 -0.23851549 -0.33785977  0.02884637
#> [7] -0.13114931

# regression
data(body_fat)
(PP <- PPO(body_fat[, 2:15], body_fat[, 1], model = "Log", split = "mse"))
#>  [1]  0.576428167 -0.660665448 -0.064453715  0.525631193 -0.472349313
#>  [6]  0.536074208 -0.123738306  0.473947709 -0.011705711  0.382371005
#> [11]  0.003063609 -0.412904055  0.583918320  0.593973852
(PP <- PPO(body_fat[, 2:15], body_fat[, 1], model = "Rand", split = "mse"))
#>  [1] -1  1 -1 -1  1  1  1  1  1  1  1  1 -1 -1
(PP <- PPO(body_fat[, 2:15], body_fat[, 1], model = "PPR", split = "mse"))
#>  [1] -0.973615195  0.007627492  0.018938292 -0.002055055  0.013834544
#>  [6]  0.030809259 -0.069084891  0.039733024 -0.039890401 -0.005993149
#> [11] -0.107760827 -0.075608376 -0.006239274  0.158635551