\name{Classifier} \alias{Classifier} \title{ A function to predict the class labels of the test set. } \description{ Given the training set and the type of classification algorithm, this function constructs the classification model based on the training set and predict the class labels of the test set. } \usage{ Classifier(train, test = c(), train.label, type = c("TSP", "GLM", "GLM_L1", "GLM_L2", "PAM", "SVM", "plsrf_x", "plsrf_x_pv", "RF"), CVtype = c("loocv", "k-fold"), outerkfold = 5, innerkfold = 5, featurenames = NULL) } \arguments{ \item{train}{ A data frame or matrix of containing predictors for the training set, where columns correspond to samples and rows to features. } \item{test}{ A data frame or matrix containing predictors for the test set (optional), where columns correspond to samples and rows to features. } \item{train.label}{ A vector of the class labels (0 or 1) of the training set. NOTE: class labels should be numerical not factor. } \item{type}{ Type of classification algorithms. Currently 9 different types of algorithm are available. They are: top scoring pair (TSP), logistic regression (GLM), GLM with L1 (lasso) penalty, GLM with L2 (ridge) penalty, prediction analysis for microarray (PAM), support vector machine (SVM), random forest method after partial least square dimension reduction (plsrf_x), random forest method after partial least square dimension reduction plus prevalidation (plsrf_x_pv), random forest (RF). NOTE: TSP, PAM, plsrf_x and plsrf_x_pv algorithms does not work with clinical data. } \item{CVtype}{ Cross validation type. } \item{outerkfold}{ Number of cross validation used in the training phase. } \item{innerkfold}{ Number of cross validation used to estimate the model parameters. } \item{featurenames}{ Feature names in molecular data (e.g. gene or probe names). If given, function also produces name of the selected feature during the training and test phases. Feature selection only works with "TSP", "GLM_L1" and "GLM_L2" algorithms. "RF" provides feature importance. } } \value{A list object \emph{Pred} which contains following components: \item{P.train}{predicted class labels of the training set.} \item{P.test}{predicted class labels of the test set if the test set is given.} \item{selfeatname_tr}{A list object, size of \emph{outerkfold}, containing the name of the selected features during the training phase if the \emph{featurenames} is given.} \item{selfeatname_te}{A list object containing the name of the selected features during the test phase if the \emph{test} and \emph{featurenames} are given.} } \references{ Aik Choo Tan and Daniel Q. Naiman and Lei Xu and Raimond L. Winslow and Donald Geman(2005). Simple Decision Rules for Classifying Human Cancers from Gene Expression Profiles(TSP). \emph{Bioinformatics, 21}, 3896-3904. Anne-Laure Boulesteix and Christine Porzelius and Martin Daumer(2008). Microarray-based Classification and Clinical Predictors: on Combined Classifiers and Additional Predictive Value. \emph{Bioinformatics, 24}, 1698--1706. } \author{ Askar Obulkasim Maintainer: Askar Obulkasim } \seealso{ \code{\link{Classifier.par}} } \examples{ data(CNS) train <- CNS$mrna[, 1:40] test <- CNS$mrna[, 41:60] train.label <- CNS$class[1:40] Pred <- Classifier(train = train, test = test, train.label = train.label, type = "GLM_L1", CVtype = "k-fold", outerkfold = 2, innerkfold = 2) Pred$P.train Pred$P.test }