\name{anotaResidOutlierTest} \alias{anotaResidOutlierTest} \title{ Test for normality of residuals } \description{ One assumption when performing APV is that the residuals from the regressions are normally distributed. anota assesses this by comparing the Q-Q plots of the residuals to envelopes derived by sampling from the normal distribution. } \usage{ anotaResidOutlierTest(anotaQcObj=NULL, confInt=0.01, iter=5, generateSingleGraph=FALSE, nGraphs=200, generateSummaryGraph=TRUE, residFitPlot=TRUE, useProgBar=TRUE) } \arguments{ \item{anotaQcObj}{The object returned by anotaPerformQc.} \item{confInt}{Controls how many samples from the normal distribution will be used to generate the envelope to which the residuals are compared. Default is 0.01 which will generate 99 samples from the normal distribution to compare to the actual residuals.} \item{iter}{How many times should the analysis be performed? Default is 5 meaning that 5 sets of samples (each with the size controlled by confInt) will be generated. Notice that the summary plotting is only performed for the last set but the percentage of outliers for each iteration can be found in the output object.} \item{generateSingleGraph}{The analysis is performed per identifier and plots can be generated for each identifier. However, due to the high number of identifiers, a large number of plots will typically be generated. Default is FALSE.} \item{nGraphs}{If generateSingleGraph is set to TRUE, nGraphs controls for how many identifiers such single gene graphs will be generated.} \item{generateSummaryGraph}{The function can generate a summary graph that shows the envelopes generated by sampling from the normal distribution compared to the obtained values for all genes. Default is TRUE, thus the graph is generated but only from the last iteration.} \item{residFitPlot}{Generates an output of the fitted values and residuals. Default is TRUE, generate the plot.} \item{useProgBar}{Should the progress bar be shown. Default is TRUE, show progress bar.} } \details{ The anotaResidOutlierTest function assesses whether the residuals from the per identifier linear regressions of translationally active mRNA level~cytosolic mRNA level+phenoType are normally distributed. anota generates normal Q-Q plots of the residuals. If the residuals are normally distributed, the data quantiles will form a straight diagonal line from bottom left to top right. Because there are typically relatively few data points, anota calculates "envelopes" based on a set of samplings from the normal distribution using the same number of data points as for the true data (Venables and Ripley 1999).To enable a comparison both the actual and the sampled data are centered (mean=0) and scaled (sd=1). The data (both true and sampled) are then sorted and the true sample is compared to the envelopes of the sampled data at each sort position. The result is presented as a Q-Q plot of the true data where the envelopes of the sampled data are indicated. If there are 99 samplings we expect that 1/100 values to be outside the envelopes obtained from the samplings. Thus it is possible to assess if approximately the expected number of outlier residuals are obtained. The result is presented as both a graphical output and an output object. } \value{ anotaResdiOutlierTest generates a graphical output ("ANOTA_residual_distribution_summary.pdf") showing the Q-Q plots from all genes as well as the envelopes from the sampled data. The obtained percentage of outliers is shown at each rank position and all combined. Optionally, when the generateSingleGraph is set to TRUE, the function also generates individual plots (stored as "ANOTA_residual_distributions_single.pdf") for n genes (set by nGraphs). When residFitPlot is set to TRUE an output comparing the fitted values to the residuals is generated (stored as "ANOTA_residuals_vs_fitted.jpeg"). An output list object with the following slots is also generated: \item{confInt}{The selected confInt (see function arguments).} \item{inputResiduals}{The residuals used.} \item{rnormIter}{The number of sampled data sets.} \item{outlierMatrixLog}{A logical matrix describing which residuals were outliers in the last iteration of the analysis.} \item{meanOutlierPerIteration}{The fraction outliers per iteration.} \item{obtainedComparedToExpected}{The ratio of the expected number of outlier residuals compared to the expected number of outliers given the selected confInt.} \item{nExpected}{Number of expected outlier residuals.} \item{nObtained}{Number of obtained outliers residuals.} } \author{Ola Larsson \email{ola.larsson@ki.se}, Nahum Sonenberg \email{nahum.sonenberg@mcgill.ca}, Robert Nadon \email{robert.nadon@mcgill.ca}} \examples{ ## See example for \code{\link{anotaPlotSigGenes}} } \seealso{\code{\link{anotaPerformQc}}, \code{\link{anotaGetSigGenes}}, \code{\link{anotaPlotSigGenes}}} \keyword{methods} \source{Modern Applied Statistics with S-PLUS. Venables, B.N. and Ripley, B.D., Springer. 1999}