\documentclass{article}
% Commands to recognize Rnw file in xemacs
% M-x noweb-mode
% M-x font-lock-mode

% -*- mode: noweb; noweb-default-code-mode: R-mode; -*-
%\VignetteIndexEntry{rflowcyt overview}
%\VignetteKeywords{FACS, FCS, Flow Cytometry}
%\VignetteDepends{rflowcyt,lattice,locfit}
%\VignettePackage{rflowcyt}


%% Postscript fonts
\usepackage{hyperref}
\usepackage{times}
\usepackage{graphicx}
\usepackage[authoryear,round]{natbib}

%% layout
\textwidth=6.2in
\textheight=8.5in
\oddsidemargin=.1in
\evensidemargin=.1in
\headheight=-.3in

%% Special fonts
\newcommand{\scscst}{\scriptscriptstyle}
\newcommand{\scst}{\scriptstyle}
\newcommand{\Rfunction}[1]{{\texttt{#1}}}
\newcommand{\Rargument}[1]{{\textit{#1}}}
\newcommand{\Robject}[1]{{\texttt{#1}}}
\newcommand{\Rmethod}[1]{{\texttt{#1}}}
\newcommand{\Rclass}[1]{{\texttt{#1}}}
\newcommand{\Rpackage}[1]{{\textit{#1}}}
\newcommand{\code}[1]{{\texttt{#1}}}


\title{Using RFlowCyt}
\author{A.J. Rossini \and J.Y. Wan \and N. Le Meur}
\date{\today}

\begin{document}
\maketitle
\begin{abstract}
  This manual describes the usage of the functions in the \Rpackage{rflowcyt} library package,
 part of the Bioconductor\footnote{\url{http://www.bioconductor.org/}} project.
 The main categories are Data Mananagement and Retrieval, Flow Cytometry Visualizations,
 Exploratory Analysis, Gating, and Flow Cytometry Hypothesis Testing and Statistical Inference.
 Examples are also shown for each category.
\end{abstract}

\tableofcontents

\clearpage
\section{Data Management and Retrieval}
\label{SEC:data-man-intro}

The \Rpackage{rflowcyt} tools in this category are
used for the following tasks:

\begin{enumerate}
\item \Rfunction{read.FCS} to read in FCS binary files into R-objects of S4 class \Rclass{FCS}
\item \Rfunction{[i,j]} to extract or subset information from the \Robject{data} 
(a \Robject{matrix} object) of the \Robject{FCS} R-object
\item \Rfunction{[[i]]} to extract \Robject{metadata} 
(which is of S4 class \Rclass{FCSmetadata}) portion of the \Robject{FCS} R-object
\item \Rfunction{[i,j]<-} to replace \Robject{data} information
\item \Rfunction{addParameter} to add column variables in the \Robject{data}
\item \Rfunction{[[i]]<-} to replace or add new information to the \Robject{metadata} 
portion of the \Robject{FCS} R-object
\item \Rfunction{as} to coerce among \Rclass{FCS}, \Rclass{data.frame} and \Rclass{matrix} class
  objects
\item \Rfunction{equals} to demonstrate equality between two FCS R objects
\item \Rmethod{print-methods}, \Rmethod{show-methods} print or show methods for
  \Robject{FCS} objects
\item \Rfunction{checkvars} to check for any discrepancies between the
  \Robject{metadata} and the \Robject{data} of the \Robject{FCS} object
\item \Rfunction{fixvars} to fix the \Robject{metadata} to relect the information
  obtained from the \Robject{data} if there are discrepancies
\item \Rfunction{summary-methods} to summarize the \Robject{FCS} R-object with Tukey's
  five number summary of the \Robject{data} and with slot information in the
  \Robject{metadata} and to output an \Robject{FCSsummary} S4 class object
\end{enumerate}

\subsection{Datasets}
\label{SUBSEC:Datasets}

There are four types of data sets that are available in the required
data package \Rpackage{rfcdmin}. The first two types of data set consists of
raw binary Flow Cytometry Standard (FCS) files, and the second type
consists of R-objects of S4 class \Rclass{FCS} and is the result of reading
in FCS binary files using \Rfunction{read.FCS} or \Rfunction{read.series.FCS} function.
Table~\ref{tab:1} summarizes the current reading information for the
raw binary files in the \Rpackage{rfcdmin} data package.

For more information about the binary FCS files and the dataset contain in the \Rpackage{rfcdmin} package, look at the package documentation files using the commands in R:

<<helpDataPack, echo=TRUE, eval=FALSE>>=
help(package="rfcdmin")
@ 

<<Table1BinaryFCSDataFiles, echo=FALSE>>=
table1<-rbind(c("2.0", "UW", "facscan", "8", "0-256"),
              c("3.0", "FHCRC", "DiVa", "10", "0-1024"),
              c("2.0", "BCCRC", "FACSCalibur", "10", "0-1024"))
table1 <- as.data.frame(table1)
## column names are the summary variables
colnames(table1)<-c("FCS Version", "Source", "Machine",
                    "bit resolution", "Integer range")
## rownames are the names of the FCS binary files
rownames(table1)<-c( "facscan256.fcs","SEB-NP22.fcs","A06-H06")
save(table1,file="table1.Rda")
@

<<GenerateTable1BinaryFCSDataFiles,results=tex, echo=FALSE>>=
if (require(xtable)) {
  xtable(table1,
         caption="Example FCS binary files in 'rfcdmin' package that can be 
         read in using read.FCS or read.series.FCS.
         (UW: University of Washington, Seattle; FHCRC: Fred Hutchinson Cancer 
         Research Center, Seattle; BCCRC: British Columbia Cancer Research Center, Vancouver)",
         label="tab:1")
}else {
  cat("XTABLE not present; please install to get the right table.")
}
@ %def xtable.table1


\subsubsection{Binary FCS data files} 
\label{SUBSUBSEC:BinaryFCSFiles}

The Flow Cytometry Standard (FCS) binary files consist of a HEADER, a
TEXT, a DATA, and an optional ANALYSIS segment. The HEADER in ASCII
text gives information about the version of the FCS file and the byte
offsets of the beginning and ending of the other segments within the
FCS file. The FCS version 3.0 is currently used and has been updated
from version 2.0 to accommodate data sets longer than 99,999,999 bytes
and allowing for primary and supplemental portions within the TEXT
segment, among other changes. Located after the HEADER, the TEXT
segment in ASCII text includes summary information in keywords such as
the number of observations and names of column variables in the DATA
segment.  The DATA segment that follows consists of the raw binary
data.  The optional ANALYSIS segment includes some results of earlier
data analyses \citep{robin:2001}.

When specific immunofluoroescence signals are received and digitized
by the analog-to-digital converters (ADCs) of a flow cytometer
machine, the measurements are grouped into a number of bins based on
the bit-resolution of the ADC.  Thus, a \textit{n}-bit resolution ADC
will group the data into $2^n$ bins or "channels".  Thus, each
immunofluoroescence measurement variable is actually categorical and
has an integer range from 0 to $2^n$, depending on the ADC
bit-resolution, which is usually 10 or 8 \citep{robin:2001}.

\subsubsection{Reading in the FCS binary file}
\label{SUBSUBSEC:readFCS}
 
The subsequent code allows us to call the \Rpackage{rflowcyt}
library.  If the  \Rpackage{rflowcyt} library is in the working library
location, then the \Rargument{library.location} is a character string
identifying the location of the \Rpackage{rflowcyt} library.

<<rflowcytLibraryCall, echo=TRUE>>=
library(rflowcyt)
if (!require(rfcdmin)) {
  stop("rfcdmin not available?")
}
@


The raw binary FCS files can have the extension "\*.fcs" or no extension at all.
The raw binary FCS files can be read one by one using  the \Rfunction{read.FCS} function or
in a set using the \Rfunction{read.series.FCS} function. The output is either 
a \Robject{FCS} R object of S4 or S3 class or a list of \Robject{FCS} R-objects.
(\textit{NB: Currently, the \Rpackage{rflowcyt} package implements functions and methods with the S4 class.})
  
In order to read in the FCS binary file, the location of the FCS
binary file in the fcs or bccrc directories of the \Rpackage{rfcdmin}
package has to be input as a parameter in the calling for 
\Rfunction{read.FCS} or \Rfunction{read.series.FCS}.


<<FindingLocationOfFCSDirectoryOfrfcdmin, echo=TRUE>>=
fcs.loc <- system.file("fcs", package="rfcdmin")
@

After finding the \*.fcs file location, we will read in the raw binary
file "facscan256.fcs" using \Rfunction{read.FCS} and call it
\Robject{FC.FCSRobj}.  In order to demonstrate a S3-to-S4 class change, we
will incorrectly read in the binary file as an S3 object.

<<readFCSfacscan256binaryFile, echo=TRUE>>=
file.location <- paste(fcs.loc, "facscan256.fcs", sep="/")
FC.FCSRobj <- read.FCS(file.location, UseS3=TRUE, MY.DEBUG=FALSE)
@


The following are \Robject{FCS} R-objects which are readily accessible in R and
can be used for analysis using the tools in the \Rpackage{rflowcyt}
package.  Prior to this release, the FCS class has been S3.  Now the
FCS class among other classes (\Rclass{FCSmetadata}, \Rclass{FCSgate} (described in the
Gating section), \Rclass{FCSggobi} (still in working progress) ) are S4.  Use
\textbf{convertS3toS4} to convert S3-class \Robject{FCS} to S4-class \Robject{FCS} R
objects. To exemplify the conversion, the following demonstrates the
conversion of an S3 \Robject{FCS} R-object to an S4 \Robject{FCS} R-object:

<<convertS3toS4FCS, echo=TRUE>>=
FC.FCSRobj<-convertS3toS4(FC.FCSRobj,
                          myFCSobj.name="FC.FCSRobj",
                          fileName=file.location)
@ 

The \Rfunction{read.series.FCS} function allows to read multiple FCS files. The output is a list of \Robject{FCS} R-object.
<<FindingLocationOfbccrcDirectoryOfrfcdmin, echo=TRUE,eval=FALSE>>=
 pathFiles <- system.file("bccrc", package="rfcdmin")
 drugFiles <- dir(pathFiles)
 drugData <- read.series.FCS(drugFiles, path=pathFiles, MY.DEBUG=FALSE)
@


A \Robject{FCS} R-object has slots \Robject{data} and \Robject{metadata}. 
The \Robject{data} component is a matrix in which the rows are the individual cells or
observations, and the columns are the different immunofluoroescence
measurements or variables.  The metadata is of S4-class \Rclass{FCSmetadata}
and has slots referring to keywords that are in the TEXT segment of
the FCS raw binary file.  Information such as variable names
(\Robject{longnames} and \Robject{shortnames}) and ranges (\Robject{paramranges}) are also
slots in the \Robject{metadata} component.  For more details, see the help files
for \Robject{FCS} and \Robject{FCSmetadata}.


\subsubsection{Opening \Rpackage{rflowcyt} data with \Robject{FCS} R-objects}
\label{SUBSUBSEC:dataFCSRobj}

The \Rpackage{rfcdmin} data package also contains archived  \Robject{FCS} R-object:


<<DataRFCSobjects, echo=TRUE>>=
data(VRCmin)
data(MC.053min)
data(flowcyt.fluors)
@
For more details of the \Robject{FCS} R-object available see \Rfunction{data(package="rfcdmin")}.

\subsubsection{Other S4 class R-objects}
\label{SUBSUBSEC:OtherS4objs}

Besides the \Rclass{FCS} class, other S4 class R-objects
include \Rclass{FCSmetadata}, \Rclass{FCSsummary}, and \Rclass{FCSgate}. 
The \Rclass{FCS} class is the class of all FCS files that are read into R using
\Rfunction{read.FCS} or \Rfunction{read.series.FCS}.  The \Rclass{FCSmetadata} is 
the class of the \Robject{metadata} slot of an \textit{FCS} R-object. 
The \textit{FCSsummary} class is the class of the output of the \Rmethod{summary} method 
implemented on a \Robject{FCS} R-object.  The \Rclass{FCSgate} class
contains the \Rclass{FCS} class and extends it to include gating information
(ie, the information about the indexing of row observations for
subsequent extraction).

The following is a brief summary of the available, generic methods
associated with each class object.

\begin{itemize}
\item \textbf{"new" Generic Method}
Default objects can be made by using the \Rmethod{new(object-contents,
  S4-class-name)} method, where object-contents refer to the contents
of each slot for the specified S4-class-name.

The following commands produce default class objects with no slot
information:

<<NewDefaultS4Object, echo=TRUE>>=
## default S4 objects
new.FCS <- new("FCS")
new.FCSmetadata <- new("FCSmetadata")
new.FCSsummary <- new("FCSsummary")
new.FCSgate <- new("FCSgate")
@ 

\item \textbf{"coercian" Generic Method}
Currently, there are only coercian methods to and from the \Rclass{FCS} class
and the \Rclass{matrix} and \Rclass{data.frame} classes.

A user makes a \Robject{FCS} R-object by the coercian method \Rmethod{as} exemplified in
the following code, where data2 is a \Robject{matrix} or \Robject{data.frame} identifying
the rows as the cell observations and the columns as the different
variables:

<<asFCSobject, echo=TRUE>>=
data2 <- rbind(1:10, 2:11, 3:12)
## coerce data into a matrix object
data2.matrix <- as(data2, "matrix")
## coerce data into a data.frame object
data2.df <- as.data.frame(data2)
## coercing matrix into FCS
test.FCSRobj <- as(data2.matrix, "FCS")
## coercing data.frame into FCS
test.FCSobj2 <- as(data2.df, "FCS")
## coercing a FCS object to a matrix
original.matrix <- as(test.FCSobj2, "matrix")
## coercing a FCS object to a data.frame
original.matrix <- as(test.FCSobj2, "data.frame")
## assigning the metadata
metadata <- new("FCSmetadata", size=dim(data2)[1], nparam=dim(data2)[2],
               fcsinfo=list("comment"="This is a pseudo FCS-R object."))
test.FCSRobj@metadata<-metadata
test.FCSRobj
@

\item \textbf{"is" Generic Method}
The S4 R-object class can be verified by using the \Rmethod{is} method.  

<<isS4object, echo=TRUE>>=
is(MC.053, "matrix")
is(MC.053, "FCS")
is(MC.053@metadata, "FCSmetadata")
is(MC.053, "FCSgate")
@

The \Rclass{FCSsummary} class is exemplified below:

<<isFCSsummary, echo=TRUE>>=
sum.FCS <- summary(MC.053)
is(sum.FCS, "FCSsummary")
@ 
\end{itemize}

\subsection{Tools to access and manipulate \Robject{FCS} R-objects}
\label{SUBSEC:DescriptTools}
\subsubsection{Descriptive tools for \Rclass{FCSmetadata} class R-objects}
\label{SUBSUBSEC:DescriptToolsFCSmetadata}

To access or extract the metadata components, use either the \Rfunction{@}
or \Rfunction{metaData}.

<<ExtractMetadata, echo=TRUE>>=
## returns the same FCSmetadata object
meta1<-st.1829@metadata
meta1<-metaData(st.1829)
@ 

To get a description of the \Rclass{FCSmetadata} class, use for example the \Rfunction{show},
\Rfunction{print}, and \Rfunction{summary} functions.

<<DescribeFCSmetadata, echo=TRUE>>=
show(st.1829@metadata)
@

The following code would output the \Robject{metadata} as a string and is not
shown because of its lengthy output.

<<SummaryFCSmetadata, echo=TRUE, eval=FALSE>>=
summary(st.1829@metadata)
@ 

The slots for an \Rclass{FCSmetadata} are summarized in Table~\ref{tab:4}.

<<table4, echo=FALSE>>=
slotnames<-c( "mode", "size", "nparam", 
"shortnames", "longnames", "paramranges", 
"filename", "objectname", "original", "fcsinfo")
description<-c("Mode", "number of cells/rows", 
"number of column parameters", 
"shortnames of column parameters", 
"longnames of column parameters", 
"Ranges/Max value of the columns", 
"original FCS filename", 
"name of the current object", 
"current object original status", 
"misc.metadata info")
table4 <- data.frame(slotnames, description)
save(table4, file="table4.Rda")
@ 

<<GenerateTable4FCSmetadataslots,results=tex, echo=FALSE>>=
if (require(xtable)) {
  xtable(table4,
         caption="FCSmetadata slot descriptions",
         label="tab:4")
}else {
  cat("XTABLE not present; please install to get the right table.")
}
@ %def xtable.table4

Slots and slot components of the metadata can be retreived by using
\Rfunction{@}, \Rfunction{[i]}, or \Rfunction{[[i]]}. Currently to extract
metadata information, we can use a single character string index being
one of the slotnames in Table ~\ref{tab:4} or one of the slotnames in
the \Robject{fcsinfo} slot.  In the case that there is a common slotname that
is also in the \Robject{fcsinfo} slot, only the slot from Table ~\ref{tab:4}
will be retrieved.

A single numeric index or a vector of numeric indices refers to only
the slot positions of the \Robject{fcsinfo} slot.

The following examples extract the column parameter ranges or maximum
value.
<<SlotExtractMeta, echo=TRUE>>=
## extracting the ranges
st.1829@metadata@paramranges
st.1829@metadata["paramranges"]
st.1829@metadata[["$PnR"]]
@ 

A single component of the range can also be retrieved
<<SlotExtractMeta, echo=TRUE>>=
st.1829@metadata[["$P1R"]]
@ 

Items in the \Rclass{FCSmetadata} can be replaced by using \Rfunction{[...]<-} 
or \Rfunction{[[...]]<-}.  

The following example will replace the \Robject{longnames} with dummy names.

<<ReplaceMeta, echo=TRUE>>=
## longnames before the change
st.1829@metadata["longnames"]
## some longname changes
st.1829@metadata["longnames"] <- rep("dummy", length(st.1829@metadata["longnames"]))
## name the third column longname as "wrongname"
st.1829@metadata["$P3S"] <- "wrongname"
## longnames after the change
st.1829@metadata["longnames"]
@

To extract and replace slots of the metadata of a \Robject{FCS} object, use
only \Rfunction{[[..]]} and \Rfunction{[[...]]<-}, respectively.

<<FCSmetadataExtractReplace, echo=TRUE>>=
## extraction
shortnames.1829 <- st.1829[["shortnames"]]
shortnames.1829

##replacement
st.1829[["$PnR"]]
st.1829[["$P1R"]] <- 0
st.1829[["paramranges"]]

st.1829[["newslot"]]
st.1829[["newslot"]] <- "this is even cooler"
st.1829[["newslot"]]
@ 


When using the replacement method for a \Robject{FCSmetadata} R-object
(\textit{i.e.},\Rfunction{[...]<-} or \Rfunction{[[...]]<-}), if the slotname is
not found, then a new slot with the current character index is made
under the \Robject{fcsinfo} slot.  In the following example, we will add a new
slot named \Robject{newslot} to the metadata.


<<AddNewSlotMeta, echo=TRUE>>=
## making a newslot
st.1829@metadata[["newslot"]]<- "wow this is cool"
## newslot is automatically made in the "fcsinfo" slot
st.1829@metadata@fcsinfo[["newslot"]]
@

\subsubsection{Descriptive tools for \Rclass{FCS} class R-objects}
\label{SUBSUBSEC:DescriptToolsFCS}

To access or extract the data components, use either \Rfunction{@} or \Rfunction{fluors}.

<<ExtractMetadata, echo=TRUE>>=
## returns the same FCSmetadata object
meta1<-st.1829@metadata
meta1<-metaData(st.1829)
@ 

<<ExtractData, echo=TRUE>>=
## returns the same data matrix
data1<-st.1829@data
data1<-fluors(st.1829)
summary(data1)
@

A set of descriptive tools are attached to the FCS R-object. The
method \Rmethod{print} (using an \Robject{FCS} object in its signature) will
automatically give a short summary of the \Robject{FCS} R-object without
printing out all the contents of the data and the metadata.  The
following examples are different incantations of the \Rmethod{print} method
for FCS objects:

<<printFCSobject, echo=TRUE>>=
print(unst.1829) 
print(MC.053)
@

A longer and more detailed summary with statistics for the column
variables can be displayed by using the \Rmethod{summary} method, whose output is 
a \Rclass{FCSsummary} S4 class.


To extract and replace components within the data matrix of a 
\Robject{FCS} object, use only \Rfunction{[..]} and 
\Rfunction{[...]<-}, respectively.

<<FCSmetadataExtractReplace, echo=TRUE>>=
## extraction first 10 rows
firstten.1829 <- as(st.1829[1:10,], "matrix")
firstten.1829
## etraction of single element
firstobs.1829 <- as(st.1829[1,1], "matrix")
firstobs.1829
##replacement of first element
st.1829[1,1] <- 99999999
as(st.1829[1,1], "matrix")
st.1829[1,1]<-firstobs.1829
as(st.1829[1,1], "matrix")
st.1829[1,1] 
@ 

Note that the "original" slot within the metadata is only changed to
FALSE when the data is changed.

Changing the metadata itself will not alter the status of the
"original" slot.

<<OriginalStatus, echo=TRUE>>=
## the data was changed so the original flag should be FALSE
st.1829[["original"]]
@ 

The function \Rfunction{dim.FCS} retrieves the dimensions of the data matrix
of a \Robject{FCS} R-object.

<<dim.FCS, echo=TRUE>>=
dim.1829 <- dim.FCS(st.1829)
dim.1829
@ 

A data parameter column can be appended to the data matrix of a
\Robject{FCS} R-object by using the method \Rmethod{addParameter}, which will also
result in the change of the "original" metadata slot to be FALSE.

<<addDataParametertoFCS, echo=TRUE>>=
column.to.add <- rep(0, dim.1829[1])
st.1829 <-addParameter(st.1829, colvar=column.to.add, 
                shortname="test", longname="example", 
                use.shortname=FALSE)
@ 

\clearpage
\section{Data Assessment}
\subsection{Checking Validity of the \Robject{FCS} R-object and Fixing errors}
\label{SUBSEC:checkfixvars}

The method \Rmethod{checkvars} checks the ranges, dimensions, and the column
variable names of the data against what is specified in the
metadata.  If details are not specified in the metadata, then the
available information is added to the metadata.  The output is a
boolean as to whether the object passes the check.  The option,
\Rargument{MY.DEBUG=TRUE}, allows us to view the checking statments.

<<checkvarsFCS, echo=TRUE>>=
st.1829.checkstat <- checkvars(st.1829,
  MY.DEBUG=TRUE)
st.1829.checkstat
@

Because st.1829 has been altered such that there is a discrepancy
between the metadata and the data portions of this \Robject{FCS} object,
\Rfunction{fixvars} will be used to correct major errors.
 
<<fixvarsFCS, echo=TRUE>>=
if (st.1829.checkstat==FALSE){
   ## fix the FCS R object
   st.1829 <- fixvars(st.1829, MY.DEBUG=TRUE)
}
@

The original FCS R-object can be retrieved by using the function
\Rfunction{get}, if the original object is on the current workspace and has been
unchanged.  Alternatively, the original FCS R-object can be obtained
by reading in the binary, fcs file from the /fcs directory (if this
raw binary file exists) of the data package \Rpackage{rfcdmin}.

<<ExtractFCSobj, echo=TRUE>>=
st.1829 <- get(st.1829[["objectname"]])
original.FC.FCSRobj <- read.FCS(FC.FCSRobj[["filename"]], 
  MY.DEBUG=FALSE)
@ 


\subsection{Equality between \Robject{FCS} objects}
\label{SUBSEC:FCSequal}

Two \Robject{FCS} objects can be checked for equality by using the \Rmethod{equals}
method.  The default check is to verify the equality of the metadata
(except for the \Rargument{filename} and the \Rargument{objectname}) and all
the elements of the data.

<<EqualityFCSobjsome, echo=TRUE>>=
## default is to not check the equality 
## of filenames and objectnames and
## only check the equality of the data and 
## the other metadata slots
equals(st.1829, unst.1829)
@ 

The \Rargument{check.filename} and \Rargument{check.objectname} set to TRUE will allow the
equality verification of the \Robject{filename} and \Robject{objectname} slots in the
metadata.

<<EqualityFCSobjall, echo=TRUE>>=
## check equality of everything in the metadata 
## and the data of the FCS objects
equals(st.1829, st.1829, check.filename=TRUE, 
check.objectname=TRUE)
@ 

\subsection{Data quality assessment between \Robject{FCS} objects}
\label{SUBSEC:FCSassessment}

Data assessment is the act of inspecting data, measuring the defects and 
analyzing the cause (and potentially the impact of those defects). 

You may have an experiment where one sample has been divided in several aliquots and 
therefore you should be able to compare some measured parameters like the size 
(FSC - Forward SCatter) or the granularity (SSC - Side SCatter) of the different aliquots. 
In order to assess the data quality, we propose 3 graphical methods to explore 
the data and to detect whether any aliquots or samples were substantially different from 
the others, in a way that were not likely to be biologically motivated and therefore misleading. 

\begin{enumerate}
\item Boxplot (or box-and-whisker) plot is a graphical representation of dispersions 
and extreme scores. Represented in this graphic are minimum, maximum, and quartile 
scores in the form of a box with "whiskers." The box includes the range of scores 
falling into the middle 50\% of the distribution (median) (Inter Quartile Range = 
75 th percentile - 25 th percentile) and the whiskers are lines extended to the 
minimum and maximum scores in the distribution or to mathematically defined (+/-1.5*IQR) 
upper and lower fences. Boxplot summarizes the location and the shape of the distribution. 
For more details, see \Rfunction{boxplot} and \Rfunction{boxplot.FCS} in the 
\Rpackage{graphics} and\Rpackage{rflowcyt} package, respectively .

\item Empirical cumulative distribution function (\Rfunction{ecdf}) reveals 
differences in the distributions. For more details, see \Rfunction{ecdf} and 
\Rfunction{plotECDF.FCS} in the \Rpackage{stats} and\Rpackage{rflowcyt} package, respectively.

\item Density plot reveals the shape of the distributions.For more details, see 
\Rfunction{density.lf} and \Rfunction{plotdensity.FCS} in the \Rpackage{locfit} 
and \Rpackage{rflowcyt} package, respectively.
\end{enumerate}

As an example, we draw the differents plot for two different datasets from the 
British Columbia Cancer Research Center, Vancouver (Canada):
\begin{enumerate}
\item Time course experiment: collection of weekly peripheral blood samples 
from 1 patient (R object \Robject{flowcyt.fluors} in the \Rpackage{rfcdmin} data package).
\item Cell line dataset: Flow Cytometry High Content Screening (FC-HCS) of a 2000-compound 
library by the mean of one cell line (FCS files from A06 to H06 in the \Rpackage{rfcdmin} 
data package -in this experiment, the name of the sample corresponds to its position 
in a 96 wells plates) \citep{gasparetto:smith:2004}. 
\end{enumerate}
For more details, see \Rfunction{help="rfcdmin"}.

\subsubsection{Time course experiment}
\label{SUBSUBSEC:FCSTimeCourse}
This dataset is an abstract of a time couse experiment realised at the British 
Columbia Cancer Research Center (Thanks to R. Brinkman, C. Smith and M. Gasparetto).
It is a collection of weekly peripheral blood samples from 1 patient, divided 
in 8 aliquots at each time point and labeled by 4 markers to identify 8 differents stains.

First, we load the data:
<<GetDataTimeCourse>>=
 require(rfcdmin)
 data(flowcyt.data)
@ 

Then, we can draw a density plot for the Foward SCatter parameter 
(\Rargument{varpos}=1) at the time points 1 and 9.

\clearpage

\begin{figure}[htbp]
 \centering  
<<PlotDensityTimeCourse, echo=TRUE, fig=TRUE>>=
 ## Draw a  density plot for the Foward scatter parameter 
 old.par <- par(no.readonly=TRUE)
 mat <- matrix(c(1:2),1,2,byrow=TRUE)
 nf <- layout(mat,respect=TRUE)
 plotdensity.FCS(flowcyt.data[1:8],
                 varpos=c(1),
                 main="FSC density plot at time point 1",
                 ylim=c(0,0.015), 
                 ylab="density of cells")
 legend(450,0.012,paste("stain",c(1:8),sep=""),col=c(1:8),pch=22)
 
 plotdensity.FCS(flowcyt.data[65:72],
                 varpos=c(1),
                 main="FSC density plot at time point 9",
                 ylim=c(0,0.015),
                 ylab="density of cells")
 legend(450,0.012,paste("stain",c(1:8),sep=""),col=c(1:8),pch=22)
 par(old.par)
@ 
\caption{plotdensity.FCS: Density plots of the forward scatter 
parameter of the different stains, at different 2 different 
time points (\Robject{flowcyt.fluors}  FCS R object)}
\label{fig:plotdensitTimeCourse}
\end{figure}

At the time point 9, we can see that the data for the first 
stain looks peculiar. We can also observe the multimodal distribution of the data. 

\clearpage

To minimize the effect of multimodal distribution on the shape of the distributions, 
we can represent the FSC data \textit{via} a ECDF plot. This representation eventially 
allows us to confirm the previously observed phenomenon.

\begin{figure}[htbp]
  \centering
<<PlotECDFTimeCourse, echo=TRUE, fig=TRUE>>=
 ##Draw an empirical cumulative density plot for the Foward scatter
 ##parameter of the different stains at a particular different time point
 ##(one panel per time point).
print(plotECDF.FCS(flowcyt.data,
                   varpos=c(1),
                   var.list=c(paste("time",1:12,sep="")),
                   group.list=paste("Stain",c(1:8),sep=""),
                   main="ECDF of the FSC for different stains at a particular time point",
                   lwd=2,
                   cex=1.5))

@ 
\caption{plotECDF.FCS: Empirical Cumulative Distribution plot of the forward 
scatter parameter of the different stains, at different 12 different time points 
(\Robject{flowcyt.fluors}  FCS R object)}
\label{fig:plotECDFTimeCourse}
\end{figure}

\clearpage

Finally, we can draw a boxplot for the Foward SCatter parameter (\Rargument{varpos}=1) 
at different time points (\textit{e.g.} time points 1, 3, 7 and 9) 

\begin{figure}[htbp]
  \centering
<<BoxplotTimeCourse, echo=TRUE, fig=TRUE>>=
 ## Draw a boxplot for the Foward SCatter parameter for the time points 1
 ## and 6 (in this experiment, each time point corresponds to a column of
 ## a 96 wells plates)
 old.par <- par(no.readonly=TRUE)
 mat <- matrix(c(1:4),2,2,byrow=TRUE)
 nf <- layout(mat,respect=TRUE)
print(boxplot.FCS(flowcyt.data[1:8],
             varpos=c(1),col=c(1:8),
             main="FSC across stains time point 1",
             names=paste("stain",c(1:8),sep="")))
print( boxplot.FCS(flowcyt.data[17:24],
             varpos=c(1),
             col=c(1:8),
             main="FSC across stains time point 3",
             names=paste("stain",c(1:8),sep="")))

print( boxplot.FCS(flowcyt.data[49:56],
             varpos=c(1),
             col=c(1:8),
             main="FSC across stains time point 7",
             names=paste("stain",c(1:8),sep="")))

print( boxplot.FCS(flowcyt.data[65:72],
             varpos=c(1),
             col=c(1:8),
             main="FSC across stains time point 9",
             names=paste("stain",c(1:8),sep="")))
 par(old.par)
@
\caption{boxplot.FCS: Boxplot of the Foward SCatter parameter of 
the different stains at different 4 different time points 
(\Robject{flowcyt.fluors}  FCS R object)}
\label{fig:BoxplotTimeCourse}
\end{figure}

\clearpage
The boxplot representations also confirm that, at time point 9, the first stain looks peculiar. 
Because of the multimodal distribution, the boxplot representation can be criticized but it 
still give us a good overview of the location of the distribution . For example, we can also see 
that at time point 1 and 3 the median of the FSC parameter is around 200 and that this parameter 
falls to 100 at time point 7 (observations not easily made in the density plot and not available 
in the ECDF representation).

\clearpage

\subsubsection{Cell line experiment}
\label{SUBSUBSEC:FCSCellLine}
This dataset is an abstract of a Flow Cytometry High Content Screening (FC-HCS) of a 2000-compound 
library (the reagent being a cell line) to identify compounds that would enhance the anti-lymphoma 
activity of the therapeutic antibofy rituximab~\citep{gasparetto:smith:2004}. It is a collection 
of 8 FCS files.

First, we read the 8 FCS files, 
<<GetDataCellLine>>=
 if (require(rfcdmin)) {
 ##Obtaining the location of the fcs files in the data
  pathFiles<-system.file("bccrc", package="rfcdmin")
  drugFiles<-dir(pathFiles)
 
 ##Reading in the FCS files
  drugData<-read.series.FCS(drugFiles,path=pathFiles,MY.DEBUG=FALSE)

 ##Extract fluorescent information from the serie of FCS files
 drug.fluors<-lapply(drugData,fluors)
}
@ 

Then, we can draw the different plots for the Foward SCatter parameter (\Rargument{varpos}=1) of 
the cell line aliquots treated with different compounds. As in the previous example, we can observe 
some noise in the data. 

\begin{figure}[htbp]
  \centering
<<PlotDensityCellLine, echo=TRUE, fig=TRUE>>=
 ##Draw a density plot for the Foward SCatter parameter for the
 ##differents aliquots (of the same cell line) tested with different
 ##compounds.
 plotdensity.FCS(drugData,
                 varpos=c(1),
                 main="FSC for aliquots 
treated with different compounds",
                 ylim=c(0,0.005),
                 ylab="Density of cells")
@ 
\caption{plotdensity.FCS: Density of the Foward SCatter parameter of the cell 
line aliquots treated with different compounds.}
\label{fig:plotDensityCellLine}
\end{figure}

\clearpage

\begin{figure}[htbp]
  \centering
<<BoxplotCellLine,echo=TRUE, fig=TRUE>>=
 ##Draw a boxplot for the Foward SCatter parameter
 ##for the differents aliquots (of the same cell line)
 ##tested with different compounds. 
print( boxplot.FCS(drugData,
             varpos=c(1),
             col=c(1:8),
             main="FSC of differents aliquots from 
a cell line treated with different compounds."))
@ 
\caption{boxplot.FCS: Boxplot of the Foward SCatter parameter of the cell line 
aliquots treated with different compounds.}
\label{fig:plotECDFCellLine}
\end{figure}

\clearpage

\begin{figure}[htbp]
  \centering
<<PlotECDFCellLine, echo=TRUE, fig=TRUE>>=
 ##Draw a empirical cumulative density plot for the Foward scatter
 ##parameter for the differents aliquots (of the same cell line)
 ##treated with different compounds. 
print(plotECDF.FCS(drugData,
              varpos=c(1),
              var.list=c("Serie"),
              group.list=paste("compound",c(1:8),sep=""),
              main="ECDF for different aliquots
	treated with diffrent compounds.",
              lwd=2,
              cex=1.5))
@ 
\caption{plotECDF.FCS: Empirical cumulative distribution plot of the Foward 
SCatter parameter of the cell line aliquots treated with different compounds.}
\label{fig:BoxplotCellLine}
\end{figure}


\clearpage
\section{Data Visualizations}
\label{SEC:flowcyt-vis-intro}

In this section, we include visualization tools that help analyze the
multivariate flow cytometry data.  Because each cell has multiple
immunofluoroescence and light scatter measurements, we have made
alternatives to visualize, beyond the ordinary bivariate scatterplots,
the cell distributions based on the different measurements.  The
common approach in the field circumvents the visualization of data on
all variables by selecting a subset of "interesting" cells by a
sequential progression of 1 and 2 dimensional gating steps.  Gating
refers to the selection of a region of cells or observations in a
bivariate or univariate plot by placing boundaries around the region.
These boundaries or thresholds based on a particular
immunofluoroescence or light scatter measurement are refered to as
\textbf{gates}. The sequence of gating steps is based on certain
pairs of measurements or individual measurement, in which the gated
region in a previous step is subsequently gated further in the next
gating step.  First we discuss the bivariate and multivariate plotting
tools and then the gating tools.

\subsection{Bivariate Plotting Tools}
\label{SUBSEC:BivarPlotTools}

The basic bivariate plots are the \Rfunction{ContourScatterPlot} with
hexgonal binning without contours or rectangular binning with
superimposed contour levels and the \Rfunction{parallelCoordinates} plot
which is either an \Rfunction{ImageParCoord} or a
\Rfunction{JointImageParCoord} plot.

\subsubsection{ContourScatterPlot}
\label{SUBSUBSEC:ContourScatterPlot}

The \Rfunction{plotvar.FCS} has the options of plotting specified
variables from an FCS R-object. A univariate histogram or
ContourScatterPlot with hexgonal binning or rectangular binning can be
shown with the appropriate specified options.  Here we will
demonstrate with the FCS R object \Robject{unst.1829} the uses of
\Rfunction{plotvar.FCS}.


\begin{figure}[htbp]
  \centering  
<<plotvarhist, echo=TRUE, fig=TRUE>>=
plotvar.FCS(unst.1829, varpos=c(1))
@ 
\caption{plotvar.FCS: Plotting a single variable histogram with the
  \Robject{unst.1829} FCS R object}
\label{fig:plotvarhist}
\end{figure}

\clearpage

\begin{figure}[htbp]
  \centering  
<<plotvarrectbin, echo=TRUE, fig=FALSE,eval=FALSE>>=
plotvar.FCS(unst.1829, varpos=c(3,4), hexbin.CSPlot=FALSE)
@ 
  \caption{plotvar.FCS: Plotting a bivariate ContourScatterPlot with
    rectangular binning with the \Robject{unst.1829} FCS R object} 
  \label{fig:plotvarrectbin}
\end{figure}
     

The function \Rfunction{ContourScatterPlot} will make an image plot using
rectangular bins of counts produced by the function \Rfunction{make.grid}
by default.  Also by default, there are superimposed contour levels
that are also drawn on the plot with rectangular image binning.  The
\Rfunction{make.grid} function is used by the \Rfunction{ContourScatterPlot} function to
make an count matrix for the number of observations in a
two-dimensional grid layout.  This function will output a matrix of
counts ("z") as well as the total number of observations ("n.cells")
within this matrix.  The count matrix for the image plot has 25 unit
cut-offs and can be changed by the \Rargument{x.grid} and \Rargument{y.grid} options.
Alternatively, if there is a \Rargument{status} or binary response
variable for the data, other values such as the difference in counts,
proportions, normalized proportions, and z statistics can be
calculated by \Rfunction{make.density} for the rectangular bins of the
image plot. Currently, a roughly estimated color legend is available
for this rectangular binning with the \Rfunction{legend.CSP} function.

Alternatively, however, there is an option for hexgonal binning with an appropriate 
legend.  Note that the Bioconductor \Rpackage{hexbin} package is necessary for this plot option.
The hexagonal binning does not have superimposed contour levels nor does it
have the option to estimate other values besides counts in its bins. 

We will demonstrate the use of \Rfunction{ContourScatterPlot} 
to make the same plots exemplified earlier with \Rfunction{plotvar.FCS}.  
These plots are not shown.

The following code extracts the third and the fourth
column variables of the FCS R object \Robject{unst.1829}.

<<obtainTwoColumnvars, echo=TRUE>>=
## obtain the two column variables
xvar<-as(unst.1829[,3], "matrix")
yvar<-as(unst.1829[,4], "matrix")
@ 

The \Rfunction{ContourScatterPlot} function is implemented to make a 
plot with hexgonal binning and a legend.  
Other parameters such as binning style and number of 
bins can also be specified in the signature.

<<CSPhexbin, echo=TRUE, eval=FALSE, fig=FALSE>>=
## hexagon cells without contour lines; default n.hexbins=100
ContourScatterPlot(xvar, yvar,
                   xlab=unst.1829[["longnames"]][3],
                   ylab=unst.1829[["longnames"]][4],
                   main="Individual unst.1829",
                   hexbin.plotted=TRUE)
@ 

A plot can be made that has rectangular binning.  The color of the
image map (via the \Rargument{image.col} option) can be changed as well as the
size of the rectangular bins by \Rargument{x.grid} and \Rargument{y.grid} options.  
A legend can be displayed in a separate plot by setting the option
\Rargument{plot.legend.CSP} = TRUE.

<<CSPrectbin, echo=TRUE, eval=FALSE, fig=FALSE>>=           
## rectangular cells with the contour plot 
ContourScatterPlot(xvar, yvar,
                   xlab = unst.1829[["longnames"]][3],
                   ylab = unst.1829[["longnames"]][4],
                   main = "Individual 042402c1.053",
                   hexbin.plotted = FALSE,
                   numlev = 25, 
                   image.col = heat.colors(15))
@

\subsection{Multivariate Plots}
\label{SUBSEC:multivarplot}

The FCS R-object can be plotted using the generic \Rfunction{plot.FCS} or \Rfunction{plot}
command which will make a pairs plot (by default) or a parallel
coordinates plot.  Here we show a default pairs plot using
rectangular binning : 

\begin{figure}[htbp]
  \centering
<<PlotFCSobj, echo=TRUE,fig=TRUE>>=
## should be able to implement because it is a pairsplot
print(plot(unst.1829))
@
\caption{unst.1829: Default Pairs plotting with rectangular bins}
\label{fig:PlotFCSobj}
\end{figure}

The same plot can be made using hexagonal binning; the code is shown,
but the plot will not be displayed.  \textbf{This is currently broken.}

<<plotFCSHexbin, echo=TRUE, eval=FALSE, fig=FALSE>>=
## plot(st.1829, alternate.hexbinplot=TRUE)
@ 

Additional parameters for the pairsplot of a data matrix can be 
referenced by the \Rfunction{pairs.CSP} function. Currently a
color legend can be plotted in the lower panels for \Rfunction{pairs.CSP}
only for the rectangular binning.  There is currently no legend available 
for pairs.CSP using hexagonal binning.

The parallel coordinates plot tracks each observation whose value is
plotted on the vertical, y-axis through a series of variables on the
horizontal, x-axis.  The observation is tracked by a line from one
variable to the next.  The order of the column variables on the
horizontal axis is the order that is presented in the input data
matrix.

Here we make a parallel coordinates plot for the data portion of the
\Robject{st.1829} FCS R-object.  Because there are too many cell or row
observations, we only show the first 10 observations in this parallel
coordinates plot.

\begin{figure}[htbp]
  \centering
<<ParallelCoordinatesPlot,echo=TRUE,fig=TRUE>>=
par(mfrow=c(1,1))
row.obs<-1:10
parallelCoordinates(as(unst.1829[row.obs,], "matrix"))
@
\caption{Parallel Coordinates plot of the first ten observations 
in the data of unst.1829.}
\label{fig:ParallelCoordinatesPlot}
\end{figure}

It is important to note that all column variables in this plot must
have the same range and scaling.  We can force scaling on a [0,1]
scale by using the option \Rargument{scaled} set to TRUE.  We can also give
group certain observations by color (\Rargument{group.col}), type (\Rargument{group.lty}),
and width (\Rargument{group.lwd}) of line.  New observations can also be added
at a time by setting \Rargument{superimpose} to be TRUE or by using the function 
\Rfunction{add.parallelCoordinates}.  The following example shows these other 
options:

\begin{figure}[htbp]
  \centering
<<ParallelCoordinatesPlot2,echo=TRUE,fig=TRUE>>=
row.obs<-1:10
parallelCoordinates(as(unst.1829[row.obs,], "matrix"), 
                     scaled=TRUE,
                     group=c(rep(1, 5), rep(2, 5)))
@
\caption{Scaled Parallel Coordinates plot of the first ten observations 
in the data of unst.1829, where the first 5 observations are in one group,
and the next five observations are in the second group.}
\label{fig:ParallelCoordinatesPlot2}
\end{figure}

Because there are many cell or row observations, an ImageParCoord or 
JointImageParCoord plot can be used to show all of the row observations
by binning on the y-axis and having the different column variables as labels
on the x-axis.  There are superimposed parallelCoordinates lines on the
colored binning that demonstrate the movement of observations from one bin
of one variable to another bin of the next variable.  In an \Robject{ImageParCoord}, 
these lines represent moves only between two adjacent variables, and 
in a JointImageParCoord, the lines represent movement among all of the 
variables.  The plots are subject to change with the ordering of the 
column variables as labels on the x-axis of the plots. Additional parallelCoordinates
lines can be added to any existing plot using the 
\Rfunction{add.parallelCoordinates} function.

The following series of graphs exemplify the Image parallelCoordinates plots.
Only the first 5 column variables and the first 1000 observations will be
shown.

\begin{figure}[htbp]
  \centering
<<ImageParCoordPlot, echo=TRUE, fig=TRUE>>=
## need to separate legend plotting
output1<-ImageParCoord(unst.1829@data[1:1000, 1:5], 
                       num.bins=16,
                       title="1000 obs 16 bins 5 trans", 
                       ntrans=5,
                       legend.plotted=FALSE,
                       plotted=TRUE,
                       image.plotted=TRUE,
                       lines.plotted=TRUE,
                       MY.DEBUG=FALSE)
@ 
\caption{This plot is an Image Parallel Coordinates plot of the first 
1000 observations and the first 5 column variables
in the "data" of unst.1829.}
\label{fig:ImageParCoordPlot}
\end{figure}


\clearpage

The functions \Rfunction{ImageParCoord} and \Rfunction{JointImageParCoord} can also plot 
histograms and traditional parallel coordinates plots as diagnostics in addition 
to or separately from the image parallel coordinates plots when the option
\Rargument{MY.DEBUG=TRUE}.


<<JointImageParCoordPlot, echo=TRUE, eval=FALSE>>=
## need to separate legend plotting
output3<-JointImageParCoord(unst.1829@data[1:1000,1:5],
                            num.bins=16,
                            title="1000 obs 16 bins 5 trans", 
                            ntrans=5, 
                            legend.plotted=FALSE, 
                            MY.DEBUG=FALSE)
@ 


\subsection{Dynamic Plotting Tools}
\label{SEC:DynamPlot}

Another multi-dimensional tool is \Rfunction{xgobi.FCS} which uses the \Rpackage{xgobi}
library.  We will leave the example for the user because the tool is
interactive.  Generally, by default \Rfunction{xgobi.FCS} will show the first 15
observations across all variables in the input data of the FCS
R-object in a high-level multi-dimensional plot, in which the user is
able to shift among sets of variables, color certain observations, and
rotate visual perspectives of these observations amongst these
variables.  The function \Rfunction{xgobi.FCS} allows the user to input the FCS
R-object, subset amongst the row observations, and subset amongst the
column variables to show in an \Rpackage{xgobi} plot.  Currently \Robject{ggobi} S4 
objects are still being contructed and would extend \Rpackage{xgobi} with more
dynamic plotting and subsetting features.  

The example code for the S3 \Rfunction{xgobi.FCS} is
shown below but is left for the user to run separately.
By default, only the first 15 rows and half of the column 
variables are shown.  If \Rargument{subset.row} and \Rargument{subset.col} are
specified, then these rows and columns will be displayed
for the user to view interactively.  In the second example,
the first 6000 rows with the first 2 column variables are shown.

<<ExampleXgobi, echo=TRUE, eval=FALSE>>=
## plots first 1/15 rows
## plots first 1/2 columns
xgobi.FCS(unst.1829, 
  title="unst.1829 default subset")
## plots all the rows
## plots only the first 3 columns
xgobi.FCS(unst.1829, 
  subset.row=1:6000, 
  subset.col=1:2,
  title="unst.1829: 6000 rows, 2 vars")
@ 

\section{Gating}
\label{SEC:gating}

<<table5, echo=FALSE>>=
slotnames<-c("gate", "history", "extractGatedData.msg", 
"current.data.obs", "data", "metadata")
description<-c("matrix of column indices for row selection", 
"vector of strings describing columns in gate", 
"vector of strings describing extraction of the data", 
"vector of the original row positions in current data", 
"matrix of column variables for rows denoting cells",
"FCSmetadata object")
table5 <- data.frame(slotnames, description)
save(table5, file="table5.Rda")
@ 

<<GenerateTable5FCSgateslots,results=tex, echo=FALSE>>=
if (require(xtable)) {
  xtable(table5,
         caption="FCSgate slot descriptions",
         label="tab:5")
} else {
  cat("XTABLE not present; please install to get the right table.")
}
@ %def xtable.table5

<<table6, echo=FALSE>>=
slotnames<-c("uniscut", "bipcut", "bidcut", "biscut", "biscut.quadrant", "")
description<-c("univariate single cut",
"bivariate polygonal cut", "bivariate double cut",
"bivariate single cut",
"values denoting the quadrant to be selected",
"$+$/$+$, $+$/$-$, $-$/$-$, $+$/$-$")
table6 <- data.frame(slotnames, description)
save(table6, file="table6.Rda")
@ 

<<GenerateTable6GateTypes,results=tex, echo=FALSE>>=
if (require(xtable)) {
  xtable(table6,
         caption="Types of Gating",
         label="tab:6")
} else {
  cat("XTABLE not present; please install to get the right table.")
}
@ %def xtable.table6

<<table7, echo=FALSE>>=
slotnames<-c("gateNum", "gateName", "type", 
"biscut.quadrant", "data.colpos", "data.colnames",
"IndexValue.In", "gatingrange", "prev.gateNum", 
"prev.gateName", "comment")
description<-c("column position in 'gate' matrix", 
"name of gate index", 
"type of gating", 
"quadrant selected, if gating type is 'biscut'", 
"'data' column variable positions used in gating",
"'data' names of the column variables used in gating",
"value of the gating index denoting inclusion",
"vector of gating thresholds",
"gateNum of previous gating, if any",
"gateName of previous gating, if any",
"comment by user for this gating index")
table7 <- data.frame(slotnames, description)
save(table7, file="table7.Rda")
@

<<GenerateTable7extractGateHistoryOutput,results=tex, echo=FALSE>>=
if (require(xtable)) {
  xtable(table7,
         caption=paste("Description of 'extractGateHistory' output:",
                 "Gating Details", sep =" "),
         label="tab:7")
} else {
  cat("XTABLE not present; please install to get the right table.")
}
@ %def xtable.table7

The \Rclass{FCSgate} class extends the S4 FCS class.  
The slots of the S4 FCSgate class are summarized in Table ~\ref{tab:5}.
There are three aspects to gating that are 
summarized below:

\begin{description}

\item[Create Gating Index]
Initially, a gating index will be created.  This binary index will
denote the selection of row observations in the data and 
will be appended as a column to the gate matrix.
The extension of the FCS object to a FCSgate object results from the 
S4 methods \Rfunction{createGate} and \Rfunction{icreateGate}, an interactive
method with user prompts for option values.  Table ~\ref{tab:6} 
summarizes the types of gates or cuts that can be used to 
select the data.  Currently, there are only gates involving one (\textit{i.e.} univariate)
or two (ie, bivariate) column variables of the data.  A single or double cut refers
to the number of thresholds for each variable.  For an example,
if there is a bidcut, then there are two thresholds for each of the two
variables.  The group of observations lying within these bivariate thresholds
are chosen.  In the bivariate polygonal cut "bipcut", the selection ranges
describe a polygonal shape which could be a square or any other closed linear 
shape description.  

\item[Extract Gated Data]
In order to collapse the data given the row selection index,
the method \Rmethod{extractGatedData} will subset the data according
to a specific value of the selection index (\textit{i.e.} \Rargument{IndexValue.In})  and to 
a particular column in the gate matrix.  Information about the 
extraction will be updated in the corresponding element of the 
\Robject{extractGatedData.msg} vector.  The metadata will also be updated
in terms of row size and the "original" flag will be set as FALSE.
The "current.data.obs" will also be subset according to the selection 
index.  In summary, the S4 method \Rmethod{extractGatedData} handles data 
collapsing with a corresponding row selection index of a FCSgate 
class object. 

\item[Extract Gating Information]
The \Rfunction{extractGateHistory} will output a list of values 
and details of a particular gating index.  Table ~\ref{tab:7} 
summarizes the descriptions of the gating information 
that is extracted.

\end{description}

The following subsections exemplify the creation of a gating or selection
binary index, the extraction or subsetting of the data using this newly created
gating index, the extraction of gating details, a description of bivariate gating 
schemes, and other gating functions for high-dimensional plots.

See ~\ref{SUBSEC:ProbBin} for details about subsequent analyses 
after gating \citep{roe:hardy:2001}.

\subsection{Creating Gate Index}  
\label{SUBSEC:createGate}

Using \Rfunction{createGate} or the interactive \Rfunction{icreateGate} will 
result in a binary index that will be appended to the gate matrix.  We
will use the FCS R-object \Robject{unst.1829} for a following demonstration of gating.

First a bivariate double cut gate will be implemented and will
capture the observations between 300 and 600 of the FSC-Height,
first column variable of data, and the Side Scatter, second 
column variable of data.

<<createGate1, echo=TRUE>>=
gate.range.x <- c(300,600)
gate.range.y <- c(300, 600)
unst.1829.gate1 <- createGate(unst.1829, 
  varpos=c(1,2), 
  gatingrange=c(gate.range.x, 
     gate.range.y), 
  type="bidcut", 
  comment="first gate")
@ 

In order to see the gate, we use \Rfunction{plotvar.FCS} and \Rfunction{showgate.FCS}.

Currently, the \Rfunction{showgate.FCS} does not work with \Rfunction{plotvar.FCS}
with the\Rargument{hexbin.CSPlot=TRUE} option.  The following is a hexbin
\Robject{ContourScatterPlot} of the complete data before extraction on the created gate.
Note that the gating thresholds are not shown.

<<HexGate1, echo=TRUE, eval=FALSE>>=
par(mfrow=c(1,1))
data.vars<-1:2
plotvar.FCS(unst.1829.gate1, varpos=data.vars, 
            plotType="ContourScatterPlot",
            hexbin.CSPlot=TRUE)
@ 

(Again, Sweave errors cause the above not to work here).


The gate for the can be shown with the original data with the 
following code:

<<showGate1, echo=TRUE, eval=FALSE>>=
data.vars<-1:2
plotvar.FCS(unst.1829.gate1, varpos=data.vars, 
            plotType="ContourScatterPlot",
            hexbin.CSPlot=FALSE)
showgate.FCS(unst.1829.gate1@data[,data.vars], 
             gatingrange= c(gate.range.x, gate.range.y), 
             Index = unst.1829.gate1@gate[,1], 
             type="bidcut", pchtype=".")
@ 


Alternatively, the corresponding \Rfunction{icreateGate} could be implemented
that would make a plot and prompt the user for information about 
the type of gate desired.  If parameters such as the type of
gate and the gatingrange are known before looking at the data,
these options can be input into\Rfunction{icreateGate}, and the plot will be shown.

The following plot and implementation describes the use of setting a univariate
single cut gate for selection of cells that are $\geq 500$ in value for the
4th data column variable from those selected by the first gate.  The previous 
gate is the first column of gate and the selection value is 1 
(\textit{i.e.} $prev.gateNum=1$ and $prev.IndexValue.In=1$). Setting \Rargument{prompt.all.options} 
to FALSE will surpress other interactive prompts for  
the title and gating color of the plot. 

\begin{figure}[htbp]
  \centering
<<showGate2, echo=TRUE, fig=TRUE>>=
unst.1829.gate2 <- icreateGate(unst.1829.gate1, varpos=4, 
                               gatingrange=500, type="uniscut", 
                               prev.gateNum=1, 
                               prev.IndexValue.In=1,
                               comment="", MY.DEBUG=FALSE, 
                               prompt.all.options=FALSE)
@
\caption{unst.1829: The gating index for fourth column variable 
of the data is shown.  The row observations beyond the vertical 
gate of 500 of uniscut are selected with an \Rargument{IndexValue.In=1}.}
\label{fig:showGate2}
\end{figure}

For a completely interactive gating session,
the user can implement \Rfunction{icreateGate} on a \Robject{FCS} R-object
and input all plotting and gating options 
after each prompt.

\subsection{Data Extraction from Gate Index}
\label{SUBSEC:extractGatedData}

The extraction or row subsetting of the data matrix corresponding 
to a gating index is implemented by \Rfunction{extractGatedData}.

The following extraction the data 
will use the first gating index (\textit{i.e.} the 
first column of the gate matrix specified with 
\Rargument{gateNum=1}) and the selection value of 1 (\textit{i.e.} selection of 
observations with \Rargument{IndexValue.In=1}). 

<<extractGatedData1, echo=TRUE>>=
unst.1829.subset1.1 <- extractGatedData(unst.1829.gate2, 
  gateNum = 1, 
  IndexValue.In = 1, 
  MY.DEBUG = FALSE)
unst.1829.subset1.2 <- extractGatedData(unst.1829.gate1, 
  gateNum=1, 
  IndexValue.In=1, 
  MY.DEBUG=FALSE)
@ 

Both the \Robject{unst.1829.gate1} 
and \Robject{unst.1829.gate2} are \Robject{FCSgate} objects
with the same data but different gate matrices.
The generic method \Rmethod{equals} will only 
evaluate the equality of the \Robject{data}
and \Robject{metadata} slots and not of the gate matrix
for \Robject{FCSgate} objects.

<<FCSgateEquality, echo=TRUE>>=
equals(unst.1829.subset1.1, 
  unst.1829.subset1.2, 
  check.filename=FALSE, 
  check.objectname=FALSE)
@ 

Extraction using the second column index of the gate matrix (\textit{i.e.} \Rargument{gateNum=2})
and selecting those with \Rargument{IndexValue.In=1} could be implemented on either a 
previously extracted \Robject{FCSgate} object or the \Robject{FCSgate} object without extraction.
The output \Robject{unst.1829.subset.2.1} and \Robject{unst.1829.subset2.2} should have the
same \Robject{data} and \Robject{metadata} slots evaluated by \Rfunction{equals}.

<<extractGatedData2, echo=TRUE>>=
unst.1829.subset2.1 <- extractGatedData(unst.1829.subset1.1, 
  gateNum = 2, 
  IndexValue.In = 1, 
  MY.DEBUG = FALSE)
unst.1829.subset2.2 <- extractGatedData(unst.1829.gate2, 
  gateNum = 2, 
  IndexValue.In = 1, 
  MY.DEBUG = FALSE)
equals(unst.1829.subset2.1, 
  unst.1829.subset2.2, 
  check.filename=FALSE, 
  check.objectname=FALSE)
@ 

\subsection{Extraction of Gating Details from "history"}
\label{SUBSEC:extractGateHistory}

The use of \Rfunction{extractGateHistory} extracts 
information for a particular gate index.  The
list output provides an easy way to access the
information that can be used as input for the 
functions \Rfunction{createGate}, \Rfunction{icreateGate},
and \Rfunction{extractGatedData} in subsequent gating implementations.

The extraction of gating information before gated data extraction
is shown in the for gates 1 and 2.

<<extractGateHistory1, echo=TRUE>>=
info.gate1 <- extractGateHistory(unst.1829.gate2, gateNum=1)
info.gate1
info.gate2 <- extractGateHistory(unst.1829.gate2, gateNum=2)
info.gate2
@ 

The extraction of gating information after implementing
"extractGatedData" provides the following output for gates 1 and 2, 
respectively:

<<extractGateHistory2, echo=TRUE>>=
info.gate1.1 <- extractGateHistory(unst.1829.subset2.1, gateNum=1)
info.gate1.1
info.gate2.1 <- extractGateHistory(unst.1829.subset2.1, gateNum=2)
info.gate2.1
@ 

Suppose the next gate is a bivariate double cut on the 5th and 6th column
variables of the "data" matrix.  If this gate is implemented from the 
previous first gate, then this extracted information \Robject{info.gate1} 
is used as well as \Robject{info.gate1.1} to identify the 
previous gating information (\textit{i.e.} \Robject{previous.gateNum} and 
\Robject{previous.IndexValue.In} in the example).

<<createExtractGateHistory, echo=TRUE>>=
gate.range.x <- c(200, 300)
gate.range.y <- c(100, 500)
previous.gateNum <- info.gate1$gateNum
previous.IndexValue.In <-info.gate1$InexValue.In
unst.1829.gate3 <- createGate(unst.1829.gate2, 
  varpos = c(1,2), 
  gatingrange = c(gate.range.x, gate.range.y), 
  type="bidcut", 
  prev.gateNum = previous.gateNum, 
  prev.IndexValue.In = previous.IndexValue.In, 
  comment="first gate")
extractGateHistory(unst.1829.gate3, gateNum=3)
@ 

Subsequent data extraction can be made on the FCSgate 
object \Robject{unst.1829.gate3} using \Rfunction{extractGatedData} 
given a particular gate index column in the gate matrix.

\subsection{Gating Schemes}
\label{SUBSEC:GatingSchemes}

The \Rfunction{FHCRC.HTVNFCS} and the \Rfunction{VRC.HVTNFCS} are functions 
that implement \Rfunction{icreateGate} and \Rfunction{extractGatedData} 
as example gating procedures \citep{roe:hardy:2001}.

The user will be prompted for gating and plotting input with the following examples
and associated FCS R objects (shown and not demonstrated).

<<HVTNFCS, echo=TRUE, eval=FALSE>>=
MC.053.gt <- FHCRC.HVTNFCS(MC.053)
MC.054.gt <- FHCRC.HVTNFCS(MC.054)
MC.055.gt <- FHCRC.HVTNFCS(MC.055)
st.1829.gt <- VRC.HVTNFCS(st.1829)
unst.1829.gt <- VRC.HVTNFCS(unst.1829)
st.DRT.gt <- VRC.HVTNFCS(st.DRT)
unst.DRT.gt <- VRC.HVTNFCS(unst.DRT)
@ 

If the user decides to implement one of the example gating
schemes on his or her own FCS R object, the column variable positions 
can be adjusted for each gate implementation such that 
the variables to be gated may remain the same.  The following example 
shows that for gate 2, column variable positions 7 and 5 refer to cd3 and cd8,
respectively for that data matrix of \Robject{MC.053}, the \Robject{FCS} 
object to be gated. Likewise, column variable
positions that correspond to cd69 and INFgamma are 4 and 3.
      
<<FHCRCgatevarschange, echo=TRUE, eval=FALSE>>=
data(MC.053min)
MC.053[["longnames"]]
FHCRC.HVTNFCS(MC.053, gate2.vars=c(7,5), gate3.vars=c(4,3))
@ 

\subsection{Other Image Gating}
\label{SUBSEC:OtherImageGating}

There are other gating procedures that can be implemented on
high-dimensional plots.  The \Rfunction{gate.IPC} interactive function allows the
user to click on upper and lower bin boundaries for a particular
variable to subset.  The subsequent graphs represent this subset of 
points that move from one variable to the next.  The following code will
be left for the user to implement as an exercise.

<<gateIPC, echo=TRUE, eval=FALSE>>=
st.DRT2 <- st.DRT
st.DRT2@data <- st.DRT@data[1:1000,]
gate.IPC(st.DRT2, 3,
         hist.plotted=FALSE,
         image.plotted=TRUE,
         para.plotted=FALSE,
         lines.plotted=TRUE,
         MY.DEBUG=FALSE)
@ 

Currently, there is still work in progress to gate on the dynamic plots 
\Rfunction{ggobi} and \Rfunction{xgobi}.  See Section ~\ref{SEC:DynamPlot}
for basic plotting usage.

\section{Exploratory Data Analysis}
\label{SEC:expl-data-analys}

The user may decide to use more qualitative means to investigate the data.  
The Patient Rule Induction Method (PRIM) allows the extraction of \textbf{rules} 
defined as subsets that maximizes or minimizes a target function which is 
usually specified as the mean of a binary label \citep{fried:fish:1998}.  
In the flow cytometry setting,
this target function is the mean of binary HIV-protein stimulated (Y=1) 
or unstimulated status (Y=0) for a particular immunofluoroescence 
data subset or box, which ultimately estimates 
a rule through iterative trimmings of the box in the greedy, top-down Peeling Step
and iterative additions into the box during the patient Expansion Step.  A Cross-Validation 
Step implements the same Peeling and Expansion Steps on Testdata Sets. 
Hence, the estimated rules aim at finding distributional
differences between the HIV-protein stimulated and unstimulated cells
in a multi-dimensional setting where many different immunofluoroescence 
measurements are made on the same sample of cells from an individual in an HIV vaccine 
trial.  Again, the results of PRIM are only exploratory because it is a qualitative 
process that needs subjective, sound judgments to arrive at conclusions for each 
step of PRIM. PRIM is regarded as a tool for hypothesis 
generation rather than for inference.

Please refer to the "PRIM.pdf" manual in the \Rpackage{rfcprim} package
for details regarding the functions used on the \Robject{data}
component of the FCS R-objects.

\clearpage
\section{Flow Cytometry Statistical Testing and Inference}
\label{SEC:flowcyt-test-intro}

The testing tools in this section are used to evaluate differences between 
HIV-protein stimulated and unstimulated scenarios,
particularly in the IFNgamma measurement after gating 
described by \cite{roe:hardy:2001}.

Each subsection describes particular tests that are implemented
by \textbf{runflowcytests} and other functions.

\subsection{Probability Binning}
\label{SUBSEC:ProbBin}

The current S3-class object \Robject{ProbBin.FCS} describes 
the equal probability binning of a univariate, immunofluoroescence
measurement (usually of IFN-gamma) after the implementation of a series of 
gating schemes across different immunofluorescence measurements.  
\textit{Equal probability binning} ensures that there are 
equal number of observations, \textbf{N}, within a bin across all bins 
constructed by cut-offs or integer breakpoints of the immunofluorescence measurement.
The final bin may contain more or less than \textbf{N}, the pre-specified 
number within each bin. The function, \Rfunction{breakpoints.ProbBin.FCS}, 
makes the breakpoints or cut-offs for equal probability binning in two ways:
 
\begin{description}

\item[combined] based on the combination of the univariate 
   distributions (usually of INF-gamma) 
   of both the HIV-protein stimulated and unstimulated samples of cells

\item[by.control] based on only the unstimulated HIV-protein sample.
   These breakpoints are then used to make histogram 
   objects from both the HIV-protein stimulated and unstimulated cell 
   samples from an individual \citep{roe:hardy:2001}.

\end{description}

<<table8, echo=FALSE>>=
slotnames<-c("unst.hist", "st.hist", "PB", 
"N.in.bin", "varname")
description<-c("unstimulated histogram", 
"stimulated histogram", 
"'combined'/'by.control'", 
"number per bin for cut-off construction",
"name of distribution/variable")
table8 <- data.frame(slotnames, description)
save(table8, file="table8.Rda")
@ 

<<GenerateTable8ProbBinFCSslots,results=tex, echo=FALSE>>=
if (require(xtable)) {
  xtable(table8,
         caption=paste("Description of 'ProbBin.FCS' S3 list",
                 "output", sep =" "),
         label="tab:8")
} else {
  cat("XTABLE not present; please install to get the right table.")
}
@ %def xtable.table8

The \Rfunction{ProbBin.FCS} object is a S3 list of the following components in Table ~\ref{tab:8}

We will construct two gated objects as described in Section ~\ref{SEC:gating}.
The stimulated gated object is \Robject{st.DRT.gt} and the unstimulated gated object
is \Robject{unst.DRT.gt}. Here we will only gate on the bivariate double cut 
that extracts the lymphocytes from the Forward Scatter and Side Scatter measurements. 
Then we will extract the "IFN-gamma" measurment from each sample and then 
construct a ProbBin.FCS object.

The following implements a "bidcut" gate and plots the image with the gate.

\begin{figure}[htbp]
  \centering
<<FCSicreateGate1, echo=TRUE, fig=TRUE>>=
unst.DRT.gt <- icreateGate(unst.DRT, varpos=c(1,2), 
                           gatingrange=c(300,650, 300, 500), type="bidcut", 
                           comment="", MY.DEBUG=FALSE, 
                           prompt.all.options=FALSE)
@
\caption{unst.DRT.gt: The gating index for first two column variables 
of the data is shown for the selection of the central cluster of lymphocytes.  
The colored points in the center of the bidcut are selected with an 
\Rargument{IndexValue.In = 1}.}
\label{fig:FCSicreateGate1}
\end{figure}

\begin{figure}[htbp]
  \centering
<<FCSicreateGate2, echo=TRUE, fig=TRUE>>=
st.DRT.gt <- icreateGate(st.DRT, varpos=c(1,2), 
gatingrange=c(300,650, 300, 500), type="bidcut", 
comment="", MY.DEBUG=FALSE, 
prompt.all.options=FALSE)
@
\caption{st.DRT.gt: The gating index for first two column variables 
of the data is shown for the selection of the central cluster of lymphocytes.  
The colored points in the center of the
bidcut are selected with an \Rargument{IndexValue.In=1.}}
\label{fig:FCScreateGate2}
\end{figure}

\begin{figure}[htbp]
  \centering
<<FCSicreateGate3, echo=TRUE, fig=TRUE>>=
unst.DRT.gt <- icreateGate(unst.DRT.gt, varpos=c(7,5), 
gatingrange=c(500,1024, 0, 1024), type="bidcut", 
prev.gateNum=1, prev.IndexValue.In=1,
comment="", MY.DEBUG=FALSE, 
prompt.all.options=FALSE)
@
\caption{unst.DRT.gt: The gating index for 7th and 5th column variables 
of the data is shown for the selection of cd3+ cells based on the previous 
gating and selection of lymphocytes (\textit{i.e.}\Rargument{prev.gateNum=1}, 
\Rargument{prev.IndexValue.In=1}). The colored points of the bidcut gate 
are selected with an \Rargument{IndexValue.In = 1}.}
\label{fig:FCSicreateGate3}
\end{figure}

\begin{figure}[htbp]
  \centering
<<FCSicreateGate4, echo=TRUE, fig=TRUE>>=
st.DRT.gt <- icreateGate(st.DRT.gt,
                         varpos=c(7,5),
                         gatingrange=c(500,1024, 0, 1024), 
                         type="bidcut", 
                         prev.gateNum=1,
                         prev.IndexValue.In=1,
                         comment="", 
                         MY.DEBUG=FALSE,
                         prompt.all.options=FALSE)
@
\caption{st.DRT.gt: The gating index for the 7th and 5th column variables 
of the data is shown for the selection of cd3+ based on the previous gating 
and selection of lymphocytes (ie, \Rargument{prev.gateNum=1}, \Rargument{prev.IndexValue.In = 1}).  
The colored points the bidcut gate are selected with an \Rargument{IndexValue.In = 1}.}
\label{fig:FCSicreateGate4}
\end{figure}

We could choose to implement subsequent gates; each gate that is dependent on the
selection of a previous gate.  We leave further gating as an exercise for the user.  
Below is an extraction of the data from the cd3+ lymphocytes (\textit{i.e.} from the second 
gate of cd3+ cells based on the selection of lymphocytes in the first gate).

<<FCSExtractGatedObjs, echo=TRUE>>=
unst.DRT.ex <- extractGatedData(unst.DRT.gt, gateNum=2)
st.DRT.ex <- extractGatedData(st.DRT.gt, gateNum=2)
@ 

We decide to analyze the IFN-gamma distribution among the selected cells.
We obtain this measurement, IFN.unst and IFN.st, from the HIV-protein 
unstimulated and stimulated samples of individual DRT, respectively. 

<<FCSgetIFNgamma, echo=TRUE>>=
IFN.unst <- unlist(as(unst.DRT.ex[,4], "matrix"))
IFN.st <- unlist(as(st.DRT.ex[,4], "matrix"))
@ 

These two distributions are used to implement probability binning \Rargument{by.control}
with 100 observations in each bin based on the control, unstimulated group:

<<ProbBinFCSbycontrol, echo=TRUE>>=
PB.by.control <- ProbBin.FCS(IFN.unst, 
  IFN.st, 100, 
  varname=unst.DRT[["longnames"]][4],
  PBspec="by.control", 
  MY.DEBUG=FALSE)
@ 

Alternatively, these two IFN distributions could have been used 
to implement probability binning constructed by the \Rargument{combined} data
having 100 observations in each bin:

<<ProbBinFCScombined, echo=TRUE>>=
PB.combined <- ProbBin.FCS(IFN.unst, 
  IFN.st, 100, 
  varname=unst.DRT[["longnames"]][4],
  PBspec="combined", 
  MY.DEBUG=FALSE)
@ 

To verify the \Robject{ProbBin.FCS} class objects, the following code using 
\Rmethod{is} can be used:

<<isProbBinFCS, echo=TRUE>>=
is(PB.by.control, "ProbBin.FCS")
is(PB.combined, "ProbBin.FCS")
@

We show the following \Robject{ProbBin.FCS} plots of the 
\Robject{PB.by.control}" object. 

\begin{figure}[htbp]
  \centering
<<plotProbBinFCSunstimul, echo=TRUE, fig=TRUE>>=
plot(PB.by.control, plots.made="unstimulated", freq=TRUE)
@
\caption{PB.by.control: The histogram shows the equal probability that was
implemented on the unstimulated or control IFNgamma data.  Here the counts in each 
bin are about 100}
\label{fig:plotProbBinFCSunstimul}
\end{figure}

\begin{figure}[htbp]
  \centering
<<plotProbBinFCSstimul, echo=TRUE, fig=TRUE>>=
plot(PB.by.control, plots.made="stimulated", freq=TRUE)
@
\caption{PB.by.control: The histogram shows the equal probability that was
implemented on the unstimulated or control IFNgamma data of which whose breaks are applied 
to the stimulated data (which is shown in the above histogram).  Here the counts in each 
bin can be shown setting the options freq=TRUE and labels=TRUE, which will prompt a warning
because the binning is not equidistant.}
\label{fig:plotProbBinFCSstimul}
\end{figure}

The statistics associated with testing the two distributions for differences, assuming the 
null of no difference between the stimulated and unstimulated samples can be referenced in 
\citep{roe:tre:moore:her:2001, baggerly:2001}.  The summary of a \Robject{ProbBin.FCS} object will 
produce statistics that test the difference between the distributions of the stimulated
and unstimulated samples.  See Section ~\ref{SUBSEC:testing:diff}.

<<summaryProbBinFCS, echo=TRUE>>=
summary(PB.by.control)
summary(PB.combined)
@ 

\subsection{Testing for the difference between two univariate distributions}
\label{SUBSEC:testing:diff}

This section describes the tools used to test for the difference 
between the HIV-protein stimulated sample and the HIV-protein 
unstimulated sample in terms of the distribution of an
immunofluoroescence measurement and, in particular, of 
the IFN-gamma measurement.  There 
have been four main testing approaches that are outlined belowed.  The 
null hypothesis is the assumption that both samples originate from 
the same distribution (\textit{i.e.}, there is no difference in two distributions), 
and the alternative is that they are from different
distributions (\textit{i.e.}, the stimulated scenario compared to the unstimulated 
scenario are different in terms of cell densities).  

\begin{description}

\item[WLR.flowcytest] The weighted log rank test (by default when rho=0) 
   tests the difference in survival curves of the stimulated and 
   unstimulated scenarios when all measurements are regarded as having 
   the "event" and "time" is considered to be the IFN-gamma or other 
   immunofluorescence measurement. Thus, at every point on the 
   immunofluorescence, the curves are tested for differences.
   A plot of the survival curves for both samples is also optionally output.

\item[KS.flowcytest] Kolmogorov-Smirnoff test also evaluates the difference in
   distributions for the control and the stimulated samples, but may 
   be more sensitive and result in a higher false positive rate when 
   there are a larger number of data points.

\item[ProbBin.flowcytest] Statistics proposed by Keith A. Baggerly and
     Mario Roederer include Chi-squared and Normal tests for the
     PB metric via probability binning (both based on the control data
     only ("by.control") and based on the combined dataset of both the
     stimulated and the control samples ("combined")
     \citep{roe:tre:moore:her:2001, baggerly:2001}.

\item[pkci2.flowcytest] The method, proposed by Zoe Moodie, PhD, tests 
   the difference of the upper tails of the two distributions rather 
   than the range of the distribution for IFN-gamma or other univariate 
   immunofluorescence measurement.

\item[runflowcytests] This function will run all of the aforementioned tests 
   either separately or together in one call.

\end{description}

As a single example implementing all of the testing tools, we will only
demonstrate the testing with the \Rfunction{runflowcytests}.  Further documentation 
for each individual test can be obtained in the help documentation for the 
following tests: \Rfunction{WLR.flowcytest}, \Rfunction{KS.flowcytest}, \Rfunction{ProbBin.flowcytest}, 
\Rfunction{pkci2.flowcytest}.  Please note that \Rfunction{ProbBin.flowcytest} provides the same 
statistical output as \Rfunction{summary.ProbBin.FCS}. 

<<runflowcytestsExample, echo=TRUE>>=
output.runflowcytests <- runflowcytests(IFN.unst, 
  IFN.st, 
  KS.plotted=FALSE, 
  WLR.plotted=FALSE, 
  PBobj.plotted=FALSE)
@ 

The plots and output for the \Rfunction{KS.flowcytest} and the \Rfunction{WLR.flowcytest} are
shown with the code on the following pages.  The plots for the \Rfunction{ProbBin.flowcytest}
is similar to those shown in Figure ~\ref{fig:plotProbBinFCSunstimul} and 
Figure ~\ref{fig:plotProbBinFCSstimul}.

\begin{figure}[htbp]
  \centering
<<KSflowcytestPlot, echo=TRUE, fig=TRUE>>=
output.KSflowcytest <- KS.flowcytest(IFN.unst, 
  IFN.st, 
  KS.plotted=TRUE,
  MY.DEBUG=FALSE)
@
\caption{KS.flowcytest plot shows the distributions of the stimulated and 
unstimulated samples.}
\label{fig:KSflowcytestPlot}
\end{figure}

\clearpage

\begin{figure}[htbp]
  \centering
<<WLRflowcytestPlot, echo=TRUE, fig=TRUE>>=
output.WLRflowcytest <- WLR.flowcytest(IFN.unst, 
  IFN.st, 
  WLR.plotted=TRUE,
  MY.DEBUG=FALSE)
@
\caption{WLR.flowcytest plot shows the survival curves for the two 
distributions if every data point was regarded as being an event, and time
was regarded as the IFN-gamma measurement.}
\label{fig:WLRflowcytestPlot}
\end{figure}


\subsection{ROC curves for testing tails of two distributions}
\label{SUBSEC:testing-ROC}

For each individual there is a pair of data corresponding to a HIV-protein
stimulated sample and a HIV-protein unstimulated/control sample.  For each individual 
who is either HIV-positive or negative, the 
99.9-th percentile for the unstimulated sample and the percent 
positive for the stimulated sample based on this control-based 
99.9-th percentile was calculated.
Here we exemplify the calculations for the \Robject{IFN.st} and the \Robject{IFN.unst} obtained 
from the gating for the HIV-negative individual 1829. 

First, using \Rfunction{percentile.FCS}, we obtain the 99.9-th percentile based 
on the control, unstimulated sample.

<<percentileFCS, echo=TRUE>>=
unst.percentile <- percentile.FCS(IFN.unst, percent=0.999)
@

Now using \Rfunction{PercentPos.FCS}, we obtain the percent positives for both 
the unstimulated and the stimulated samples, respectively, using the \Rargument{unst.percentile}.
Note that the percent positive for the control sample is about 1 - 0.999.

<<PercentPosFCS, echo=TRUE>>=
PercentPos.FCS(IFN.unst, percentile=unst.percentile)$percent.pos
PercentPos.FCS(IFN.st, percentile = unst.percentile)$percent.pos
@

To evaluate which HIV-protein stimulation results in the 
most sensitive detection of HIV-positive 
status as well as the lowest chance of falsely 
concluding HIV-positive status 
based on a stimulated sample's higher 99.9th percentile 
control-based percent positive 
(\textit{i.e.}, according to the approach used in \Rfunction{pcki2.flowcytest}).  Zoe Moodie, PhD, 
constructed the ROC (Receiver Operating Characteristic) HIV-protein-specific curves
in which the cut-offs are based on the combined stimulated and unstimulated percent 
positives obtained by the previous methods.

The \Robject{PerPosROCmin} data in the \Rpackage{rfcdmin} package exemplifies the
percent positives obtained to plot the ROC curve.

Here we retrieve the example data provided by Zoe Moodie, PhD.

<<callPerPosROC, echo=TRUE>>=
data(PerPosROCmin, package="rfcdmin")
@

The function \Rfunction{ROC.FCS} shows the ROC curve and sensitivity, specificity output after 
the implementation of the functions \Rfunction{percentile.FCS} and 
\Rfunction{PercentPos.FCS} to obtain the percentiles and the percent positives, respectively, for each 
individual's HIV-protein stimulated and unstimulated pair for a particular immunofluorescence 
measurement.

\begin{figure}[htbp]
  \centering
<<ROCexample1, echo=TRUE, fig=TRUE>>=
GAG<-ROC.FCS(hivpos.gag, hivneg.gag)
#plotting the pola stimulated 100* percent positives
POLA<-ROC.FCS(hivpos.pola, hivneg.pola, lineopt=2, colopt=2, overlay=TRUE)
#plotting the polb stimulated 100* percent positives
POLB<-ROC.FCS(hivpos.polb, hivneg.polb, lineopt=4, colopt=3, overlay=TRUE)
legend(0.7, 0.7, c("gag", "polA", "polB"), col = c(1,2,3), lty=c(1,2,4))
@
\caption{The ROC curves are based on the different HIV-proteins used for the 
stimulation of immune responses.  Here the GAG appears to achieve greater 
sensitivity at a lower 1-specificity 
when evaluating the difference in immune responses between an 
HIV-infected and HIV-noninfected
profiles using the \Rfunction{pkci2.flowcytest} approach.}
\label{fig:ROCexample1}
\end{figure}

\clearpage

\section{Future Updates}
\label{SEC:futureupdates}

Most notable future updates include converting the testing and the gating 
into generic S4 class objects.  Currently these objects are all S3.

The dynamic plotting functions will also be converted to S4 generic 
objects with additional visualization tools and methods.

Future work with PRIM include using the algorithm with real datasets and 
displaying the results with the tools provided in the \Rpackage{rflowcyt} package.

\bibliographystyle{plainnat}
\bibliography{rflowcyt}

\end{document}