%\VignetteIndexEntry{An R Package for Analysis of High Throughput Flow Cytometry Data}

\documentclass[12pt]{article}

\usepackage{amsmath,pstricks}
\usepackage[authoryear,round]{natbib}
\usepackage{hyperref}

\SweaveOpts{keep.source=TRUE}

<<echo=FALSE,results=hide>>=
library(gplots)
@

\textwidth=6.2in
\textheight=8.5in
\oddsidemargin=.1in
\evensidemargin=.1in
\headheight=-.3in

\newcommand{\scscst}{\scriptscriptstyle}
\newcommand{\scst}{\scriptstyle}
\newcommand{\Rfunction}[1]{{\texttt{#1()}}}
\newcommand{\Rmethod}[1]{{\texttt{#1}}}  
\newcommand{\Rclass}[1]{{\texttt{#1}}}
\newcommand{\Robject}[1]{{\texttt{#1}}}
\newcommand{\Rpackage}[1]{{\textit{#1}}}
\newcommand{\code}[1]{{\texttt{#1}}}
\bibliographystyle{plainnat}

\title{Analysis of High Throughput Flow Cytometry Data using \Rpackage{plateCore}}

\author{Errol Strain, Florian Hahne, Perry Haaland}

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\begin{document}

\maketitle
\tableofcontents

\section{Overview}
\Rpackage{plateCore} is a Bioconductor packaged created to make processing and analysis of large, complex flow datasets
in R easier. High throughput flow studies are often run in a 96 or 384-well plate format, with a number of different samples, 
controls, and antibodies-dye conjugates present on the plate. Analyzing the output from the cytometer requires keeping track of the contents
of each well, matching sample wells with control wells, gating each well/channel separately, making the appropriate plots, and
summarizing the results. \Rpackage{plateCore} extends the \Rpackage{flowCore} and \Rpackage{flowViz} packages to work on
\Rclass{flowPlate} objects that represent these large flow datasets. For those familiar with \Rpackage{flowCore} and \Rpackage{flowViz}, 
the gating (filtering), transformation, and other data manipulations for \Rclass{flowPlates} are very similar to \Rclass{flowSets}. 

In this document we show how setup a \Rpackage{plateCore} analysis and provide examples for a publicly available dataset. The 
peripheral blood mononucleocyte (PBMC) dataset is a collection of five 96-well plates that have been stained with 189 different antibody-dye conjugates.
The goal is to identify cells that are positively stained relative to some corresponding negative control.


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Introduction}

\subsection{Background}
\Rpackage{plateCore} was created to address the need for robust methods to analyze data from
\href{http://www.bd.com/technologies/discovery_platform/BD_FACS_CAP.asp}{BD FACS\texttrademark CAP} experiments.
The current version of FACS\texttrademark CAP has 200+ antibody-dye conjugates on a single 96-well plate,
where the antibodies are specific for different human cell surface markers or proteins. The output
from an experiment is a per cell measurement of protein/marker expression for each antibody.  
FACS\texttrademark CAP is designed to be a tool for screening a large number of surface markers 
on different types of human tissue or blood samples. Analyzing the data requires objective, high throughput
approaches to gating and summarizing large amounts of flow data. 

In addition to BD FACS\texttrademark CAP, \Rpackage{plateCore} also works for any other type of plate-based flow
data, or any collection of samples that can be organized into a \Rclass{flowPlate} object. \Rclass{flowPlates}
are simply a convenient way to structure the data and manage the annotation for a set of related flow samples.
Once the data is in \Rpackage{plateCore}, it's also easy to summarize results across multiple experiments and
to integrate flow results with other types of data using Bioconductor and access statistical methods in R.

\subsection{Bioconductor Flow Tools}
\Rpackage{plateCore} is built on top of \Rpackage{flowCore} and \Rpackage{flowViz}, and the functionality implemented
for \Rclass{flowPlates} is a subset of what is available for \Rclass{flowSets}. Getting your data into a \Rclass{flowSet} 
is the best way to start exploratory analysis, and also gives you access to other flow packages such as \Rpackage{flowQ},
\Rpackage{flowStats}, and \Rpackage{flowClust}. Once the layout of the experiment has been standardized, loading
the data into plateCore allows users to automate portions of the analysis, and makes summarizing the data and 
extracting the results easier.


\clearpage
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Installation and Getting Started}
  
Install the \Rpackage{plateCore} package and load the sample data containing a plate dataset, compensation files, and 
a table decsribing the layout of the experiment.
<<loadPacks,echo=TRUE,results=hide>>=
library(plateCore)
data(plateCore)
@
<<>>=
ls()
@ 

The \Robject{pbmcPlate} is actually one of the 5 plates that make up the Peripheral
Blood Mononucleocyte Cell (PMBC) dataset in the Section ~\ref{section:PBMC}. 
This particular plate has 96 wells and is stained in the FL1.H (FITC), FL2.H (PE), FL3.H (PerCP.Cy5.5), and FL4.H (APC) channels. \Rpackage{flowCore}
was used to read in the raw FCS files, and the \Robject{pbmcPlate} is a \Rclass{flowSet} resulting from \Rfunction{read.flowset} operation. \Rclass{flowSets}
are modeled on the microarray data structures used in Bioconductor.  The \Robject{pbmcPlate} is comprised of an
\Rclass{exrs} expression matrix, along with the associated \Rclass{phenoData}.  

<<echo=TRUE>>=
pbmcPlate
@

The wellAnnotation \Rclass{data.frame} describes the layout of the plate and the content of each well. Stained wells have a row
for each of the channels of interest, while unstained wells only have a single row. The information in well annotation is similar 
to the \Robject{pData} matrix for \Rclass{phenoData(AffyBatch)}. For \Rclass{pData} all information about a sample is contained on a single row, which
makes it difficult to handle a multiplexed flow dataset since each channel essentially requires multiple columns to 
describe the contents of a well. The wellAnnotation format is easier to create and maintain, and \Rpackage{plateCore}
handles organizing the relevant information into a \Rclass{pData} object and incorporating it into a \Rclass{flowSet}.

<<echo=TRUE>>=
head(wellAnnotation)
@

The first column of the table contains a unique well identifier, which must correspond to one of of
the samples from pbmcPlate. The FCS output from the cytometer usually contains the well id in the filename, and these filenames
are what \Rpackage{flowCore} uses to assign sample names to a \Rclass{flowSet}. These well identifiers must be unique to a 
well, which usually means using 3 character codes like "A01", "B09", "H10", etc.   \Rpackage{plateCore} assumes that the first
character provides the row name and that second two characters give the column on the plate.  The second column in the annotation table
gives the sample type, currently either "Unstained", "Test","Negative.Control", or "Isotype". Each well can only be a single sample type. The
Ab.Name column contains either the name of the antibody, or some other descriptor that will be useful for making plots.
Looking at well B05 in the FL4.H, we see that it is stained with an antibody named CDbd14.  
<<echo=TRUE>>=
wellAnnotation[50,]
@

The antibody names in this PBMC dataset were masked prior to public release, so these names will not correspond to the standard
CD names. The Channel column tells which channel was used to detect fluoresence for the antibody-dye conjugate. Negative.Control
information indicates which well is the negative control (e.g. isotype) for test samples. The current version of plateCore only
supports a single negative control well for each test well, but multiple test wells can use the same negative control.  Each negative
control well should be assigned as it's own negative control for reasons that will be explained in following sections.

Additional columns can be included in wellAnnotation and will be incorporated in later results, but the 5 columns described here 
are the minimum for a \Rclass{flowPlate}.


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Creating a \Rclass{flowPlate}}

Making a \Rclass{flowPlate} requires a \Rclass{flowset}, a well annotation table, and a name for the plate that is unique within
the set of flow data under analysis. Later we may want to combine information from multiple plates, so having a unique plate name
identifier (e.g. barcode) makes it easier to track the samples.  Using the sample dataset from the \Rpackage{plateCore}, a 
\Rclass{flowPlate} can be built.
<<echo=TRUE>>=
pbmcFP <- flowPlate(pbmcPlate,wellAnnotation,plateName="PBMC.001")
@

Looking at the first \Rclass{pData} entry from the \Robject{pbmcPlate} and from \Robject{pbmcFP}, we can see how the \Robject{wellAnnotation}
 was incorporated into \Rclass{pData}. 
<<echo=TRUE>>=
pData(phenoData(pbmcPlate))[1,]
pData(phenoData(pbmcFP))[1,]
@

The following examples use the lymphocyte population from \Robject{pbmcFP}, so we select the cells of interest using
a \Rclass{rectangleGate} and \Rclass{norm2Filter}.  Details on how to use \Rfunction{Subset} can be found in the Gating section 
and in \Rpackage{flowCore}. The cells selected by this gate are shown in Figure ~\ref{fig:lymphGate}.
<<echo=TRUE>>=
rectGate <- rectangleGate("FSC-H"=c(300,700),"SSC-H"=c(50,400))
normGate <- norm2Filter("SSC-H","FSC-H",scale.factor=2.5)
pbmcFP.lymph <- Subset(pbmcFP, rectGate & normGate)
@

\begin{figure}
\begin{center}
<<label=lymphGate,fig=TRUE,echo=FALSE>>=

print(xyplot(`SSC-H` ~ `FSC-H` | as.factor(Well.Id),pbmcFP[93:96],smooth=FALSE,displayFilter=TRUE,col=c("red","blue"),
filter=rectangleGate("FSC-H"=c(300,700),"SSC-H"=c(50,400)) & norm2Filter("SSC-H","FSC-H",scale.factor=2.5)))	
  
@
\end{center}
\caption{Forward (FSC) and side-scatter (SSC) dotplots for the first four wells of the PMBC plate. The lymphocyte population is
shown in blue while monocytes are located in the upper right corner. Lymphocytes were selected using the \Rclass{rectangleGate} 
and \Rclass{norm2Filter} gates from \Rpackage{flowCore}.}
\label{fig:lymphGate}
\end{figure}


\clearpage
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Compensation and Background Correction}

The \Rfunction{compensation} function from \Rpackage{flowCore} is used to create a compensation matrix for the sample data.
Channel names in the compensation \Rclass{flowSet} must match the dataset under analysis, otherwise the matrix will not work correctly. 
Details about how \Rfunction{compensation} works can be found in \Rpackage{flowCore}. We can create a compensation matrix
using the sample data in \Rpackage{plateCore} and the \Rfunction{spillover} from \Rpackage{flowCore}.

<<echo=TRUE>>=
comp.mat <- spillover(x=compensationSet,unstain=sampleNames(compensationSet)[5],
patt=".*H",fsc="FSC-H",ssc="SSC-H",method="median")
@

This \Robject{comp.mat} matrix can then be applied to a \Rclass{flowPate} to correct for the effects of spillover. The difference between compensating
in \Rpackage{flowCore} and compensating in \Rpackage{plateCore} is that \Rpackage{plateCore} only compensates for the dyes listed
in \Robject{wellAnnotation}, whereas \Rpackage{flowCore} compensates each sample the same way. Since \Rpackage{plateCore} does this custom
compensation, it is important to list each dye or fluorophore in the wellAnnotation, even if the sample will not be used for further
analysis. If all the samples are stained the same way, then compensating with either approach should give the same results. The
\Robject{pbmcFP} can be compensated using the \Robject{comp.mat} from above.
<<echo=TRUE,eval=FALSE>>=
pbmcFPcomp <- compensate(pbmcFP.lymph,comp.mat)
@

In this particular case the \Robject{pbmcPlate} has already compensated on the cytometer, so no further compensation is necessary.

The process of analyzing and gating (filtering) the large amount of data in a \Rclass{flowPlate} can be simplified by first correcting
for the effects of cell size on background fluorescence. This step is not necessary in this PBMC lymphocyte example, since the 
level of autofluorescence is very low, but it can have large effects on other cell types that have a wider range of FSC values, such as fibroblasts and stem cells.
This correction will later allows us to define a one-dimensional gate between positive and negative
cells in only the channel interest, instead of having a two dimensional gate that includes the forward scatter (FSC) channel. The background
correction uses the unstained wells and fits a log-log linear model to FSC versus each fluorescence channel (Hahne et al. 2006 Genome Biology).  The
correction is then applied to all the wells on a the plate. The \Rfunction{fixAutoFl} function takes a \Rclass{flowPlate}, and character variables
indicating the FSC channel and the fluorsence channels (chanCols).  
<<echo=TRUE>>=
pbmcFPbgc <- fixAutoFl(pbmcFP.lymph,fsc="FSC-H",chanCols=rownames(comp.mat))
@
Figure ~\ref{fig:bgc1} shows the results of the autofluorescence correction for PBMC lymphoctes where the background fluorescence was
artificially inflated. 


\begin{figure}
\begin{center}
<<label=bgc1,fig=TRUE,echo=FALSE>>=
temp <- pbmcFP.lymph[96]
fsMed <- log10(median(exprs(temp[[1]])[,"FSC-H"]))
temp <- fsApply(plateSet(temp),function(x) {
			flVals <- log10(exprs(x)[,"FL1-H"])
			fsc <-  log10(exprs(x)[,"FSC-H"])
			exprs(x)[,"FL1-H"] <- 10^(flVals+4*(fsc-fsMed))
			x
		})
temp2 <- as(list("0877408774.H12"=temp[[1]]),"flowSet")
wellTemp <- subset(wellAnnotation(pbmcFP),Well.Id=="H12")
temp2 <- flowPlate(temp2,wellTemp,plateName="H12")
temp2 <- fixAutoFl(temp2,fsc="FSC-H",chanCols=c("FL1-H","FL2-H"),unstain="0877408774.H12")
temp <- as(list("H11"=temp[[1]],"H11 (corrected)"=temp2[[1]]),"flowSet")
temp <- transform("FL1-H"=log10) %on% temp
print(flowViz::levelplot(`FL1-H` ~ `FSC-H` | as.factor(name),data=temp, ylim=c(0,2.2)))	
@
\end{center}
\caption{Levelplots showing the effects of cell size (FSC) on fluorescence for the unstained cells before (left) and after (right) the 
background correction. The fluorescence signal intensity generally increases with increasing cell size (FSC). The effect
of cell size on background fluorescence for PBMC lymphocytes was exaggerated in this plot to demonstrate \Rfunction{fixAutoFl}.}
\label{fig:bgc1}
\end{figure}


\clearpage
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Gating}

Gating on \Rclass{flowplates} is mainly performed using \Rfunction{Subset}. The first argument to this function is a \Rclass{flowPlate} and
the second is a valid \Rpackage{flowCore} filter(s). For exploratory analysis and setting up the initial gates, it is sometimes more
convenient to work with a \Rclass{flowSet}, which can be extracted from a \Rclass{flowPlate} using the \Rfunction{plateSet}
function.    
<<echo=TRUE>>=
fs <- plateSet(pbmcFP)
@

In this PBMC example, we can use a \Rclass{rectangleGate} to separate the lymphocytes from the debris and monocytes. Looking at the 
plots in Figure ~\ref{fig:lymphGate}, the lymphocytes look like they are located from about 400 to 650 on the FSC scale, and
100 to 300 on the SSC scale.  We can use \Rfunction{xyplot} from the \Rpackage{flowViz} package to display events inside the gates.
<<echo=TRUE,eval=FALSE>>=
rectGate <- rectangleGate("FSC-H"=c(400,700),"SSC-H"=c(100,300))
xyplot(`SSC-H` ~ `FSC-H` | as.factor(Well.Id), pbmcFP[1:2], displayFilter=TRUE,
smooth=FALSE, col=c("red","blue"),filter=rectGate)
@
Since we want this gate to be applicable to all the wells, it may help to enlarge the \Rclass{rectangleGate} and then use
a data-driven gate like \Rclass{norm2Filter} to pick out lymphocytes.  The lymphocyte population may drift over the course of analyzing
the plate. \Rclass{norm2Filter} will fit a bivariate normal to the FSC and SSC channels to identify an elliptical 
region of high density. 
<<echo=TRUE,eval=FALSE>>=
rectGate <- rectangleGate("FSC-H"=c(300,700),"SSC-H"=c(50,400))
normGate <- norm2Filter("SSC-H","FSC-H",scale.factor=1.5)
xyplot(`SSC-H` ~ `FSC-H` | as.factor(Well.Id), pbmcFP[1:4], displayFilter=TRUE, smooth=FALSE, col=c("red","blue"),filter=normGate & rectGate)
@
The results of applying the \Rclass{rectangleGate} and \Rclass{norm2filt} are shown in Figure~\ref{fig:lymphGate}.

Once the population of interest has been identified, \Rfunction{Subset} the \Rclass{flowPlate} to select those cells. These same
gates were used to select lymphocytes when we were initially creating the flowSet. If your sample is comprised of multiple
cell populations, as in this PBMC example, then it is necessary to \Rfunction{Subset} before the compensation and background correction steps. The
\Robject{compensationSet} should be processed with the \Rclass{rectangleGate} and \Rclass{norm2Filter}
gates prior to constructing the compensation matrix. 
<<echo=TRUE,eval=FALSE>>=
rectGate <- rectangleGate("FSC-H"=c(300,700),"SSC-H"=c(50,400))
normGate <- norm2Filter("SSC-H","FSC-H",scale.factor=1.5)
pbmcFP.lymph <- Subset(pbmcFP, rectGate & normGate)
@

Now we're ready to set the isotype (Negative.Control) gates that will be used to idenfity 
positively and negatively stained cells. We will estimate the gates using the \Rclass{flowPlate} that
has been subsetted for lymphocytes, compensate, and then background corrected. The default settings for the
\Rfunction{setControlGates} assumes the data is on a linear scale.
<<controlGates,echo=TRUE>>=
pbmcFPbgc <- setControlGates(pbmcFPbgc,gateType="Negative.Control",numMads=5)
pbmcFPbgc <- applyControlGates(pbmcFPbgc)
@
The \Robject{numMads} parameter indicates how far above the isotype population the 
positive-negative threshold is set. The threshold is set by taking the median fluorescence
intensity (MFI) and adding 5 median absolute deviations (MADs). The default value of
5 works well for a number of cell types but may need to be adjusted for specific applications.
Future versions of \Rpackage{plateCore} will use more sophisticated methods of estimating kernel
density to fit distrubutions to the data and have a more robust esimtate of the positive-negative
threshold. Once the control gates have be created, they are applied to the test wells using
the \Rfunction{applyControlGates}.

We see how reasonable the gates look using \Rfunction{xyplot}. Examples of the
test wells associated with the negative control well A03 are shown in Figure ~\ref{fig:isoGate1}.
To check all the gates in the FL1.H channel we would use the following code.  
<<echo=TRUE,eval=FALSE>>=
xyplot(`FL1-H` ~ `FSC-H` | as.factor(Well.Id), 
transform("FL1-H"=log10) %on% pbmcFPbgc, displayFilter=TRUE,
smooth=FALSE,col=c("red","blue"),filter="Negative.Control")
@

\begin{figure}
\begin{center}
<<label=isoGate1,fig=TRUE,echo=FALSE>>=

wells <- unique(subset(pbmcFPbgc@wellAnnotation,Negative.Control=="A03",select="name")[,1])
print(xyplot(`FL1-H` ~ `FSC-H` | as.factor(Well.Id), transform("FL1-H"=log10) %on% pbmcFPbgc[wells], displayFilter=TRUE,smooth=FALSE, col=c("red","blue"),
filter="Negative.Control"))

@
\end{center}
\caption{Dotplots showing the test wells associated with the negative control well A03. The gate was created automatically using
\Rfunction{setControlGates} to estimate the threshold between positive and negative cells based on the staining in A03.
\Rclass{rectangleGates} are drawn around the positive cells in each plot. Positive cells are shown in blue and negative cells in
red.}
\label{fig:isoGate1}
\end{figure}



\clearpage
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Displaying Data}
Tools for visualizing \Rclass{flowPlates} are based on the collection of functions in \Rpackage{flowViz}, which is itself
based on \Rpackage{lattice}. Currently on \Rfunction{xyplot} works directly on a \Rclass{flowPlate} object, but the
other functions like \Rfunction{levelplot} (shown in Figure \ref{fig:bgc1}) can be used by accessing the \Rclass{flowSet} inside a \Rclass{flowPlate}
via \Rfunction{plateSet}. \Rfunction{xyplot} is the primary tool for making dotplots, showing gates, and creating overlay
plots. 

If the negative control gates have been created using \Rfunction{setControlGates}, then they can be shown using \Rfunction{xyplot}.
The code used to generate Figure ~\ref{fig:isoGate1} is shown below.
<<eval=FALSE>>=
wells <- unique(subset(pbmcFPbgc@wellAnnotation,Negative.Control=="A03",
  select="name")[,1])
xyplot(`FL1-H` ~ `FSC-H` | as.factor(Well.Id), 
  transform("FL1-H"=log10) %on% pbmcFPbgc[wells], 
  displayFilter=TRUE,smooth=FALSE, col=c("red","blue"),
  filter="Negative.Control")
@
The wells of interest are identified by looking for all the wells that have A03 are their negative control. When used for plotting
\Rclass{flowSets}, the filter argument takes a \Rpackage{flowCore} \Rclass{filter}. For \Rclass{flowPlates} the filter can also
accept a character string indicating what type of gate to display.  \Rpackage{plateCore} currently supports "Negative.Control"
gates, and more options will be available in the future.  

In the plots shown in Figure ~\ref{fig:isoGate1}, a number of the test samples are heterogenous in expression for the markers
in the FL1.H channel. For these samples it is clear that the gate created using \Rfunction{setControlGates} looks reasonable.
In other situations where the difference between positive and negative cells is not as distinct, making overlay dotplots
showing both the test and negative control samples on the same graph can help to determine the appropriate gate.
An example of the code use to generate the overlay plots in Figure ~\ref{fig:isoGate2}is shown below.
<<eval=FALSE>>=
wells <- unique(subset(pbmcFPbgc@wellAnnotation,Negative.Control=="A06",
  select="name")[,1])
xyplot(`FL1-H` ~ `FSC-H` | as.factor(Well.Id), 
  transform("FL1-H"=log10) %on% pbmcFPbgc[wells], 
  displayFilter=TRUE,smooth=FALSE, col=c("blue","green"),
  filter="Negative.Control",filterResults="Negative.Control")
@
In order for the gates to display correctly, any transformations must be applied directly on the \Rclass{flowPlate}. This ensures
that the Negative.Control gates, along with the fluorescence signal data, are on the same scale. If the transformation is in the 
formula, as in the following example, the the gates will not show up.
<<eval=FALSE>>=
xyplot(log10(`FL1-H`) ~ `FSC-H` | as.factor(Well.Id), pbmcFPbgc[wells], 
displayFilter=TRUE,smooth=FALSE, col=c("blue","green"),
filter="Negative.Control",filterResults="Negative.Control")
@

\begin{figure}
\begin{center}
<<label=isoGate2,fig=TRUE,echo=FALSE>>=

wells <- unique(subset(pbmcFPbgc@wellAnnotation,Negative.Control=="A06",select="name")[,1])

print(xyplot(`FL1-H` ~ `FSC-H` | as.factor(Well.Id), transform("FL1-H"=log10) %on% pbmcFPbgc[wells], displayFilter=TRUE,smooth=FALSE, col=c("blue","green"),
filter="Negative.Control",filterResults="Negative.Control"))

@
\end{center}
\caption{Overlay dotplots for the test wells associated with the negative control well A06. The cells from A06 (green) are shown in
each dotplot along with the results from the test cells (blue). In cases where the test well population is close to the positive-negative 
threshold, such as well D08, these plots can help determine if the Negative.Control gates should be adjusted.}
\label{fig:isoGate2}
\end{figure}


\clearpage
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Extracting Results}
Once the Negative.Control gates have been created and applied, we can then use the \Rfunction{summaryStats} to calculate different
metrics of interest from the \Rclass{flowPlate}. Running \Rfunction{summaryStats} on the \Robject{pbmcFPbgc}, 
<<echo=TRUE>>=
pbmcFPbgc <- summaryStats(pbmcFPbgc)
@
will result in additional columns created in the associated \Robject{wellAnnotation} \Rclass{data.frame}.  
<<echo=TRUE>>=
colnames(pbmcFPbgc@wellAnnotation)
@
These new columns include percentage of cells above the Negative.Control gate (Percent.Positive),
the number of cells in the raw data (Total.Count), the number of positive cells (Positive Count),
the median fluorescence intensity (MFI), and the ratio of the test well MFI to the MFI of the negative
control well (MFI.Ratio). The Predict.PP and Gate.Score columns are explained in Section ~\ref{section:Quality}.
In this PMBC example a number of the samples have multiple cell populations, so the MFI and MFI.Ratio
may not be helpful since they are based on all the cells in a well, and not just the positive cells.

Now that we have this information we can display it using \Rfunction{xyplot}. We can add the marker names and percent
positive results to our  dotplots (Figure ~\ref{fig:isoGate3}). The \Rfunction{getGroups} retrieves a list
of wells associated with each negative control for a particular channel.  In this case, the 3rd element
in the controlGroups list corresponds to negative control well A03.
<<eval=FALSE,echo=TRUE>>=
controlGroups <- getGroups(pbmcFPbgc,chan="FL1-H")

print(xyplot(`FL1-H` ~ `FSC-H` | as.factor(Well.Id),
	transform("FL1-H"=log10) %on% pbmcFPbgc[controlGroups[[3]]],
	displayFilter=TRUE,
	smooth=FALSE,
	col=c("red","blue"),
	filter="Negative.Control",
	flowStrip=c("Well.Id","Ab.Name","Percent.Positive")))
@


The \Robject{wellAnnotation} \Rclass{data.frame} can be exported as a comma delimited for a record of the analysis.
<<eval=FALSE>>=
write.csv(pbmcFPbgc@wellAnnotation,file="PMBC.001.csv")
@

\clearpage
\begin{figure}
\begin{center}
<<label=isoGate3,fig=TRUE,echo=FALSE>>=
controlGroups <- getGroups(pbmcFPbgc,chan="FL1-H")

print(xyplot(`FL1-H` ~ `FSC-H` | as.factor(Well.Id),
				transform("FL1-H"=log10) %on% pbmcFPbgc[controlGroups[[3]]],
				displayFilter=TRUE,
				smooth=FALSE,
				col=c("red","blue"),
				filter="Negative.Control",
				flowStrip=c("Well.Id","Ab.Name","Percent.Positive")))
@
\end{center}
\caption{Dotplots and Negative.Control gates for wells associated with control well A03. Negative cells are shown in read and positive
cells in blue. The strip above each plot now contains the Antibody-Marker name, along with the percentage of positive cells in the well.}
\label{fig:isoGate3}
\end{figure}





%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\clearpage
\section{Quality Assessment}
\label{section:Quality}

Quality assessment using Bioconductor flow tools is covered in detail in
the flowQ publication (Le Meur et al. 2007, Cytometry Part A). This example briefly covers
how to check the number of lymphocyte events in each well, scan for fluidic
events, and check the consistency of the isotype gating.

The number of gated lymphocytes in each well of the pbmcFPbgc dataset can be found in the
Total.Count column of the wellAnnotation. 
<<echo=TRUE>>=
summary(wellAnnotation(pbmcFPbgc)$Total.Count)
@
Since each well has at least 858 gated lymphocyte events there are no apparent
problems with the sample acquisition on the plate.

Fluidic events (bubbles, etc.) can cause the cytometer readings to shift. These
type of events can often be identified by plotting FSC vs Time, or through ecdf
plots (as described in Le Meur et al. 2007, Cytometry Part A). An example of an ecdf plot
for FSC looking for a row effect, since the samples are acquired by row, is shown
in Figure ~\ref{fig:ecdf}.  The code used to produce the figure is shown below.
<<eval=FALSE>>=
print(flowViz::ecdfplot(~`FSC-H` | as.factor(Column.Id),
 data=plateSet(pbmcFPbgc), groups=Row.Id, auto.key=TRUE))
@

One approach to evaluating the consistency of isotype-based gating is to plot
the MFI ratio versus the percentage of positive cells. The MFI ratio is the
ratio of the median fluorescence intensity of the test sample to its corresponding
isotype control. If two sample have approximately the same percentage of positive
cells, then their MFI ratios should be similar. The \Rfunction{summaryStats} function
in \Rpackage{plateCore} performs a robust logistic regression on the MFI ratio to the 
percentage of positive cells, and results from the fit are stored in the Predict.PP
and Gate.Score columns of wellAnnotation. Predict.PP gives the estimated percentage
of positive cells based on the MFI ratio, and Gate.Score indicates how many 
standard residuals a sample data point is from the best fit line. Figure ~\ref{fig:mfiRatio}
shows the MFI ratio versus percent positive plot for the PBMC lymphocyte 
example. The code used to generate the plot is shown below. The \Rfunction{glmrob}
function from the \Rpackage{robustbase} package is used to perform the regression.

<<eval=FALSE>>=
mfiPlot(fp,thresh=2,xlab="MFI Ratio (Test MFI / Isotype MFI)",xlim=c(0.1,250),
		ylab="Percentage of cells above the isotype gate",pch=23)
@


\clearpage
\begin{figure}
\begin{center}
<<label=ecdf,fig=TRUE,echo=FALSE>>=
print(flowViz::ecdfplot(~`FSC-H` | as.factor(Column.Id),
				data=plateSet(pbmcFPbgc), groups=Row.Id, auto.key=TRUE))
@
\end{center}
\caption{Empirical Cumulative Distribution Function (ecdf) plot for FSC from the
pmbc lymphocyte example plate. }
\label{fig:ecdf}
\end{figure}


\clearpage
\begin{figure}
\begin{center}
<<label=mfiRatio,fig=TRUE,echo=FALSE>>=
mfiPlot(pbmcFPbgc,thresh=2,xlab="MFI Ratio (Test MFI / Isotype MFI)",xlim=c(0.1,250),
		ylab="Percentage of cells above the isotype gate",pch=23)
@
\end{center}
\caption{Plot of the MFI ratio versus the percentage of positive cells for the 
example PBMC lymphocytes. The robust best fit line is shown in red, and samples
that are more than 2 standard residuals away from the line are shown in red.}
\label{fig:mfiRatio}
\end{figure}

\clearpage
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Example: PBMC Large Panel Study}
\label{section:PBMC}

The PBMC dataset used in this example is available for download from ficcs.org
as the \href{http://www.ficcs.org/data/plateData.tar.gz}{plateData.tar.gz} file. The data consists
of 5 different PBMC samples that were analyzed with 189 different antibodies on 96-well plates. Each plate has a set of
unstained, isotype, and control wells. Antibodies and isotype controls are arrayed 3 per well, but the data was compensated
on the cytometer so there is no need to correct for spillover. This example assumes
that the plates have been unpacked and stored in the folder "data/pbmc". Each 
of the 5 subfolders in the PBMC directory contains 96 fcs files. The text delimited
file describing the layout of the plate is named pmbcPlateLayout.csv.

Processing all 5 plates is memory intensive, so each plate was loaded separately
and the resulting \Robject{flowPlate} was stored in an R data image in the pbmcRData folder. The code used 
to process plate 8774 is shown below. Settings for the remaining 4 plates were identical to 
plate 8774.
<<eval=FALSE>>=
plateName <- "lymph08774"

plateDescription <- read.delim("pmbcPlateLayout.csv",
	as.is=TRUE,header=TRUE,stringsAsFactors=FALSE)

platePBMCraw <- flowPlate(data=read.flowSet(path="data/pbmc/08774"),
	plateDescription,plateName=plateName)

platePBMC <- Subset(platePBMCraw,
	rectangleGate("FSC-H"=c(300,700),"SSC-H"=c(50,400)) & 
	norm2Filter("SSC-H","FSC-H",scale.factor=1.5))

platePBMC <- setControlGates(platePBMC,gateType="Negative.Control")

platePBMC <- applyControlGates(platePBMC)

platePBMC <- summaryStats(platePBMC)

save.image(file=paste("pbmcRData/",plateName,".Rdata",sep=""))
@

Once the 5 plates have been processed, the \Robject{flowPlates} can then be combined
using \Rfunction{fpbind}. In this case the plates were saved using the same name,
platePBMC. The \Robject{flowPlates} were read into a list, and then combined into
a large "virtPlate". 
<<eval=FALSE>>=
fileNames <- list.files("pbmcRData",full.names=TRUE)
plates <- lapply(fileNames,function(x){
			load(x)
			platePBMC
		})
virtPlate <- fpbind(plates[[1]],plates[[2]],plates[[3]],plates[[4]],plates[[5]])
@

Once the plates have been combined, xyplots and densityplots can then be conditioned
on "plateName" to create graphics like Figure ~\ref{fig:expDens}. The R code used to generate
Figure ~\ref{fig:expDens} is shown below.
<<eval=FALSE>>=
print(densityplot(~ `FL2-H` | as.factor(plateName),
	transform("FL2-H"=log10) %on% virtPlate[c("C02","C03","A05")],
	layout=c(3,2),xlim=c(-0.2,2.5),
	filterResult="Negative.Control",lty=c(1,2,3,4),
	col=c('blue','black','red')))
@


\clearpage
\begin{figure}
\centering
\includegraphics{expDens.pdf}
\caption{Density plot for two test antibodies (red,black) and their associated 
isotype control (blue). The isotype gate is indicated with a vertical black 
bar.}
\label{fig:expDens}
\end{figure}


\end{document}