%\VignetteIndexEntry{SurvComp: a package for performance assessment and comparison for survival analysis} %\VignetteDepends{survival, bootstrap, grid, rmeta, SuppDists, prodlim, ipred, survivalROC} %\VignetteSuggests{Hmisc, CPE, clinfun, Biobase, xtable} %\VignetteKeywords{Breast Cancer, Survival Analysis, GeneExpression, DifferentialExpression} %\VignettePackage{survcomp} %documentclass[12pt, a4paper]{article} \documentclass[12pt]{article} \usepackage{helvet} \renewcommand{\familydefault}{\sfdefault} \usepackage{amsmath} \usepackage{hyperref} \usepackage[numbers]{natbib} \usepackage[american]{babel} \usepackage{authblk} \usepackage{graphicx} \usepackage{epstopdf} %\renewcommand\Authfont{\scshape %\renewcommand\Affilfont{\itshape\small} \usepackage{Sweave} \renewcommand{\topfraction}{0.85} \renewcommand{\textfraction}{0.1} %\usepackage{tikz} \textwidth=6.2in \textheight=8.5in %\parskip=.3cm \oddsidemargin=.1in \evensidemargin=.1in \headheight=-.3in %------------------------------------------------------------ % newcommand %------------------------------------------------------------ \newcommand{\scscst}{\scriptscriptstyle} \newcommand{\scst}{\scriptstyle} \newcommand{\Robject}[1]{{\texttt{#1}}} \newcommand{\Rfunction}[1]{{\texttt{#1}}} \newcommand{\Rclass}[1]{\textit{#1}} \newcommand{\Rpackage}[1]{\textit{#1}} \newcommand{\Rexpression}[1]{\texttt{#1}} \newcommand{\Rmethod}[1]{{\texttt{#1}}} \newcommand{\Rfunarg}[1]{{\texttt{#1}}} \begin{document} %------------------------------------------------------------ \title{\vspace{-2cm}\Rpackage{survcomp}: a package for performance assessment and comparison for survival analysis} %------------------------------------------------------------ \author[1]{Benjamin Haibe-Kains} \author[2]{Markus Schr\"{o}der} \author[3]{Catharina Olsen} \author[4]{Christos Sotiriou} \author[3]{Gianluca Bontempi} \author[5,6]{John Quackenbush} \affil[1]{Bioinformatics and Computational Genomics Laboratory, Princess Margaret Cancer Center, University Health Network, Toronto, Ontario, Canada} \affil[2]{UCD School of Biomolecular and Biomedical Science, Conway Institute, University College Dublin, Belfield, Dublin, Ireland} \affil[3]{Machine Learning Group, Universit\'{e} Libre de Bruxelles} \affil[4]{Breast Cancer Translational Research Laboratory, Institut Jules Bordet, Universit\'{e} Libre de Bruxelles} \affil[5]{Computational Biology and Functional Genomics Laboratory, Dana-Farber Cancer Institute, Harvard School of Public Health} \affil[6]{Center for Cancer Computational Biology, Dana-Farber Cancer Institute} \SweaveOpts{highlight=TRUE, tidy=TRUE, keep.space=TRUE, keep.blank.space=FALSE, keep.comment=TRUE} %<>= %library(pgfSweave) %setCacheDir("cache") %options(keep.source=TRUE) %@ \maketitle \tableofcontents %------------------------------------------------------------ \clearpage \section{Introduction} %------------------------------------------------------------ The \Rpackage{SurvComp} package is providing functions to assess and to statistically compare the performance of risk prediction (survival) models. It includes (i) implementation of state-of-the-art statistics developed to measure the performance of risk prediction models and (ii) to combine these statistics estimated from multiple datasets using a meta-analytical framework, functions (iii) to visualize those measurements in a clear and compact way, and (iv) to statistically compare the performance of competitive models. %------------------------------------------------------------ \subsection{Installation} %------------------------------------------------------------ \Rpackage{SurvComp} requires that \Rpackage{survival}, \Rpackage{ipred}, \Rpackage{prodlim}, \Rpackage{survivalROC}, \Rpackage{SuppDists}, \Rpackage{bootstrap} and \Rpackage{R} (>= 2.3.0) are installed. These should be installed automatically when you install \Rpackage{SurvComp}. To install \Rpackage{SurvComp}, source biocLite from bioconductor: <>= source("http://bioconductor.org/biocLite.R") biocLite("survcomp") @ Load the \Rpackage{SurvComp}, into your current workspace: <>== library(survcomp) @ %------------------------------------------------------------ \subsection{Further help} %------------------------------------------------------------ To view the \Rpackage{SurvComp} description and a summary of all the functions within \Rpackage{SurvComp}, type the following: <>== library(help=survcomp) @ %------------------------------------------------------------ \subsection{Citing} %------------------------------------------------------------ We are delighted if you use this package. Please do email us if you find a bug or have a suggestion. We would be very grateful if you could cite: B. Haibe-Kains, C. Desmedt, C. Sotiriou and G. Bontempi (2008) A comparative study of survival models for breast cancer prognostication based on microarray data: does a single gene beat them all? \textit{Bioinformatics} \textbf{24(19):}2200-2208. %------------------------------------------------------------ \section{A use case: from expression data to survival analysis} %------------------------------------------------------------ We will very briefly demonstrate some of the functions in \Rpackage{SurvComp}. We use the \Robject{breastCancerData} datafile for demonstration purposes, it includes subsets of the datasets \Rpackage{breastCancerMAINZ}, \Rpackage{breastCancerTRANSBIG}, \Rpackage{breastCancerUPP}, \Rpackage{breastCancerUNT}, \Rpackage{breastCancerVDX} and \Rpackage{breastCancerNKI}, available as experimental datapackages on Bioconductor. The six datasets in \Robject{breastCancerData} contain the genes AURKA (also known as STK6, STK7, or STK15), PLAU (also known as uPA), STAT1, VEGF, CASP3, ESR1, and ERBB2, as introduced by Desmedt et al. 2008 \cite{Desmedt2008}. The seven genes represent the proliferation, tumor invasion/metastasis, immune response, angiogenesis, apoptosis phenotypes, and the ER and HER2 signaling, respectively. %------------------------------------------------------------ \subsection{Overview} %------------------------------------------------------------ To use the \Robject{ExpressionSet} object we have to load the \Rpackage{Biobase} package. We also make use of the package \Rpackage{xtable} in order to visualize some of the results as tables in this Vignette. <>== library(Biobase) library(xtable) library(rmeta) library(xtable) @ Loading the \Robject{breastCancerData} object will results in 6 new objects. If you execute \Rmethod{ls()} you will see \Robject{mainz7g},\Robject{transbig7g}, \Robject{upp7g}, \Robject{unt7g}, \Robject{vdx7g} and \Robject{nki7g}. More details about these datasets are available in the \Robject{breastCancerData} manpage (\Rmethod{?breastCancerData}). <>== data(breastCancerData) mainz7g @ Before we can start the analysis, we have to define the annotation for the mentioned seven genes, the datasets we use and a few help-variables. We define the gene symbol list (\Robject{gsList}), the entrez-gene ID list (\Robject{gidList}), the probe names for the Agilent microarray (\Robject{probesNKI}), the probe names for the Affymetrix microarray (\Robject{probesAffy}), a list containing the dataset names (\Robject{datasetList}), spaces for displaying the text in the forestplot at the right place (\Robject{myspace} and \Robject{mybigspace}) and \Robject{tc} for setting the censored time to 10 years. We converted the gene symbols for each gene to lowercase for better separation from the datasets. Table \ref{tab:variables} gives an overview of the gene annotation. <>== gsList <- tolower(fData(mainz7g)[,"Gene.symbol"]) gidList <- fData(mainz7g)[,"Gene.ID"] probesNKI <- as.character(fData(nki7g)[,"probe"]) probesAffy <- fData(mainz7g)[,"probe"] datasetList <- c("MAINZ","TRANSBIG","UPP","UNT","VDX","NKI","","Overall") myspace <- " " mybigspace <- " " tc <- 10 * 365 @ \begin{table}[!t] \begin{center} <>== overview <- as.data.frame(cbind("Gene Symbol"=gsList,"Gene ID"=gidList,"Probes Agilent"=probesNKI,"Probes Affy"=probesAffy)) print(xtable(overview), floating=FALSE) @ \caption{Overview of the annotation of the seven genes.} \label{tab:variables} \end{center} \end{table} %------------------------------------------------------------ \subsection{Computing concordance index, D index and hazard ratio} %------------------------------------------------------------ To compute the concordance index \cite{Harrel1996, Pencina2004} for each gene in each dataset, we have to call the \Rmethod{concordance.index()}) function for each dataset. See '\Rmethod{?concordance.index}' for details. The following command shows the computation of the concordance index for each gene in the \Robject{mainz7g} dataset. <>== cindexall.mainz.small <- t(apply(X=exprs(mainz7g), MARGIN=1, function(x, y, z) { tt <- concordance.index(x=x, surv.time=y, surv.event=z, method="noether", na.rm=TRUE); return(c("cindex"=tt$c.index, "cindex.se"=tt$se, "lower"=tt$lower, "upper"=tt$upper)); }, y=pData(mainz7g)[ ,"t.dmfs"], z=pData(mainz7g)[ ,"e.dmfs"])) @ <>== cindexall.transbig.small <- t(apply(X=exprs(transbig7g), MARGIN=1, function(x, y, z) { tt <- concordance.index(x=x, surv.time=y, surv.event=z, method="noether", na.rm=TRUE); return(c("cindex"=tt$c.index, "cindex.se"=tt$se, "lower"=tt$lower, "upper"=tt$upper)); }, y=pData(transbig7g)[ ,"t.dmfs"], z=pData(transbig7g)[ ,"e.dmfs"])) cindexall.vdx.small <- t(apply(X=exprs(vdx7g), MARGIN=1, function(x, y, z) { tt <- concordance.index(x=x, surv.time=y, surv.event=z, method="noether", na.rm=TRUE); return(c("cindex"=tt$c.index, "cindex.se"=tt$se, "lower"=tt$lower, "upper"=tt$upper)); }, y=pData(vdx7g)[ ,"t.dmfs"], z=pData(vdx7g)[ ,"e.dmfs"])) cindexall.upp.small <- t(apply(X=exprs(upp7g), MARGIN=1, function(x, y, z) { tt <- concordance.index(x=x, surv.time=y, surv.event=z, method="noether", na.rm=TRUE); return(c("cindex"=tt$c.index, "cindex.se"=tt$se, "lower"=tt$lower, "upper"=tt$upper)); }, y=pData(upp7g)[ ,"t.rfs"], z=pData(upp7g)[ ,"e.rfs"])) cindexall.unt.small <- t(apply(X=exprs(unt7g), MARGIN=1, function(x, y, z) { tt <- concordance.index(x=x, surv.time=y, surv.event=z, method="noether", na.rm=TRUE); return(c("cindex"=tt$c.index, "cindex.se"=tt$se, "lower"=tt$lower, "upper"=tt$upper)); }, y=pData(unt7g)[ ,"t.dmfs"], z=pData(unt7g)[ ,"e.dmfs"])) cindexall.nki.small <- t(apply(X=exprs(nki7g), MARGIN=1, function(x, y, z) { tt <- concordance.index(x=x, surv.time=y, surv.event=z, method="noether", na.rm=TRUE); return(c("cindex"=tt$c.index, "cindex.se"=tt$se, "lower"=tt$lower, "upper"=tt$upper)); }, y=pData(nki7g)[ ,"t.dmfs"], z=pData(nki7g)[ ,"e.dmfs"])) @ To compute the D index \cite{Royston2004} for each gene in each dataset, we have to call the \Rmethod{D.index()}) function. See '\Rmethod{?D.index}' for details. The following command shows the computation of the D index for each gene in the \Robject{mainz7g} dataset. <>== dindexall.mainz.small <- t(apply(X=exprs(mainz7g), MARGIN=1, function(x, y, z) { tt <- D.index(x=x, surv.time=y, surv.event=z, na.rm=TRUE); return(c("dindex"=tt$d.index, "dindex.se"=tt$se, "lower"=tt$lower, "upper"=tt$upper)); }, y=pData(mainz7g)[ ,"t.dmfs"], z=pData(mainz7g)[ ,"e.dmfs"])) @ <>== dindexall.transbig.small <- t(apply(X=exprs(transbig7g), MARGIN=1, function(x, y, z) { tt <- D.index(x=x, surv.time=y, surv.event=z, na.rm=TRUE); return(c("dindex"=tt$d.index, "dindex.se"=tt$se, "lower"=tt$lower, "upper"=tt$upper)); }, y=pData(transbig7g)[ ,"t.dmfs"], z=pData(transbig7g)[ ,"e.dmfs"])) dindexall.upp.small <- t(apply(X=exprs(upp7g), MARGIN=1, function(x, y, z) { tt <- D.index(x=x, surv.time=y, surv.event=z, na.rm=TRUE); return(c("dindex"=tt$d.index, "dindex.se"=tt$se, "lower"=tt$lower, "upper"=tt$upper)); }, y=pData(upp7g)[ ,"t.rfs"], z=pData(upp7g)[ ,"e.rfs"])) dindexall.unt.small <- t(apply(X=exprs(unt7g), MARGIN=1, function(x, y, z) { tt <- D.index(x=x, surv.time=y, surv.event=z, na.rm=TRUE); return(c("dindex"=tt$d.index, "dindex.se"=tt$se, "lower"=tt$lower, "upper"=tt$upper)); }, y=pData(unt7g)[ ,"t.dmfs"], z=pData(unt7g)[ ,"e.dmfs"])) dindexall.vdx.small <- t(apply(X=exprs(vdx7g), MARGIN=1, function(x, y, z) { tt <- D.index(x=x, surv.time=y, surv.event=z, na.rm=TRUE); return(c("dindex"=tt$d.index, "dindex.se"=tt$se, "lower"=tt$lower, "upper"=tt$upper)); }, y=pData(vdx7g)[ ,"t.dmfs"], z=pData(vdx7g)[ ,"e.dmfs"])) dindexall.nki.small <- t(apply(X=exprs(nki7g), MARGIN=1, function(x, y, z) { tt <- D.index(x=x, surv.time=y, surv.event=z, na.rm=TRUE); return(c("dindex"=tt$d.index, "dindex.se"=tt$se, "lower"=tt$lower, "upper"=tt$upper)); }, y=pData(nki7g)[ ,"t.dmfs"], z=pData(nki7g)[ ,"e.dmfs"])) @ To compute the hazard ratio \cite{Cox1972} for each gene in each dataset, we have to call the \Rmethod{hazard.ratio()}) function. See \Rmethod{?hazard.ratio} for details. Before we compute the hazard ratio, we have to rescale the gene expression data for each dataset to a comparable scale, since the Affymetrix and Agilent microarrays have a different range of their gene expression, which would affect the hazard ratio computation. Table \ref{tab:ranges} gives an overview of the gene expression ranges in the six datasets that are included in \Robject{breastCancerData}. <>== tt <- list(mainz7g, transbig7g, upp7g, unt7g, vdx7g, nki7g) ttNames <- c("MAINZ","TRANSBIG","UPP","UNT","VDX","NKI") dataRange <- NULL for(i in tt) { rr <- range(exprs(i), na.rm=TRUE) dataRange$Min <- rbind(dataRange$Min, rr[1]) dataRange$Max <- rbind(dataRange$Max, rr[2]) } dataRange <- as.data.frame(dataRange) rownames(dataRange) <- ttNames @ \begin{table}[!t] \begin{center} <>== print(xtable(dataRange), floating=FALSE) @ \caption{Overview of the gene expression ranges in the six datasets.} \label{tab:ranges} \end{center} \end{table} Therefore we use the following function to rescale the gene expression values to lie approximately in [-1,1], robust to extreme values (possibly outliers). <>== rescale <- function(x, na.rm=FALSE, q=0.05) { ma <- quantile(x, probs=1-(q/2), na.rm=na.rm) mi <- quantile(x, probs=q/2, na.rm=na.rm) x <- (x - mi) / (ma - mi) return((x - 0.5) * 2) } @ The following command shows the rescaling and the computation of the hazard ratio for each gene in the \Robject{mainz7g} dataset. <>== hratio.mainz.small <- t(apply(X=rescale(exprs(mainz7g) , q=0.05, na.rm=TRUE), MARGIN=1, function(x, y, z) { tt <- hazard.ratio(x=x, surv.time=y, surv.event=z, na.rm=TRUE); return(c("hratio"=tt$hazard.ratio, "hratio.se"=tt$se, "lower"=tt$lower, "upper"=tt$upper)); }, y=pData(mainz7g)[ ,"t.dmfs"], z=pData(mainz7g)[ ,"e.dmfs"])) @ <>== hratio.transbig.small <- t(apply(X=rescale(exprs(transbig7g), q=0.05, na.rm=TRUE), MARGIN=1, function(x, y, z) { tt <- hazard.ratio(x=x, surv.time=y, surv.event=z, na.rm=TRUE); return(c("hratio"=tt$hazard.ratio, "hratio.se"=tt$se, "lower"=tt$lower, "upper"=tt$upper)); }, y=pData(transbig7g)[ ,"t.dmfs"], z=pData(transbig7g)[ ,"e.dmfs"])) hratio.upp.small <- t(apply(X=rescale(exprs(upp7g), q=0.05, na.rm=TRUE), MARGIN=1, function(x, y, z) { tt <- hazard.ratio(x=x, surv.time=y, surv.event=z, na.rm=TRUE); return(c("hratio"=tt$hazard.ratio, "hratio.se"=tt$se, "lower"=tt$lower, "upper"=tt$upper)); }, y=pData(upp7g)[ ,"t.rfs"], z=pData(upp7g)[ ,"e.rfs"])) hratio.unt.small <- t(apply(X=rescale(exprs(unt7g), q=0.05, na.rm=TRUE), MARGIN=1, function(x, y, z) { tt <- hazard.ratio(x=x, surv.time=y, surv.event=z, na.rm=TRUE); return(c("hratio"=tt$hazard.ratio, "hratio.se"=tt$se, "lower"=tt$lower, "upper"=tt$upper)); }, y=pData(unt7g)[ ,"t.dmfs"], z=pData(unt7g)[ ,"e.dmfs"])) hratio.vdx.small <- t(apply(X=rescale(exprs(vdx7g), q=0.05, na.rm=TRUE), MARGIN=1, function(x, y, z) { tt <- hazard.ratio(x=x, surv.time=y, surv.event=z, na.rm=TRUE); return(c("hratio"=tt$hazard.ratio, "hratio.se"=tt$se, "lower"=tt$lower, "upper"=tt$upper)); }, y=pData(vdx7g)[ ,"t.dmfs"], z=pData(vdx7g)[ ,"e.dmfs"])) hratio.nki.small <- t(apply(X=rescale(exprs(nki7g), q=0.05, na.rm=TRUE), MARGIN=1, function(x, y, z) { tt <- hazard.ratio(x=x, surv.time=y, surv.event=z, na.rm=TRUE); return(c("hratio"=tt$hazard.ratio, "hratio.se"=tt$se, "lower"=tt$lower, "upper"=tt$upper)); }, y=pData(nki7g)[ ,"t.dmfs"], z=pData(nki7g)[ ,"e.dmfs"])) @ To get an overall estimate over all datasets for the concordance index from each gene, we iterate over all the concordance indices of all datasets and combine them with the \Rmethod{combine.est()} function \cite{Cochrane1954} and recalculate the lower- and upper border accordingly. We do that for the D indices and the hazard ratios in the same way. %------------------------------------------------------------ \subsection{Combining estimations across datasets} %------------------------------------------------------------ <>== tt <- as.data.frame(NULL) for(i in 1:7){ tt <- rbind( tt,combine.est( x=cbind( cindexall.mainz.small[i,"cindex"], cindexall.transbig.small[i,"cindex"], cindexall.upp.small[i,"cindex"], cindexall.unt.small[i,"cindex"], cindexall.vdx.small[i,"cindex"], cindexall.nki.small[i,"cindex"]), x.se=cbind (cindexall.mainz.small[i,"cindex.se"], cindexall.transbig.small[i,"cindex.se"], cindexall.upp.small[i,"cindex.se"], cindexall.unt.small[i,"cindex.se"], cindexall.vdx.small[i,"cindex.se"], cindexall.nki.small[i,"cindex.se"]), na.rm=TRUE) ) } tt$lower <- tt$estimate + qnorm(0.025, lower.tail=TRUE) * tt$se tt$upper <- tt$estimate + qnorm(0.025, lower.tail=FALSE) * tt$se rownames(tt) <- gsList colnames(tt) <- c("cindex","cindex.se","lower","upper") ccindex <- tt @ The combined concordance indices for the six datasets are shown in table \ref{tab:cindex}. \begin{table}[!t] \begin{center} <>== print(xtable(ccindex), floating=FALSE) @ \caption{Combined concordance indices of each gene for the six datasets.} \label{tab:cindex} \end{center} \end{table} The combined log2 D indices for the six datasets are shown in table \ref{tab:dindex}. \begin{table}[!t] \begin{center} <>== tt <- as.data.frame(NULL) for(i in 1:7){ tt <- rbind( tt,combine.est( x=cbind( dindexall.mainz.small[i,"dindex"], dindexall.transbig.small[i,"dindex"], dindexall.upp.small[i,"dindex"], dindexall.unt.small[i,"dindex"], dindexall.vdx.small[i,"dindex"], dindexall.nki.small[i,"dindex"]), x.se=cbind(dindexall.mainz.small[i,"dindex.se"], dindexall.transbig.small[i,"dindex.se"], dindexall.upp.small[i,"dindex.se"], dindexall.unt.small[i,"dindex.se"], dindexall.vdx.small[i,"dindex.se"], dindexall.nki.small[i,"dindex.se"]),na.rm=TRUE) ) } tt$lower <- tt$estimate + qnorm(0.025, lower.tail=TRUE) * tt$se tt$upper <- tt$estimate + qnorm(0.025, lower.tail=FALSE) * tt$se rownames(tt) <- gsList colnames(tt) <- c("dindex","dindex.se","lower","upper") cdindex <- tt print(xtable(log2(cdindex)), floating=FALSE) @ \caption{Combined log2 D indices of each gene for the six datasets.} \label{tab:dindex} \end{center} \end{table} The combined log2 hazard ratios for the six datasets are shown in table \ref{tab:hratio}. \begin{table}[!t] \begin{center} <>== tt <- as.data.frame(NULL) for(i in 1:7){ tt <- rbind( tt,combine.est( x=cbind( hratio.mainz.small[i,"hratio"], hratio.transbig.small[i,"hratio"], hratio.upp.small[i,"hratio"], hratio.unt.small[i,"hratio"], hratio.vdx.small[i,"hratio"], hratio.nki.small[i,"hratio"]), x.se=cbind(hratio.mainz.small[i,"hratio.se"], hratio.transbig.small[i,"hratio.se"], hratio.upp.small[i,"hratio.se"], hratio.unt.small[i,"hratio.se"], hratio.vdx.small[i,"hratio.se"], hratio.nki.small[i,"hratio.se"]),na.rm=TRUE) ) } tt$lower <- tt$estimate + qnorm(0.025, lower.tail=TRUE) * tt$se tt$upper <- tt$estimate + qnorm(0.025, lower.tail=FALSE) * tt$se rownames(tt) <- gsList colnames(tt) <- c("hratio","hratio.se","lower","upper") chratio <- tt print(xtable(log2(chratio)), floating=FALSE) @ \caption{Combined log2 hazard ratios of each gene for the six datasets.} \label{tab:hratio} \end{center} \end{table} %------------------------------------------------------------ \subsection{The forestplot.surv} %------------------------------------------------------------ To display the combined concordance indices of each genes over all datasets, we use the \Rmethod{forestplot.surv()} function \cite{Lewis2001}. The resulting forestplot for all concordance indices is: <>== labeltext <- cbind(c("Gene Symbol",gsList),c(rep(myspace,8))) bs <- rep(0.5, nrow(labeltext)) r.mean <- c(NA,ccindex$cindex) r.lower <- c(NA,ccindex$lower) r.upper <- c(NA,ccindex$upper) forestplot.surv(labeltext=labeltext, mean=r.mean, lower=r.lower, upper=r.upper, zero=0.5, align=c("l"), graphwidth=grid::unit(2, "inches"), x.ticks=seq(0.4,0.7,0.05), xlab=paste( "Concordance Index", myspace, sep=""), col=meta.colors(box="royalblue", line="darkblue", zero="darkred"), box.size=bs, clip=c(0.4,1)) @ The resulting forestplot for all D indices is: <>== labeltext <- cbind(c("Gene Symbol",gsList),c(rep(myspace,8))) bs <- rep(0.5, nrow(labeltext)) r.mean <- c(NA,log2(cdindex$dindex)) r.lower <- c(NA,log2(cdindex$lower)) r.upper <- c(NA,log2(cdindex$upper)) forestplot.surv(labeltext=labeltext, mean=r.mean, lower=r.lower, upper=r.upper, zero=0, align=c("l"), graphwidth=grid::unit(2, "inches"), x.ticks=seq(-0.5,1,0.1), xlab=paste("log2 D Index", myspace, sep=""), col=meta.colors(box="royalblue", line="darkblue", zero="darkred"), box.size=bs, clip=c(-0.5,1.25)) @ The resulting forestplot for all hazard ratios is: <>== labeltext <- cbind(c("Gene Symbol",gsList),c(rep(mybigspace,8))) bs <- rep(0.5, nrow(labeltext)) r.mean <- c(NA,log2(chratio$hratio)) r.lower <- c(NA,log2(chratio$lower)) r.upper <- c(NA,log2(chratio$upper)) forestplot.surv(labeltext=labeltext, mean=r.mean, lower=r.lower, upper=r.upper, zero=0, align=c("l"), graphwidth=grid::unit(2, "inches"), x.ticks=seq(-0.5,3.5,0.5), xlab=paste( "log2 Hazard Ratio", myspace, sep=""), col=meta.colors(box="royalblue", line="darkblue", zero="darkred"), box.size=bs, clip=c(-0.75,3.5)) @ Taking a more specific look, e.g. at the genes AURKA and VEGF, we create the forestplot the same way as before, showing the concordance indices for both genes in each dataset and the combined estimation over all datasets. <>== tt <- rbind(cindexall.mainz.small[3,], cindexall.transbig.small[3,], cindexall.upp.small[3,], cindexall.unt.small[3,], cindexall.vdx.small[3,], cindexall.nki.small[3,], NA, as.numeric(ccindex[3,])) rownames(tt) <- datasetList tt <- as.data.frame(tt) labeltext <- cbind(c("Dataset",datasetList),c(rep(mybigspace,length(datasetList)+1))) bs <- rep(0.5, nrow(labeltext)) r.mean <- c(NA,tt$cindex) r.lower <- c(NA,tt$lower) r.upper <- c(NA,tt$upper) forestplot.surv(labeltext=labeltext, mean=r.mean, lower=r.lower, upper=r.upper, zero=0.5, align=c("l"), graphwidth=grid::unit(2, "inches"), x.ticks=seq(0.4,0.8,0.05), xlab=paste( "AURKA Concordance Index", myspace, sep=""), col=meta.colors(box="royalblue", line="darkblue", zero="darkred"), box.size=bs, clip=c(0.5,1), is.summary=(c(rep(FALSE,8),TRUE))) @ <>== tt <- rbind(cindexall.mainz.small[5,], cindexall.transbig.small[5,], cindexall.upp.small[5,], cindexall.unt.small[5,], cindexall.vdx.small[5,], cindexall.nki.small[5,], NA, as.numeric(ccindex[5,])) rownames(tt) <- datasetList tt <- as.data.frame(tt) labeltext <- cbind(c("Dataset",datasetList),c(rep(mybigspace,length(datasetList)+1))) bs <- rep(0.5, nrow(labeltext)) r.mean <- c(NA,tt$cindex) r.lower <- c(NA,tt$lower) r.upper <- c(NA,tt$upper) forestplot.surv(labeltext=labeltext, mean=r.mean, lower=r.lower, upper=r.upper, zero=0.5, align=c("l"), graphwidth=grid::unit(2, "inches"), x.ticks=seq(0.4,0.75,0.05), xlab=paste( "VEGF Concordance Index", myspace, sep=""), col=meta.colors(box="royalblue", line="darkblue", zero="darkred"), box.size=bs, clip=c(0.3,0.75), is.summary=(c(rep(FALSE,8),TRUE))) @ More advanced displaying of the genes AURKA and VEGF in a single forestplot with different colors and labels is possible: <>== tt <- rbind(cindexall.mainz.small[3,], cindexall.mainz.small[5,], NA, cindexall.transbig.small[3,], cindexall.transbig.small[5,], NA, cindexall.upp.small[3,], cindexall.upp.small[5,], NA, cindexall.unt.small[3,], cindexall.unt.small[5,], NA, cindexall.vdx.small[3,], cindexall.vdx.small[5,], NA, cindexall.nki.small[3,], cindexall.nki.small[5,], NA, as.numeric(ccindex[3,]), as.numeric(ccindex[5,])) rownames(tt) <- c("MAINZa", "MAINZv", "a", "TRANSBIGa", "TRANSBIGv", "b", "UPPa", "UPPv", "c", "UNTa", "UNTv", "d", "VDXa", "VDXv", "e", "NKIa", "NKIv", "f", "ALLa", "ALLv") tt <- as.data.frame(tt) labeltext <- cbind(c("Dataset", "MAINZ", NA, NA, "TRANSBIG", NA, NA, "UPP", NA, NA, "UNT", NA, NA, "VDX", NA, NA, "NKI", NA, NA, "Overall", NA), c("Gene", rep(c("aurka","vegf",NA), length(datasetList)-2), c("aurka","vegf")), c(rep(mybigspace,21))) bs <- rep(0.5, nrow(labeltext)) r.mean <- c(NA,tt$cindex) r.lower <- c(NA,tt$lower) r.upper <- c(NA,tt$upper) forestplot.surv(labeltext=labeltext, mean=r.mean, lower=r.lower, upper=r.upper, zero=0.5, align=c("l"), graphwidth=grid::unit(2, "inches"), x.ticks=seq(0.4,0.8,0.05), xlab=paste( "AURKA and VEGF Concordance Index", myspace, sep=""), col=meta.colors(line=c(rep(c(NA, "darkblue", "seagreen"),7)), zero="firebrick", box=c(rep(c(NA," royalblue", "forestgreen"),7))), box.size=bs, clip=c(0.3,1), is.summary=(c(rep(FALSE,19), TRUE, TRUE))) @ We display the D indices for both genes in each dataset and the combined estimation over all datasets in the same way. <>== tt <- rbind(dindexall.mainz.small[3,], dindexall.transbig.small[3,], dindexall.upp.small[3,], dindexall.unt.small[3,], dindexall.vdx.small[3,], dindexall.nki.small[3,], NA, as.numeric(cdindex[3,])) rownames(tt) <- datasetList tt <- as.data.frame(tt) labeltext <- cbind(c("Dataset",datasetList),c(rep(mybigspace,length(datasetList)+1))) bs <- rep(0.5, nrow(labeltext)) r.mean <- c(NA,log2(tt$dindex)) r.lower <- c(NA,log2(tt$lower)) r.upper <- c(NA,log2(tt$upper)) forestplot.surv(labeltext=labeltext, mean=r.mean, lower=r.lower, upper=r.upper, zero=0, align=c("l"), graphwidth=grid::unit(2, "inches"), x.ticks=seq(-0.5,2,0.5), xlab=paste("AURKA log2 D Index", myspace, sep=""), col=meta.colors(box="royalblue", line="darkblue", zero="darkred"), box.size=bs, clip=c(-0.25,2), is.summary=(c(rep(FALSE,8), TRUE))) @ <>== tt <- rbind(dindexall.mainz.small[5,], dindexall.transbig.small[5,], dindexall.upp.small[5,], dindexall.unt.small[5,], dindexall.vdx.small[5,], dindexall.nki.small[5,], NA, as.numeric(cdindex[5,])) rownames(tt) <- datasetList tt <- as.data.frame(tt) labeltext <- cbind(c("Dataset",datasetList),c(rep(mybigspace,length(datasetList)+1))) bs <- rep(0.5, nrow(labeltext)) r.mean <- c(NA,log2(tt$dindex)) r.lower <- c(NA,log2(tt$lower)) r.upper <- c(NA,log2(tt$upper)) forestplot.surv(labeltext=labeltext, mean=r.mean, lower=r.lower, upper=r.upper, zero=0, align=c("l"), graphwidth=grid::unit(2, "inches"), x.ticks=seq(-1.25,1.5,0.25), xlab=paste( "VEGF log2 D Index", myspace, sep=""), col=meta.colors(box="royalblue", line="darkblue", zero="darkred"), box.size=bs, clip=c(-1.5,1.75), is.summary=(c(rep(FALSE,8), TRUE))) @ And at last the hazard ratio for the gene AURKA in each dataset and the combined estimation over all datasets. <>== tt <- rbind(hratio.mainz.small[3,], hratio.transbig.small[3,], hratio.upp.small[3,], hratio.unt.small[3,], hratio.vdx.small[3,], hratio.nki.small[3,], NA, as.numeric(chratio[3,])) rownames(tt) <- datasetList tt <- as.data.frame(tt) labeltext <- cbind(c("Dataset",datasetList),c(rep(myspace,length(datasetList)+1))) bs <- rep(0.5, nrow(labeltext)) r.mean <- c(NA,log2(tt$hratio)) r.lower <- c(NA,log2(tt$lower)) r.upper <- c(NA,log2(tt$upper)) forestplot.surv(labeltext=labeltext, mean=r.mean, lower=r.lower, upper=r.upper, zero=0, align=c("l"), graphwidth=grid::unit(2, "inches"), x.ticks=seq(-0.5,3.5,0.5), xlab=paste( "AURKA log2 Hazard Ratio", myspace, sep=""), col=meta.colors(box="royalblue", line="darkblue", zero="darkred"), box.size=bs, clip=c(-0.5,3.5),is.summary=(c(rep(FALSE,8), TRUE))) @ The following small loop shows an easy way for creating several forestplots showing the concordance indices for a single gene for all datasets and the combined estimation over all datasets. The same can be done for the D indices and hazard ratios. Since it is not yet possible to combine several forestplots in one figure (e.g. with \Rfunction{par(mfrow=c(2,2))}), we don't display the results of the following loop. <>== for(i in 1:length(gsList)) { tt <- rbind(cindexall.mainz.small[i,], cindexall.transbig.small[i,], cindexall.upp.small[i,], cindexall.unt.small[i,], cindexall.vdx.small[i,], cindexall.nki.small[i,], NA, as.numeric(ccindex[i,])) rownames(tt) <- datasetList tt <- as.data.frame(tt) labeltext <- cbind(c("Dataset",datasetList), c(rep(myspace,length(datasetList)+1))) bs <- rep(0.5, nrow(labeltext)) r.mean <- c(NA,tt$cindex) r.lower <- c(NA,tt$lower) r.upper <- c(NA,tt$upper) x.ticks.lower <- (floor((min(r.mean,na.rm=TRUE) - 0.1) * 10)/10) x.ticks.upper <- (floor((max(r.mean,na.rm=TRUE) + 0.2) * 10)/10) forestplot.surv(labeltext=labeltext, mean=r.mean, lower=r.lower, upper=r.upper, zero=0.5, align=c("l"), graphwidth= grid::unit(2, "inches"), x.ticks=seq(x.ticks.lower,x.ticks.upper,0.05), xlab=paste(gsList[i], " Concordance Index", myspace, sep=""), col=meta.colors(box="royalblue", line="darkblue", zero="darkred"), box.size=bs, clip=c(0.3,0.8),is.summary=(c(rep(FALSE,8), TRUE))) } @ %------------------------------------------------------------ \subsection{Kaplan Meier survival curves} %------------------------------------------------------------ To display a Kaplan Meier curve \cite{Kaplan1958} for all datasets you can use: <>== surv.data <- censor.time(surv.time=c(pData(mainz7g)[ ,"t.dmfs"], pData(transbig7g)[ ,"t.dmfs"], pData(unt7g)[ ,"t.dmfs"], pData(vdx7g)[ ,"t.dmfs"], pData(upp7g)[ ,"t.rfs"], pData(nki7g)[ ,"t.dmfs"]) / 365, surv.event=c(pData(mainz7g)[ ,"e.dmfs"], pData(transbig7g)[ ,"e.dmfs"], pData(unt7g)[ ,"e.dmfs"], pData(vdx7g)[ ,"e.dmfs"], pData(upp7g)[ ,"e.rfs"], pData(nki7g)[ ,"e.dmfs"]), time.cens=tc / 365) gg <- factor(c(rep("mainz", nrow(pData(mainz7g))), rep("transbig", nrow(pData(transbig7g))), rep("unt", nrow(pData(unt7g))), rep("vdx", nrow(pData(vdx7g))), rep("upp", nrow(pData(upp7g))), rep("nki", nrow(pData(nki7g)))), levels=c("mainz", "transbig", "unt", "vdx", "upp", "nki")) dd <- data.frame("time"=surv.data[[1]], "event"=surv.data[[2]], "group"=gg) km.coxph.plot(formula.s=formula(Surv(time, event) ~ group), data.s=dd, sub.s="all", x.label="Time (years)", y.label = "Probability of DMFS/RFS", main.title="", sub.title="", leg.pos="bottomright", leg.inset=0.05, o.text=FALSE, v.line=FALSE, h.line=FALSE, .lty=rep(1, length(levels(gg))), show.n.risk=TRUE, n.risk.step=1, n.risk.cex=0.85, .col=c("darkorange", "red", "darkblue", "darkgreen", "black", "brown"), leg.text=paste(levels(gg), myspace, sep=""), verbose=FALSE, ylim=c(0.1,1)) @ If you want do display the survival curve for a single gene using the data of all six datasets, we have to concatenate the survival and expression data of all datasets (see \Robject{surv.time.all}, \Robject{surv.event.all} and \Robject{aurka.exprs} below). After that we split the patients in each dataset into three parts acording to their gene expression. We use the function \Rfunction{quantile()} for that. In the end we have three groups, representing the low gene expression group (lowest 33\% of the gene expression), intermediate gene expression group (gene expression between 33\% and 66\%) and high gene expression group (over 66\%). <>== aurkaGs <- "AURKA" aurkaGid <- 6790 aurkaPaf <- "208079_s_at" aurkaPagi <- "NM_003600" surv.time.all <- c(pData(mainz7g)[ ,"t.dmfs"], pData(transbig7g)[ ,"t.dmfs"], pData(unt7g)[ ,"t.dmfs"], pData(upp7g)[ ,"t.rfs"], pData(vdx7g)[ ,"t.dmfs"], pData(nki7g)[ ,"t.dmfs"]) surv.event.all <- c(pData(mainz7g)[ ,"e.dmfs"], pData(transbig7g)[ ,"e.dmfs"], pData(unt7g)[ ,"e.dmfs"], pData(upp7g)[ ,"e.rfs"], pData(vdx7g)[ ,"e.dmfs"], pData(nki7g)[ ,"e.dmfs"]) aurka.exprs <- c(exprs(mainz7g)[aurkaPaf,], exprs(transbig7g)[aurkaPaf,], exprs(unt7g)[aurkaPaf,], exprs(upp7g)[aurkaPaf,], exprs(vdx7g)[aurkaPaf,], exprs(nki7g)[aurkaPagi,]) aurka.exprs.length <- c(length( exprs(mainz7g)[aurkaPaf,] ), length( exprs(transbig7g)[aurkaPaf,] ), length( exprs(unt7g)[aurkaPaf,] ), length( exprs(upp7g)[aurkaPaf,] ), length( exprs(vdx7g)[aurkaPaf,] ), length( exprs(nki7g)[aurkaPagi,] )) pos <- 0 mygroup <- NULL for(i in aurka.exprs.length){ qq <- aurka.exprs[(pos+1):(pos+i)] myq <- quantile(qq, probs=c(0.33, 0.66), na.rm=TRUE) qq[aurka.exprs[(pos+1):(pos+i)] < myq[1]] <- 1 qq[aurka.exprs[(pos+1):(pos+i)] >= myq[1] & aurka.exprs[(pos+1):(pos+i)] < myq[2]] <- 2 qq[aurka.exprs[(pos+1):(pos+i)] > myq[2]] <- 3 qq <- factor(x=qq, levels=1:3) mygroup <- c(mygroup,qq) pos <- pos + i } surv.data <- censor.time(surv.time=surv.time.all / 365, surv.event=surv.event.all, time.cens=tc / 365) dd <- data.frame("time"=surv.data[[1]], "event"=surv.data[[2]], "gg"=mygroup) gg <- factor(c(rep("mainz", nrow(pData(mainz7g))), rep("transbig", nrow(pData(transbig7g))), rep("unt", nrow(pData(unt7g))), rep("upp", nrow(pData(upp7g))), rep("vdx", nrow(pData(vdx7g))), rep("nki", nrow(pData(nki7g)))), levels=c("mainz", "transbig", "unt", "upp", "vdx", "nki")) km.coxph.plot(formula.s=formula(Surv(time, event) ~ gg), data.s=dd, sub.s="all", x.label="Time (years)", y.label="Probability of DMFS/RFS", main.title="", sub.title="", leg.text=c("Low ", "Intermediate ", "High "), leg.pos="bottomright", leg.inset=0.05, o.text=FALSE, v.line=FALSE, h.line=FALSE, .col=c("darkblue", "darkgreen", "darkred"), .lty=1, show.n.risk=TRUE, n.risk.step=1, n.risk.cex=0.85, verbose=FALSE, ylim=c(0.3,1)) @ %------------------------------------------------------------ \subsection{Meta analysis of estimation values} %------------------------------------------------------------ The \Rpackage{SurvComp} package integrates functions for meta-analysis of risk-prediction models, e.g. for the concordance index or the D index. The following example shows the \Rfunction{cindex.comp.meta()} function \cite{Haibe-Kains2008}. Table \ref{tab:cindexCompMeta} shows the p-values representing the difference between the cindices of two genes using the cindices of all six datasets. For example, the cindex of the gene AURKA is with a p-value of 0.00001 significantly different from the cindex of the gene VEGF using the six datasets. <>== cindexMetaMainz <- t(apply(X=exprs(mainz7g), MARGIN=1, function(x, y, z) { tt <- concordance.index(x=x, surv.time=y, surv.event=z, method="noether", na.rm=TRUE); return(tt); }, y=pData(mainz7g)[ ,"t.dmfs"], z=pData(mainz7g)[ ,"e.dmfs"])) cindexMetaTransbig <- t(apply(X=exprs(transbig7g), MARGIN=1, function(x, y, z) { tt <- concordance.index(x=x, surv.time=y, surv.event=z, method="noether", na.rm=TRUE); return(tt); }, y=pData(transbig7g)[ ,"t.dmfs"], z=pData(transbig7g)[ ,"e.dmfs"])) cindexMetaUpp <- t(apply(X=exprs(upp7g), MARGIN=1, function(x, y, z) { tt <- concordance.index(x=x, surv.time=y, surv.event=z, method="noether", na.rm=TRUE); return(tt); }, y=pData(upp7g)[ ,"t.rfs"], z=pData(upp7g)[ ,"e.rfs"])) cindexMetaUnt <- t(apply(X=exprs(unt7g), MARGIN=1, function(x, y, z) { tt <- concordance.index(x=x, surv.time=y, surv.event=z, method="noether", na.rm=TRUE); return(tt); }, y=pData(unt7g)[ ,"t.dmfs"], z=pData(unt7g)[ ,"e.dmfs"])) cindexMetaVdx <- t(apply(X=exprs(vdx7g), MARGIN=1, function(x, y, z) { tt <- concordance.index(x=x, surv.time=y, surv.event=z, method="noether", na.rm=TRUE); return(tt); }, y=pData(vdx7g)[ ,"t.dmfs"], z=pData(vdx7g)[ ,"e.dmfs"])) ccNki <- complete.cases(exprs(nki7g)[1,], exprs(nki7g)[2,], exprs(nki7g)[3,], exprs(nki7g)[4,], exprs(nki7g)[5,], exprs(nki7g)[6,], exprs(nki7g)[7,], pData(nki7g)[,"e.dmfs"], pData(nki7g)[,"e.dmfs"]) cindexMetaNki <- t(apply(X=exprs(nki7g)[,ccNki], MARGIN=1, function(x, y, z) { tt <- concordance.index(x=x, surv.time=y, surv.event=z, method="noether", na.rm=TRUE); return(tt); }, y=pData(nki7g)[ccNki ,"t.dmfs"], z=pData(nki7g)[ccNki ,"e.dmfs"])) ccmData <- tt <- rr <- NULL for(i in 1:7){ tt <- NULL listOne <- list("mainz" = cindexMetaMainz[[i]], "transbig" = cindexMetaTransbig[[i]], "upp" = cindexMetaUpp[[i]], "unt" = cindexMetaUnt[[i]], "vdx" = cindexMetaVdx[[i]], "nki" = cindexMetaNki[[i]]) for(j in 1:7){ listTwo <- list("mainz" = cindexMetaMainz[[j]], "transbig" = cindexMetaTransbig[[j]], "upp" = cindexMetaUpp[[j]], "unt" = cindexMetaUnt[[j]], "vdx" = cindexMetaVdx[[j]], "nki" = cindexMetaNki[[j]]) rr <- cindex.comp.meta(list.cindex1=listOne, list.cindex2=listTwo) tt <- cbind(tt, rr$p.value) ##list(round(rr$p.value,5))) } ccmData <- rbind(ccmData, tt) } ccmData <- as.data.frame(ccmData) colnames(ccmData) <- gsList rownames(ccmData) <- gsList @ \begin{table}[!t] \begin{center} <>== print(xtable(ccmData, digits=5), floating=FALSE) @ \caption{\Rfunction{cindex.comp.meta()} results showing the significance of the difference between concordance indices.} \label{tab:cindexCompMeta} \end{center} \end{table} %\begin{figure}[htbp] %\begin{center} % %\caption{ } %\label{fig:1} %\end{center} %\end{figure} \newpage %------------------------------------------------------------ \section{Session Info} %------------------------------------------------------------ <>== toLatex(sessionInfo()) @ \newpage %------------------------------------------------------------ \section{Functions within \Rpackage{SurvComp}} %------------------------------------------------------------ For references to the following functions, please see \cite{Harrel1996}-\cite{Houwelingen2006}. \begin{table}[h] \begin{tabular}{p{3cm} p{13cm}} \noindent \textsc{Function} & \noindent \textsc{Description}\\ D.index & Function to compute the D index\\ breastCancerData & Sample data containing six datasets for gene expression, annotations and clinical data\\ censor.time & Function to artificially censor survival data\\ cindex.comp & Function to compare two concordance indices\\ cindex.comp.meta & Function to compare two concordance indices\\ combine.est & Function to combine estimates\\ combine.test & Function to combine probabilities\\ concordance.index & Function to compute the cindex for survival or binary class prediction\\ cvpl & Function to compute the CVPL\\ dindex.comp & Function to compare two D indices\\ dindex.comp.meta & Function to compare two D indices\\ fisherz & Function to compute Fisher z transformation\\ forestplot.surv & Forest plots enables to display performance estimates of survival models\\ getsurv2 & Function to retrieve the survival probabilities at a specific point in time\\ hazard.ratio & Function to estimate the hazard ratio through Cox regression\\ hr.comp & Function to statistically compare two hazard ratios\\ hr.comp.meta & Function to compare two concordance indices\\ hr.comp2 & Function to statistically compare two hazard ratios (alternative interface)\\ iauc.comp & Function to compare two IAUCs through time-dependent ROC curves\\ ibsc.comp & Function to compare two IBSCs\\ km.coxph.plot & Function to plot several Kaplan-Meier survival curves\\ logpl & Function to compute the log partial likelihood of a Cox model\\ metaplot.surv & Meta plots enables to display performance estimates of survival models\\ no.at.risk & Function to compute the number of individuals at risk\\ sbrier.score2proba & Function to compute the BSCs from a risk score, for all the times of event occurrence\\ score2proba & Function to compute the survival probabilities from a risk score\\ survcomp-package & Performance Assessment and Comparison for Survival Analysis\\ td.sens.spec & Function to compute sensitivity and specificity for a binary classification of survival data\\ tdrocc & Function to compute time-dependent ROC curves\\ test.hetero.est & Function to test the heterogeneity of set of probabilities\\ test.hetero.test & Function to test the heterogeneity of set of probabilities\\ \end{tabular} \end{table} \newpage %------------------------------------------------------------ % BIBLIO %------------------------------------------------------------ \begin{thebibliography}{10} \bibitem{Desmedt2008} Desmedt, C., Haibe-Kains, B., Wirapati, P., Buyse, M., Larsimont, D., Bontempi, G., Delorenzi, M., Piccart, M. and Sotiriou, C.: \newblock Biological Processes Associated with Breast Cancer Clinical Outcome Depend on the Molecular Subtypes. \newblock \textit{Clinical Cancer Research}, \textbf{16}, 5158-5165. 2008 \bibitem{Harrel1996} Harrel Jr, F. E. and Lee, K. L. and Mark, D. B.: \newblock Tutorial in biostatistics: multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors. \newblock \textit{Statistics in Medicine}, \textbf{15}, 361-387. 1996. \bibitem{Pencina2004} Pencina, M. J. and D'Agostino, R. B.: \newblock Overall C as a measure of discrimination in survival analysis: model specific population value and confidence interval estimation. \newblock \textit{Statistics in Medicine}, \textbf{23}, 2109-2123. 2004. \bibitem{Royston2004} Royston, P. and Sauerbrei, W.: \newblock A new measure of prognostic separation in survival data. \newblock \textit{Statistics in Medicine}, \textbf{23}, 723-748. 2004. \bibitem{Cox1972} Cox, D. R.: \newblock Regression Models and Life Tables. \newblock \textit{Journal of the Royal Statistical Society Series B}, \textbf{34}, 187-220. 1972. \bibitem{Cochrane1954} Cochrane, W. G.: \newblock The combination of estimates from different experiments. \newblock \textit{Biometrics}, \textbf{10}, 101-129. 1954. \bibitem{Lewis2001} Lewis, Steff, and Mike Clarke: \newblock Forest plots: trying to see the wood and the trees. \newblock \textit{BMJ}, \textbf{322}, 1479-1480. 2001 \bibitem{Kaplan1958} Kaplan, E. L., and Paul Meier: \newblock Nonparametric Estimation from Incomplete Observations. \newblock \textit{Journal of the American Statistical Association}, \textbf{53}, 457-481. 1958 \bibitem{Haibe-Kains2008} Haibe-Kains, B. and Desmedt, C. and Sotiriou, C. and Bontempi, G.: \newblock A comparative study of survival models for breast cancer prognostication based on microarray data: does a single gene beat them all? \newblock \textit{Bioinformatics}, \textbf{24}:19, 2200-2208. 2008. \bibitem{Whitlock2005} Whitlock, M. C.: \newblock Combining probability from independent tests: the weighted Z-method is superior to Fisher's approach. \newblock \textit{J. Evol. Biol.}, \textbf{18}, 1368-1373. 2005. \bibitem{Heagerty2000} Heagerty, P. J. and Lumley, T. L. and Pepe, M. S.: \newblock Time-Dependent ROC Curves for Censored Survival Data and a Diagnostic Marker. \newblock \textit{Biometrics}, \textbf{56}, 337-344. 2000. \bibitem{Efron1986} Efron, B. and Tibshirani, R.: \newblock The Bootstrap Method for standard errors, confidence intervals, and other measures of statistical accuracy. \newblock \textit{Statistical Science}, \textbf{1}, 1-35. 1986. \bibitem{Becker1988} Becker, R. A., Chambers, J. M. and Wilks, A. R.: \newblock The New S Language. \newblock \textit{Wadsworth \& Brooks/Cole}, 1988. \bibitem{Andersen1993} Andersen, P. K. and Borgan, O. and Gill, R. D. and Keiding, N.: \newblock Statistical Models Based on Counting Processes \newblock \textit{Springer}, 1993. \bibitem{Brier1950} Brier, G. W.: \newblock Verification of forecasts expressed in terms of probabilities. \newblock \textit{Monthly Weather Review}, \textbf{78}, 1-3. 1950. \bibitem{Graf1999} Graf, E. and Schmoor, C. and Sauerbrei, W. and Schumacher, M.: \newblock Assessment and comparison of prognostic classification schemes for survival data. \newblock \textit{Statistics in Medicine}, \textbf{18}, 2529-2545. 1999. \bibitem{Wilcoxon1945} Wilcoxon, F.: \newblock Individual comparisons by ranking methods. \newblock \textit{Biometrics Bulletin}, \textbf{1}, 80-83. 1945. \bibitem{Student1908} Student: \newblock The Probable Error of a Mean. \newblock \textit{Biometrika}, \textbf{6}, 1-25. 1908. \bibitem{Fisher1915} R. A. Fisher: \newblock Frequency distribution of the values of the correlation coefficient in samples of an indefinitely large population. \newblock \textit{Biometrika}, \textbf{10}, 507-521. 1915. \bibitem{Verweij1993} Verweij PJM. and van Houwelingen H.: \newblock Cross-validation in survival analysis. \newblock \textit{Statistics in Medicine}, \textbf{12}, 2305-2314. 1993. \bibitem{Houwelingen2006} van Houwelingen H, Bruinsma T, Hart AA, van't Veer LJ and Wessels LFA: \newblock Cross-validated Cox regression on microarray gene expression data. \newblock \textit{tatistics in Medicine}, \textbf{25}, 3201-3216. 2006. %\bibitem{} % %\newblock %\newblock \textit{}, \textbf{}, . . \end{thebibliography} \end{document}