%\VignetteIndexEntry{tigre Quick Guide} %\VignetteKeywords{TimeCourse, GeneExpression, Transcription} %\VignettePackage{tigre} \documentclass[a4paper]{article} \usepackage{url} \usepackage[authoryear,round]{natbib} \title{tigre Quick Guide} \author{Antti Honkela, Pei Gao, Jonatan Ropponen, \\ Magnus Rattray, and Neil D. Lawrence} \newcommand{\Rfunction}[1]{{\texttt{#1}}} \newcommand{\Robject}[1]{{\texttt{#1}}} \newcommand{\Rpackage}[1]{{\textit{#1}}} \newcommand{\tigre}{\Rpackage{tigre}} \begin{document} \maketitle \SweaveOpts{keep.source=TRUE} \section{Abstract} The \tigre{} package implements our methodology of Gaussian process differential equation models for analysis of gene expression time series from single input motif networks. The package can be used for inferring unobserved transcription factor (TF) protein concentrations from expression measurements of known target genes, or for ranking candidate targets of a TF. The purpose of this quick guide is to present a small subset of examples from the User Guide that can be run very quickly. For a more comprehensive (although slower-running) presentation, please refer to the \tigre{} User Guide. \section{Citing \tigre{}} The \tigre{} package is based on a body of methodological research. Citing \tigre{} in publications will usually involve citing one or more of the methodology papers \citep{Honkela2010PNAS,Gao2008,Lawrence2007} that the software is based on as well as citing the software package itself \citep{Honkela2011}. <>= options(width = 60) @ \section{Introductory example analysis - Drosophila development} \label{section:Introductory example} In this section we introduce the main functions of the \Rpackage{puma} package by repeating some of the analysis from the PNAS paper~\citep{Honkela2010PNAS}\footnote{Note that the results reported in the paper were run using an earlier version of this package for MATLAB, so there can be minor differences.}. \subsection{Installing the \tigre{} package} The recommended way to install \tigre{} is to use the \Rfunction{biocLite} function available from the bioconductor website. Installing in this way should ensure that all appropriate dependencies are met. << eval=FALSE >>== source("http://www.bioconductor.org/biocLite.R") biocLite("tigre") @ % To install the tigre software, unpack the software and run % \begin{verbatim} % R CMD INSTALL tigre % \end{verbatim} To load the package start R and run <<>>= library(tigre) @ \subsection{Loading the data} To get started, you need some preprocessed time series expression data. If the data originates from Affymetrix arrays, we highly recommend processing it with \Rfunction{mmgmos} from the \Rpackage{puma} package. This processing extracts error bars on the expression measurements directly from the array data to allow judging the reliability of individual measurements. This information is directly utilised by all the models in this package. To start from scratch on Affymetrix data, the .CEL files from \url{ftp://ftp.fruitfly.org/pub/embryo_tc_array_data/} may be processed using: <>= # Names of CEL files expfiles <- c(paste("embryo_tc_4_", 1:12, ".CEL", sep=""), paste("embryo_tc_6_", 1:12, ".CEL", sep=""), paste("embryo_tc_8_", 1:12, ".CEL", sep="")) # Load the CEL files expdata <- ReadAffy(filenames=expfiles, celfile.path="embryo_tc_array_data") # Setup experimental data (observation times) pData(expdata) <- data.frame("time.h" = rep(1:12, 3), row.names=rownames(pData(expdata))) # Run mmgMOS processing (requires several minutes to complete) drosophila_mmgmos_exprs <- mmgmos(expdata) drosophila_mmgmos_fragment <- drosophila_mmgmos_exprs @ This data needs to be further processed to make it suitable for our models. This can be done using <>= drosophila_gpsim_fragment <- processData(drosophila_mmgmos_fragment, experiments=rep(1:3, each=12)) @ Here the last argument specifies that we have three independent time series of measurements. In order to save time with the demos, a part of the result of this is included in this package and can be loaded using <<>>= data(drosophila_gpsim_fragment) @ \subsection{Learning an individual model} Let us now learn a model for the TF twist and one of its potential targets <<>>= # The probe identifier for TF 'twi' twi <- "143396_at" # The probe identifier for the target gene target <- "152715_at" # Learn the model using only one of the 3 repeats in the data model <- GPLearn(drosophila_gpsim_fragment[,1:12], TF=twi, targets=target, useGpdisim=TRUE, quiet=TRUE) # Display the model parameters show(model) @ \subsection{Visualising the model} The model can be plotted using the command <>= GPPlot(model) @ the results of which can be seen in Fig.~\ref{fig:model}. \begin{figure} \begin{center} <>= GPPlot(model) @ \end{center} \caption{Single target models for the gene FBgn0003486. The models for each repeated time series are shown in different columns.\label{fig:model}} \end{figure} \subsection{Ranking the targets} Please refer to the User Guide for details on bulk ranking of candidate targets. \section{Session Info} <>= sessionInfo() @ \bibliographystyle{plainnat} \bibliography{gpsim} \end{document}