% pedigreeHandling.Rnw % ------------------------------------------------------------------------- % What: Pedigree handling vignette % $Id: pedigreeHandling.Rnw 1179 2007-04-03 14:13:02Z ggorjan $ % Time-stamp: <2007-04-01 01:09:08 ggorjan> % ------------------------------------------------------------------------- % --- Vignette stuff --- % ------------------------------------------------------------------------- %\VignetteIndexEntry{Pedigree handling} %\VignettePackage{GeneticsPed} %\VignetteDepends{gdata} %\VignetteKeywords{genetics, pedigree, parents, children, progeny, ancestor, ascendant, descendant, generation, sex, birth date, family} % --- Preamble --- % ------------------------------------------------------------------------- % Document class and general packages % ------------------------------------------------------------------------- \documentclass[fleqn,a4paper]{article} % fleqn - allignment of equations % a4paper appropriate page size \usepackage{graphicx} % For inclusion of pictures, ... \usepackage[authoryear]{natbib} % Bibliography - Natbib \newcommand{\SortNoop}[1]{} % Sort command for bib entries \usepackage{hyperref} \hypersetup{% pdftitle={Pedigree handling}, pdfauthor={Gregor Gorjanc}, pdfkeywords={genetics, pedigree, parents, children, progeny, ancestor, ascendant, descendant, generation, sex, birth date, family} } \newcommand{\email}[1]{\href{mailto:#1}{#1}} % Email command \makeatletter % Allow the use of @ in command names % Paragraph and page % ------------------------------------------------------------------------- \usepackage{setspace} % Line spacing \onehalfspacing \setlength\parskip{\medskipamount} \setlength\parindent{0pt} % --- Lyx Tips&Tricks: better formatting, less hyphenation problems --- \tolerance 1414 \hbadness 1414 \emergencystretch 1.5em \hfuzz 0.3pt \widowpenalty=10000 \vfuzz \hfuzz \raggedbottom % Page size \usepackage{geometry} \geometry{verbose,a4paper,tmargin=3.0cm,bmargin=3.0cm,lmargin=2.5cm, rmargin=2.5cm,headheight=20pt,headsep=0.7cm,footskip=12pt} \makeatother % Cancel the effect of \makeatletter % R and friends % ------------------------------------------------------------------------- \newcommand{\program}[1]{{\textit{\textbf{#1}}}} \newcommand{\code}[1]{{\texttt{#1}}} % http://www.bioconductor.org/develPage/guidelines/vignettes/vignetteGuidelines.pdf \newcommand{\Robject}[1]{{\texttt{#1}}} \newcommand{\Rfunction}[1]{{\texttt{#1}}} \newcommand{\Rpackage}[1]{{\textit{#1}}} \newcommand{\Rclass}[1]{{\textit{#1}}} \newcommand{\Rmethod}[1]{{\textit{#1}}} \newcommand{\Rfunarg}[1]{{\textit{#1}}} \newcommand{\R}{\program{R}} \usepackage{Sweave} % Sweave \SweaveOpts{strip.white=all, keep.source=TRUE} <>= options(width=85) @ % --- Document --- % ------------------------------------------------------------------------- \begin{document} \title{Pedigree handling} \author{ Gregor Gorjanc\\ \email{gregor.gorjanc@bfro.uni-lj.si},\\ David A. Henderson\\ \email{dnadave@insightful.com} } \maketitle \section*{Introduction} Pedigrees are collections of related individuals. Often we represent these as a linked list, a collection of trios that links (or almost everyone) everyone together: an individual and its two parents. This simple representation allows the use of graph theory in analysis. The GeneticsPed package provides utilities for managing pedigrees; inputing, sorting, and subsetting pedigrees; and computing on pedigrees by calculating relationship coefficients and other similar quantities. name some fields were pedigree is used \cite{Falconer:1996} \section*{?} Pedigree class \subsection*{} describe individual subject ascendant can be factor, character, numeric, but all must have the same class \subsection*{Unknown individuals} FIXME Pedigrees are never complete because it is not possible to get data on all ascendants. Therefore there are always some subjects with unknown ascendants. As with pedigree form there are also differences in representation of unknown individuals between different applications, namely using 0, blank field, particular string as ``unknown'', etc. In GeneticsPed R's unknown representation \code{NA} is used. Change from other representations to \code{NA} can be done prior to definition of a pedigree object. To ease this process, we have provided argument \code{unknown} in \Rfunction{Pedigree}. Multiple values can be passed to that argument in one call i.e. \code{uknown=c(0, "", "unknown")}. Internally, change is done via \Rfunction{uknownToNA} generic function. In case one wants to use some other representation for example for special application in R or exporting to outer application, \Rfunction{NAToUknown} function is provided. How will we handle if one wants anything else than 0 in R - should we allow for this or just convert each time to NA in our functions? <>= 1+1 @ \section{Check consistency of data in pedigree} check check.Pedigree checkId check performs a series of checks on pedigree object to ensure consistency of data. check(x, $\ldots$) checkId(x) \begin{itemize} \item[x] pedigree, object to be checked \item[]$\ldots$] arguments to other methods, none for now \end{itemize} checkId performs various checks on subjects and their ascendants. These checks are: \begin{itemize} \item idClass: all ids must have the same class \item subjectIsNA: subject can not be NA \item subjectNotUnique: subject must be unique \item subjectEqualAscendant: subject can not be equal (in identification) to its ascendant \item ascendantEqualAscendant: ascendant can not be equal to another ascendant \item ascendantInAscendant: ascendant can not appear again as asescendant of other sex i.e. father can not be a mother to someone else \item unusedLevels: in case factors are used for id presentation, there might be unused levels for some ids - some functions rely on number of levels and a check is provided for this \end{itemize} checkAttributes is intended primarly for internal use and performs a series of checks on attribute values needed in various functions. It causes stop with error messages for all given attribute checks. List of more or less self-explanatory errors and "pointers" to these errors for ease of further work i.e. removing errors. \begin{verbatim} ## EXAMPLES BELLOW ARE ONLY FOR TESTING PURPOSES AND ARE NOT INTENDED ## FOR USERS, BUT IT CAN NOT DO ANY HARM. ## --- checkAttributes --- tmp <- generatePedigree(5) attr(tmp, "sorted") <- FALSE attr(tmp, "coded") <- FALSE GeneticsPed:::checkAttributes(tmp) try(GeneticsPed:::checkAttributes(tmp, sorted=TRUE, coded=TRUE)) ## --- idClass --- tmp <- generatePedigree(5) tmp$id <- factor(tmp$id) class(tmp$id) class(tmp$father) try(GeneticsPed:::idClass(tmp)) ## --- subjectIsNA --- tmp <- generatePedigree(2) tmp[1, 1] <- NA GeneticsPed:::subjectIsNA(tmp) ## --- subjectNotUnique --- tmp <- generatePedigree(2) tmp[2, 1] <- 1 GeneticsPed:::subjectNotUnique(tmp) ## --- subjectEqualAscendant --- tmp <- generatePedigree(2) tmp[3, 2] <- tmp[3, 1] GeneticsPed:::subjectEqualAscendant(tmp) ## --- ascendantEqualAscendant --- tmp <- generatePedigree(2) tmp[3, 2] <- tmp[3, 3] GeneticsPed:::ascendantEqualAscendant(tmp) ## --- ascendantInAscendant --- tmp <- generatePedigree(2) tmp[3, 2] <- tmp[5, 3] GeneticsPed:::ascendantInAscendant(tmp) ## Example with multiple parents tmp <- data.frame(id=c("A", "B", "C", "D"), father1=c("E", NA, "F", "H"), father2=c("F", "E", "E", "I"), mother=c("G", NA, "H", "E")) tmp <- Pedigree(tmp, ascendant=c("father1", "father2", "mother"), ascendantSex=c(1, 1, 2), ascendantLevel=c(1, 1, 1)) GeneticsPed:::ascendantInAscendant(tmp) ## --- unusedLevels --- tmp <- generatePedigree(2, colClass="factor") tmp[3:4, 2] <- NA GeneticsPed:::unusedLevels(tmp) \end{verbatim} \bibliographystyle{apalike} \bibliography{library} \end{document} % ------------------------------------------------------------------------- % pedigreeHandling.Rnw ends here