\name{xbat} \alias{xbat} \docType{data} \title{Simulated pedigree with genotypes and covariates} \description{ Simulated dataset for a pedigree of 1000 trios with 50 SNPs, with 8 quantitative traits, 2 binary traits, and 8 covariates." } \usage{data(xbat)} \format{ 'geneSet' object } \details{ This data is the 'xbat' example from Lange and Kraft's "Short Course: Genetics Associateion Analysis." It is described there as: "[This] simulated dataset comprises a pedigree file with genotype information for 1000 trios with 50 SNPs and a phenotype file that contains 8 quantitative traits, 2 binary traits, and 8 covariates. "Genotypes "The simulation generated complete genotype data for 1000 families with two parents and one offspring. The single nucleotide polymorphism (SNP) frequencies and haplotype blocks were estimated using real data. These estimates were fixed and used as parameters for the simulation of the parental genotypes. Offspring genotypes were generated by simulating random Mendelian transmission from their respective parents. In total, 50 SNPs were simulated, 28 of which lie in 1 of 5 variable length haplotype blocks (range: 4 to 10 SNPs per block). The blocks were simulated as a function of haplotype block frequency, assuming no recombination, resulting in varying degrees of linkage disequilibrium within each block. The remaining 22 SNPs that are not in a haplotype block were simulated randomly as a function of SNP frequency. The SNPs are indicated in the header line of the pedigree file, and named SNP1, SNP2, .., and SNP50. Note that the affectation status variable in the pedigree file is coded as missing (0) for all individuals. All phenotype data comes from the phenotype file (see below). "Phenotypes "Overall, 10 phenotypes ($Y$) were simulated additively as function of the genetic effect size a, marker score $X$, covariate effect size $b$, and covariate value $Z$ as follows: \eqn{Y_i = a_iX_i + b_iZ_i \; (i = 1, 2, .., 10)}{% Y[i] = a[i] X[i] + b[i] Z[i] (i = 1, 2,.., 10)} "Quantitative Traits "Eight quantitative phenotypes were simulated from a random sample from a normal distribution: $Y~N([aX+ bZ], s2)$, where a is the additive effect for the phenotype and s2 is the variance. We measure the strength of the additive effect relative to the phenotypic variance by the heritability h2 [Falconer and Mackay, 1997], which is the proportion of phenotypic variation explained by genetic variation. We assume that the environment variance is 1. SNP23 was simulated as the "disease SNP" which is the 5th SNP in a 10 SNP haplotype block. The heritabilities were simulated from random uniform distribution ranging from -0.1 to 0.1. In addition, the simulation produced two correlated quantitative traits (QTL9 and QTL10; r2 = 0.40). The quantitative traits are indicated in the header line of the phenotype file and named QTL1, QTL2, .., and QTL10. "Binary Traits "Two binary traits were simulated simply by dichotomizing the first quantitative trait (QTL1). For the AFF1 trait, individuals were coded as affected (1) if their QTL1 value is above the sample mean and unaffected (0) if their QTL1 value was below the sample mean. For the AFF2 trait, individuals were coded as affected (1) if their QTL1 value is at least one standard deviation above the sample mean, and missing ("-") if their trait value did not reach that criteria. "Covariates "In addition to the additive genetic effect, each phenotype was simulated with one covariate effect. The quantitative covariates were sampled from normal distribution (mu = random, s2 = 10). The effect size for each covariate was sampled randomly from a uniform distribution (0, 1). The covariates are indicated in the header line of the phenotype file and named COV1, COV2, .., and COV10. Note that COV1 corresponds to QTL1, AFF1 and AFF2." (quoted from Lange and Kraft 2005) } \source{ Lange, C. and Kraft, P. (2005). "Short Course: Genetics Association Analysis." } \references{ Lange, C. and Kraft, P. (2005). "Short Course: Genetics Association Analysis." DeMeo, D. L., C. Lange, et al. (2002). "Univariate and multivariate family-based association analysis of the IL-13 ARG130GLN polymorphism in the Childhood Asthma Management Program." Genet Epidemiol 23(4): 335-48. } \examples{ library(GeneticsBase) data(xbat) head(xbat) } \keyword{datasets}