%\VignetteIndexEntry{ENCODE/hg18 track demo: DNaseI hypersensitivity} %\VignetteDepends{GGtools, GGdata} %\VignetteKeywords{Genetics} %\VignettePackage{encoDnaseI} % % NOTE -- ONLY EDIT THE .Rnw FILE!!! The .tex file is % likely to be overwritten. % \documentclass[12pt]{article} \usepackage{amsmath,pstricks} % With MikTeX 2.9, using pstricks also requires using auto-pst-pdf or running % pdflatex will fail. Note that using auto-pst-pdf requires to set environment % variable MIKTEX_ENABLEWRITE18=t on Windows, and to set shell_escape = t in % texmf.cnf for Tex Live on Unix/Linux/Mac. \usepackage{auto-pst-pdf} \usepackage[authoryear,round]{natbib} \usepackage{hyperref} \textwidth=6.2in \textheight=8.5in %\parskip=.3cm \oddsidemargin=.1in \evensidemargin=.1in \headheight=-.3in \newcommand{\scscst}{\scriptscriptstyle} \newcommand{\scst}{\scriptstyle} \newcommand{\Rfunction}[1]{{\texttt{#1}}} \newcommand{\Robject}[1]{{\texttt{#1}}} \newcommand{\Rpackage}[1]{{\textit{#1}}} \newcommand{\Rmethod}[1]{{\texttt{#1}}} \newcommand{\Rfunarg}[1]{{\texttt{#1}}} \newcommand{\Rclass}[1]{{\textit{#1}}} \textwidth=6.2in \bibliographystyle{plainnat} \begin{document} %\setkeys{Gin}{width=0.55\textwidth} \title{Using the DNaseI hypersensitivity data from encode in R} \author{VJ Carey} \maketitle \section{Introduction} Annotation tracks from UCSC hg18 can be used with Bioconductor to help establish genomic contexts of events or alterations. The CD4-based hypersensitivity assays are collected in the structure rawCD4 in package encoDnaseI: <>= library(encoDnaseI) data(rawCD4) rawCD4 @ At present, we can subset the data by casting a chromosome number: <>= c19g = rawCD4[ chrnum(19) ] c19g @ And we can get a trace of values along the chromosome: <>= c19gxy = getTrkXY(c19g) plot(c19gxy) @ \section{Coupling the DnaseI series to genetics of gene expression} We would like to subset a racExSet from GGdata and look at snps that are in regions of high DNaseI sensitivity. Some infrastructure to help with this is: <>= clipSnps = function( sms, chrn, lo, hi ) { allp = getSnpLocs(sms) allp = allp-allp[1] # relative ok = allp >= lo & allp <= hi thesm = smList(sms)[[1]] rsn = colnames(thesm) rid = rsn[which(ok)] thesm = thesm[, rid, drop=FALSE] nn = new.env() tmp = list(thesm) names(tmp) = as.character(chrn) assign("smList", tmp, nn) sms@smlEnv = nn sms@activeSnpInds=which(ok) sms } rangeX = function(htrk) { range(getTrkXY(htrk)$x) } @ So we get the information on expression and SNPs in chr19g and filter: <>= library(GGtools) library(GGdata) h19 = getSS("GGdata", "19") rs19g = rangeX(c19g) library(SNPlocs.Hsapiens.dbSNP.20090506) c19l = getSNPlocs("chr19") h19locs = rbind(rsid=as.numeric(c19l[,"RefSNP_id"]), loc=as.numeric(c19l[,"loc"])) #h19locs = getSnpLocs(hmceuB36[chrnum(19),])[[1]] goodlocs = which(h19locs[2,] >= rs19g[1] & h19locs[2,] <= rs19g[2]) h19rsn = paste("rs", h19locs[1,goodlocs], sep="") h19trim = h19[rsid(h19rsn),] #ok #c19gf = clipSnps( hmceuB36[chrnum(19),], chrnum(19), rs19g[1], rs19g[2] ) #c19gf @ A gene-specific screen can be computed as follows: <>= oo = options() # don't take warnings on multiple probes... caveat emptor options(warn=0) library(GGtools) showMethods("gwSnpTests") smxi1 = gwSnpTests(genesym("MXI1")~1-1, h19trim, chrnum(19)) smxi1 #plot(smxi1) options(oo) @ We'd like to look at the SNP screen results juxtaposed with the DnaseI results. <>= print(juxtaPlot( c19g, smxi1, h19locs )) @ Another example: <>= oo = options() options(warn=0) sOSR2 = gwSnpTests(genesym("OSR2")~1-1, h19trim, chrnum(19)) print(juxtaPlot( c19g, sOSR2, h19locs )) options(oo) @ With these scores, we can find gene-snp combinations for which association is at least partly synchronized with DHS. Algorithms for systematically assessing synchronicity are in development. \end{document}