\name{findChunks}
\alias{findChunks}
\title{Identifies `chunks' of data within a set of aligned reads.}
\description{
This function identifies chunks of data within a set of aligned reads by looking for gaps within the alignments; regions where no reads align. If we assume that a locus should not contain a gap of sufficient length, then we can separate the analysis of the data into chunks defined by these gaps, reducing the complexity of the problem of segmentation.
}
\usage{
findChunks(alignments, gap, checkDuplication = TRUE)
}
\arguments{
  \item{alignments}{A \code{\linkS4class{GRanges}} object
    defining a set of aligned reads.}
  \item{gap}{The minimum length of a gap across which it is assumed that
    no locus can exist.}
  \item{checkDuplication}{Should we check whether or not reads are
    duplicated within a chunk? Defaults to TRUE.}
}

\details{This function is called by the \code{\link{readGeneric}} and
  \code{\link{readBAM}} functions but may usefully be called again if
  filtering of an \code{linkS4class{alignmentData}} object has altered
  the data present, or to increase the computational effort required for
  subsequent analysis. The lower the `gap' parameter used to define the
  chunks, the faster (though potentially less accurate) any subsequent
  analyses will be.
}
\value{
A modified \code{\link{GRanges}} object, now containing columns `chunk'
  and `chunkDup' (if 'checkDuplication' is TRUE), identifying the chunk
  to which the alignment belongs and whether the alignment of the tag is
  duplicated within the chunk respectively.
}
\author{
Thomas J. Hardcastle
}

\examples{
# Define the chromosome lengths for the genome of interest.

chrlens <- c(2e6, 1e6)

# Define the files containing sample information.

datadir <- system.file("extdata", package = "segmentSeq")
libfiles <- c("SL9.txt", "SL10.txt", "SL26.txt", "SL32.txt")

# Establish the library names and replicate structure.

libnames <- c("SL9", "SL10", "SL26", "SL32")
replicates <- c(1,1,2,2)

# Read the files to produce an `alignmentData' object.

alignData <- readGeneric(file = libfiles, dir = datadir, replicates =
replicates, libnames = libnames, chrs = c(">Chr1", ">Chr2"), chrlens =
chrlens, gap = 100)

# Filter the data on number of matches of each tag to the genome

alignData <- alignData[values(alignData@alignments)$matches < 5,]

# Redefine the chunking structure of the data.

alignData <- findChunks(alignData@alignments, gap = 100)

}
% Add one or more standard keywords, see file 'KEYWORDS' in the
% R documentation directory.
\keyword{manip}