To view a phylogenetic tree, we first need to parse the tree file into R. The ggtree package supports many file formats including output files of commonly used software packages in evolutionary biology. For more details, plase refer to the Tree Data Import vignette.

library("ggtree")
nwk <- system.file("extdata", "sample.nwk", package="ggtree")
tree <- read.tree(nwk)

Viewing a phylogenetic tree with ggtree

The ggtree package extends ggplot2 package to support viewing phylogenetic tree. It implements geom_tree layer for displaying phylogenetic tree, as shown below:

ggplot(tree, aes(x, y)) + geom_tree() + theme_tree()

The function, ggtree, was implemented as a short cut to visualize a tree, and it works exactly the same as shown above.

ggtree takes all the advantages of ggplot2. For example, we can change the color, size and type of the lines as we do with ggplot2.

ggtree(tree, color="firebrick", size=1, linetype="dotted")

By default, the tree is viewed in ladderize form, user can set the parameter ladderize = FALSE to disable it.

ggtree(tree, ladderize=FALSE)

The branch.length is used to scale the edge, user can set the parameter branch.length = "none" to only view the tree topology (cladogram) or other numerical variable to scale the tree (e.g. dN/dS, see also in Tree Annotation vignette).

ggtree(tree, branch.length="none")

Layout

Currently, ggtree supports several layouts, including:

for Phylogram (by default) and Cladogram if user explicitly setting branch.length='none'. ggtree also supports unrooted layout.

Phylogram

rectangular

ggtree(tree) + ggtitle("(Phylogram) rectangular layout")

slanted

ggtree(tree, layout="slanted") + ggtitle("(Phylogram) slanted layout")

circular

ggtree(tree, layout="circular") + ggtitle("(Phylogram) circular layout")

Cladogram

rectangular

ggtree(tree, branch.length='none') + ggtitle("(Cladogram) rectangular layout")

slanted

ggtree(tree, layout="slanted", branch.length='none') + ggtitle("(Cladogram) slanted layout")

circular

ggtree(tree, layout="circular", branch.length="none") + ggtitle("(Cladogram) circular layout")

Unrooted

Unrooted layout was implemented by the equal-angle algorithm that described in Inferring Phylogenies1.

ggtree(tree, layout="unrooted") + ggtitle("unrooted layout")

Time-scaled tree

A phylogenetic tree can be scaled by time (time-scaled tree) by specifying the parameter, mrsd (most recent sampling date).

tree2d <- read.beast(system.file("extdata", "twoD.tree", package="ggtree"))
ggtree(tree2d, mrsd = "2014-05-01") + theme_tree2()

Two dimensional tree

ggtree implemented two dimensional tree. It accepts parameter yscale to scale the y-axis based on the selected tree attribute. The attribute should be numerical variable. If it is character/category variable, user should provides a name vector of mapping the variable to numeric by passing it to parameter yscale_mapping.

ggtree(tree2d, mrsd = "2014-05-01",
       yscale="NGS", yscale_mapping=c(N2=2, N3=3, N4=4, N5=5, N6=6, N7=7)) +
           theme_classic() + theme(axis.line.x=element_line(), axis.line.y=element_line()) +
               theme(panel.grid.major.x=element_line(color="grey20", linetype="dotted", size=.3),
                     panel.grid.major.y=element_blank()) +
                         scale_y_continuous(labels=paste0("N", 2:7))

In this example, the figure demonstrates the quantity of y increase along the trunk. User can highlight the trunk with different line size or color using the functions described in Tree Manipulation vignette.

Displaying tree scale (evolution distance)

To show tree scale, user can use geom_treescale() layer.

ggtree(tree) + geom_treescale()

geom_treescale() supports the following parameters:

ggtree(tree)+geom_treescale(x=0, y=12, width=6, color='red')

ggtree(tree)+geom_treescale(fontsize=8, linesize=2, offset=-1)

We can also use theme_tree2() to display the tree scale by adding x axis.

ggtree(tree) + theme_tree2()

Tree scale is not restricted to evolution distance, ggtree can re-scale the tree with other numerical variable. More details can be found in the Tree Annotation vignette.

Displaying nodes/tips

Showing all the internal nodes and tips in the tree can be done by adding a layer of points using geom_nodepoint, geom_tippoint or geom_point.

ggtree(tree)+geom_point(aes(shape=isTip, color=isTip), size=3)

p <- ggtree(tree) + geom_nodepoint(color="#b5e521", alpha=1/4, size=10)
p + geom_tippoint(color="#FDAC4F", shape=8, size=3)

Displaying labels

Users can use geom_text to display the node (if availabel) and tip labels simultaneously or geom_tiplab to only display tip labels:

p + geom_tiplab(size=3, color="purple")

For circular and unrooted layout, ggtree supports rotating node labels according to the angles of the branches.

ggtree(tree, layout="circular") + geom_tiplab(aes(angle=angle), color='blue')

By default, the positions are based on the node positions, we can change them to based on the middle of the branch/edge.

p + geom_tiplab(aes(x=branch), size=3, color="purple", vjust=-0.3)

Based on the middle of branch is very useful when annotating transition from parent node to child node.

update tree view with a new tree

In previous example, we have a p object that stored the tree viewing of 13 tips and internal nodes highlighted with specific colored big dots. If users want to apply this pattern (we can imaging a more complex one) to a new tree, you don’t need to build the tree step by step. ggtree provides an operator, %<%, for applying the visualization pattern to a new tree.

For example, the pattern in the p object will be applied to a new tree with 50 tips as shown below:

p %<% rtree(50)

Another example can be found in the Tree Data Import vignette.

theme

theme_tree() defined a totally blank canvas, while theme_tree2() adds phylogenetic distance (via x-axis). These two themes all accept a parameter of bgcolor that defined the background color.

multiplot(
    ggtree(rtree(30), color="red") + theme_tree("steelblue"),
    ggtree(rtree(20), color="white") + theme_tree("black"),
    ncol=2)

Visualize a list of trees

ggtree supports multiPhylo object and a list of trees can be viewed simultaneously.

trees <- lapply(c(10, 20, 40), rtree)
class(trees) <- "multiPhylo"
ggtree(trees) + facet_wrap(~.id, scale="free") + geom_tiplab()

One hundred bootstrap trees can also be view simultaneously.

btrees <- read.tree(system.file("extdata/RAxML", "RAxML_bootstrap.H3", package="ggtree"))
ggtree(btrees) + facet_wrap(~.id, ncol=10)

Another way to view the bootstrap trees is to merge them together to form a density tree. We can add a layer of the best tree on the top of the density tree.

p <- ggtree(btrees, layout="rectangular",   color="lightblue", alpha=.3)

best_tree <- read.tree(system.file("extdata/RAxML", "RAxML_bipartitionsBranchLabels.H3", package="ggtree"))
df <- fortify(best_tree, branch.length = 'none')
p+geom_tree(data=df, color='firebrick')

References

1.Felsenstein, J. Inferring phylogenies. (Sinauer Associates, 2003).