Introduction to Phylogenies and the Comparative Method

Showing some neat features of R!
Author
Affiliation
Published

February 6, 2023

Note

In this lab, you will learn basic tools in R for visualizing phylogenies, optimizing ancestral states for a discrete and continuous characters, testing models of character evolution, and performing phylogenetic correction of a regression model. This lab is based in part on one designed by Luke Harmon for a workshop that he and others ran at Ilha Bela, Brazil; the original can be seen here There are many other useful labs in comparative analysis from that workshop that you can peruse at your leisure.

You will need two datasets, that will be provided for you:

  1. A data.frame with species traits – furnariides_traits.csv

  2. A phylogenetic tree – furnariides_tree.nex

The clade we will work on today is the Furnariides (Aves: Passeriformes), also known as the largest continental endemic radiation (Pinto-Ledezma et al., 2019). We will use the phylogenetic tree reconstructed by Jesús. The trait data correspond to several morphological measurements of birds from AVONET (Tobias et al., 2022).

1 Set up your data and your working directory

You will need to have a set of R packages to do this lab. Install the following packages:

Code
packages <- c("tidyverse", "ape", "geiger", "nlme", "phytools", "rr2") 
# Package vector names
Function install.packages()

You can use the function install.packages() to install the packages.

If you don’t want to install the packages one by one, you can use the next command.

Code
# Install packages not yet installed
installed_packages <- packages %in% rownames(installed.packages())

if (any(installed_packages == FALSE)) {
  install.packages(packages[!installed_packages], dependencies = TRUE)
}

This command, will, first, check if you already the packages installed, then if a package is not installed in your computer, will install it.

Load installed packages:

Code
library(tidyverse)
── Attaching packages ─────────────────────────────────────── tidyverse 1.3.2 ──
✔ ggplot2 3.4.0      ✔ purrr   1.0.1 
✔ tibble  3.1.8      ✔ dplyr   1.0.10
✔ tidyr   1.2.1      ✔ stringr 1.5.0 
✔ readr   2.1.3      ✔ forcats 0.5.1 
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
Code
library(ape)
library(geiger)
library(nlme)

Attaching package: 'nlme'

The following object is masked from 'package:dplyr':

    collapse
Code
library(phytools)
Loading required package: maps

Attaching package: 'maps'

The following object is masked from 'package:purrr':

    map
Code
library(rr2)

Attaching package: 'rr2'

The following object is masked from 'package:ape':

    binaryPGLMM

Set up a working directory. Tell R that this is the directory you will be using, and read in your data:

Function getwd()

You can use the function getwd() to get the current working directory.

Code
setwd("path/for/your/directory")
Function dir.create()

You can use the function dir.create() to get create a series of folders within your working directory. For example, if you run dir.create(“Output”) it will create an empty folder–named Output–within your working directory. This folder then can be used to store the results from the lab.

Load the data. Instead of reading files from the computer we will pull the required data directly from the internet.

Code
## Trait data
furnaData <- read_csv("https://raw.githubusercontent.com/jesusNPL/BiodiversityScience/master/Spring2023/Lab_1/Data/furnariides_traits.csv") %>% 
  column_to_rownames("Sciname") %>%   # we are using the column "Sciname" as rownames
  drop_na(Range.Size)
Rows: 652 Columns: 38
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (15): Sciname, Species3, Family3, Order3, Mass.Source, Mass.Refs.Other, ...
dbl (23): Total.individuals, Female, Male, Unknown, Complete.measures, Beak....

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Code
## Phylogenetic data
furnaTree <- read.nexus("https://raw.githubusercontent.com/jesusNPL/BiodiversityScience/master/Spring2023/Lab_1/Data/furnariides_tree.nex")

Another option is downloading the data and storing it on your computer. You can use the following lines to do that. These lines will do: 1) check your working directory, 2) download the lab data in a zip file, and 3) unzip the downloaded data.

Code
main.dir <- getwd() # Will get the working directory

urls <- "https://www.dropbox.com/s/z80ba99eyzmo6p3/Lab_1.zip?dl=1" # Name of the file to download

download.file(url = urls, file.path(main.dir, "Data/Lab_1.zip"), mode = "wb") # download the file in a specific folder

unzip("Data/Lab_1.zip")

Previous lines of code will only work if you have set your working directory (WD) and only if you have the folder Data within the WD. You can check the Intro-R lab for more details.

OK. You should be ready to go.

Let’s inspect the data first, to do that we will use the function “glimpse()” of the R package {dplyr}

The pipe (%>%) operator

This operator is, maybe, the most used operator from the {dplyr} package and is used to perform a sequence of operations on a data frame. In other words, the pipe operator simply feeds the results of one operation into the next operation below it.

Code
glimpse(furnaData)
Rows: 651
Columns: 37
$ Species3           <chr> "Conopophaga ardesiaca", "Conopophaga aurita", "Con…
$ Family3            <chr> "Conopophagidae", "Conopophagidae", "Conopophagidae…
$ Order3             <chr> "Passeriformes", "Passeriformes", "Passeriformes", …
$ Total.individuals  <dbl> 13, 6, 6, 9, 9, 4, 18, 12, 6, 9, 12, 8, 15, 42, 19,…
$ Female             <dbl> 4, 2, 3, 4, 3, 1, 10, 3, 2, 1, 2, 2, 4, 8, 5, 3, 4,…
$ Male               <dbl> 9, 4, 3, 4, 6, 2, 8, 5, 4, 4, 9, 3, 8, 27, 11, 8, 5…
$ Unknown            <dbl> 0, 0, 0, 1, 0, 1, 0, 4, 0, 4, 1, 3, 3, 7, 3, 0, 1, …
$ Complete.measures  <dbl> 6, 4, 4, 4, 4, 3, 4, 3, 4, 4, 4, 3, 4, 10, 5, 4, 4,…
$ Beak.Length_Culmen <dbl> 17.5, 15.6, 20.9, 15.2, 16.4, 20.4, 16.0, 17.7, 16.…
$ Beak.Length_Nares  <dbl> 9.0, 9.3, 9.6, 9.6, 8.9, 11.4, 9.9, 9.9, 9.4, 63.2,…
$ Beak.Width         <dbl> 5.7, 5.8, 5.9, 5.2, 4.9, 6.2, 5.6, 6.2, 5.7, 4.3, 4…
$ Beak.Depth         <dbl> 4.3, 4.4, 4.4, 4.6, 4.0, 5.2, 4.9, 4.5, 4.2, 5.8, 5…
$ Tarsus.Length      <dbl> 30.3, 27.2, 29.1, 26.9, 28.1, 32.7, 25.5, 25.6, 26.…
$ Wing.Length        <dbl> 70.8, 67.7, 74.4, 69.6, 70.0, 79.2, 65.2, 65.6, 69.…
$ Kipps.Distance     <dbl> 9.8, 9.6, 8.8, 6.3, 7.8, 9.3, 8.6, 11.2, 7.1, 13.9,…
$ Secondary1         <dbl> 61.4, 58.0, 65.4, 65.5, 62.6, 70.4, 55.4, 54.7, 62.…
$ `Hand-Wing.Index`  <dbl> 13.9, 14.2, 11.8, 8.8, 11.0, 11.8, 13.4, 17.1, 10.2…
$ Tail.Length        <dbl> 45.2, 32.5, 42.1, 43.0, 45.3, 41.2, 28.3, 30.8, 35.…
$ Mass               <dbl> 26.30, 26.30, 27.00, 22.00, 25.21, 42.50, 20.10, 23…
$ Mass.Source        <chr> "Dunning", "Dunning", "Dunning", "EltonTraits_Model…
$ Mass.Refs.Other    <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ Inference          <chr> "NO", "NO", "NO", "NO", "NO", "NO", "NO", "NO", "NO…
$ Traits.inferred    <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ Reference.species  <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ Habitat            <chr> "Forest", "Forest", "Forest", "Forest", "Forest", "…
$ Habitat.Density    <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, …
$ Habitat_recode     <chr> "Forest", "Forest", "Forest", "Forest", "Forest", "…
$ Migration          <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, …
$ Trophic.Level      <chr> "Carnivore", "Carnivore", "Carnivore", "Carnivore",…
$ Trophic.Niche      <chr> "Invertivore", "Invertivore", "Invertivore", "Inver…
$ Primary.Lifestyle  <chr> "Insessorial", "Insessorial", "Insessorial", "Inses…
$ Min.Latitude       <dbl> -22.34, -11.46, -12.58, -9.55, -32.41, -15.14, -27.…
$ Max.Latitude       <dbl> -12.44, 7.43, 8.04, -4.00, -12.65, -2.02, -6.60, -0…
$ Centroid.Latitude  <dbl> -16.59, -3.08, -2.03, -7.65, -21.94, -7.84, -18.32,…
$ Centroid.Longitude <dbl> -66.99435, -62.26265, -76.24056, -35.89943, -48.840…
$ Range.Size         <dbl> 100334.39, 3387968.70, 160640.32, 58582.69, 1896972…
$ Species.Status     <chr> "Extant", "Extant", "Extant", "Extant", "Extant", "…

Let’s check if our trait data contain the same species as in the phylogeny

Code
tmp <- name.check(phy = furnaTree, data = furnaData)
tmp
$tree_not_data
[1] "Philydor_novaesi"

$data_not_tree
character(0)

It indicates that the species Philydor novaesi is not present in the trait data, so let’s drop this species from the phylogeny.

Code
furnaTree <- drop.tip(phy = furnaTree, tip = tmp$tree_not_data)

We can double check if our data match after dropping the missing species

Code
name.check(phy = furnaTree, data = furnaData)
[1] "OK"

Now it seems that we are ready to go!

2 Working with trees

Let’s start by looking at the phylogeny of these birds and learning a bit about how to work with trees in R.

What does your tree look like?

Code
plot(furnaTree)

Whoa. That’s ugly. Let’s clean it up.

Code
plot.phylo(furnaTree, no.margin = TRUE, cex = 0.5)

Better. You can mess around with tree plotting functions in plot.phylo() as much as you’d like. Try this for example:

Code
plot.phylo(furnaTree, type = "fan", no.margin = TRUE, cex = 0.3)

Pretty.

It may be useful to understand how trees are encoded in R. Typing in just the name of the tree file like this:

Code
furnaTree

will give you basic information about the phylogeny: the number of tips and nodes; what the tips are called; whether the tree is rooted; and if it has branch lengths.

Code
str(furnaTree)

will tell you more about tree structure. Trees consist of tips connected by edges (AKA branches)

Code
furnaTree$tip.label

gives you a list of all your terminal taxa, which are by default numbered 1-n, where n is the number of taxa.

Code
furnaTree$Nnode

gives you the number of nodes. This is a fully bifurcating rooted tree, so it has 1 fewer node than the number of taxa.

Code
furnaTree$edge

This tells you the beginning and ending node for all edges.

Put that all together with the following

Code
plot.phylo(furnaTree, type = "fan", no.margin = TRUE, cex = 0.7, label.offset = 0.1)
nodelabels(cex = 0.5)
tiplabels(cex = 0.5)

There are many ways to manipulate trees in R using {Ape}, {Phytools}, and other packages. This just gives you a bare-bones introduction.

3 Working with a data matrix and testing hypotheses in a phylogenetically informed way

Let’s ask some questions using the trait data that were measured for these birds. First, explore the data in the “furnaData” matrix. Here are some options for visualizing data matrices:

Code
furnaData %>% 
  head() # this will show you the first few rows of your data matrix and its header

furnaData %>% 
  dimnames() # this will show you the row and column headers for your matrix

furnaData %>% 
  View() # this will let you visualize the entire matrix

After looking at the data, please answer the next questions

What variables were measured for each of these species of Furnariides? How many species of Furnariides were used?

Is that the same number that were in your phylogeny?

Hand-wing index is one of your variables. Let’s isolate it so we can work with it easily:

Code
hwi <- furnaData[, "Hand-Wing.Index"]
names(hwi) <- rownames(furnaData) 
# data vectors have to be labelled with tip names for the associated tree. 
# This is how to do that.
Exploring the data

It is good practice to check the distribution of your data before doing downstream analysis.

Code
hist(hwi)

In lecture, Brie talked about one model of character evolution, called a Brownian Motion model. This model assumes that a trait evolves from a starting state (z0) according to a random walk with the variance specified by the rate parameter \(\sigma^{2}\) (sigma-squared). In short, Brownian motion describes a process in which tip states are modeled under the assumption of a multivariate normal distribution. On a phylogeny, the multivariate mean of tip states is equal to the root state estimate, and variance accumulates linearly through time.

What does Brownian Motion evolution of hand-wing index in Furnariides look like?

Code
brownianModel <- fitContinuous(furnaTree, hwi)
brownianModel # this will show you the fit statistics and parameter values

Here, you can see the estimates for ancestral state (z0), and the rate parameter (\(\sigma^{2}\)), as well as some measures of model fit. The fit of the model is determined using maximum likelihood, and expressed as a log likelihood (lnL). The higher the lnL, the more probable the data given the model. However, when comparing different models, we can’t use the lnL, because it does not account for the difference in the number of parameters among models. Models with more parameters will always fit better, but do they fit significantly better? For example an OU model has 4 parameters (alpha [\(\alpha\)], theta [\(\theta\)], z0, and sigma-squared [\(\sigma^{2}\)]), so it should fit better than a BM model, which includes only z0 and sigsq. To account for this, statisticians have developed another measure of fit called the AIC (Akaike Information Criterion): AIC = (2xN)-2xlnL, where N is the number of parameters. This penalizes the likelihood score for adding parameters. When selecting among a set of models, the one with the lowest AIC is preferred. We will use this information later on in this lab.

In addition to assessing model fit, we can use the Brownian Motion model to reconstruct ancestral states of a character on a tree. To visualize what BM evolution of this trait looks like on a tree. The contMap() command in phytools estimates ancestral states and plots them on a tree.

Code
contMap(furnaTree, hwi, fsize = 0.5, lwd = 3)

Describe the evolution of Hand-wing index on this tree. How many times have exteremely high and extremely low Hand-wing index evolved on this tree?

What does this say about our ability to test hypotheses about the evolution of Hand-wing index?

Let’s go ahead and test some hypotheses. Range Size is another trait in your data matrix. Let’s assess whether there is a correlation between HWI and body mass? We will extract the column “Range Size” from the datamatrix and assign it species names, just as we did for “HWI” above.

Code
rangeSize <- furnaData[, "Range.Size"]
names(rangeSize) <- rownames(furnaData)

Let’s look at a plot of range size as a function of Hand-wing index.

Code
furnaData %>% 
  drop_na(Range.Size) %>% 
  ggplot(aes(x = `Hand-Wing.Index`, y = log(Range.Size))) + 
  geom_point(alpha = 0.5, color = "darkgray") + 
  labs(x = "Hand-Wing Index", y = "log(Range Size)")

Hm. looks promising. How would you describe the relationship between these two variables?

Why did we log scale range size?

Let’s be more quantitative in describing that relationship with a linear model.

Code
lm_hwi_rangesize <- lm(log(rangeSize) ~ hwi)
summary(lm_hwi_rangesize)
Code
furnaData %>% 
  drop_na(Range.Size) %>% 
  ggplot(aes(x = `Hand-Wing.Index`, y = log(Range.Size))) + 
  geom_point(alpha = 0.5, color = "darkgray") + 
  labs(x = "Hand-Wing Index", y = "log(Range Size)") + 
  geom_smooth(method = "lm")
`geom_smooth()` using formula = 'y ~ x'

The coefficients table from the summary() command shows the slope and intercept for the linear model describing range size as a function of Hand-wing index. Each line shows the estimated coefficient (Estimate), the standard error (Std. Error) of that estimate, as well as a t-statistic and associated p-value, testing whether those parameters are equal to 0. The Multiple R-squared is an estimate of how much variance in the response variable can be explained by the predictor variable.

Write the linear model for this relationship. Are the parameters significantly different from 0?

What is the R^2 value for this data?

How do you feel about that?

3.1 Phylogenetic regression (PGLS)

Nice. But. We have not considered the fact that these birds are related to each other, in fact, all this birds are monophyletic–i.e., the clade includes an ancestral taxon and all of its descendants. As such, they may share their hand-wing index and range size simply due to the fact that their ancestors had large HWI and range size or the reverse. In other words, we need to account for non-independence of residuals due to phylogeny. One way to do that is to use phylogenetic-generalized-least-squares regression (PGLS).

Code
pglsModel <- gls(log(rangeSize) ~ hwi, 
                 correlation = corBrownian(phy = furnaTree, form=~names(rangeSize)), 
                 method = "ML")

Let’s break this command down. This command infers a linear model for Range Size as a function of HWI (gls(rangeSize ~ hwi), but it specifies existing correlation structure in the data (correlation =) as the covariance of these traits assuming a Brownian motion model (corBrowinan()) based on the Furnariides tree (phy = furnaTree) and a correctly ordered list of taxon names (form=~names(rangeSize)). The model is fit using maximum likelihood (method = “ML”). To see the results:

Code
summary(pglsModel)
coef(pglsModel)
R2(pglsModel)
Code
furnaData %>% 
  drop_na(Range.Size) %>% 
  ggplot(aes(x = `Hand-Wing.Index`, y = log(Range.Size))) + 
  geom_point(alpha = 0.5, color = "darkgray") + 
  labs(x = "Hand-Wing Index", y = "log(Range Size)") + 
  #geom_smooth(method = "lm") + 
  geom_abline(intercept = coef(pglsModel)[1], slope = coef(pglsModel)[2], 
              color = "red", linewidth = 1.5)

# will plot the pgls regression line on your biplot.

Write the linear model for this relationship. Are the parameters significantly different from 0?

What is the R^2 value for this data? (use the likelihood-based R^2 value)

How do you feel about that?

Compare results from the pgls analysis with those that you got from the regular linear model you ran earlier.

4 Model Fitting

Brownian Motion is only one model of evolution for a continuous variable. Another model is the Ornstein-Uhlenbeck (OU) model, which allows the trait mean to evolve towards a new state (theta), with a selective force (alpha). These two new parameters, plus the starting state (z0) and the rate of evolution (sigsq) parameters from the BM model, make for a 4-parameter model. The Early Burst (EB) model allows the rate of evolution to change across the tree, where the early rate of evolution is high and declines over time (presumably as niches are filled during an adaptive radiation. The rate of evolution changes exponentially over time and is specified under the model r[t] = r[0] x exp(a x t), where r[0] is the initial rate, a is the rate change parameter, and t is time. The maximum bound is set to -0.000001, representing a decelerating rate of evolution. The minimum bound is set to \(log(10^{-5})\)/depth of the tree.

Let’s evaluate the relative fit of these three models to the Hand-wing index trait.

4.1 Brownian Motion (BM)

Code
brownianModel <- fitContinuous(furnaTree, hwi)

4.2 Ornstein-Uhlenbeck (OU)

Code
OUModel <- fitContinuous(furnaTree, hwi, model = "OU")

4.3 Early Burst (EB)

Code
EBModel <- fitContinuous(furnaTree, hwi, model = "EB")

And recover the parameter values and fit estimates.

Code
brownianModel
OUModel
EBModel

Compare all models and select the best fitting model.

Code
aicw(c(brownianModel$opt$aicc, OUModel$opt$aicc, EBModel$opt$aicc))

Make a table with the AIC and lnL values for each model. Which model provides the best fit for Hand-wing index?

Now, add the results for a model fitting analysis of the range size trait to this table.

So, we were wrong. An OU model fits HWI better (and you should be able to explain how we know that). Unfortunately, a PGLS analysis with an OU model specified is currently computationally difficult. The best we can do is report the results from our model fitting analysis, and realize that the parameters from BM might not be the best fit.

However, we can still test our hypothesis that species with large HWI also present large range size, and account for phylogeny when we do. First, we should compare the uncorrected linear model of range size as a function of HWI vs the PGLS that uses the covariance structure of the residuals under a Brownian Motion model.

Code
furnaData %>% 
  drop_na(Range.Size) %>% 
  ggplot(aes(x = `Hand-Wing.Index`, y = log(Range.Size))) + 
  geom_point(alpha = 0.5, color = "darkgray") + 
  labs(x = "Hand-Wing Index", y = "log(Range Size)") + 
  geom_smooth(method = "lm") + 
  geom_abline(intercept = coef(pglsModel)[1], slope = coef(pglsModel)[2], 
              color = "red", linewidth = 1.5)

You might want to know if these regressions really differ in their ability to predict range size from HWI. Asked in another way, are the slopes from these two regressions significantly different from each other? You need to know that a 95% confidence interval for the slope parameter is b (the slope) plus/minus 1.96 standard errors (this is derived from a normal distribution). To calculate your 95% confidence intervals:

Code
rshwi.sum <- summary(lm_hwi_rangesize)

#for the uncorrected linear model
rshwi.sum$coef[2, 1] + c(-1.96, 1.96)*rshwi.sum$coef[2, 2]

#for Brownian Motion, the 95% CI
coef(pglsModel)[2] + c(-1.96, 1.96)*sqrt(pglsModel$varBeta[2, 2])

Did phylogenetic correction make a difference in this case?

What do you conclude about the evolution of range size as a function of hand-wing index?

5 Phylogenetic signal

Phylogenetic signal is the tendency of related species to resemble each other more than species drawn at random from the same tree.

5.1 Blomberg’s K

Blomberg’s K compares the variance of PICs to what we would expect under a Brownian motion (BM) model of evolution. K = 1 means that close relatives resemble each other as much as we should expect under BM. K < 1 that there is less phylogenetic signal than expected under BM and that K > 1 means that there is more. In addition, a significant p-value returned from a randomization test tells us that the phylogenetic signal is significant, in other words, close relatives are more similar than random pairs of taxa in the dataset.

Code
K_hwi <- phylosig(tree = furnaTree, # Phylogeny
                  x = hwi, # trait
                  method = "K", # method
                  test = TRUE)
print(K_hwi)
plot(K_hwi)

5.2 Pagel’s Lambda

Pagel’s \(\lambda\) is a tree transformation that stretches the tip branches relative to internal branches, making the tree more and more like a complete polytomy of a star phylogeny. If \(\lambda = 0\) there is no phylogenetic signal, while \(\lambda = 1\) correspond to BM and \(0 < \lambda < 1\) in between.

Code
LB_hwi <- phylosig(tree = furnaTree, 
                  x = hwi, 
                  method = "lambda", 
                  test = TRUE)
print(LB_hwi)
plot(LB_hwi)

Describe the results of phylogenetic signal. Does Hand-wing index present phylogenetic signal?

6 Discrete Character Mapping

So far, we have been dealing with continuous characters, those that take values along some continuum. Things like height, weight, length, temperature, humidity, etc. are continuous variables. There is another type of variable called a discrete variable, that takes, well, discrete values. Color (e.g. red, blue, green); Locomotory type (e.g. scansoial, terrestrial, fossorial) are examples of discrete variables.

Look at the data in the furnaData matrix. Which of these variables are discrete and which are continuous?

In your data matrix, Habitat has been coded for each of these species. Let’s examine some biogeography for these birds by reconstructing the ancestral habitat (i.e. habitat of origin) and dispersal history. First, isolate your variable.

Code
habitat <- furnaData$Habitat_recode
names(habitat) <- rownames(furnaData)

table(habitat)

We can simultaneously fit a model of discrete character evolution and create a set of plausible character histories using a method called stochastic character mapping or simply SIMMAP:

Code
habitat_anc <- make.simmap(furnaTree, habitat, model = "SYM", nsim = 100)

This analysis results in a “Q” matrix showing the relative probabilities of change from state to state. For this character, this would represent dispersal events between habitats. The higher the value, the higher the probability of that type of change.

Which pair of habitats shows the highest probability of interchange?

Which pair shows the least?

Does this make sense geographically?

Now, you can plot a random simulation of change in this character, that is based on the values inferred above.

Code
plotSimmap(habitat_anc[[1]], fsize = 0.5) # this plots the first of 100 simulations

How many transitions are there between black and red?

Are there any reversals?

Are there any branches with more than one change?

Using the command above, but changing the index from 1 to other values, look at a number of reconstructions. How much variation do you see?

We can summarize these simulations and estimate the relative probability of each habitat as an ancestor for each node on our tree:

Code
habitat_summary <- summary(habitat_anc)

habitat_summary # print the results

Plot the resulting reconstruction

Code
plot(habitat_summary, cex = c(0.5, 0.2), fsize = 0.5, offset = 90)
legend("bottomleft", fill = c("black", "red", "green"), 
       legend = c("Forest", "Savanna", "Transition"))

We can also use the plot function “plotSimmap” to change the colors of the habitats

Code
cols <- setNames(c("darkgreen", "khaki", "gray"), 
                 c("Forest", "Savanna", "Transition"))
Code
plotSimmap(habitat_anc[[1]], cols)
add.simmap.legend(colors = cols, prompt = FALSE, x = 0, y = -0.5,
    vertical = FALSE)

Where did Furnariides likely originate?

How many transitions from Forest to Savanna?

Are there any reversals?

Does a Savanna ancestor ever move back to Forest?

References

Pinto-Ledezma, J. N., Jahn, A. E., Cueto, V. R., Diniz-Filho, J. A. F., & Villalobos, F. (2019). Drivers of phylogenetic assemblage structure of the furnariides, a widespread clade of lowland neotropical birds. The American Naturalist, 193(2), E41–E56. https://doi.org/10.1086/700696
Tobias, J. A., Sheard, C., Pigot, A. L., Devenish, A. J. M., Yang, J., Sayol, F., Neate‐Clegg, M. H. C., Alioravainen, N., Weeks, T. L., Barber, R. A., Walkden, P. A., MacGregor, H. E. A., Jones, S. E. I., Vincent, C., Phillips, A. G., Marples, N. M., Montaño‐Centellas, F. A., Leandro‐Silva, V., Claramunt, S., … Schleuning, M. (2022). AVONET: Morphological, ecological and geographical data for all birds. Ecology Letters, 25(3), 581–597. https://doi.org/10.1111/ele.13898