In this class, we will consider an inventory of tree communities in a tropical forest in Panama and go through the first properties an ecologist/forester can look at.
Before starting
If you would like to see this tutorial in your viewer within RStudio (saves space on the screen), please execute the following chunk of code:
install.packages("rstudioapi") # installs a required R-package
dir <- tempfile()
dir.create(dir)
download.file("https://gift.uni-goettingen.de/community/index.html",
destfile = file.path(dir, "index.html"))
download.file("https://gift.uni-goettingen.de/community/lesson2.html",
destfile = file.path(dir, "lesson2.html"))
htmlFile <- file.path(dir, "lesson2.html")
rstudioapi::viewer(htmlFile)In ecology, or forestry, biodiversity assessment usually consists in, first, designing plots/study sites and, second, identifying the sampled species.
The selected plots should represent environmentally homogeneous spatial units. These units are also called communities.
Local abundances of each species can be recorded. Other features, like functional traits, can be collected on the present species.
There are several ways to record abundances: counting the number of individuals of every species or estimate their spatial cover are the most common ones. Sometimes, the abundances of species are not measured and only presence/absence data are available.
Digitilizing the field data then creates a so-called site-species matrix. This object usually contains the sites in rows and the species in columns (it can also be the other way around). The matrix is filled with the local abundances of species in plots. If only presence/absence are recorded, then the matrix only contains 1 and 0.
For example, in the following figure, the blue case indicates that the Species 2 has 3 individuals in the Plot 1.
The Barro Colorado Island (BCI) is the largest island of the canal of Panama. Since 1923, the island is a place of intensive scientific research focused on ecology of lowland tropical forest.
From the floristic point of view, it may be the most explored tropical area in the world.
On this island, 50 1 hectare permanent plots were established by Smithsonian Tropical Research Institute and Princeton University to study the dynamics of tropical forest vegetation.
In these 50 1-ha plots, every tree individual with a diameter at breast height (DBH) > 10 cm was recorded.
If you have not downloaded the data yet, please download the file BCI_Data.csv on Stud-IP and copy it into the data directory in your project folder.
BCI <- read.table("data/BCI_Data.csv",
header = TRUE, # does the file have column names
sep = ",", # what separates the column
row.names = 1) # does the file have row namesNote that the BCI dataset is also provided in the vegan package. To load data stored in some packages, you have to use the function data() and specify the name of the dataset as an argument.
# Installing vegan package on your computer
# install.packages("vegan")
# Loading vegan package in your current session
library(vegan)
# Loading the data
data(BCI)
# Check the objects loaded in the local environment
ls()Now that the data is loaded, we first examine the structure of the data.
## int [1:225, 1:50] 0 0 0 0 0 0 2 0 0 0 ...
## - attr(*, "dimnames")=List of 2
## ..$ : chr [1:225] "Abarema.macradenium" "Acacia.melanoceras" "Acalypha.diversifolia" "Acalypha.macrostachya" ...
## ..$ : chr [1:50] "Plot1" "Plot2" "Plot3" "Plot4" ...
## [1] 225 50
## Plot4 Plot5 Plot6
## Adelia.triloba 3 1 0
## Aegiphila.panamensis 0 1 0
## Alchornea.costaricensis 18 3 2
The data has species in rows and plots in columns. To reverse it like in the example figure shown above, we can transpose the matrix using the dedicated function t().
Solution
## int [1:50, 1:225] 0 0 0 0 0 0 0 0 0 1 ...
## - attr(*, "dimnames")=List of 2
## ..$ : chr [1:50] "Plot1" "Plot2" "Plot3" "Plot4" ...
## ..$ : chr [1:225] "Abarema.macradenium" "Acacia.melanoceras" "Acalypha.diversifolia" "Acalypha.macrostachya" ...
## [1] "Plot1" "Plot2" "Plot3" "Plot4" "Plot5"
## [1] "Abarema.macradenium" "Acacia.melanoceras" "Acalypha.diversifolia"
## [4] "Acalypha.macrostachya" "Adelia.triloba"
Solution
## Adelia.triloba Aegiphila.panamensis Alchornea.costaricensis
## Plot4 3 0 18
## Plot5 1 1 3
## Plot6 0 0 2
Solution
## [1] 50 225
## [1] 50
## [1] 225
The site-species matrix is filled with abundance data. The next lines of code create a binary site-species matrix out of it, ony filled with presences and absences.
BCI_bin <- BCI
BCI_bin[BCI_bin > 0] <- 1 # replacement of values strictly positive by 1
BCI[4:6, 5:7]## Adelia.triloba Aegiphila.panamensis Alchornea.costaricensis
## Plot4 3 0 18
## Plot5 1 1 3
## Plot6 0 0 2
## Adelia.triloba Aegiphila.panamensis Alchornea.costaricensis
## Plot4 1 0 1
## Plot5 1 1 1
## Plot6 0 0 1
Each forest plot of the site-species matrix has a certain number of individuals and species. Let’s first see how many species and individuals were sampled in the first community.
## Abarema.macradenium Acacia.melanoceras Acalypha.diversifolia
## 0 0 0
## Acalypha.macrostachya Adelia.triloba Aegiphila.panamensis
## 0 0 0
## Alchornea.costaricensis Alseis.blackiana Annona.spraguei
## 2 25 1
## Apeiba.aspera Apeiba.tibourbou Astronium.graveolens
## 13 2 6
How many individual trees were sampled in the first plot?
What number of species does the first plot have?
Solution
## [1] 93
## [1] 93
We can repeat these operations over the whole site-species matrix using the function rowSums().
## Plot1 Plot2 Plot3 Plot4 Plot5 Plot6
## 448 435 463 508 505 412
## Plot35 Plot4 Plot5 Plot40 Plot10 Plot30
## 601 508 505 489 483 475
## Plot23 Plot18 Plot29 Plot12 Plot17 Plot28
## 340 347 364 366 381 387
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 340.0 409.0 428.0 429.1 443.5 601.0
We can plot the histogram of the individual frequency over the sampled plots.
# Histogram
hist(rowSums(BCI),
col = "grey", # colors of bins
main = "Community sampling", # plot title
xlab = "Number of individuals") # x-axis titleThe previous histogram indicated us how many individuals were sampled in each plot. But it does not tell us how many different species were found in each plot.
Now plot the histogram of species richness across plots.
Solution
Same code but applied on the presence/absence site-species matrix BCI_bin.
hist(rowSums(BCI_bin), col = "grey",
main = "Species richness", # plot title
xlab = "Number of species") # x-axis titleIntuitively, the species richness of a given plot should be positively linked with the number of individuals sampled. Indeed, adding a new individual to a sample can only add a new species to the plot. Thus, the sampling effort can bias the estimation of species richness.
Methods exist to disentangle the sampling bias from the ecological processes in the assessment of species richness. We will see these methods in later classes. Here, we just look whether there is such a positive relationship in BCI data.
To do so, we first construct a table containing the number of individuals and tree species per community.
# Number of individuals per community
ind_com <- rowSums(BCI)
ind_com <- data.frame(com = names(ind_com),
nb_ind = as.numeric(ind_com))
head(ind_com)## com nb_ind
## 1 Plot1 448
## 2 Plot2 435
## 3 Plot3 463
## 4 Plot4 508
## 5 Plot5 505
## 6 Plot6 412
Now build the equivalent data.frame for species richness.
Solution
# Number of species per community
sr_com <- rowSums(BCI_bin)
sr_com <- data.frame(com = names(sr_com),
rich = as.numeric(sr_com))
head(sr_com)## com rich
## 1 Plot1 93
## 2 Plot2 84
## 3 Plot3 90
## 4 Plot4 94
## 5 Plot5 101
## 6 Plot6 85
We can now merge the two tables using merge(). Then, we plot the relationship between these two features.
# Merging the two tables
sr_ind <- merge(sr_com, ind_com, by = "com") # by indicates the common column
head(sr_ind); dim(sr_ind)## com rich nb_ind
## 1 Plot1 93 448
## 2 Plot10 94 483
## 3 Plot11 87 401
## 4 Plot12 84 366
## 5 Plot13 93 409
## 6 Plot14 98 438
## [1] 50 3
# Scatterplot
plot(sr_ind$nb_ind, sr_ind$rich,
pch = 16,
xlab = "Number of individuals",
ylab = "Species richness")The relationship between species richness and number of individuals sampled does not seem that obvious.
After characterizing the richness of each forest plot, located on the rows of the site-species matrix, we can study the frequency of each species. We here focus on the columns of the site-species matrix.
Let’s start with one species.
In how many plots does it occur?
Solution
## Plot1 Plot2 Plot3 Plot4 Plot5 Plot6 Plot7 Plot8 Plot9 Plot10 Plot11
## 0 0 0 3 1 0 0 0 5 0 0
## Plot12 Plot13 Plot14 Plot15 Plot16 Plot17 Plot18 Plot19 Plot20 Plot21 Plot22
## 1 1 0 2 2 0 1 0 0 0 1
## Plot23 Plot24 Plot25 Plot26 Plot27 Plot28 Plot29 Plot30 Plot31 Plot32 Plot33
## 0 2 0 0 1 0 1 14 5 7 3
## Plot34 Plot35 Plot36 Plot37 Plot38 Plot39 Plot40 Plot41 Plot42 Plot43 Plot44
## 3 6 1 2 6 9 7 0 0 0 4
## Plot45 Plot46 Plot47 Plot48 Plot49 Plot50
## 0 0 2 1 0 1
## Plot4 Plot5 Plot9 Plot12 Plot13 Plot15 Plot16 Plot18 Plot22 Plot24 Plot27
## 3 1 5 1 1 2 2 1 1 2 1
## Plot29 Plot30 Plot31 Plot32 Plot33 Plot34 Plot35 Plot36 Plot37 Plot38 Plot39
## 1 14 5 7 3 3 6 1 2 6 9
## Plot40 Plot44 Plot47 Plot48 Plot50
## 7 4 2 1 1
## [1] 27
## [1] 27
Now, for all species we can sum the abundances/presences by columns to get information about species’ occurrences using colSums() function.
## Abarema.macradenium Acacia.melanoceras Acalypha.diversifolia
## 1 3 2
## Acalypha.macrostachya Adelia.triloba Aegiphila.panamensis
## 1 92 23
## Faramea.occidentalis Trichilia.tuberculata Alseis.blackiana
## 1717 1681 983
## Oenocarpus.mapora Poulsenia.armata Quararibea.asterolepis
## 788 755 724
## Plot1 Plot2 Plot3 Plot4 Plot5 Plot6 Plot7 Plot8 Plot9 Plot10 Plot11
## 93 84 90 94 101 85 82 88 90 94 87
## Plot12 Plot13 Plot14 Plot15 Plot16 Plot17 Plot18 Plot19 Plot20 Plot21 Plot22
## 84 93 98 93 93 93 89 109 100 99 91
## Plot23 Plot24 Plot25 Plot26 Plot27 Plot28 Plot29 Plot30 Plot31 Plot32 Plot33
## 99 95 105 91 99 85 86 97 77 88 86
## Plot34 Plot35 Plot36 Plot37 Plot38 Plot39 Plot40 Plot41 Plot42 Plot43 Plot44
## 92 83 92 88 82 84 80 102 87 86 81
## Plot45 Plot46 Plot47 Plot48 Plot49 Plot50
## 81 86 102 91 91 93
Similarly than for the plot richness, we can characterize the number of occurrences of each species and the number of plots in which they occur.
First, each species has a certain frequency in the site-species matrix. The frequency of the species can here be defined as its number of individuals in the site-species matrix.
# Number of individuals per community
ind_sp <- colSums(BCI)
ind_sp <- data.frame(sp = names(ind_sp),
nb_ind = as.numeric(ind_sp))
head(ind_sp)## sp nb_ind
## 1 Abarema.macradenium 1
## 2 Acacia.melanoceras 3
## 3 Acalypha.diversifolia 2
## 4 Acalypha.macrostachya 1
## 5 Adelia.triloba 92
## 6 Aegiphila.panamensis 23
hist(ind_sp$nb_ind, col = "grey",
main = "Species frequency",
xlab = "Number of individuals") # x-axis titleSecond, each species occurs in a certain amount of plots. The number of plots for one species can define its occupancy.
# Number of species per community
com_sp <- colSums(BCI_bin)
com_sp <- data.frame(sp = names(com_sp),
nb_plot = as.numeric(com_sp))
head(com_sp)## sp nb_plot
## 1 Abarema.macradenium 1
## 2 Acacia.melanoceras 2
## 3 Acalypha.diversifolia 2
## 4 Acalypha.macrostachya 1
## 5 Adelia.triloba 27
## 6 Aegiphila.panamensis 18
hist(com_sp$nb_plot, col = "grey",
main = "Species occupancy",
xlab = "Number of plots per species") # x-axis titleSolution
# Merge the two tables
nbplot_ind <- merge(com_sp, ind_sp, by = "sp")
head(nbplot_ind); dim(nbplot_ind)## sp nb_plot nb_ind
## 1 Abarema.macradenium 1 1
## 2 Acacia.melanoceras 2 3
## 3 Acalypha.diversifolia 2 2
## 4 Acalypha.macrostachya 1 1
## 5 Adelia.triloba 27 92
## 6 Aegiphila.panamensis 18 23
## [1] 225 3
Solution
# Scatterplot
plot(nbplot_ind$nb_ind, nbplot_ind$nb_plot,
pch = 16,
main = "Occupancy vs frequency",
xlab = "Number of individuals",
ylab = "Number of plots")As expected, there is a strong positive relationship between the number of individuals a species has and the number of plot it occurs in.