Using hyperspectral remote sensing data to predict biodiversity

Showing some neat features of R!
Author
Affiliation
Published

April 24, 2023

Note

In this lab we will explore some aspects of hyperspectral remote sensing obtained from the National Ecological Observatory Network (NEON) with the goal of predicting biodiversity from the sky. Specifically, we will use hyperspectral remote sensing data from the NEON Airborne Observation Platform (AOP) (more information here and here). As you will see, working with hyperspectral information/data is similar to work with any other information/data (e.g., species abundance, presence-absence) and consequently it can be used to calculate any metric of biodiversity. In this sense, spectral diversity can be considered as a dimension of biodiversity.

Note. Part of the text used in this tutorial was extracted from here with several modifications.

1 Set up your data and your working directory

Set up a working directory and store the data files in that directory. Tell R that this is the directory you will be using, and read in your data:

Code
setwd("..your working directory")

To do this laboratory you will need to install the following packages:

Code
packages <- c("maptools", "rgdal", "raster", "neonUtilities", "rasterdiv", 
              "BiocManager", "tidyverse", "plyr", "reshape2") 
# Package vector names
Function install.packages()

You can use the function install.packages() to install the packages.

If you don’t want to install the packages one by one, you can use the next command.

Code
# Install packages not yet installed
installed_packages <- packages %in% rownames(installed.packages())

if (any(installed_packages == FALSE)) {
  install.packages(packages[!installed_packages], dependencies = TRUE)
}

The package {rhdf5} is not available on CRAN, but it is located in “Bioconductor”.

When installing the package {rhdf5} R will ask you if you want to “Update all/some/none? [a/s/n]:” please in your console type n.

Code
if ( ! ("rhdf5" %in% installed.packages())) {BiocManager::install("rhdf5")}

#BiocManager::install("rhdf5")

Call or load all packages

Code
sapply(packages, require, character.only = TRUE)

library(rhdf5)

Double-check your working directory.

Function getwd()

You can use the function getwd() to get the current working directory.

2 Data preparation

Code
dir.create("Data")
dir.create("Data/NEON")

The first step is to prepare the information required to download the hyperspectral data from NEON-AOP. To do that we will first download the Terrestrial Observation System Sampling Locations dataset. You can download it from NEON Spatial Data & Maps, specifically under the Terrestrial Observation System Sampling Locations tab or by clicking HERE. Once you have downloaded the data, please store it within the folder NEON that is located within the folder Data

Code
NEON_plots <- readOGR(dsn = "Data/NEON/All_NEON_TOS_Plots_v9", 
                      layer = "All_NEON_TOS_Plot_Polygons_V9")

From the spatial distribution of the NEON sites/plots we loaded into R, we will select the NEON site Harvard Forest & Quabbin Watershed (HARV).

Code
HARV_plots <- NEON_plots@data %>% 
  filter(siteID == "HARV" & plotType == "distributed" & subtype == "basePlot") %>% 
  arrange(plotID)

view(HARV_plots)

Now from the NEON site HARV we will select the plot number one (plotID = HARV_001) and extract the coordinates in UTM (Universal Transverse Mercator) system as spatial reference to download the hyperspectral data from the NEON-AOP.

Code
east <- HARV_plots$easting
names(east) <- HARV_plots$plotID

north <- HARV_plots$northing
names(north) <- HARV_plots$plotID

east
north

coords_HARV_001 <- c(east[1], north[1])
coords_HARV_001 

2.1 Download hyperspectral imagery from NEON-AOP

Now we have the UTM coordinates from the plot HARV_001 we can download the Hyperspectral Remote Sensing Data in HDF5 Format for that site.

Code
byTileAOP(dpID = "DP3.30006.001", # NEON-AOP product
          site = "HARV", # Site code
          year = "2019", # Year
          check.size = TRUE, 
          easting = coords_HARV_001[1], northing = coords_HARV_001[2], # Coordinates UTM
          savepath = "Data/NEON", # Path
          token = NA) 

When you are downloading the Hyperspectral imagery {R} will ask you if you want to download the data please in your console type y.

3 Hyperspectral remote sensing data exploration

We are now ready for start using hyperspectral data. Note that the downloaded data is in a HDF5 format, this format natively compresses data stored within it (i.e., makes it smaller) and supports data slicing, in other words, you can extract only the portions of the data that you need to work with rather than reading the entire dataset into ypur computer memory.

The downloaded hyperspectral data is stored within a folder DP3.30006.001 that in turn is stored within the folder scheme Data/Neon. Please pay attention to the full folder scheme. This scheme is composed of several folders with a file NEON_D01_HARV_DP3_725000_4700000_reflectance.h5 in HDF5 format. In order to load and work with this data you might need to specify the full path to that file.

Code
f <- "Data/NEON/DP3.30006.001/neon-aop-products/2019/FullSite/D01/2019_HARV_6/L3/Spectrometer/Reflectance/NEON_D01_HARV_DP3_725000_4700000_reflectance.h5"

Note that you did not loaded all hyperspectral data into R but the path to the file, so using the path we can just call the metadata of the downloaded data.

3.0.1 HDF5 file structure

When we are exploring the data structure in a HDF5 file, we really need to pay attention to the first two columns, these two columns is informing us about the location of the data (group) and the name of the data (name) stored into the file. For example the Map_info dataset is located in /HARV/Reflectance/Metadata/Coordinate_System and the Reflectance dataset under /HARV.

The wavelength dataset contains the middle wavelength values for each band in the dataset and the Reflectance dataset contains the image data, these two datasets are going to be used for both data processing and visualization.

Code
View(h5ls(f, all = TRUE))

3.0.2 On bands and wavelengths

A band represents a group of wavelengths. For example, the wavelength values between 695 nm and 700 nm might be one band as captured by an imaging spectrometer. The imaging spectrometer collects reflected light energy in a pixel for light in that band. When we are working with a multispectral (e.g., Landsat, MODIS) or hyperspectral (e.g., NEON-AOP - NASA/JPL AVIRIS-NG) dataset, the band information is reported as the center wavelength value. This value represents the center point value of the wavelengths represented in that band. Thus in a band spanning 695-700 nm, the center would be 697.5 nm.

Code
# Get information about the wavelengths of HARV plot 001
wlInfo <- h5readAttributes(f, "/HARV/Reflectance/Metadata/Spectral_Data/Wavelength")

wlInfo

3.0.3 Read wavelengths from the HDF5 file

To read the wavelenghts of our hyperspectral image we just use the function {h5read}

Code
WL <- h5read(f, "/HARV/Reflectance/Metadata/Spectral_Data/Wavelength")

head(WL)
tail(WL)

3.0.4 Extract reflectance metadata

To extract the full metadata of the hyperspectral image we use the function {h5readAttributes}. Please, read the content of the the metadata carefully.

Code
reflInfo <- h5readAttributes(f, "/HARV/Reflectance/Reflectance_Data")

reflInfo

For example, the center wavelength value associated with the band 34 is 549.0242

Code
WL[34]

Importnatly, the {h5read} function reads data in the order: Bands, Cols, Rows. Let’s get these information from the reflectance metadata.

3.0.5 Read dimensions of the hyperspectral data

Code
nRows <- reflInfo$Dimensions[1]
nCols <- reflInfo$Dimensions[2]
nBands <- reflInfo$Dimensions[3]

nRows
nCols
nBands

Now using the the information obtained from the reflectance metadata let’s extract the band 34. You can try other band if you prefere.

3.0.6 Extract or “slice” data for band 34 from the HDF5 file

Code
b34 <- h5read(f, "/HARV/Reflectance/Reflectance_Data", 
              index = list(34, 1:nCols, 1:nRows)) # get band 34

# what type of object is b34?
class(b34)

## [1] "array"

The returned object is an array, the arrays are matrices with more than 2 dimensions, i.e., are matrices stacked or piled in a single object.

Code
# convert from array to matrix by selecting only the first band
b34 <- b34[1,, ]

# check it
class(b34)

Now we can plot the image of band 34.

Code
image(b34)

The previous image is hard to visually interpret, let’s log the data and see what happens.

Code
image(log(b34))

3.0.7 Data cleaning

An image data in raster format will often contain a data ignore value and a scale factor. The data ignore value represents pixels where there are no data. Usually, no data values may be attributed to the sensor not collecting data in the area of the image or to processing results which yield null values.

From the reflectance metadata we can define the ignore value as -9999. Thus, let’s set all pixels with a value == -9999 to NA (no value).

Code
# there is NO data value in our raster - let's define it
myNoDataValue <- as.numeric(reflInfo$Data_Ignore_Value)

myNoDataValue

# set all values equal to -9999 to NA
b34[b34 == myNoDataValue] <- NA

# plot the image now
image(b34)

4 Creating a georeferenced raster

In order to get a raster file suitable for further analysis, we first need to define the Coordinate reference system (CRS) of the raster. Again, we obtain the necessary information form the HDF5 file.

Code
# Extract the EPSG from the h5 dataset
myEPSG <- h5read(f, "/HARV/Reflectance/Metadata/Coordinate_System/EPSG Code")

myEPSG

# convert the EPSG code to a CRS string
myCRS <- crs(paste0("+init=epsg:", myEPSG))

myCRS

Define final raster with projection info.

Code
b34ras <- raster(b34, crs = myCRS)

# view the raster attributes
b34ras

Let’s take a look at the georeferenced raster. Take note of the coordinates on the x and y axis.

Code
image(log(b34ras), 
      xlab = "UTM Easting", 
      ylab = "UTM Northing",
      main = "Properly Oriented Raster")

Next we define the extents of our raster. The extents will be used to calculate the raster’s resolution. We get this information from the reflectance information obtained in a previous step.

Code
# Grab the UTM coordinates of the spatial extent
xMin <- reflInfo$Spatial_Extent_meters[1]
xMax <- reflInfo$Spatial_Extent_meters[2]
yMin <- reflInfo$Spatial_Extent_meters[3]
yMax <- reflInfo$Spatial_Extent_meters[4]

Define the spatial extent of the raster image

Code
# define the extent (left, right, top, bottom)
rasExt <- extent(xMin, xMax, yMin, yMax)

# view the extent to make sure that it looks right
rasExt 

Assign the spatial extent to the raster

Code
extent(b34ras) <- rasExt

# look at raster attributes
b34ras

Visualize the raster file but now let’s change the colors by adjusting the zlims.

Code
col <- terrain.colors(25)

image(b34ras,  
      xlab = "UTM Easting", 
      ylab = "UTM Northing",
      main = "Raster with Custom Colors",
      col = col, 
      zlim = c(0, 1000))

We can now save the created raster.

Code
# write out the raster as a geotiff
writeRaster(b34ras,
            file = "Data/NEON/DP3.30006.001/HARV_plot_001_band_34.tif",
            format = "GTiff",
            overwrite = TRUE)

4.1 Creating a RGB raster

In the previous step we created a raster file of a single band, but each hyperspectral data from the NEON-AOP contains 426 bands. In this step we will construct a raster stack file, i.e., a raster with N bands, in other words, a raster of rasters. You can do it manually as we did for the band 34, but let’s take advantage of R and write a function that do the work for us. Please, take your time to read and understand the function.

Code
# file: the hdf file
# band: the band you want to process 
# noDataValue: values to be omitted
# extent: raster extent
# CRS: coordinates system
# returns: a matrix containing the reflectance data for the specific band

band2Raster <- function(file, band, noDataValue, extent, CRS){
    # first, read in the raster
    out <- h5read(file, "/HARV/Reflectance/Reflectance_Data", 
                  index = list(band, NULL, NULL)) # path to the HDF5 file
    
    # Convert from array to matrix
    out <- (out[1,, ]) # output
    # transpose data to fix flipped row and column order 
    # depending upon how your data are formatted you might not have to perform this
    # step.
    out <- t(out)
    # assign data ignore values to NA
    # note, you might chose to assign values of 15000 to NA
    out[out == myNoDataValue] <- NA

    # turn the out object into a raster
    outr <- raster(out, crs = CRS)

    # assign the extents to the raster
    extent(outr) <- extent

    # return the raster object
    return(outr)
}

Now apply the function to create a RGB raster file, to do this, we will use the bands 58, 34 and 19, respectively.

Code
# create a list of the bands we want in our stack
rgb <- list(58, 34, 19)

# lapply tells R to apply the function to each element in the list
rgb_harv <- lapply(rgb, FUN = band2Raster, file = f,
                   noDataValue = myNoDataValue, 
                   extent = rasExt,
                   CRS = myCRS)

# check out the properties or rgb_rast
# note that it displays properties of 3 rasters.
rgb_harv

Success!!! 🎉🎉🎉 we created a list of three rasters. Now in order to get the raster stack just apply the function stack() from the package {raster}.

Code
# Create a raster stack from our list of rasters
rgb_harv_stack <- stack(rgb_harv)

rgb_harv_stack

As a final step, let’s assign the names of the bands to the raster stack object.

Code
# Create a list of band names
bandNames <- paste("Band_", unlist(rgb), sep = "")

# set the rasterStack's names equal to the list of bandNames created above
names(rgb_harv_stack) <- bandNames

# check properties of the raster list - note the band names
rgb_harv_stack

# scale the data as specified in the reflInfo$Scale Factor
rgb_harv_stack <- rgb_harv_stack/as.integer(reflInfo$Scale_Factor)

# plot one raster in the stack to make sure things look OK.
plot(rgb_harv_stack$Band_58, main = "Band 58")

And plot resulting RGB raster.

Code
# create a 3 band RGB image
plotRGB(rgb_harv_stack,
        r = 1, g = 2, b = 3,
        stretch = "lin")

Cool, right?

As we did with the band 34, let’s save the RGB raster in a GeoTIFF format.

Code
# write out final raster    
writeRaster(rgb_harv_stack, 
            file = "Data/NEON/DP3.30006.001/HARV_plot_001_RGB.tif", 
            format = "GTiff", 
            overwrite = TRUE)

4.2 Vegetation indices calculation

Now, we will calculate vegetation indices, specifically, we will calculate the Normalized Difference Vegetation Index (NDVI).

The \(NDVI\) is computed as the difference between near-infrared (\(NIR\)) and red (\(RED\)) reflectance divided by their sum.:

\[NDVI = \frac{NIR - R}{NIR + R}\]

To calculate the NDVI from the NEON-AOP hyperspectral data, first select the bands 58 (red) and 90 (NIR) and then create a raster stack as we did for the RGB raster.

4.2.1 Calculate NDVI

Code
# select bands to use in calculation (red, NIR)
ndvi_bands <- c(58, 90) #bands c(58, 90) in full NEON hyperspectral dataset

# create raster list and then a stack using those two bands
ndvi_harv <- lapply(ndvi_bands, 
                    FUN = band2Raster, 
                    file = f, 
                    noDataValue = myNoDataValue, 
                    extent = rasExt, 
                    CRS = myCRS)

ndvi_harv <- stack(ndvi_harv)

# make the names pretty
bandNDVINames <- paste("Band_", unlist(ndvi_bands), sep = "")
names(ndvi_harv) <- bandNDVINames

# view the properties of the new raster stack
ndvi_harv

Write a function for NDVI calculation.

Code
#calculate NDVI
NDVI_func <- function(ras) {
      (ras[, 2] - ras[, 1])/(ras[, 2] + ras[, 1])
}

Apply the function and plot the result.

Code
ndvi_calc <- calc(ndvi_harv, fun = NDVI_func)

plot(ndvi_calc, main = "NDVI for the NEON HARV Field Site")

Now, play with breaks and colors to create a meaningful map add a color map with 4 colors.

Code
myCol <- rev(terrain.colors(4)) # use the 'rev()' function to put green as the highest NDVI value
# add breaks to the colormap, including lowest and highest values (4 breaks = 3 segments)
brk <- c(0, .25, .5, .75, 1)

# plot the image using breaks
plot(ndvi_calc, main = "NDVI for the NEON HARV Field Site", col = myCol, breaks = brk)

We can save the resulting NDVI raster.

Code
writeRaster(ndvi_calc, file = "Data/NEON/DP3.30006.001/HARV_plot_001_NDVI.tif", 
            format = "GTiff", overwrite = TRUE)

5 Plot spectral signatures derived from hyperspectral remote sensing data

Now we will extract all reflectance values for a selected pixel in the hyperspectral data from HARV and use this values to plot its spectral signatures. For practice purpose we will select a pixel at the position (100, 35), in other words, we are selecting the pixel at the row 100 and column 35. In addition to get the the reflectance for all bands we need to inform an empty space (as NULL).

Code
# extract all bands from a single pixel
aPixel <- h5read(f, "/HARV/Reflectance/Reflectance_Data", index = list(NULL, 100, 35))

class(aPixel)

# The line above generates a vector of reflectance values.
# Next, we reshape the data and turn them into a dataframe
b <- adply(aPixel, c(1))

class(b)

# create clean data frame
aPixeldf <- b[2]

# add wavelength data to matrix
aPixeldf$Wavelength <- WL

head(aPixeldf)

Now select the scale factor from the reflectance object and scale the reflectance values for all bands.

Code
scaleFact <- reflInfo$Scale_Factor

# add scaled data column to the data frame
aPixeldf$scaled <- (aPixeldf$V1/as.vector(scaleFact))

# make nice column names
names(aPixeldf) <- c('Reflectance', 'Wavelength', 'ScaledReflectance')

head(aPixeldf)
tail(aPixeldf)

As a last step let’s plot the scaled reflectance as a function of the wavelength.

Code
aPixeldf %>% 
  ggplot(aes(x = Wavelength, y = ScaledReflectance)) + 
  geom_line() + 
  xlab("Wavelength (nm)") + 
  ylab("Reflectance")

5.1 Select pixels and compare spectral signatures

In the previous step we selected and arbitrary pixel at (100, 35), however when we are working with real data we might need to have the spatial position of objects on the ground (e.g., GPS points).

We will use the exact geographical position of tree species in the HARV site. To do that, we will need to install two more packages that are not in CRAN but on GitHub. The packages are geoNEON and neonhs

First install and call the package geoNEON

Code
# install the package {geoNEON}
remotes::install_github('NEONScience/NEON-geolocation/geoNEON', dependencies = TRUE)

library(geoNEON)

Now download the data from woody plant vegetation from NEON.

Code
# Download Woody plant vegetation structure from NEON #####
zipsByProduct(
  dpID = "DP1.10098.001",
  site = "HARV",
  savepath = "Data/NEON",
  check.size = FALSE
)

## Combine the files
stackByTable("Data/NEON/filesToStack10098", folder = TRUE)

5.1.1 Prepare vegetation data

The function {def.calc.geo.os} will refine the geolocation data associated with NEON data products.

Code
# Calculate the more precise location for each NEON plot in the HARV site
vegmap <- "Data/NEON/filesToStack10098/stackedFiles/vst_mappingandtagging.csv" %>%
  read_csv() %>%
  mutate(year = substr(date, 1, 4)) %>% 
  filter(year == '2019') %>% 
  def.calc.geo.os("vst_mappingandtagging") # Calculate more precise geolocations for specific NEON data products

# Load individual tree coordinates
vegind <- read_csv("Data/NEON/filesToStack10098/stackedFiles/vst_apparentindividual.csv")

# Combine the coordinates with three identification
veg <- right_join(vegind, vegmap,
  by = c("individualID", "namedLocation", "domainID", "siteID", "plotID")) %>%
  filter(!is.na(adjEasting), !is.na(adjNorthing), plantStatus == "Live")

Now select the individual trees available for the plot HARV_001 and transform it into a spatial object.

Code
harv_01_trees <- veg %>% 
  select(adjNorthing, adjEasting, scientificName, plotID, 
         adjDecimalLatitude, adjDecimalLongitude) %>% 
  filter(plotID == "HARV_001")

harv_01_trees_spt <- SpatialPointsDataFrame(coords = harv_01_trees[, 2:1], 
                                      data = harv_01_trees, 
                                      proj4string = crs(ndvi_calc))

The plot HARV_001 is composed by four species and 38 individuals. You can verify that information.

Code
unique(harv_01_trees$scientificName)

length(harv_01_trees$scientificName)

Plot the results

Code
plot(ndvi_harv[[1]])
plot(harv_01_trees_spt, add = TRUE)

This is not over, now we will extract the spectral information for each of our trees in the plot HARV_001. For this we will use the R package {neonhs}

5.1.2 Extract spectral data

First install and call the package neonhs

Code
remotes::install_github('earthlab/neonhs')
library(neonhs)

Extract the spectra data associated with each tree species.

Code
# Path to access the hyperspectral data
hs_path_2019 <- list.files(
  path = "Data/NEON/DP3.30006.001/neon-aop-products/2019/", 
  pattern = "reflectance.h5",
  recursive = TRUE, full.names = TRUE
)

# extract the spectra data
resHARV_001 <- neonhs::hs_extract_pts(filename = hs_path_2019, # path to the h5 file
                                      pts = harv_01_trees_spt, # spatial points
                                      bands = 1:426) # which bands

resHARV_001

The object with the spectral data is a SpatialPointsDataFrame we need to transform it into a data.frame to continue working with the spectra.

Code
resHARV_001_df <- as.data.frame(resHARV_001) %>%
  bind_rows() %>%
  as_tibble() %>% 
  dplyr::select(!c("band418_2472nm", "band419_2477nm", "band420_2482nm", "band421_2487nm", 
            "band422_2492nm", "band423_2497nm", "band424_2502nm", "band425_2507nm", 
            "band426_2512nm", "adjEasting.1", "adjNorthing.1")) %>% 
  dplyr::select(plotID, scientificName, adjNorthing, adjEasting, adjDecimalLongitude, adjDecimalLatitude, everything())

resHARV_001_df

Let’s perform a bit of cleaning…

Code
resHARV_001_df_long <- resHARV_001_df %>% 
  dplyr::select(!c(plotID, adjNorthing, adjEasting, adjDecimalLongitude, adjDecimalLatitude)) %>% 
  reshape2::melt(id.vars = "scientificName", 
       variable.name = "Wavelength", 
       value.name = "Reflectance")

resHARV_001_df_long <- resHARV_001_df_long %>% 
  mutate(Wavelength2 = Wavelength) %>% 
  separate(Wavelength2, into = c("bands", "wl"), sep = "_") %>% 
  mutate(WL = as.numeric(gsub("[nm]", "", wl))) %>% 
  mutate(scientificName = as.factor(scientificName))

Now let’s plot the results…

Code
# Aux function for visualization
theme_nice <- function() {
  theme_bw() + #base_family = "Noto Sans") +
    theme(panel.grid.minor = element_blank(),
          plot.background = element_rect(fill = "white", color = NA),
          #plot.title = element_text(face = "bold"),
          #strip.text = element_text(face = "bold"),
          strip.background = element_rect(fill = "grey80", color = NA),
          legend.title = element_text(face = "bold", size = 15), 
          legend.text = element_text(size = 12))
}

# plot the spectra by species 
resHARV_001_df_long %>% 
  drop_na() %>% 
  #filter(scientificName == "Acer rubrum L.") %>% 
  ggplot() + 
  geom_line(aes(x = WL, y = Reflectance, color = scientificName)) +
  scale_color_viridis_d(option = "A", 
                        labels = c("Acer rubrum", "Betula lenta", "Pinus strobus", "Quercus rubra")) + 
  xlab("Wavelength (nm)") + 
  ylab("Reflectance") + 
  theme_nice()

Cool! we just made a plot with the spectral signatures for four species at HARV site, but there are some anomalies in the plot which difficult its interpretation. Those anomalies around 1400 nm and 1850 nm correspond to two major atmospheric absorption bands, i.e., regions in the spectra where gasses in the atmosphere (primarily carbon dioxide and water vapor) absorb radiation, and therefore, obscure the reflected radiation that the imaging spectrometer measures.

To eliminate those anomalies we first might need to select and eliminate those bands manually. Happily, the reflectance metadata contains the lower and upper bound of each of those atmospheric absorption bands. Let’s read those bands and plot rectangles where the reflectance measurements are obscured by atmospheric absorbtion.

Code
# grab Reflectance metadata (which contains absorption band limits)
reflMetadata <- h5readAttributes(f, "/HARV/Reflectance" )

ab1 <- reflMetadata$Band_Window_1_Nanometers
ab2 <- reflMetadata$Band_Window_2_Nanometers

ab1
ab2

Plot spectral signatures again with rectangles showing the absorption bands

Code
resHARV_001_df_long %>% 
  drop_na() %>% 
  ggplot() + 
  geom_line(aes(x = WL, y = Reflectance, color = scientificName)) +
  scale_color_viridis_d(option = "A", 
                        labels = c("Acer rubrum", "Betula lenta", "Pinus strobus", "Quercus rubra")) + 
  geom_rect(mapping = aes(ymin = min(Reflectance), 
                          ymax = max(Reflectance), 
                          xmin = ab1[1], xmax = ab1[2]), 
            color = "darkgray", fill = "gray", alpha = 0.7) +
  geom_rect(mapping = aes(ymin = min(Reflectance), 
                          ymax = max(Reflectance), 
                          xmin = ab2[1], xmax = ab2[2]), 
            color = "darkgray", fill = "gray", alpha = 0.7) + 
  xlab("Wavelength (nm)") + 
  ylab("Reflectance") + 
  theme_nice()

By inspecting the plot we can confirm that the sections of the spectra with anomalies are within the atmospheric absorption bands. Using the absorption band limits we can remove the sections with anomalies and plot masked spectral signatures for the four species.

Code
# Duplicate the spectral signatures into a new data.frame
resHARV_001_df_long_mask <- resHARV_001_df_long

# Mask out all values within each of the two atmospheric absorbtion bands
resHARV_001_df_long_mask[resHARV_001_df_long_mask$WL > 
                    ab1[1] & resHARV_001_df_long_mask$WL < ab1[2], ]$Reflectance <- NA 

resHARV_001_df_long_mask[resHARV_001_df_long_mask$WL > 
                    ab2[1] & resHARV_001_df_long_mask$WL < ab2[2], ]$Reflectance <- NA

head(resHARV_001_df_long_mask)

Plot the masked spectral signatures

Code
resHARV_001_df_long_mask %>% 
  ggplot() + 
  geom_line(aes(x = WL, y = Reflectance, color = scientificName)) +
  scale_color_viridis_d(option = "A", 
                        labels = c("Acer rubrum", "Betula lenta", "Pinus strobus", "Quercus rubra")) + 
  xlab("Wavelength (nm)") + 
  ylab("Reflectance") + 
  theme_nice()

It’s always good practice to close the H5 connection before moving on.

Code
# close the H5 file
H5close()

Clean your R environment.

Code
rm(list = ls())

6 Calculate metrics of biodiversity using remote sensing data

As a final exercise we will estimate some diversity metrics directly from the results obtained in the previous steps. Specifically we will use the NDVI raster we created before.

Code
harv_ndvi <- raster("Data/NEON/DP3.30006.001/HARV_plot_001_NDVI.tif")

plot(harv_ndvi)

Now using the NDVI raster we will estimate the Shannon and Hill diversity metrics for each pixel. To do that we will use the package {rasterdiv} (Thouverai et al., 2021). Given the spatial extent and resolution of the NDVI raster file, this process will take some time (~2 minutes in Jesús’s computer) to finish.

6.1 Shannon’s diversity index (\(H'\))

The Shannon’s diversity index is one of the most common metrics used to estimate diversity from remote sensing data. This index is calculated as:

\[H = -\sum_{n=1}^{N} p_i ln_b(p_i) = 1\]

Where \(p_i\) is the proportional abundance of pixel \(i\) and \(b\) is the base of the logarithm. It is most popular to use natural logarithms, but some argue for base \(b = 2\). Shannon’s H is a dimensionless metric, in other words, it consider differences in the relative abundance among pixel values, but not their relative spectral distance, i.e. the distance among spectral values (Thouverai et al., 2021).

In this example we are using a parallel computation which allow to obtain results quickly, however if you have some issues running this part, please remove the argument np = 10 from the code below.

Code
library(doParallel)

# Computes Shannon's diversity index (H') on different classes of numeric matrices using a moving window algorithm.
HARV_shannon <- rasterdiv::Shannon(x = harv_ndvi, # NDVI raster
                                   window = 5, # window size
                                   np = 10 # Number of cores, if this don't work for you, just remove this line of code
                                   )

Plot the results for Shannon’s diversity.

Code
plot(HARV_shannon)

6.2 Hill’s generalized entropy diversity index

The Hill’s generalized entropy diversity index is based on the effective number of species (or pixels) of \(H\alpha\), i.e., the number of species that would lead to the diversity \(H\) if the species are equally abundant (Cavender-Bares et al., 2020; Scheiner et al., 2017). An important component in the Hill’s diversity is the \(\alpha\) component, thus, different orders of \(\alpha\) result in different diversity measures; for example, \(\alpha = 0\) is simply species richness, \(\alpha = 1\) gives the exponential of Shannon’s entropy index, and \(\alpha = 2\) gives the inverse of Simpson’s concentration index (Cavender-Bares et al., 2020).

\[H_\alpha = (\sum_{n=1}^{N} p_i^{\alpha})^\frac{1}{1-\alpha}\]

Let’s compute the Hill’s diversity with an \(\alpha = 1\) and compare the results with the Shannon’s metric.

Code
# Computes Hill's index of diversity (Hill numbers) on different classes of numeric matrices using a moving window algorithm.
HARV_hill <- rasterdiv::Hill(harv_ndvi, 
                             alpha = 1, 
                             window = 5, 
                             np = 10, # Number of cores, if this don't work for you, just remove this line of code
                             rasterOut = TRUE)

Plot the results

Code
plot(HARV_hill[[1]])

Explore the correlation between the two rasters

Code
cor.test(values(HARV_shannon), values(HARV_hill[[1]]))

As indicated before, when \(\alpha = 1\) in the Hill’s calculation it will resemble the exponential of Shannon’s entropy index.

That’s it!

7 The challenge

This was a very long lab, and it will take sometime to digest all the information. Thus the challenge for this lab is simple. Prepare a document with all the figures generated in this tutorial.

References

Cavender-Bares, J., Schweiger, A. K., Pinto-Ledezma, J. N., & Meireles, J. E. (2020). Applying remote sensing to biodiversity science. In J. Cavender-Bares, J. A. Gamon, & P. A. Townsend (Eds.), Remote sensing of plant biodiversity (pp. 13–42). Springer International Publishing. https://doi.org/10.1007/978-3-030-33157-3_2
Scheiner, S. M., Kosman, E., Presley, S. J., & Willig, M. R. (2017). Decomposing functional diversity. Methods in Ecology and Evolution, 8(7), 809–820. https://doi.org/10.1111/2041-210X.12696
Thouverai, E., Marcantonio, M., Bacaro, G., Re, D. D., Iannacito, M., Marchetto, E., Ricotta, C., Tattoni, C., Vicario, S., & Rocchini, D. (2021). Measuring diversity from space: A global view of the free and open source rasterdiv r package under a coding perspective. Community Ecology, 22(1), 1–11. https://doi.org/10.1007/s42974-021-00042-x