Using raster data to calculate and then map aridity indices
Peter Kamerman
15 December 2020I was asked by my wife, Prof Andrea Fuller, head of the Wildlife Conservation Physiology Lab, University of the Witwatersrand, South Africa, to assist her and some colleagues who were writing a paper on how dryland mammals may respond to climate change 1 to update a figure (see below) on the global aridity index, which appears in the 2019 IPCC report on climate change 2. The original figure uses 2015 data, and they wanted to use 2019 data.
The aridity index is the ratio of the total amount of precipitation in an area and the amount of evapotranspiration . That is, what is the water gain (precipitation) relative to the amount of water loss (evapotranspiration) in a region. The smaller the ratio, the more arid a region is.
I had used the mapping functions in R before (for example, see here), but never to plot raster data. So, I took up the challenge to reproduce the plot.
This blog post details how I accomplished the task, and in the process provides me with a “note to self” on how I did it.
Global drylands figure
First I needed to get access to the raster data, which is freely available through the amazing TerraClimate project 3. Although there are no aridity index rasters, data on precipitation and evapotranspiration are available for the period 1958 to 2019, and from those data I could calculate the aridity index using raster maths.
Load packages
# Load packages
library(raster) # Raster importing and manipulation
library(ggplot2) # Plotting the raster image
library(dplyr) # General data processing
Download, import, and inspect data
For importing the data I used the raster::stack
function, which imports the raster data stored in the _*.nc_ files I had downloaded. Each _*.nc_ file contained raster data for the 12 months of each year, and the raster::stack
function imports the data and stacks them into a single ‘RasterStack’ object, with 12 layers (one for each month).
#--- Download the files from the TerraClimate website ---#
# Precipitation
download.file(url = 'http://thredds.northwestknowledge.net:8080/thredds/fileServer/TERRACLIMATE_ALL/data/TerraClimate_ppt_2019.nc',
destfile = 'ppt.nc')
# Evapotranspiration
download.file(url = 'http://thredds.northwestknowledge.net:8080/thredds/fileServer/TERRACLIMATE_ALL/data/TerraClimate_pet_2019.nc',
destfile = 'pet.nc')
#--- Import the downloaded files ---#
# Precipitation
<- stack(x = 'ppt.nc')
ppt
# Evapotranspiration
<- stack(x = 'pet.nc')
pet
#--- Inspect ---#
# Precipitation
plot(ppt)
# Evapotranspiration
plot(pet)
As can be seen from the plots, there are data for each month of 2019 (one for each layer in the ‘RasterStack’ object), but I want a single figure reflecting the average annual data. This is where raster maths came in handy.
Raster maths
I used the raster::calc
function to generate a new raster object consisting of a single layer by calculating the arithmetic mean across the layers of the RasterLayer object.
#--- Raster maths ---#
# Precipitation
<- calc(ppt, # RasterStack object
ppt_mean fun = mean, # Function to apply across the layers
na.rm = TRUE)
# Evapotranspiration
<- calc(pet,
pet_mean fun = mean,
na.rm = TRUE)
You can also see that the small multiples shown above have Antarctica plotted, but the original figure does not, and so I needed to crop the rasters to exclude Antarctica using the raster::extent
and raster::crop
functions.
#--- Set the extent ---#
# Cut off all values below 60 degrees South (removing Antarctica)
<- extent(c(xmin = -180, xmax = 180,
ext ymin = -60, ymax = 90))
#--- Crop ---#
# Precipitation
<- crop(x = ppt_mean,
ppt_mean y = ext)
# Evapotranspiration
<- crop(x = pet_mean,
pet_mean y = ext)
#--- Inspect ---#
# Precipitation
plot(main = 'Precipitation',
ppt_mean)
# Evapotranspiration
plot(main = 'Evapotranspiration',
pet_mean)
Now that I had combined the layers into a single layer and cropped them to the area I wanted, I needed to calculate the aridity index by overlaying the two rasters ppt_mean and pet_mean and specifying how the layers should be combined using the raster::overlay
function. In this case I wanted to take the ratio of precipitation to evapotranspiration.
#--- Calculate aridity index ---#
# Precipitation (ppt) / Evapotranspiration (pet)
<- overlay(x = ppt_mean, # Raster object 1
aridity_index y = pet_mean, # Raster object 2
fun = function(x, y){return(x / y)}) # Function to apply
You can plot rasters using the raster::plot
function, but I find it has limited functionality, so to give me more flexibility when plotting the aridity index, I converted the raster to a dataframe so that I could use ggplot2
. First I converted the raster, using the raster::rasterToPoints
function, to a matrix of x and y coordinates and a layer expressing the value at each coordinate.
#--- Convert raster to a matrix ---#
<- rasterToPoints(aridity_index) aridity_index_matrix
Then I converted the matrix to a dataframe using the base::as.data.frame
function, and recoded all the layer values into a categorical variable representing the aridity index (AI) categories. These categories are:
Humid: AI \(\geq\) 0.65,
Dry sub-humid: 0.50 \(<\) AI \(\leq\) 0.65
Semi-arid: 0.20 \(<\) AI \(\leq\) 0.50
Arid: 0.05 \(<\) AI \(\leq\) 0.20
Hyper-arid: AI \(<\) 0.05
#--- Convert to the matrix to a dataframe ---#
<- as.data.frame(aridity_index_matrix)
aridity_index_df
#--- Recode aridity index into categories --#
<- aridity_index_df %>%
aridity_index_df # Recode
mutate(category = case_when(
is.infinite(layer) ~ 'Humid',
>= 0.65 ~ 'Humid',
layer >= 0.5 & layer < 0.65 ~ 'Dry sub-humid',
layer >= 0.2 & layer < 0.5 ~ 'Semi-arid',
layer >= 0.05 & layer < 0.2 ~ 'Arid',
layer < 0.05 ~ 'Hyper-arid'
layer %>%
)) # Convert to ordered factor
mutate(category = factor(category,
levels = c('Hyper-arid', 'Arid', 'Semi-arid',
'Dry sub-humid', 'Humid'),
ordered = TRUE))
Plot the data
Lastly, I plotted the data.
#--- Set a colour palette ---#
<- c('#e31a1c', '#fd8d3c', '#fecc5c', '#ffffb2', '#666666')
colours
#--- Plot the data ---#
ggplot(data = aridity_index_df) +
aes(y = y,
x = x,
fill = category) +
geom_raster() +
scale_fill_manual(values = colours,
guide = guide_legend(reverse = TRUE)) +
scale_y_continuous(limits = c(-60, 90),
expand = c(0, 0),
breaks = c(-40, -20, 0, 20, 40, 60, 80),
labels = c(expression('40'*degree*'S'),
expression('20'*degree*'S'),
expression('0'*degree),
expression('20'*degree*'N'),
expression('40'*degree*'N'),
expression('60'*degree*'N'),
expression('80'*degree*'N'))) +
scale_x_continuous(limits = c(-180, 180),
expand = c(0, 0),
breaks = c(-180, -120, -60, 0, 60, 120, 180),
labels = c(expression('180'*degree*'W'),
expression('120'*degree*'W'),
expression('60'*degree*'W'),
expression('0'*degree),
expression('60'*degree*'E'),
expression('120'*degree*'E'),
expression('180'*degree*'E'))) +
theme_bw(base_size = 14) +
theme(legend.title = element_blank(),
legend.text = element_text(size = 10),
axis.title = element_blank(),
panel.grid.major = element_line(linetype = 2,
size = 0.5,
colour = '#666666'),
panel.grid.minor = element_blank())
I think that is a fairly good reproduction of the original figure from the IPCC report, updated to use 2019 data. As you can see large swaths of North Africa, the Arabian Peninsula, and central Australia are deemed hyper-arid regions. In fact, most of Australia is hyper-arid or arid. The same goes for the Southern African region, which I am from, where the West coast is hyper-arid, and most of the rest of the region is arid or semi-arid.
Seeing this gradient from hyper-arid to semi-arid across the Southern African region, I wanted to see if there has been any changes in this gradient over the past 25 years.
Aridity index across the southern parts of Southern Africa (1995 to 2019)
I wanted to use data from 1995 through to 2019, in 5-year periods. This meant downloading and processing multiple files, and to reduce the amount of repetition in my code, I turned to the various map
functions in the purrr
package.
Download and import data
First I downloaded the data from TerraClimate project locally.
#--- Load packages ---#
library(purrr)
#--- Precipitation ---#
# Construct a vector of URLs to download data
## Years to download
<- c(seq(from = 1995, to = 2015, by = 5), 2019)
years ## First chunk of the URL
<- 'http://thredds.northwestknowledge.net:8080/thredds/fileServer/TERRACLIMATE_ALL/data/TerraClimate_ppt_'
precipitation_lead ## Piece together the precipitation_lead and year and add the file extension (.nc)
<- paste0(precipitation_lead, years, '.nc')
precipitation_urls
# Construct a vector of destination files
<- paste0('TerraClimate_ppt_', years, '.nc')
precipitation_destfile
# Download using the purrr::map2 functions
map2(.x = precipitation_urls, # Input 1 to apply a function over
.y = precipitation_destfile, # Input 2 to apply a function over
~ download.file(url = .x, # The function to apply
destfile = .y,
method = 'curl'))
#--- Evapotranspiration
# Construct a vector of URLs to download data
## First chunk of the URL
<- 'http://thredds.northwestknowledge.net:8080/thredds/fileServer/TERRACLIMATE_ALL/data/TerraClimate_pet_'
evapotranspiration_lead ## Piece together the precipitation_lead and year and add the file extension (.nc)
<- paste0(evapotranspiration_lead, years, '.nc')
evapotranspiration_urls
# Construct a vector of destination files
<- paste0('TerraClimate_pet_', years, '.nc')
evapotranspiration_destfile
# Download using the purrr::map2 functions
map2(.x = evapotranspiration_urls,
.y = evapotranspiration_destfile,
~ download.file(url = .x,
destfile = .y,
method = 'curl'))
Then I imported the files. Again, I used the raster::stack
function I had used for importing the data for the global map.
#--- Precipitation ---#
# Get a list of precipitation file names
<- list.files(path = '.', pattern = '.ppt.')
precipitation_files # Import using the purrr::map function to apply the raster::stack function
<- map(.x = precipitation_files,
precipitation_rasters ~ stack(.x))
#--- Evapotranspiration ---#
# Get a list of evapotranspiration file names
<- list.files(path = '.', pattern = '.pet.')
evapotranspiration_files # Import using the purrr::map function to apply the raster::stack function
<- map(.x = evapotranspiration_files,
evapotranspiration_rasters ~stack(.x))
Once I had imported the data, I needed to crop the large raster data to only include parts of Southern Africa, with a focus on South Africa.
By only using a subset of the data in the raster file, this cropping step would speed-up the processing of the data in later steps.
# Set extent to cover Southern Africa
<- extent(10, 40, -37, -20)
ext2
# Crop rasters
## Precipitation
<- map(.x = precipitation_rasters,
precipitation_rasters ~ crop(.x, ext2))
## Evapotranspiration
<- map(.x = evapotranspiration_rasters,
evapotranspiration_rasters ~ crop(.x, ext2))
Raster maths
Then to calculate the annual mean for each year, I used the raster::calc
function.
#-- Calculate mean annual values --#
# Precipitation
<- map(.x = precipitation_rasters,
precipitation_rasters ~ calc(.x,
fun = mean,
na.rm = TRUE))
# Evapotranspiration
<- map(.x = evapotranspiration_rasters,
evapotranspiration_rasters ~ calc(.x,
fun = mean,
na.rm = TRUE))
Process the data as a dataframe
To try something different to the raster::overlay
method I used in the previous section to calculate the global aridity index, in this section, I converted the raster objects to dataframes using the combination of raster::rasterToPoints
and base::as.data.frame
, and then did the maths using the dplyr::mutate
function.
#--- Convert each raster to a dataframe ---#
# Precipitation
<- map(.x = precipitation_rasters,
precipitation_rasters ~ rasterToPoints(.x) %>%
as.data.frame(.) %>%
rename(precipitation = layer)) # For later join
# Evapotranspiration
<- map(.x = evapotranspiration_rasters,
evapotranspiration_rasters ~ rasterToPoints(.x) %>%
as.data.frame(.) %>%
rename(evapotranspiration = layer)) # For later join
#--- Join the precipitation and evapotranspiration dataframes ---#
<- map2(.x = precipitation_rasters,
ai_df .y = evapotranspiration_rasters,
~ left_join(.x, .y))
#-- Add a year identifier to each dataframe --#
<- c(seq(from = 1995, to = 2015, by = 5), 2019)
years2
<- map2(.x = ai_df,
ai_df .y = years2,
~ .x %>%
mutate(year = as.character(.y)))
#-- Bind list items to make a single large dataframe (1995 to 2019) --#
<- map_df(.x = ai_df,
ai_df ~ bind_rows(.x))
#--- Calculate aridity index ---#
<- ai_df %>%
ai_df mutate(aridity_index = precipitation / evapotranspiration)
#--- Recode aridity index into categories --#
<- ai_df %>%
ai_df # Recode
mutate(category = case_when(
is.infinite(aridity_index) ~ 'Humid',
>= 0.65 ~ 'Humid',
aridity_index >= 0.5 & aridity_index < 0.65 ~ 'Dry sub-humid',
aridity_index >= 0.2 & aridity_index < 0.5 ~ 'Semi-arid',
aridity_index >= 0.05 & aridity_index < 0.2 ~ 'Arid',
aridity_index < 0.05 ~ 'Hyper-arid'
aridity_index %>%
)) # Convert to ordered factor
mutate(category = factor(category,
levels = c('Hyper-arid', 'Arid', 'Semi-arid',
'Dry sub-humid', 'Humid'),
ordered = TRUE))
Get Natural Earth map
I wanted to plot the country outlines on the raster image, and to do this I needed the rnaturalearth
package, which makes it easy to global maps from the Natural Earth site.
# Load package
library(rnaturalearth)
library(rnaturalearthdata)
#--- Get medium scale (1:50) map of Southern African countries ---#
# Get a map of Africa
<- ne_countries(scale = 50,
africa continent = 'Africa')
# Extract Southern African countries
<- c('ZAF', 'NAM', 'BWA', 'ZWE', 'MOZ', 'MWI', 'ZMB', 'AGO')
country_filter
<- africa %>%
s_africa st_as_sf(.) %>% # Convert the large SpatialPolygonDataFrame into a sf object
filter(sov_a3 %in% country_filter) # Filter out the countries I want
Plot the data
#--- Plot the data ---#
ggplot() +
geom_raster(data = ai_df, # Plot the aridity index raster
aes(x = x,
y = y,
fill = category)) +
geom_sf(data = s_africa,
alpha = 0) + # Plot Southern Africa
scale_fill_manual(values = colours, # Already set for the global aridity index map
guide = guide_legend(reverse = TRUE)) +
coord_sf(ylim = c(-37, -21),
xlim = c(10, 40)) +
theme_void(base_size = 18) +
facet_wrap(~year, ncol = 2) +
theme(legend.title = element_blank())
It’s always difficult to interpret a series of snapshots in time, and as can be seen in the figure, there are spatial fluctuations in the aridity index across the ~25 years. For example, the humid zone increases massively in 2000, but is minimal in most other years. Indeed, it appears that there is a gradual encroachment of the semi-arid regions into the dry semi-humid and humid zones on the east coast. What is clear, is that 2019 was a bad year, with large expanses of Namibia and South Africa experiencing hyper-arid conditions. It will be interesting to see if this trend continues in 2020, especially since the El Nino-Southern Oscillation is in a La Nina state in 2020, which is associated with above average rainfall in Southern Africa.
Another approach to analysing the data could be to average the aridity index values over multiple years, thus exposing smoothed temporal trends. I hope you found this blog post useful, and that you enjoyed reading it.
Session information
## R version 4.0.4 (2021-02-15)
## Platform: x86_64-apple-darwin17.0 (64-bit)
## Running under: macOS Catalina 10.15.7
##
## Matrix products: default
## BLAS: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRblas.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib
##
## Random number generation:
## RNG: Mersenne-Twister
## Normal: Inversion
## Sample: Rounding
##
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
##
## attached base packages:
## [1] grid stats graphics grDevices utils datasets methods
## [8] base
##
## other attached packages:
## [1] rnaturalearthdata_0.1.0 rnaturalearth_0.1.0 raster_3.4-5
## [4] lspline_1.0-0 carrots_0.1.0 geofacet_0.2.0
## [7] survey_4.0 survival_3.2-9 Matrix_1.3-2
## [10] sf_0.9-7 fiftystater_1.0.1 skimr_2.1.3
## [13] magrittr_2.0.1 geojsonio_0.9.4 lubridate_1.7.10
## [16] xml2_1.3.2 sp_1.4-5 leaflet_2.0.4.1
## [19] highcharter_0.8.2 flexdashboard_0.5.2 gganimate_1.0.7
## [22] gdtools_0.2.3 ggiraph_0.7.8 svglite_2.0.0
## [25] pander_0.6.3 knitr_1.31 forcats_0.5.1
## [28] stringr_1.4.0 dplyr_1.0.5 purrr_0.3.4
## [31] readr_1.4.0 tidyr_1.1.3 tibble_3.1.0
## [34] ggplot2_3.3.3 tidyverse_1.3.0
##
## loaded via a namespace (and not attached):
## [1] colorspace_2.0-0 ellipsis_0.3.1 class_7.3-18
## [4] rgdal_1.5-23 base64enc_0.1-3 fs_1.5.0
## [7] rstudioapi_0.13 httpcode_0.3.0 farver_2.1.0
## [10] ggrepel_0.9.1 fansi_0.4.2 codetools_0.2-18
## [13] splines_4.0.4 ncdf4_1.17 rlist_0.4.6.1
## [16] jsonlite_1.7.2 broom_0.7.5 dbplyr_2.1.0
## [19] png_0.1-7 rgeos_0.5-5 compiler_4.0.4
## [22] httr_1.4.2 backports_1.2.1 assertthat_0.2.1
## [25] lazyeval_0.2.2 cli_2.3.1 tweenr_1.0.1
## [28] htmltools_0.5.1.1 prettyunits_1.1.1 tools_4.0.4
## [31] igraph_1.2.6 gtable_0.3.0 glue_1.4.2
## [34] geojson_0.3.4 V8_3.4.0 Rcpp_1.0.6
## [37] imguR_1.0.3 cellranger_1.1.0 jquerylib_0.1.3
## [40] vctrs_0.3.6 crul_1.1.0 nlme_3.1-152
## [43] crosstalk_1.1.1 xfun_0.22 rvest_1.0.0
## [46] lifecycle_1.0.0 jqr_1.2.0 zoo_1.8-9
## [49] scales_1.1.1 hms_1.0.0 yaml_2.2.1
## [52] quantmod_0.4.18 curl_4.3 gridExtra_2.3
## [55] sass_0.3.1 stringi_1.5.3 highr_0.8
## [58] maptools_1.0-2 e1071_1.7-4 TTR_0.24.2
## [61] boot_1.3-27 repr_1.1.3 rlang_0.4.10
## [64] pkgconfig_2.0.3 systemfonts_1.0.1 geogrid_0.1.1
## [67] evaluate_0.14 lattice_0.20-41 htmlwidgets_1.5.3
## [70] labeling_0.4.2 tidyselect_1.1.0 geojsonsf_2.0.1
## [73] R6_2.5.0 magick_2.7.0 generics_0.1.0
## [76] DBI_1.1.1 mgcv_1.8-34 pillar_1.5.1
## [79] haven_2.3.1 foreign_0.8-81 withr_2.4.1
## [82] units_0.7-0 xts_0.12.1 modelr_0.1.8
## [85] crayon_1.4.1 uuid_0.1-4 KernSmooth_2.23-18
## [88] utf8_1.2.1 rmarkdown_2.7 jpeg_0.1-8.1
## [91] progress_1.2.2 readxl_1.3.1 data.table_1.14.0
## [94] reprex_1.0.0 digest_0.6.27 classInt_0.4-3
## [97] webshot_0.5.2 munsell_0.5.0 viridisLite_0.3.0
## [100] kableExtra_1.3.4 bslib_0.2.4 mitools_2.4
Fuller A, et al., How dryland mammals will respond to climate change: the effects of body size, heat load, and a lack of food and water. J Exp Biol IN PRESS. DOI: 10.1242/jeb.238113↩︎
Desertification. In: Climate Change and Land: an IPCC special report on climate change, desertification, land degradation, sustainable land management, food security, and greenhouse gas fluxes in terrestrial ecosystems. URL: ipcc.ch/srccl/chapter/chapter-3/. Accessed: 15 December 2019.↩︎
Abatzoglou JT, Dobrowski SZ, Parks SA, Hegewisch KC. TerraClimate, a high-resolution global dataset of monthly climate and climatic water balance from 1958-2015. Sci Data 5:170191, 2018. DOI: 10.1038/sdata.2017.191↩︎