The aim of this tutorial is to provide a short introduction to the prioritizr R package. It is also intended to help conservation planners familiar the Marxan decision support tool (Ball et al. 2009) start using the package for their work.
Let’s load the packages and data used in this tutorial. Since this tutorial uses data from the prioritizrdata R package, please ensure that it is installed. The data used in this tutorial were obtained from the Introduction to Marxan course and the Australian Government’s National Vegetation Information System.
# load packages
library(prioritizrdata)
library(prioritizr)
library(sf)
library(terra)
library(vegan)
library(cluster)
# set seed for reproducibility
set.seed(500)
# load planning unit data
tas_pu <- get_tas_pu()
# load feature data
tas_features <- get_tas_features()
Let’s have a look at the planning unit data. The tas_pu
object contains planning units represented as spatial polygons (i.e., a
sf::st_sf()
object). This object has three columns that
denote the following information for each planning unit: a unique
identifier (id
), unimproved land value (cost
),
and current conservation status (locked_in
). Planning units
that have at least half of their area overlapping with existing
protected areas are denoted with a locked in TRUE
value,
otherwise they are denoted with a value of FALSE
. We will
also set the costs for existing protected areas to zero, so that
existing protected areas aren’t included in the the cost of the
prioritization.
# print planning unit data
print(tas_pu)
# set costs for existing protected areas to zero
tas_pu$cost <- tas_pu$cost * !tas_pu$locked_in
# plot map of planning unit costs
plot(st_as_sf(tas_pu[, "cost"]), main = "Planning unit costs")
# plot map of planning unit coverage by protected areas
plot(st_as_sf(tas_pu[, "locked_in"]), main = "Protected area coverage")
Now, let’s look at the conservation feature data. The
tas_features
object describes the spatial distribution of
the features. Specifically, the feature data are a multi-layer raster
(i.e., a terra::rast()
object). Each layer corresponds to a
different vegetation community. Within each layer, cells values denote
the presence (using value of 1) or absence (using value of 0) of the
vegetation community across the study area.
Now we will formulate a conservation planing problem. To achieve
this, we first specify which objects contain the planning unit and
feature data (using the problem()
function). Next, we
specify that we want to use the minimum set objective function (using
the add_min_set_objective()
function). This objective
function indicates that we wish to minimize the total cost of planning
units selected by the prioritization. We then specify boundary penalties
to reduce spatial fragmentation in the resulting prioritization (using
the add_boundary_penalties()
function; see the Calibrating
trade-offs vignette for details on calibrating the penalty
value). We also specify representation targets to ensure the resulting
prioritization provides adequate coverage of each vegetation community
(using the add_relative_targets()
function). Specifically,
we specify targets to ensure at least 17% of the spatial extent of each
vegetation community (based on the Aichi Target 11).
Additionally, we set constraints to ensure that planning units
predominately covered by existing protected areas are selected by the
prioritization (using the add_locked_in_constraints()
function). Finally, we specify that the prioritization should either
select – or not select – planning units for prioritization (using the
add_binary_decisions()
function).
We can now solve the problem formulation (p1
) to
generate a prioritization (using the solve()
function). The
prioritizr R package supports a range of different exact
algorithm solvers, including Gurobi, IBM CPLEX,
CBC, HiGHS, Rsymphony, and
lpsymphony. Although there are benefits and limitations
associated with each of these different solvers, they should return
similar results. Note that you will need at least one solver installed
on your system to generate prioritizations. Since we did not specify a
solver when building the problem, the prioritizr R package will
automatically select the best available solver installed. We recommend
using the Gurobi solver if possible, and have used it for this
tutorial (see the Gurobi Installation Guide vignette for
installation instructions). After solving the problem, the
prioritization will be stored in the solution_1
column of
the s1
object.
Let’s examine how well the vegetation communities are represented by existing protected areas and the prioritization.
# create column with existing protected areas
tas_pu$pa <- round(tas_pu$locked_in)
# calculate feature representation statistics based on existing protected areas
tc_pa <- eval_target_coverage_summary(p1, tas_pu[, "pa"])
print(tc_pa)
# calculate feature representation statistics based on the prioritization
tc_s1 <- eval_target_coverage_summary(p1, s1[, "solution_1"])
print(tc_s1)
# explore representation by existing protected areas
## calculate number of features adequately represented by existing protected
## areas
sum(tc_pa$met)
## summarize representation (values show percent coverage)
summary(tc_pa$relative_held * 100)
## visualize representation (values show percent coverage)
hist(tc_pa$relative_held * 100,
main = "Feature representation by existing protected areas",
xlim = c(0, 100),
xlab = "Percent coverage of features (%)")
# explore representation by prioritization
## summarize representation (values show percent coverage)
summary(tc_s1$relative_held * 100)
## calculate number of features adequately represented by the prioritization
sum(tc_s1$met)
## visualize representation (values show percent coverage)
hist(
tc_s1$relative_held * 100,
main = "Feature representation by prioritization",
xlim = c(0, 100),
xlab = "Percent coverage of features (%)"
)
We can see that representation of the vegetation communities by existing protected areas is remarkably poor. For example, many of the vegetation communities have nearly zero coverage by existing protected areas. In other words, are almost entirely absent from existing protected areas. We can also see that all vegetation communities have at least 17% coverage by the prioritization – meaning that it meets the representation targets for all of the features.
After generating the prioritization, we can examine the relative
importance of planning units selected by the prioritization. This can be
useful to identify critically important planning units for conservation
– in other words, places that contain biodiversity features which cannot
be represented anywhere else – and schedule implementation of the
prioritization. To achieve this, we will use an incremental rank
approach (Jung et al. 2021).
Briefly, this approach involves generating incremental prioritizations
with increasing budgets, wherein planning units selected in a previous
increment are locked in to the following solution. Additionally, locked
out constraints are used to ensure that only planning units selected in
the original solution are available for selection. If you’re interested,
other approaches for examining importance are also available (see ?importance
).
Conservation planning exercises often involve generating multiple
different prioritizations. This can help decision makers consider
different options, and provide starting points for building consensus
among stakeholders. To generate a range of different prioritizations
given the same problem formulation, we can use portfolio functions. Here
we will use the gap portfolio to generate 1000 solutions that are within
20% of optimality. Please note that you will need to have the
Gurobi solver installed to use this specific portfolio. If you
don’t have access to Gurobi, you could try using the shuffle
portfolio instead (using the add_shuffle_portfolio()
function).
# create new problem with a portfolio added to it
p2 <-
p1 %>%
add_gap_portfolio(number_solutions = 1000, pool_gap = 0.2)
# print problem
print(p2)
# generate prioritizations
prt <- solve(p2)
print(prt)
After generating all these prioritizations, we now want some way to visualize them. Because it would be onerous to look at each and every prioritization individually, we will use statistical analyses to help us. We can visualize the differences between these different prioritizations – based on which planning units they selected – using a hierarchical cluster analysis (Harris et al. 2014).
# extract solutions
prt_results <- sf::st_drop_geometry(prt)
prt_results <- prt_results[, startsWith(names(prt_results), "solution_")]
# calculate pair-wise distances between different prioritizations for analysis
prt_dists <- vegan::vegdist(t(prt_results), method = "jaccard", binary = TRUE)
# run cluster analysis
prt_clust <- hclust(as.dist(prt_dists), method = "average")
# visualize clusters
opar <- par()
par(oma = c(0, 0, 0, 0), mar= c(0, 4.1, 1.5, 2.1))
plot(
prt_clust, labels = FALSE, sub = NA, xlab = "",
main = "Different prioritizations in portfolio"
)
suppressWarnings(par(opar))
We can see that there are approximately six main groups of prioritizations in the portfolio. To explore these different groups, let’s conduct another cluster analysis (i.e., a k-medoids analysis) to extract the most representative prioritization from each of these groups. In other words, we will run another statistical analysis to find the most central prioritization within each group.
# run k-medoids analysis
prt_med <- pam(prt_dists, k = 6)
# extract names of prioritizations that are most central for each group.
prt_med_names <- prt_med$medoids
print(prt_med_names)
# create a copy of prt and set values for locked in planning units to -1
# so we can easily visualize differences between prioritizations
prt2 <- prt[, prt_med_names]
prt2[which(tas_pu$locked_in > 0.5), prt_med_names] <- -1
# plot a map showing main different prioritizations
# dark grey: locked in planning units
# grey: planning units not selected
# green: selected planning units
plot(st_as_sf(prt2), pal = c("grey60", "grey90", "darkgreen"))
The prioritizr R package provides functionality to help Marxan users generate prioritizations. Specifically, it can import conservation planning data prepared for Marxan, and can generate prioritizations using a similar problem formulation as Marxan (based on Beyer et al. 2016). Indeed, the problem formulation presented earlier in this vignette is very similar to that used by Marxan. The key difference is that the problem formulation we specified earlier uses “hard constraints” for feature representation, and Marxan uses “soft constraints” for feature representation. This means that prioritization we generated earlier was mathematically guaranteed to reach the targets for all features. However, if we used Marxan to generate the prioritization, then we could have produced a prioritization that would fail to reach targets (depending the Species Penalty Factors used to generate the prioritization). In addition to these differences in terms problem formulation, the prioritizr R package uses exact algorithms – instead of the simulated annealing algorithm – which ensures that we obtain prioritizations that are near optimal.
Here we will show the prioritizr R package can import Marxan data and generate a prioritization. To begin with, let’s import a conservation planning data prepared for Marxan.
# import data
## planning unit data
pu_path <- system.file("extdata/marxan/input/pu.dat", package = "prioritizr")
pu_data <- read.csv(pu_path, header = TRUE, stringsAsFactors = FALSE)
print(head(pu_data))
## feature data
spec_path <- system.file(
"extdata/marxan/input/spec.dat", package = "prioritizr"
)
spec_data <- read.csv(spec_path, header = TRUE, stringsAsFactors = FALSE)
print(head(spec_data))
## amount of each feature within each planning unit data
puvspr_path <- system.file(
"extdata/marxan/input/puvspr.dat", package = "prioritizr"
)
puvspr_data <- read.csv(puvspr_path, header = TRUE, stringsAsFactors = FALSE)
print(head(puvspr_data))
## boundary data
bound_path <- system.file(
"extdata/marxan/input/bound.dat", package = "prioritizr"
)
bound_data <- read.table(bound_path, header = TRUE, stringsAsFactors = FALSE)
print(head(bound_data))
After importing the data, we can now generate a prioritization based
on the Marxan problem formulation (using the
marxan_problem()
function). Please note that this
function does not generate prioritizations using
Marxan. Instead, it uses the data to create an
optimization problem formulation similar to Marxan – using hard
constraints instead of soft constraints – and uses an exact algorithm
solver to generate a prioritization.
This tutorial shows how the prioritizr R package can be used to build a conservation problem, generate a prioritization, and evaluate it. Although we explored just a few functions, the package provides many different functions so that you can build and custom-tailor conservation planning problems to suit your needs. To learn more about the package, please see the package vignettes for an overview of the package, instructions for installing the Gurobi optimization suite, benchmarks comparing the performance of different solvers, and a record of publications that have cited the package. In addition to this tutorial, the package also provides tutorials on incorporating connectivity into prioritizations, calibrating trade-offs between different criteria (e.g., total cost and spatial fragmentation), and creating prioritizations that have multiple management zones or management actions.