The prioritizr R package uses mixed integer linear programming (MILP) techniques to provide a flexible interface for building and solving conservation planning problems (Rodrigues et al. 2000; Billionnet 2013). It supports a broad range of objectives, constraints, and penalties that can be used to custom-tailor conservation planning problems to the specific needs of a conservation planning exercise. Once built, conservation planning problems can be solved using a variety of commercial and open-source exact algorithm solvers. In contrast to the algorithms conventionally used to solve conservation problems, such as heuristics or simulated annealing (Ball et al. 2009), the exact algorithms used here are guaranteed to find optimal solutions. Furthermore, conservation problems can be constructed to optimize the spatial allocation of different management zones (or actions), meaning that conservation practitioners can identify solutions that benefit multiple stakeholders. Finally, this package has the functionality to read input data formatted for the Marxan conservation planning program (Ball et al. 2009), and find much cheaper solutions in a much shorter period of time than Marxan (Beyer et al. 2016).
Systematic conservation planning is a rigorous, repeatable, and structured approach to designing new protected areas that efficiently meet conservation objectives (Margules & Pressey 2000). Historically, conservation decision-making has often evaluated parcels opportunistically as they became available for purchase, donation, or under threat. Although purchasing such areas may improve the status quo, such decisions may not substantially enhance the long-term persistence of target species or communities. Faced with this realization, conservation planners began using decision support tools to help simulate alternative reserve designs over a range of different biodiversity and management goals and, ultimately, guide protected area acquisitions and management actions. Due to the systematic, evidence-based nature of these tools, conservation prioritization can help contribute to a transparent, inclusive, and more defensible decision making process.
A conservation planning exercise typically starts by defining a study area. This study area should encompass all the areas relevant to the decision maker or the hypothesis being tested. The extent of a study area could encompass a few important localities (e.g., Stigner et al. 2016), a single state (e.g., Kirkpatrick 1983), an entire country (Fuller et al. 2010), or the entire planet (Butchart et al. 2015). Next, the study area is divided into a set of discrete areas termed planning units. Each planning unit represents a discrete locality in the study area that can be managed independently of other areas. The general idea is that some combination of the planning units can be selected for conservation actions (e.g., protected area establishment, habitat restoration). Planning units are often created as square or hexagon cells that are sized according to the scale of the conservation actions, and the resolution of the data that underpin the planning exercise (but see Klein et al. 2009).
Cost data (or a surrogate thereof) are needed to inform the prioritization process. Specifically, these cost data describe the relative expenditure associated with managing each planning unit for conservation. For example, if the goal of the conservation planning exercise is to identify priority areas for expanding a local protected area system, then the cost data could represent the physical cost of purchasing the land. Alternatively, if such data are not available, then surrogate data are often used instead (e.g., human population density, opportunity cost of foregone commercial activities, or planning unit size).
Conservation planning exercises also require data on the biodiversity elements that are of conservation interest (termed conservation features). These features could be species (e.g., Neofelis nebulosa, the Clouded Leopard), populations, or habitats (e.g., mangroves or cloud forest). After identifying the set of relevant conservation features for a conservation planning exercise, spatially explicit data need to be obtained for each and every feature to describe their spatial distribution (e.g., habitat suitability data, probability of occurrence data, presence/absence data). This is important to ensure that conservation features are adequately covered (represented) by prioritizations. After assembling all the data, the next step is to define the conservation planning problem.
The prioritizr R package is designed to help you build and solve conservation planning problems. Specifically, prioritizations are generated by formulating a mathematical optimization problem and then solving it to generate a solution. These mathematical optimization problems are formulated using the planning unit data, cost data, and feature data, and with information related to the overarching aim of the prioritization process. In general, the goal of an optimization problem is to minimize (or maximize) an objective function that is calculated using a set of decision variables, subject to a series of constraints to ensure that solution exhibits specific properties. The objective function describes the quantity which we are trying to minimize (e.g., cost of the solution) or maximize (e.g., number of features conserved). The decision variables describe the entities that we can control, and indicate which planning units are selected for conservation management and which are not. The constraints can be thought of as rules that the decision variables need to follow. For example, a commonly used constraint is specifying that the solution must not exceed a certain budget.
A wide variety of approaches have been developed for solving optimization problems. Reserve design problems are frequently solved using simulated annealing (Kirkpatrick et al. 1983) or heuristics (Nicholls & Margules 1993; Moilanen 2007). These methods are conceptually simple and can be applied to a wide variety of optimization problems. However, they do not scale well for large or complex problems (Beyer et al. 2016). Additionally, these methods cannot tell you how close any given solution is to the optimal solution. The prioritizr package uses exact algorithms to efficiently solve conservation planning problems to within a pre-specified optimality gap. In other words, you can specify that you need the optimal solution (i.e., a gap of 0%) and the algorithms will, given enough time, find a solution that meets this criteria. In the past, exact algorithms have been too slow for conservation planning exercises (Pressey et al. 1996). However, improvements over the last decade mean that they are now much faster (Achterberg & Wunderling 2013; Beyer et al. 2016).
In this package, optimization problems are expressed using integer linear programming (ILP) so that they can be solved using (linear) exact algorithm solvers. The general form of an integer programming problem can be expressed in matrix notation using the following equation.
$$\text{Minimize} \space \boldsymbol{c}^\text{T} \boldsymbol{x} \space \space \text{subject to} \space A\boldsymbol{x} \space \Box \space \boldsymbol{b}$$
Where x is a vector of decision variables, c and b are vectors of known coefficients, and A is the constraint matrix. The final term specifies a series of structural constants and the □ symbol is used to indicate that the relational operators for the constraints can be either ≥, =, or ≤. In the context of a conservation planning problem, c could be used to represent the planning unit costs, A could be used to store the data showing the presence / absence (or amount) of each feature in each planning unit, b could be used to represent minimum amount of habitat required for each species in the solution, the □ could be set to ≥ symbols to indicate that the total amount of each feature in the solution must exceed the quantities in b. But there are many other ways of formulating the reserve selection problem (Rodrigues et al. 2000).
The prioritizr R package uses a grammar to describe elements
of conservation planning. This means that functions are organized into
verbs that relate to specific concepts. For example, all of the
functions used to specify the primary objective for optimization end
with the _objective
suffix (e.g.,
add_min_set_objective()
and
add_min_shortfall_objective()
). By combining multiple
functions together, they can be used to formulate a complete
conservation planning problem. Specifically, the verbs for formulating
problems are described below.
After building a conservation planning problem, it can be solved to
generate a prioritization (using the solve()
function).
There are also verbs available to help evaluate and interpret solutions.
These verbs are described below.
The general workflow when using the prioritizr R package
starts with creating a new conservation planning problem()
object using data. Specifically, the problem()
object
should be constructed using data that specify the planning units,
biodiversity features, management zones (if applicable), and costs.
After creating a new problem()
object, it can be
customized—by adding objectives, penalties, constraints, and other
information—to build a precise representation of the conservation
planning problem required, and then solved to obtain a solution.
All conservation planning problems require an objective. An objective
specifies the property which is used to compare different feasible
solutions. Simply put, the objective is the property of the solution
which should be maximized or minimized during the optimization process.
For instance, with the minimum set objective (specified using
add_min_set_objective()
), we are seeking to minimize the
cost of the solution (similar to Marxan). On the other hand,
with the minimum shortfall objective (specified using
add_min_shortfall_objective()
), we are seeking to minimize
the average target shortfall for all features represented in the
solution, subject to a budget.
Many objectives require representation targets (e.g., the minimum set
objective). These targets are a specialized set of constraints that
relate to the total quantity of each feature secured in the solution
(e.g., amount of suitable habitat or number of individuals). In the case
of the minimum set objective ( add_min_set_objective()
),
they are used to ensure that solutions secure a sufficient quantity of
each feature, and in other objectives, such as the maximum features
objective ( add_max_features_objective()
) they are used to
assess whether a feature has been adequately conserved by a candidate
solution. Targets can be expressed numerically as the total amount
required for a given feature (using
add_absolute_targets()
), or as a proportion of the total
amount found in the planning units (using
add_relative_targets()
). Note that not all objectives
require targets, and a warning will be thrown if an attempt is made to
add targets to a problem with an objective that does not use them.
Constraints and penalties can be added to a conservation planning
problem to ensure that solutions exhibit a specific property or penalize
solutions which don’t exhibit a specific property (respectively). The
difference between constraints and penalties, strictly speaking, is
constraints are used to rule out potential solutions that don’t exhibit
a specific property. For instance, constraints can be used to ensure
that specific planning units are selected in the solution for
prioritization (using add_locked_in_constraints()
) or not
selected in the solution for prioritization (using
add_locked_out_constraints()
). On the other hand, penalties
are combined with the objective of a problem, with a penalty factor, and
the overall objective of the problem then becomes to minimize (or
maximize) the primary objective function and the penalty function. For
example, penalties can be added to a problem to penalize solutions that
are excessively fragmented (using
add_boundary_penalties()
). These penalties have a
penalty
argument that specifies the relative importance of
having spatially clustered solutions. When the argument to
penalty
is high, then solutions which are less fragmented
are valued more highly – even if they cost more – and when the argument
to penalty
is low, then the solutions which are more
fragmented are valued less highly.
After building a conservation problem, it can then be solved to obtain a solution (or portfolio of solutions if desired). The solution is returned in the same format as the planning unit data used to construct the problem. For instance, this means that if raster data was used to initialize the problem, then the solution will also be output in raster format. This can be very helpful when it comes to interpreting and visualizing solutions because it means that the solution data does not first have to be merged with spatial data before they can be plotted on a map.
Here we will provide an introduction to using the prioritizr R package to build and solve a conservation planning problem. Please note that we will not discuss conservation planning with multiple zones in this vignette, for more information on working with multiple management zones please see the Management zones tutorial.
First, we will load the prioritizr package.
Now we will load some built-in data sets that are distributed with the prioritizr R package. This package contains several different planning unit data sets. To provide a comprehensive overview of the different ways that we can initialize a conservation planning problem, we will load each of them.
First, we will load the raster planning unit data
(sim_pu_raster
). Here, the planning units are represented
as a single-layer raster object (i.e., a terra::rast()
object) and each pixel corresponds to the spatial extent of each panning
unit. The pixel values correspond to the acquisition costs of each
planning unit.
# load raster planning unit data
sim_pu_raster <- get_sim_pu_raster()
# print data
print(sim_pu_raster)
# plot data
plot(
sim_pu_raster, main = "Planning unit costs",
xlim = c(-0.1, 1.1), ylim = c(-0.1, 1.1), axes = FALSE
)
Secondly, we will load one of the spatial vector planning unit data
sets (sim_pu_polygons
). Here, each polygon (i.e., feature
using ArcGIS terminology) corresponds to a different planning unit. This
data set has an attribute table that contains additional information
about each polygon. Namely, the cost
field (column) in the
attribute table contains the acquisition cost for each planning
unit.
# load polygon planning unit data
sim_pu_polygons <- get_sim_pu_polygons()
# print data
print(sim_pu_polygons)
# plot the planning units, and color them according to cost values
plot(sim_pu_polygons[, "cost"])
Thirdly, we will load some planning unit data stored in tabular
format (i.e., data.frame
format). For those familiar with
Marxan or dealing with very large conservation planning
problems (> 10 million planning units), it may be useful to work with
data in this format because it does not contain any spatial information
which will reduce computational burden. When using tabular data to
initialize conservation planning problems, the data must follow the
conventions used by Marxan. Specifically, each row in the
planning unit table must correspond to a different planning unit. The
table must also have an “id” column to provide a unique integer
identifier for each planning unit, and it must also have a column that
indicates the cost of each planning unit. For more information, please
see the official Marxan
documentation.
# specify file path for planning unit data
pu_path <- system.file("extdata/marxan/input/pu.dat", package = "prioritizr")
# load in the tabular planning unit data
# note that we use the data.table::fread function, as opposed to the read.csv
# function, because it is much faster
pu_dat <- data.table::fread(pu_path, data.table = FALSE)
# preview first six rows of the tabular planning unit data
# note that it has some extra columns other than id and cost as per the
# Marxan format
head(pu_dat)
Finally, we will load data showing the spatial distribution of the
conservation features. Our conservation features
(sim_features
) are represented as a multi-layer raster
object (i.e., a terra::rast()
object), where each layer
corresponds to a different feature. The pixel values in each layer
correspond to the amount of suitable habitat available in a given
planning unit. Note that our planning unit and our feature data have
exactly the same spatial properties (i.e., resolution, extent,
coordinate reference system) so their pixels line up perfectly.
After having loaded our planning unit and feature data, we will now
try initializing some conservation planning problems. There are a lot of
different ways to initialize a conservation planning problem, so here we
will just showcase a few of the more commonly used methods. For an
exhaustive description of all the ways you can initialize a conservation
problem, see the help file for the problem()
function.
First off, we will initialize a conservation planning problem using the
raster data.
# create problem
p1 <- problem(sim_pu_raster, sim_features)
# print problem
print(p1)
# print number of planning units
number_of_planning_units(p1)
# print number of features
number_of_features(p1)
Generally, we recommend initializing problems using raster data where
possible. This is because the problem()
function needs to
calculate the amount of each feature in each planning unit, and by
providing both the planning unit and feature data in raster format with
the same spatial resolution, extents, and coordinate systems, this means
that the problem()
function does not need to do any
geoprocessing behind the scenes. But sometimes we can’t use raster
planning unit data, because our planning units aren’t equal-sized grid
cells. So, below is an example showing how we can initialize a
conservation planning problem using planning units that are formatted as
spatial vector data. Note that we could reduce run-time by pre-computing
the amount of each feature in each planning unit and storing the data in
the attribute table (e.g., by performing zonal statistics with
R or ESRI ArcGIS), and then passing in the names of
the columns as an argument to the problem()
function (see
Examples section for problem()
for details).
# create problem with spatial vector data
# note that we have to specify which column in the attribute table contains
# the cost data
p2 <- problem(sim_pu_polygons, sim_features, cost_column = "cost")
# print problem
print(p2)
We can also initialize a conservation planning problem using tabular
planning unit data (i.e., data.frame
format). Since the
tabular planning unit data does not contain any spatial information, we
also have to provide the feature data in tabular format (i.e.,
data.frame
format) and data showing the amount of each
feature in each planning unit in tabular format (i.e.,
data.frame
format). The feature data must have an “id”
column containing a unique integer identifier for each feature, and the
planning unit by feature data must contain the following three columns:
pu
corresponding to the planning unit identifiers,
species
corresponding to the feature identifiers, and
amount
showing the amount of a given feature in a given
planning unit.
# set file path for feature data
spec_path <- system.file(
"extdata/marxan/input/spec.dat", package = "prioritizr"
)
# load in feature data
spec_dat <- data.table::fread(spec_path, data.table = FALSE)
# print first six rows of the data
# note that it contains extra columns
head(spec_dat)
# set file path for planning unit vs. feature data
puvspr_path <- system.file(
"extdata/marxan/input/puvspr.dat", package = "prioritizr"
)
# load in planning unit vs feature data
puvspr_dat <- data.table::fread(puvspr_path, data.table = FALSE)
# print first six rows of the data
head(puvspr_dat)
# create problem
p3 <- problem(pu_dat, spec_dat, cost_column = "cost", rij = puvspr_dat)
# print problem
print(p3)
For more information on initializing problems, please see the help
page for the problem()
function (which you can open by
entering the code: ?problem
). Now that we have initialized
a conservation planning problem, we will show you how you can customize
it to suit the exact needs of your conservation planning scenario.
Although we initialized the conservation planning problems using several
different methods, moving forward, we will only use raster-based
planning unit data to keep things simple.
The next step is to add a primary objective to the problem. A problem objective is used to specify the primary goal of the problem (i.e., the quantity that is to be maximized or minimized). All conservation planning problems involve minimizing or maximizing some kind of objective. For instance, we might require a solution that conserves enough habitat for each species while minimizing the overall cost of the reserve network. Alternatively, we might require a solution that maximizes the number of conserved species while ensuring that the cost of the reserve network does not exceed the budget. Please note that objectives are added in the same way regardless of the type of data used to initialize the problem. The following objectives are available.
# create a new problem that has the minimum set objective
p3 <-
problem(sim_pu_raster, sim_features) %>%
add_min_set_objective()
# print problem
print(p3)
# create a new problem that has the maximum coverage objective and a budget
# of 5000
p4 <-
problem(sim_pu_raster, sim_features) %>%
add_max_cover_objective(5000)
# print problem
print(p4)
# create a new problem that has the maximum features objective and a budget
# of 5000
p5 <-
problem(sim_pu_raster, sim_features) %>%
add_max_features_objective(budget = 5000)
# print problem
print(p5)
# create a new problem that has the minimum shortfall objective and a budget
# of 5000
p6 <-
problem(sim_pu_raster, sim_features) %>%
add_min_shortfall_objective(budget = 5000)
# print problem
print(p6)
# create a new problem that has the minimum largest shortfall objective and a
# budget of 5000
p7 <-
problem(sim_pu_raster, sim_features) %>%
add_min_largest_shortfall_objective(budget = 5000)
# print problem
print(p7)
sim_phylogny
).# load simulated phylogeny data
sim_phylogeny <- get_sim_phylogeny()
# create a new problem that has the maximum phylogenetic diversity
# objective and a budget of 5000
p8 <-
problem(sim_pu_raster, sim_features) %>%
add_max_phylo_div_objective(budget = 5000, tree = sim_phylogeny)
# print problem
print(p8)
# load simulated phylogeny data
sim_phylogeny <- get_sim_phylogeny()
# create a new problem that has the maximum phylogenetic diversity
# objective and a budget of 5000
p9 <-
problem(sim_pu_raster, sim_features) %>%
add_max_phylo_end_objective(budget = 5000, tree = sim_phylogeny)
# print problem
print(p9)
Most conservation planning problems require targets. Targets are used to specify the minimum amount or proportion of a feature’s distribution that needs to be protected in the solution. For example, we may want to develop a reserve network that will secure 20% of the distribution for each feature for minimal cost. The following methods are available for specifying targets.
# create a problem with targets which specify that the solution must conserve
# a sum total of 3 units of suitable habitat for each feature
p11 <-
problem(sim_pu_raster, sim_features) %>%
add_min_set_objective() %>%
add_absolute_targets(3)
# print problem
print(p11)
# create a problem with the minimum set objective and relative targets of 10 %
# for each feature
p12 <-
problem(sim_pu_raster, sim_features) %>%
add_min_set_objective() %>%
add_relative_targets(0.1)
# print problem
print(p12)
# create a problem with targets which specify that we need 10 % of the habitat
# for the first feature, 15 % for the second feature, 20 % for the third feature
# 25 % for the fourth feature and 30 % of the habitat for the fifth feature
targets <- c(0.1, 0.15, 0.2, 0.25, 0.3)
p13 <-
problem(sim_pu_raster, sim_features) %>%
add_min_set_objective() %>%
add_relative_targets(targets)
# print problem
print(p13)
# create problem with added log-linear targets
p14 <-
problem(sim_pu_raster, sim_features) %>%
add_min_set_objective() %>%
add_loglinear_targets(10, 0.9, 100, 0.2)
# print problem
print(p14)
As with the functions for specifying the objective of a problem, if we try adding multiple targets to a problem, only the most recently added set of targets are used.
A constraint can be added to a conservation planning problem to ensure that all solutions exhibit a specific property. For example, they can be used to make sure that all solutions select a specific planning unit or that all selected planning units in the solution follow a certain configuration. The following constraints are available.
# create problem with constraints which specify that the first planning unit
# must be selected in the solution
p15 <-
problem(sim_pu_raster, sim_features) %>%
add_min_set_objective() %>%
add_relative_targets(0.1) %>%
add_locked_in_constraints(1)
# print problem
print(p15)
# create problem with constraints which specify that the second planning unit
# must not be selected in the solution
p16 <-
problem(sim_pu_raster, sim_features) %>%
add_min_set_objective() %>%
add_relative_targets(0.1) %>%
add_locked_out_constraints(2)
# print problem
print(p16)
# create problem with constraints which specify that all selected planning units
# in the solution must have at least 1 neighbor
p17 <-
problem(sim_pu_raster, sim_features) %>%
add_min_set_objective() %>%
add_relative_targets(0.1) %>%
add_neighbor_constraints(1)
# print problem
print(p17)
# create problem with constraints which specify that all selected planning units
# in the solution must form a single contiguous unit
p18 <-
problem(sim_pu_raster, sim_features) %>%
add_min_set_objective() %>%
add_relative_targets(0.1) %>%
add_contiguity_constraints()
# print problem
print(p18)
add_contiguity_constraints
function, because they ensure that each feature is represented in a
contiguous unit and not that the entire solution should form a
contiguous unit.# create problem with constraints which specify that the planning units used
# to conserve each feature must form a contiguous unit
p19 <-
problem(sim_pu_raster, sim_features) %>%
add_min_set_objective() %>%
add_relative_targets(0.1) %>%
add_feature_contiguity_constraints()
# print problem
print(p19)
# create problem with constraints which specify that the sum of
# values in sim_features[[1]] among selected planning units must not exceed a
# threshold value of 190.
p20 <-
problem(sim_pu_raster, sim_features) %>%
add_min_shortfall_objective(budget = 1800) %>%
add_relative_targets(0.1) %>%
add_linear_constraints(190, "<=", sim_features[[1]])
# print problem
print(p20)
In particular, The add_locked_in_constraints
and
add_locked_out_constraints
functions are incredibly useful
for real-world conservation planning exercises, so it’s worth pointing
out that there are several ways we can specify which planning units
should be locked in or out of the solutions. If we use raster planning
unit data, we can also use raster data to specify which planning units
should be locked in or locked out.
# load data to lock in or lock out planning units
sim_locked_in_raster <- get_sim_locked_in_raster()
sim_locked_out_raster <- get_sim_locked_out_raster()
# plot the locked data
plot(
c(sim_locked_in_raster, sim_locked_out_raster),
xlim = c(-0.1, 1.1), ylim = c(-0.1, 1.1),
axes = FALSE, main = c("Locked in", "Locked out")
)
# create a problem using raster planning unit data and use the locked raster
# data to lock in some planning units and lock out some other planning units
p21 <-
problem(sim_pu_raster, sim_features) %>%
add_min_set_objective() %>%
add_relative_targets(0.1) %>%
add_locked_in_constraints(sim_locked_in_raster) %>%
add_locked_out_constraints(sim_locked_out_raster)
# print problem
print(p21)
If our planning unit data are in a spatial vector format (similar to
the sim_pu_polygons
data) or a tabular format (similar to
pu_dat
), we can use the field names in the data to refer to
which planning units should be locked in and / or out. For example, the
sim_pu_polygons
object has TRUE
/
FALSE
values in the “locked_in” field which indicate which
planning units should be selected in the solution. We could use the data
in this field to specify that those planning units with
TRUE
values should be locked in using the following
methods.
We can also add penalties to a problem to favor or penalize solutions
according to a secondary objective. Unlike the constraint functions,
these functions will add extra information to the objective function of
the optimization function to penalize solutions that do not exhibit
specific characteristics. For example, penalties can be added to a
problem to avoid highly fragmented solutions at the expense of accepting
slightly more expensive solutions. All penalty functions have a
penalty
argument that controls the relative importance of
the secondary penalty function compared to the primary objective
function. It is worth noting that incredibly low or incredibly high
penalty
values – relative to the main objective function –
can cause problems to take a very long time to solve, so when trying out
a range of different penalty values it can be helpful to limit the
solver to run for a set period of time. The following penalties are
available.
# create problem with penalties that penalize fragmented solutions with a
# penalty factor of 0.01
p24 <-
problem(sim_pu_raster, sim_features) %>%
add_min_set_objective() %>%
add_relative_targets(0.1) %>%
add_boundary_penalties(penalty = 0.01)
# print problem
print(p24)
# create problem with penalties for symmetric connectivity,
# here we will use only the first four layers in
# sim_features for the features and we will use the fifth layer in sim_features
# to represent the connectivity data, where the connectivity_matrix function
# will create a matrix showing the average strength of connectivity between
# adjacent planning units using the data in the fifth layer of sim_features
p25 <-
problem(sim_pu_raster, sim_features[[1:4]]) %>%
add_min_set_objective() %>%
add_relative_targets(0.1) %>%
add_connectivity_penalties(
penalty = 5,
data = connectivity_matrix(sim_pu_raster, sim_features[[5]])
)
# print problem
print(p25)
# create a matrix containing asymmetric connectivity data,
asym_con_matrix <- matrix(
runif(ncell(sim_pu_raster)^2),
nrow = ncell(sim_pu_raster),
ncol = ncell(sim_pu_raster)
)
# create problem with penalties for asymmetric connectivity
p26 <-
problem(sim_pu_raster, sim_features[[1:4]]) %>%
add_min_set_objective() %>%
add_relative_targets(0.1) %>%
add_asym_connectivity_penalties(penalty = 5, data = asym_con_matrix)
# print problem
print(p26)
# create data for penalizing planning units
pen_raster <- simulate_cost(sim_pu_raster)
# create problem with penalties that penalize solutions that select
# planning units with high values in the pen_raster object,
# here we will use a penalty value of 5 to indicate the trade-off (scaling)
# between the penalty values (in the sim_pu_raster) and the main objective
# (i.e., the cost of the solution)
p27 <-
problem(sim_pu_raster, sim_features) %>%
add_min_set_objective() %>%
add_relative_targets(0.1) %>%
add_linear_penalties(penalty = 5, data = pen_raster)
# print problem
print(p27)
Conservation planning problems involve making decisions on how planning units will be managed. These decisions are then associated with management actions (e.g., turning a planning unit into a protected area). The type of decision describes how the action is applied to planning units. For instance, the default decision-type is a binary decision type, meaning that we are either selecting or not selecting planning units for management. The following decision types are available.
# add binary decisions to a problem
p28 <-
problem(sim_pu_raster, sim_features) %>%
add_min_set_objective() %>%
add_relative_targets(0.1) %>%
add_binary_decisions()
# print problem
print(p28)
# add proportion decisions to a problem
p29 <-
problem(sim_pu_raster, sim_features) %>%
add_min_set_objective() %>%
add_relative_targets(0.1) %>%
add_proportion_decisions()
# print problem
print(p29)
Next, after specifying the mathematical formulation that underpins your conservation planning problem, you can specify how the problem should be solved. If you do not specify this information, the prioritizr R package will automatically use the best solver currently installed on your system with some reasonable defaults. We strongly recommend installing the Gurobi software suite and the gurobi R package to solve problems, and for more information on this topic please refer to the Gurobi Installation Guide. The following solvers are available.
# create a problem and specify that Gurobi should be used to solve the problem
# and specify an optimality gap of zero to obtain the optimal solution
p31 <-
problem(sim_pu_raster, sim_features) %>%
add_min_set_objective() %>%
add_relative_targets(0.1) %>%
add_binary_decisions() %>%
add_gurobi_solver(gap = 0)
# print problem
print(p31)
# create a problem and specify that IBM CPLEX should be used to solve the
# problem and specify an optimality gap of zero to obtain the optimal solution
p32 <-
problem(sim_pu_raster, sim_features) %>%
add_min_set_objective() %>%
add_relative_targets(0.1) %>%
add_binary_decisions() %>%
add_cplex_solver(gap = 0)
# print problem
print(p32)
add_rsymphony_solver()
,
add_lpsymphony_solver()
) (see the Solver benchmarks vignette
for details). As such, it is strongly recommended to use this solver if
the Gurobi and IBM CPLEX solvers are not available.
This solver requires the rcbc R package, which is currently
only available on GitHub.# create a problem and specify that CBC should be used to solve the
# problem and specify an optimality gap of zero to obtain the optimal solution
p33 <-
problem(sim_pu_raster, sim_features) %>%
add_min_set_objective() %>%
add_relative_targets(0.1) %>%
add_binary_decisions() %>%
add_cbc_solver(gap = 0)
# print problem
print(p33)
add_cbc_solver()
) for particular problems and is generally
faster than the SYMPHONY based solvers (i.e.,
add_rsymphony_solver()
,
add_lpsymphony_solver()
), it can sometimes take much longer
than the CBC solver for particular problems. This solver is
recommended if the add_gurobi_solver()
,
add_cplex_solver()
, add_cbc_solver()
cannot be
used.# create a problem and specify that HiGHS should be used to solve the
# problem and specify an optimality gap of zero to obtain the optimal solution
p34 <-
problem(sim_pu_raster, sim_features) %>%
add_min_set_objective() %>%
add_relative_targets(0.1) %>%
add_binary_decisions() %>%
add_highs_solver(gap = 0)
# print problem
print(p34)
# create a problem and specify that lpsymphony should be used to solve the
# problem and specify an optimality gap of zero to obtain the optimal solution
p35 <-
problem(sim_pu_raster, sim_features) %>%
add_min_set_objective() %>%
add_relative_targets(0.1) %>%
add_binary_decisions() %>%
add_lpsymphony_solver(gap = 0)
# print problem
print(p35)
# create a problem and specify that Rsymphony should be used to solve the
# problem and specify an optimality gap of zero to obtain the optimal solution
p36 <-
problem(sim_pu_raster, sim_features) %>%
add_min_set_objective() %>%
add_relative_targets(0.1) %>%
add_binary_decisions() %>%
add_rsymphony_solver(gap = 0)
# print problem
print(p36)
Many conservation planning exercises require a portfolio of solutions. For example, real-world exercises can involve presenting decision makers with a range of near-optimal decisions. Additionally, the number of times that different planning units are selected in different solutions can provide insight into their relative importance. The following portfolio methods are available.
# create a problem and specify that a portfolio should be created by
# finding five solutions within 10% of optimality
p37 <-
problem(sim_pu_raster, sim_features) %>%
add_min_set_objective() %>%
add_relative_targets(0.1) %>%
add_binary_decisions() %>%
add_gap_portfolio(number_solutions = 5, pool_gap = 0.2)
# print problem
print(p37)
# create a problem and specify that a portfolio should be created using
# the top five solutions
p38 <-
problem(sim_pu_raster, sim_features) %>%
add_min_set_objective() %>%
add_relative_targets(0.1) %>%
add_binary_decisions() %>%
add_top_portfolio(number_solutions = 5)
# print problem
print(p38)
# create a problem and specify that a portfolio should be created using
# extra solutions found while solving the problem
p39 <-
problem(sim_pu_raster, sim_features) %>%
add_min_set_objective() %>%
add_relative_targets(0.1) %>%
add_binary_decisions() %>%
add_extra_portfolio()
# print problem
print(p39)
# create a problem and specify that a portfolio containing 10 solutions
# should be created using using Bender's cuts
p40 <-
problem(sim_pu_raster, sim_features) %>%
add_min_set_objective() %>%
add_relative_targets(0.1) %>%
add_binary_decisions() %>%
add_cuts_portfolio(number_solutions = 10)
# print problem
print(p40)
# create a problem and specify a portfolio should be created that contains
# 10 solutions and that any duplicate solutions should not be removed
p41 <-
problem(sim_pu_raster, sim_features) %>%
add_min_set_objective() %>%
add_relative_targets(0.1) %>%
add_binary_decisions() %>%
add_shuffle_portfolio(number_solutions = 10, remove_duplicates = FALSE)
# print problem
print(p41)
After formulating our conservation planning problem and specifying
how the problem should be solved, we can use the solve()
function to obtain a solution. Note that the solver will typically print
out some information describing the size of the problem and report its
progress when searching for a suitable solution.
# formulate the problem
p42 <-
problem(sim_pu_raster, sim_features) %>%
add_min_set_objective() %>%
add_relative_targets(0.1) %>%
add_boundary_penalties(penalty = 500, edge_factor = 0.5) %>%
add_binary_decisions()
# print problem
print(p42)
# solve the problem (using the default solver)
s42 <- solve(p42)
# plot solution
plot(
s42, col = c("grey90", "darkgreen"), main = "Solution",
xlim = c(-0.1, 1.1), ylim = c(-0.1, 1.1), axes = FALSE
)
We can plot this solution because the planning unit input data are
spatially referenced in a raster format. The output format will always
match the planning unit data used to initialize the problem. For
example, the solution to a problem with planning units in a spatial
vector (shapefile) format would also be in a spatial vector format.
Similarly, if the planning units were in a tabular format (i.e.,
data.frame
), the solution would also be returned in a
tabular format.
We can also extract attributes from the solution that describe the quality of the solution and the optimization process.
Conservation planning involves making trade-offs between different criteria (e.g., overall cost, feature representation, connectivity). After obtaining a solution to a conservation planning problem, it is important to evaluate it to help understand the trade-offs made by the prioritization. This is also useful to compare different solutions with each other.
Summary statistics can be computed to evaluate the overall performance of a solution based on certain criteria. The following summaries can be computed.
The following functions are available to summarize a solution:
# create symmetric connectivity data
# here we parametrize connectivity based on which planning units are
# adjacent to each other
cm <- adjacency_matrix(sim_pu_raster)
# calculate statistic
eval_connectivity_summary(p42, s42, data = cm)
Conservation plans can take a long time to implement. Since funding availability and habitat quality can decline over time, it is critical that the most important places in a prioritization are scheduled for management as early as possible. For instance, some planning units in a solution might contain many rare species which do not occur in any other planning units. Alternatively, some planning units might offer an especially high return on investment that reduces costs considerably. As a consequence, conservation planners often need information on which planning units selected in a prioritization are most important to the overall success of the prioritization. To achieve this, conservation planners can use importance scores for each planning unit selected by a solution.
Let’s generate a prioritization so that can compare the different importance methods.
# formulate the problem
p43 <-
problem(sim_pu_raster, sim_features) %>%
add_min_set_objective() %>%
add_relative_targets(0.1) %>%
add_binary_decisions()
# solve the problem
s43 <- solve(p43)
# plot solution
plot(
s43, col = c("grey90", "darkgreen"), main = "Solution",
xlim = c(-0.1, 1.1), ylim = c(-0.1, 1.1), axes = FALSE
)
The following methods are available for computing importance scores.
# calculate replacement cost scores and make the solver quiet
rc43 <-
p43 %>%
add_default_solver(gap = 0, verbose = FALSE) %>%
eval_replacement_importance(s43)
# plot replacement cost scores
plot(
rc43, main = "Replacement cost scores",
xlim = c(-0.1, 1.1), ylim = c(-0.1, 1.1), axes = FALSE
)
# calculate Ferrier scores and extract total score
fs43 <- eval_ferrier_importance(p43, s43)[["total"]]
# plot Ferrier scores
plot(
fs43, main = "Ferrier scores",
xlim = c(-0.1, 1.1), ylim = c(-0.1, 1.1), axes = FALSE
)
# calculate rarity weighted richness scores
rwr43 <- eval_rare_richness_importance(p43, s43)
# plot rarity weighted richness scores
plot(
rwr43, main = "Rarity weighted richness scores",
xlim = c(-0.1, 1.1), ylim = c(-0.1, 1.1), axes = FALSE
)
In general, we recommend using replacement cost scores for small and moderate sized problems (e.g., less than 30,000 planning units) when it is feasible to do so. It can take a very long time to compute replacement cost scores, and so it is simply not feasible to compute these scores for particularly large problems. For moderate and large sized problems (e.g., more than 30,000 planning units), we recommend using the Ferrier method. This is because it explicitly accounts for representation targets, unlike the rarity weighted richness scores. We almost never recommend using the rarity weighted richness scores. This is because they do not consider criteria needed to inform conservation decision making (Brown et al. 2015).
Although we encourage users to build and tailor conservation planning
problems to suit their own needs, sometimes it just simply easier to use
something you’re already familiar with. The
marxan_problem()
function is provided as a convenient
wrapper for building and solving Marxan-style conservation
problems. If users already have their conservation planning data
formatted for use with Marxan, this function can also read
Marxan data files and solve the Marxan-style problems
using exact algorithm solvers. Please note that problems built
using the marxan_problem()
function are still solved the
same way as a problem initialized using the problem()
function, and therefore still require the installation of one of the
solver packages.
Here is a short example showing how the marxan_problem()
function can be used to read Marxan input files and the
solve
function can be used to solve the problem.
# set file path for Marxan input file
minput <- system.file("extdata/marxan/input.dat", package = "prioritizr")
# read Marxan input file
mp <- marxan_problem(minput)
# print problem
print(mp)
# solve the problem
ms <- solve(mp)
# since the Marxan data was in a tabular format, the solution is also returned
# in a tabular format, so we will print the first six rows of the table
# containing the solution
head(ms)
Alternatively, rather then using a Marxan input file to
construct the problem, we can manually read in the Marxan data
files and input these to the marxan_problem()
function.
# load data
pu <-
system.file("extdata/marxan/input/pu.dat", package = "prioritizr") %>%
read.table(sep = ",", header = TRUE)
features <-
system.file("extdata/marxan/input/spec.dat", package = "prioritizr") %>%
read.table(sep = ",", header = TRUE)
bound <-
system.file("extdata/marxan/input/bound.dat", package = "prioritizr") %>%
read.table(sep = "\t", header = TRUE)
rij <-
system.file("extdata/marxan/input/puvspr.dat", package = "prioritizr") %>%
read.table(sep = ",", header = TRUE)
# build Marxan problem using data.frame objects
mp2 <- marxan_problem(
x = pu, spec = features, puvspr = rij, bound = bound, blm = 0
)
# print problem
print(mp2)
Conservation planning problems that are built using the
marxan_problem()
function can also be customized. For
example, we could change the decision type for mp2
to
involve selecting a proportion of each planing unit (using the
add_proportion_decisions()
function).
Hopefully, this vignette has provided an informative overview of the prioritizr R package. For more examples using the package, please see the other vignettes. Perhaps, one of the best ways to learn a new piece of software is to just try it out. Test it, try breaking it, make mistakes, and learn from them. We would recommend trying to build conservation planning problems that resemble those you face in your own work—but using the built-in example data sets. This way you can quickly verify that the problems you build actually mean what you think they mean. For instance, you can try playing around with the targets and see what effect they have on the solutions, or try playing around with penalties and see what effect they have on the solutions. Finally, if you have any questions about using the package or suggestions for it, please post an issue on the package’s online coding repository.