Joseph Holler's Open Source GIScience Resources at Middlebury College
: In this lesson, we will learn methods to quantitatively assess Choropleth map agreement and accuracy.
Friday, April 23
Prior to Friday’s class, you should have:
Today, we’ll investigate the question of how to compare choropleth maps for agreement between the maps or for assessment of accuracy.
Let’s connect back to concepts in the Longley et al chapter on Uncertainty as well as E Tate’s (2013) Uncertainty Analysis for a Social Vulnerability Index
What about ordinal data, which is every choropleth map?
Application Program Interface! Examples:
If an API is useful for data science, it’s common to find an R package allowing you to connect to that API
Additionally, if a data course is useful for data science, it’s common to find an R package specific to loading that data source. For example:
or_fig4 = # load original figure 4 data
read_sf(here("data", "derived", "public", "georeferencing.gpkg"),
layer="ta_resilience") %>%
# load ta_resilience layer from georeferencing geopackage
st_drop_geometry() %>%
# remove the geometry data because two geometries cannot be joined
select(c(ID_2,resilience)) %>%
# select only the ID_2 and resilience columns
na.omit()
# remove records with null values
rp_fig4 = ta_2010 %>% # prepare our reproduction of figure 4 data
select(c(ID_2,capacity_2010)) %>%
# select only the ID_2 and resilience columns
# note: geometry columns are 'sticky' -- only way to remove is st_drop_geometry()
na.omit() %>%
# remove records with null values
mutate(rp_res = case_when(
capacity_2010 <= ta_brks[2] ~ 1,
capacity_2010 <= ta_brks[3] ~ 2,
capacity_2010 <= ta_brks[4] ~ 3,
capacity_2010 > ta_brks[4] ~ 4
))
# code the capacity scores as integers, as we see them classified on the map.
#ta_brks was the result of a Jenks classification, as noted on Malcomb et al's maps
fig4compare = inner_join(rp_fig4,or_fig4,by="ID_2") %>%
#inner join on field ID_2 keeps only matching records
filter(rp_res>0 & rp_res<5 & resilience > 0 & resilience < 5)
# keep only records with valid resilience scores
table(fig4compare$resilience,fig4compare$rp_res)
# crosstabulation with frequencies
cor.test(fig4compare$resilience,fig4compare$rp_res,method="spearman")
# Spearman's Rho correlation test
fig4compare = mutate(fig4compare, difference = rp_res - resilience)
# Calculate difference between the maps so that you can create a difference map
orfig5vect =
read_sf(here("data", "derived", "public", "georeferencing.gpkg"),
layer="vulnerability_grid")
# load original figure 5 data
orfig5rast = st_rasterize(orfig5vect["bmean"], template=ta_final)
# convert mean of blue values into a raster using ta_final as a reference for raster
# extent, cell size, CRS, etc.
orfig5rast = orfig5rast %>%
mutate(or = 1-
(bmean - min(orfig5rast[[1]], na.rm= TRUE)) /
(max(orfig5rast[[1]], na.rm= TRUE) -
min(orfig5rast[[1]], na.rm= TRUE)
)
) # or is Re-scaled from 0 to 1 with (value - min)/(max - min)
# it is also inverted, because higher blue values are less red
ta_final = ta_final %>%
mutate(rp =
(capacity_2010 - min(ta_final[[1]], na.rm= TRUE)) /
(max(ta_final[[1]], na.rm= TRUE) -
min(ta_final[[1]], na.rm= TRUE)
)
) # rp is Re-scaled from 0 to 1 with (value - min)/(max - min)
fig5comp = c( select(ta_final,"rp"), select(orfig5rast,"or"))
# combine the original (or) fig 5 and reproduced (rp) fig 5
fig5comp = fig5comp %>% mutate( diff = rp - or )
# calculate difference between the original and reproduction,
# for purposes of mapping
fig5comppts = st_as_sf(fig5comp)
# convert raster to vector points to simplify plotting and correlation testing
plot(fig5comppts$or, fig5comppts$rp, xlab="Original Study", ylab="Reproduction")
# create scatterplot of original results and reproduction results
cor.test(fig5comppts$or, fig5comppts$rp, method="spearman")
# Spearman's Rho correlation test
# Hint for mapping raster results: refer to the diff raster attribute
# in the fig5comp stars object like this: fig5comp["diff"]
The code blocks above will help you to integrate your georeferenced results from QGIS with your final results from the study in R. Of course, you may have to modify the file names, layer names, and column names to match the data as you created it in QGIS and R.
To finalize the study, make the following figures:
You can save all of these figures as .png
files into your repository’s results
folder using R code.
One great thing about doing cartography in R is that you can repurpose code from one map to the next!