SDM lab 1 - Downloading species data with spocc

First tell R where you are in the file structure. Before running this code, the folder specified by dir_data if it does not exist already.

here::i_am("r-tutorials/SDM-lab-spocc.qmd")
here() starts at /Users/eli.holmes/Documents/GitHub/NOAAHackDays
dir_data <- here::here("r-tutorials", "data")

Set the file paths.

# data as csv file
obs_csv <- file.path(dir_data, "obs.csv")
# data as geojson
obs_geo <- file.path(dir_data, "obs.geojson")

Load libraries.

require(sf)
require(spocc)
require(knitr)
require(dplyr)
require(readr)
require(mapview)

spocc R package

The spocc R package which allows us to query species occurrence data from a variety of sources:

spocc: A programmatic interface to many species occurrence data sources, including GBIF, iNaturalist, Berkeley Ecoinformatics Engine, eBird, iDigBio, VertNet, OBIS, and ALA. Includes functionality for retrieving species occurrence data, and combining that data.

Brown-throated sloth

This is a classic example for Species Distribution modeling. It is a sloth found in South America.

if (!file.exists(obs_csv)){
# get species occurrence data from GBIF with coordinates
res <- spocc::occ(
    query = 'Bradypus variegatus', 
    from = 'gbif', has_coords = TRUE)
  
# extract data frame from result
df <- res$gbif$data[[1]] 
# write data to a file
readr::write_csv(df, obs_csv)
}else{
  df <- readr::read_csv(obs_csv)
}
Rows: 500 Columns: 83
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr  (55): name, issues, prov, scientificName, datasetKey, publishingOrgKey,...
dbl  (21): longitude, latitude, key, crawlId, taxonKey, kingdomKey, phylumKe...
lgl   (1): isInCluster
dttm  (5): lastCrawled, lastParsed, dateIdentified, modified, lastInterpreted
date  (1): eventDate

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Look at the objects

Type names(res) and then names(res$gbif). Then type df to see what is in the dataframe.

Make a table of the first few lines of the dataframe.

knitr::kable(df %>% head)
name longitude latitude issues prov key scientificName datasetKey publishingOrgKey installationKey hostingOrganizationKey publishingCountry protocol lastCrawled lastParsed crawlId basisOfRecord occurrenceStatus taxonKey kingdomKey phylumKey classKey orderKey familyKey genusKey speciesKey acceptedTaxonKey acceptedScientificName kingdom phylum order family genus species genericName specificEpithet taxonRank taxonomicStatus iucnRedListCategory dateIdentified coordinateUncertaintyInMeters continent stateProvince year month day eventDate modified lastInterpreted references license isInCluster datasetName recordedBy identifiedBy geodeticDatum class countryCode country rightsHolder identifier http://unknown.org/nick verbatimEventDate collectionCode gbifID verbatimLocality occurrenceID taxonID catalogNumber institutionCode eventTime http://unknown.org/captive identificationID occurrenceRemarks sex lifeStage individualCount vernacularName locality higherClassification informationWithheld infraspecificEpithet identificationRemarks
Bradypus variegatus Schinz, 1825 -79.54650 8.987403 cdc,cdround gbif 4011797263 Bradypus variegatus Schinz, 1825 50c9509d-22c7-4a22-a47d-8c48425ef4a7 28eb1a3f-1c15-4a95-931a-4af90ecb574d 997448a8-f762-11e1-a439-00145eb45e9a 28eb1a3f-1c15-4a95-931a-4af90ecb574d US DWC_ARCHIVE 2023-08-17 14:21:14 2023-08-18 13:22:12 390 HUMAN_OBSERVATION PRESENT 2436353 1 44 359 1494 9418 2436350 2436353 2436353 Bradypus variegatus Schinz, 1825 Animalia Chordata Pilosa Bradypodidae Bradypus Bradypus variegatus Bradypus variegatus SPECIES ACCEPTED LC 2023-01-02 21:10:21 61 NORTH_AMERICA Panamá 2023 1 2 2023-01-02 2023-03-09 20:50:19 2023-08-18 13:22:12 https://www.inaturalist.org/observations/145694886 http://creativecommons.org/licenses/by/4.0/legalcode FALSE iNaturalist research-grade observations Kai Squires Kai Squires WGS84 Mammalia PA Panama Kai Squires 145694886 squiresk 2023/01/02 9:49 AM Observations 4011797263 Panama City, Panama https://www.inaturalist.org/observations/145694886 47067 145694886 iNaturalist 09:49:00-05:00 wild 324468453 NA NA NA NA NA NA NA NA NA NA
Bradypus variegatus Schinz, 1825 -84.68933 10.519427 cdc,cdround gbif 4091593493 Bradypus variegatus Schinz, 1825 50c9509d-22c7-4a22-a47d-8c48425ef4a7 28eb1a3f-1c15-4a95-931a-4af90ecb574d 997448a8-f762-11e1-a439-00145eb45e9a 28eb1a3f-1c15-4a95-931a-4af90ecb574d US DWC_ARCHIVE 2023-08-17 14:21:14 2023-08-18 13:41:19 390 HUMAN_OBSERVATION PRESENT 2436353 1 44 359 1494 9418 2436350 2436353 2436353 Bradypus variegatus Schinz, 1825 Animalia Chordata Pilosa Bradypodidae Bradypus Bradypus variegatus Bradypus variegatus SPECIES ACCEPTED LC 2023-01-31 13:22:05 4 NORTH_AMERICA Alajuela 2023 1 2 2023-01-02 2023-04-13 01:01:22 2023-08-18 13:41:19 https://www.inaturalist.org/observations/145696593 http://creativecommons.org/licenses/by-nc/4.0/legalcode FALSE iNaturalist research-grade observations Jessica Rae Wren WGS84 Mammalia CR Costa Rica Jessica Rae 145696593 altoidsboi 2023/01/02 3:54 PM Observations 4091593493 Alajuela Province, San Carlos, Costa Rica https://www.inaturalist.org/observations/145696593 47067 145696593 iNaturalist 15:54:00-06:00 wild 330896608 NA NA NA NA NA NA NA NA NA NA
Bradypus variegatus Schinz, 1825 -82.15604 9.316584 cdc,cdround gbif 4028722131 Bradypus variegatus Schinz, 1825 50c9509d-22c7-4a22-a47d-8c48425ef4a7 28eb1a3f-1c15-4a95-931a-4af90ecb574d 997448a8-f762-11e1-a439-00145eb45e9a 28eb1a3f-1c15-4a95-931a-4af90ecb574d US DWC_ARCHIVE 2023-08-17 14:21:14 2023-08-18 13:41:57 390 HUMAN_OBSERVATION PRESENT 2436353 1 44 359 1494 9418 2436350 2436353 2436353 Bradypus variegatus Schinz, 1825 Animalia Chordata Pilosa Bradypodidae Bradypus Bradypus variegatus Bradypus variegatus SPECIES ACCEPTED LC 2023-01-03 18:24:03 75 NORTH_AMERICA Bocas del Toro 2023 1 3 2023-01-03 2023-02-02 01:16:10 2023-08-18 13:41:57 https://www.inaturalist.org/observations/145767431 http://creativecommons.org/licenses/by-nc/4.0/legalcode FALSE iNaturalist research-grade observations lucybrown19 lucybrown19 WGS84 Mammalia PA Panama lucybrown19 145767431 lucybrown19 Tue Jan 03 2023 11:36:24 GMT -0500 (EST) Observations 4028722131 Bocas del Toro, PA-BC, PA https://www.inaturalist.org/observations/145767431 47067 145767431 iNaturalist 11:36:24-05:00 wild 324686542 NA NA NA NA NA NA NA NA NA NA
Bradypus variegatus Schinz, 1825 -84.06505 10.449010 cdc,cdround gbif 4011771409 Bradypus variegatus Schinz, 1825 50c9509d-22c7-4a22-a47d-8c48425ef4a7 28eb1a3f-1c15-4a95-931a-4af90ecb574d 997448a8-f762-11e1-a439-00145eb45e9a 28eb1a3f-1c15-4a95-931a-4af90ecb574d US DWC_ARCHIVE 2023-08-17 14:21:14 2023-08-18 13:41:03 390 HUMAN_OBSERVATION PRESENT 2436353 1 44 359 1494 9418 2436350 2436353 2436353 Bradypus variegatus Schinz, 1825 Animalia Chordata Pilosa Bradypodidae Bradypus Bradypus variegatus Bradypus variegatus SPECIES ACCEPTED LC 2023-01-03 19:44:25 5 NORTH_AMERICA Heredia 2023 1 3 2023-01-03 2023-01-16 15:12:42 2023-08-18 13:41:03 https://www.inaturalist.org/observations/145772860 http://creativecommons.org/licenses/by/4.0/legalcode FALSE iNaturalist research-grade observations Chris Harrison Chris Harrison WGS84 Mammalia CR Costa Rica Chris Harrison 145772860 sandboa 2023-01-03 11:57:43 Observations 4011771409 Sarapiquí, CR-HE, CR https://www.inaturalist.org/observations/145772860 47067 145772860 iNaturalist 11:57:43-06:00 wild 324705982 NA NA NA NA NA NA NA NA NA NA
Bradypus variegatus Schinz, 1825 -79.64967 9.061424 cdc,cdround gbif 4014910775 Bradypus variegatus Schinz, 1825 50c9509d-22c7-4a22-a47d-8c48425ef4a7 28eb1a3f-1c15-4a95-931a-4af90ecb574d 997448a8-f762-11e1-a439-00145eb45e9a 28eb1a3f-1c15-4a95-931a-4af90ecb574d US DWC_ARCHIVE 2023-08-17 14:21:14 2023-08-18 13:21:15 390 HUMAN_OBSERVATION PRESENT 2436353 1 44 359 1494 9418 2436350 2436353 2436353 Bradypus variegatus Schinz, 1825 Animalia Chordata Pilosa Bradypodidae Bradypus Bradypus variegatus Bradypus variegatus SPECIES ACCEPTED LC 2023-01-04 02:04:32 365 NORTH_AMERICA Panamá 2023 1 3 2023-01-03 2023-03-09 20:50:14 2023-08-18 13:21:15 https://www.inaturalist.org/observations/145797456 http://creativecommons.org/licenses/by-nc/4.0/legalcode FALSE iNaturalist research-grade observations Matt Cohen and Elizabeth Hargrave Matt Cohen and Elizabeth Hargrave WGS84 Mammalia PA Panama Matt Cohen and Elizabeth Hargrave 145797456 mattandeliz 2023/01/03 9:02 AM Observations 4014910775 Panama https://www.inaturalist.org/observations/145797456 47067 145797456 iNaturalist 09:02:00-05:00 wild 324778216 NA NA NA NA NA NA NA NA NA NA
Bradypus variegatus Schinz, 1825 -80.14370 8.602112 cdc,cdround gbif 4015257923 Bradypus variegatus Schinz, 1825 50c9509d-22c7-4a22-a47d-8c48425ef4a7 28eb1a3f-1c15-4a95-931a-4af90ecb574d 997448a8-f762-11e1-a439-00145eb45e9a 28eb1a3f-1c15-4a95-931a-4af90ecb574d US DWC_ARCHIVE 2023-08-17 14:21:14 2023-08-18 13:40:25 390 HUMAN_OBSERVATION PRESENT 2436353 1 44 359 1494 9418 2436350 2436353 2436353 Bradypus variegatus Schinz, 1825 Animalia Chordata Pilosa Bradypodidae Bradypus Bradypus variegatus Bradypus variegatus SPECIES ACCEPTED LC 2023-01-05 00:14:33 171 NORTH_AMERICA Coclé 2023 1 4 2023-01-04 2023-03-09 20:50:13 2023-08-18 13:40:25 https://www.inaturalist.org/observations/145862390 http://creativecommons.org/licenses/by-nc/4.0/legalcode FALSE iNaturalist research-grade observations Matt Cohen and Elizabeth Hargrave Matt Cohen and Elizabeth Hargrave WGS84 Mammalia PA Panama Matt Cohen and Elizabeth Hargrave 145862390 mattandeliz 2023/01/04 2:49 PM Observations 4015257923 Anton Valley, Panama https://www.inaturalist.org/observations/145862390 47067 145862390 iNaturalist 14:49:00-05:00 wild 324981107 NA NA NA NA NA NA NA NA NA NA

Convert to points of observation from lon/lat columns in data frame. This part is using the sf R package to make our dataframe into a geospatial object. This way the mapping functions will be able to plot the points easily.

obs <- df %>% 
  sf::st_as_sf(
    coords = c("longitude", "latitude"), # what columns have x and y
    crs = sf::st_crs(4326)) %>% # what projection to use
    dplyr::select(prov, key) # save space (optional)
# save the file
sf::write_sf(obs, obs_geo, delete_dsn=TRUE)
nrow(obs) # number of rows
[1] 500

The format of a bounding box is [min-longitude, min-latitude, max-longitude, max-latitude]

Plot using mapview.

mapview::mapview(obs, col.regions = "gray")