library(earthdatalogin)
library(lubridate)
library(terra)
Subset and Plot
- How to crop a single data file
- How to create a data cube with
terra
- How to crop a data cube to a box
Summary
In this example, we will utilize the earthdatalogin
R package to retrieve, subset, and crop sea surface temperature data as a file and as a datacube from NASA Earthdata search. The earthdatalogin
R package simplifies the process of discovering and accessing NASA Earth science data. We manipulate with the terra package. terra is a very versatile package and we are just scratching the surface. Read more about terra here.
For more on earthdatalogin
visit the earthdatalogin
GitHub page and/or the earthdatalogin
documentation site. Be aware that earthdatalogin
is under active development and that we are using the development version on GitHub.
Prerequisites
The tutorial can be run with the guest Earthdata Login that is in earthdatalogin
. However, if you will be using the NASA Earthdata portal more regularly, please register for an Earthdata Login account. Please https://urs.earthdata.nasa.gov to register and manage your Earthdata Login account. This account is free to create and only takes a moment to set up.
See the 1-earthdatalogin tutorial for set-up if you are running the tutorial locally.
Load Required Packages
Authenticate to NASA Earthdata.
::edl_netrc() earthdatalogin
Get a vector of urls to our nc files
Get the urls. The results
object is a vector of urls pointing to our netCDF
files in the cloud. Each netCDF
file is circa 670Mb.
<- 'MUR-JPL-L4-GLOB-v4.1'
short_name <- c(xmin=-75.5, ymin=33.5, xmax=-73.5, ymax=35.5)
bbox <- c("2020-01-16", "2020-12-16")
tbox
<- earthdatalogin::edl_search(
results short_name = short_name,
version = "4.1",
temporal = tbox,
bounding_box = paste(bbox,collapse=",")
)length(results)
[1] 336
1:3] results[
[1] "https://archive.podaac.earthdata.nasa.gov/podaac-ops-cumulus-protected/MUR-JPL-L4-GLOB-v4.1/20200116090000-JPL-L4_GHRSST-SSTfnd-MUR-GLOB-v02.0-fv04.1.nc"
[2] "https://archive.podaac.earthdata.nasa.gov/podaac-ops-cumulus-protected/MUR-JPL-L4-GLOB-v4.1/20200117090000-JPL-L4_GHRSST-SSTfnd-MUR-GLOB-v02.0-fv04.1.nc"
[3] "https://archive.podaac.earthdata.nasa.gov/podaac-ops-cumulus-protected/MUR-JPL-L4-GLOB-v4.1/20200118090000-JPL-L4_GHRSST-SSTfnd-MUR-GLOB-v02.0-fv04.1.nc"
Crop and plot one netCDF file
Each MUR SST netCDF
file is large so I do not want to download. Instead I will use terra::rast()
to do subset the data on the server side. vsi = TRUE
is letting function know that these are files in the cloud and to use GDAL functionality for that type of resource.
<- terra::rast(results[1], vsi=TRUE) ras
Getting errors? Scroll below to the troubleshooting section.
Crop to a very small region. Note order of terms is different than in bbox!!
<- terra::ext(c(xmin=-75.5, xmax=-73.5, ymin=33.5, ymax=35.5 ))
e <- terra::crop(ras, e)
rc rc
class : SpatRaster
dimensions : 200, 200, 6 (nrow, ncol, nlyr)
resolution : 0.01, 0.01 (x, y)
extent : -75.495, -73.495, 33.505, 35.505 (xmin, xmax, ymin, ymax)
coord. ref. : lon/lat WGS 84
source(s) : memory
names : analysed_sst, analy~error, mask, sea_i~ction, dt_1km_data, sst_anomaly
min values : 290.656, 0.37, 1, 1.28, 128, 0.410
max values : 298.954, 0.40, 2, 1.28, 128, 3.585
time : 2020-01-16 09:00:00 UTC
Plot:
plot(rc[[c(1, 2)]])
Crop and plot multiple netCDF files
We can send multiple urls to terra
.
<- terra::rast(results[c(1:4)], vsi = TRUE)
ras_all ras_all
class : SpatRaster
dimensions : 17999, 36000, 24 (nrow, ncol, nlyr)
resolution : 0.01, 0.01 (x, y)
extent : -179.995, 180.005, -89.995, 89.995 (xmin, xmax, ymin, ymax)
coord. ref. : lon/lat WGS 84
sources : 20200116090000-JPL-L4_GHRSST-SSTfnd-MUR-GLOB-v02.0-fv04.1.nc:analysed_sst
20200116090000-JPL-L4_GHRSST-SSTfnd-MUR-GLOB-v02.0-fv04.1.nc:analysis_error
20200116090000-JPL-L4_GHRSST-SSTfnd-MUR-GLOB-v02.0-fv04.1.nc:mask
... and 21 more source(s)
varnames : analysed_sst (analysed sea surface temperature)
analysis_error (estimated error standard deviation of analysed_sst)
mask (sea/land field composite mask)
...
names : analysed_sst, analy~error, mask, sea_i~ction, dt_1km_data, sst_anomaly, ...
unit : kelvin, kelvin, , , hours, kelvin, ...
time : 2020-01-16 09:00:00 to 2020-01-19 09:00:00 UTC
Crop to a small extent. Note order of terms is different than in bbox! Since we will only plot sst for this example, it is faster to first select our variable of interest.
<- terra::ext(c(xmin=-75.5, xmax=-73.5, ymin=33.5, ymax=35.5 ))
e <- ras_all["analysed_sst",]
ras_sst <- terra::crop(ras_sst, e)
rc_sst rc_sst
class : SpatRaster
dimensions : 200, 200, 4 (nrow, ncol, nlyr)
resolution : 0.01, 0.01 (x, y)
extent : -75.495, -73.495, 33.505, 35.505 (xmin, xmax, ymin, ymax)
coord. ref. : lon/lat WGS 84
source(s) : memory
varname : analysed_sst (analysed sea surface temperature)
names : analysed_sst, analysed_sst, analysed_sst, analysed_sst
min values : 290.656, 290.316, 289.456, 289.216
max values : 298.954, 298.869, 298.662, 298.119
unit : kelvin, kelvin, kelvin, kelvin
time : 2020-01-16 09:00:00 to 2020-01-19 09:00:00 UTC
Convert Kelvin to Celsius.
<- rc_sst - 273.15 rc_sst
Now plot. We will set the range so it is the same across plots and clean up the titles to be just day without time.
<- terra::time(x = rc_sst) |> lubridate::date() |> as.character()
titles plot(rc_sst,
range = c(16, 26),
main = titles)
Conclusions
Some really cool things just happened here! You connected to multiple remote-sensing files (netCDF
) in the cloud and worked with them without directly downloading them.