{
"cells": [
{
"cell_type": "raw",
"id": "bcf23fc8-ae63-4efe-84bb-c67acdffb973",
"metadata": {},
"source": [
"---\n",
"title: \"Data subsetting and plotting with earthaccess and xarray\"\n",
"author: Luis Lopez (NASA) and adapted by Eli Holmes (NOAA)\n",
"---"
]
},
{
"cell_type": "markdown",
"id": "7cf2da01-6c4d-49c6-8728-7939a5c982ce",
"metadata": {},
"source": [
"[][colab-link]\n",
"\n",
" \n",
" [][download-link]\n",
"\n",
"[download-link]: https://nmfs-opensci.github.io/NMFSHackDays-2025/topics-2025/2025-02-14-earthdata/2-subset-and-plot.ipynb\n",
"[colab-link]: https://colab.research.google.com/github/nmfs-opensci/nmfshackdays-2025/blob/main/topics-2025/2025-02-14-earthdata/2-subset-and-plot.ipynb\n",
"[jupyter-link]: https://nmfs-openscapes.2i2c.cloud/hub/user-redirect/lab?fromURL=https://raw.githubusercontent.com/nmfs-opensci/nmfshackdays-2025/main/topics-2025/2025-02-14-earthdata/2-subset-and-plot.ipynb"
]
},
{
"cell_type": "markdown",
"id": "45bbcecf-537b-40a7-92a3-e040f2c64c2f",
"metadata": {},
"source": [
">📘 Learning Objectives\n",
">\n",
"> 1. How to crop a single data file\n",
"> 2. How to create a data cube (DataSet) with `xarray`\n",
"> 3. Extract variables, temporal slices, and spatial slices from an `xarray` dataset\n",
">\n"
]
},
{
"cell_type": "markdown",
"id": "a28d9430-1a3e-480c-bf15-c35f938b4210",
"metadata": {},
"source": [
"## Summary\n",
"\n",
"In this examples we will use the [xarray](https://xarray.dev/) and [earthaccess](https://nsidc.github.io/earthaccess/) to subset data and make figures.\n",
"\n",
"For this tutorial we will use the [GHRSST Level 4 MUR Global Foundation Sea Surface Temperature Analysis](https://cmr.earthdata.nasa.gov/search/concepts/C1996881146-POCLOUD.html) (v4.1) data. This is much higher resolution data than the AVHRR data and we will do spatially subsetting to a small area of interest.\n",
"\n",
"#### For those not working in the JupyterHub\n",
"\n",
"Create a code cell and run `pip install earthaccess`"
]
},
{
"cell_type": "markdown",
"id": "66d78efc-2d62-428a-a813-e58f949ee1bf",
"metadata": {},
"source": [
"### Import Required Packages"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "f04a653d-b9e5-4cfe-a198-b5b612389742",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"# Suppress warnings\n",
"import warnings\n",
"warnings.simplefilter('ignore')\n",
"warnings.filterwarnings('ignore')\n",
"from pprint import pprint\n",
"\n",
"import earthaccess\n",
"import xarray as xr"
]
},
{
"cell_type": "markdown",
"id": "442bd92a-8f2d-4448-a59e-da4567710730",
"metadata": {},
"source": [
"## Authenticate to NASA Earthdata\n",
"\n",
"We will authenticate our Earthaccess session, and then open the results like we did in the Search & Discovery section."
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "0fe0002f-c759-4611-8dd7-861b8bd38971",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"auth = earthaccess.login()\n",
"# are we authenticated?\n",
"if not auth.authenticated:\n",
" # ask for credentials and persist them in a .netrc file\n",
" auth.login(strategy=\"interactive\", persist=True)"
]
},
{
"cell_type": "markdown",
"id": "a6a3cb10-6988-401e-a618-59e2f5ac3228",
"metadata": {},
"source": [
"## Get a vector of urls to our nc files"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "3dbe9828-37e9-4949-846f-297057e5b0d5",
"metadata": {
"tags": []
},
"outputs": [
{
"data": {
"text/plain": [
"93"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"short_name = 'MUR-JPL-L4-GLOB-v4.1'\n",
"version = \"4.1\"\n",
"date_start = \"2020-01-01\"\n",
"date_end = \"2020-04-01\"\n",
"date_range = (date_start, date_end)\n",
"# min lon, min lat, max lon, max lat\n",
"bbox = (-75.5, 33.5, -73.5, 35.5) \n",
"\n",
"results = earthaccess.search_data(\n",
" short_name = short_name,\n",
" version = version,\n",
" cloud_hosted = True,\n",
" temporal = date_range,\n",
" bounding_box = bbox,\n",
")\n",
"len(results)"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "70973325-f862-4ad2-932f-46a4b9c24217",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"
Data: 20200101090000-JPL-L4_GHRSST-SSTfnd-MUR-GLOB-v02.0-fv04.1.nc
\n", "Size: 679.04 MB
\n", "Cloud Hosted: True
\n", "<xarray.Dataset> Size: 29GB\n", "Dimensions: (time: 1, lat: 17999, lon: 36000)\n", "Coordinates:\n", " * time (time) datetime64[ns] 8B 2020-01-01T09:00:00\n", " * lat (lat) float32 72kB -89.99 -89.98 -89.97 ... 89.98 89.99\n", " * lon (lon) float32 144kB -180.0 -180.0 -180.0 ... 180.0 180.0\n", "Data variables:\n", " analysed_sst (time, lat, lon) float64 5GB ...\n", " analysis_error (time, lat, lon) float64 5GB ...\n", " mask (time, lat, lon) float32 3GB ...\n", " sea_ice_fraction (time, lat, lon) float64 5GB ...\n", " dt_1km_data (time, lat, lon) timedelta64[ns] 5GB ...\n", " sst_anomaly (time, lat, lon) float64 5GB ...\n", "Attributes: (12/47)\n", " Conventions: CF-1.7\n", " title: Daily MUR SST, Final product\n", " summary: A merged, multi-sensor L4 Foundation SST anal...\n", " references: http://podaac.jpl.nasa.gov/Multi-scale_Ultra-...\n", " institution: Jet Propulsion Laboratory\n", " history: created at nominal 4-day latency; replaced nr...\n", " ... ...\n", " project: NASA Making Earth Science Data Records for Us...\n", " publisher_name: GHRSST Project Office\n", " publisher_url: http://www.ghrsst.org\n", " publisher_email: ghrsst-po@nceo.ac.uk\n", " processing_level: L4\n", " cdm_data_type: grid
<xarray.Dataset> Size: 29GB\n", "Dimensions: (time: 1, lat: 17999, lon: 36000)\n", "Coordinates:\n", " * time (time) datetime64[ns] 8B 2020-01-01T09:00:00\n", " * lat (lat) float32 72kB -89.99 -89.98 -89.97 ... 89.98 89.99\n", " * lon (lon) float32 144kB -180.0 -180.0 -180.0 ... 180.0 180.0\n", "Data variables:\n", " analysed_sst (time, lat, lon) float64 5GB ...\n", " analysis_error (time, lat, lon) float64 5GB ...\n", " mask (time, lat, lon) float32 3GB ...\n", " sea_ice_fraction (time, lat, lon) float64 5GB ...\n", " dt_1km_data (time, lat, lon) timedelta64[ns] 5GB ...\n", " sst_anomaly (time, lat, lon) float64 5GB ...\n", "Attributes: (12/47)\n", " Conventions: CF-1.7\n", " title: Daily MUR SST, Final product\n", " summary: A merged, multi-sensor L4 Foundation SST anal...\n", " references: http://podaac.jpl.nasa.gov/Multi-scale_Ultra-...\n", " institution: Jet Propulsion Laboratory\n", " history: created at nominal 4-day latency; replaced nr...\n", " ... ...\n", " project: NASA Making Earth Science Data Records for Us...\n", " publisher_name: GHRSST Project Office\n", " publisher_url: http://www.ghrsst.org\n", " publisher_email: ghrsst-po@nceo.ac.uk\n", " processing_level: L4\n", " cdm_data_type: grid
<xarray.Dataset> Size: 2MB\n", "Dimensions: (time: 1, lat: 201, lon: 201)\n", "Coordinates:\n", " * time (time) datetime64[ns] 8B 2020-01-01T09:00:00\n", " * lat (lat) float32 804B 33.5 33.51 33.52 ... 35.48 35.49 35.5\n", " * lon (lon) float32 804B -75.5 -75.49 -75.48 ... -73.51 -73.5\n", "Data variables:\n", " analysed_sst (time, lat, lon) float64 323kB ...\n", " analysis_error (time, lat, lon) float64 323kB ...\n", " mask (time, lat, lon) float32 162kB ...\n", " sea_ice_fraction (time, lat, lon) float64 323kB ...\n", " dt_1km_data (time, lat, lon) timedelta64[ns] 323kB ...\n", " sst_anomaly (time, lat, lon) float64 323kB ...\n", "Attributes: (12/47)\n", " Conventions: CF-1.7\n", " title: Daily MUR SST, Final product\n", " summary: A merged, multi-sensor L4 Foundation SST anal...\n", " references: http://podaac.jpl.nasa.gov/Multi-scale_Ultra-...\n", " institution: Jet Propulsion Laboratory\n", " history: created at nominal 4-day latency; replaced nr...\n", " ... ...\n", " project: NASA Making Earth Science Data Records for Us...\n", " publisher_name: GHRSST Project Office\n", " publisher_url: http://www.ghrsst.org\n", " publisher_email: ghrsst-po@nceo.ac.uk\n", " processing_level: L4\n", " cdm_data_type: grid
<xarray.Dataset> Size: 143GB\n", "Dimensions: (time: 5, lat: 17999, lon: 36000)\n", "Coordinates:\n", " * time (time) datetime64[ns] 40B 2020-01-01T09:00:00 ... 2020-...\n", " * lat (lat) float32 72kB -89.99 -89.98 -89.97 ... 89.98 89.99\n", " * lon (lon) float32 144kB -180.0 -180.0 -180.0 ... 180.0 180.0\n", "Data variables:\n", " analysed_sst (time, lat, lon) float64 26GB dask.array<chunksize=(1, 1023, 2047), meta=np.ndarray>\n", " analysis_error (time, lat, lon) float64 26GB dask.array<chunksize=(1, 1023, 2047), meta=np.ndarray>\n", " mask (time, lat, lon) float32 13GB dask.array<chunksize=(1, 1447, 2895), meta=np.ndarray>\n", " sea_ice_fraction (time, lat, lon) float64 26GB dask.array<chunksize=(1, 1447, 2895), meta=np.ndarray>\n", " dt_1km_data (time, lat, lon) timedelta64[ns] 26GB dask.array<chunksize=(1, 1447, 2895), meta=np.ndarray>\n", " sst_anomaly (time, lat, lon) float64 26GB dask.array<chunksize=(1, 1023, 2047), meta=np.ndarray>\n", "Attributes: (12/47)\n", " Conventions: CF-1.7\n", " title: Daily MUR SST, Final product\n", " summary: A merged, multi-sensor L4 Foundation SST anal...\n", " references: http://podaac.jpl.nasa.gov/Multi-scale_Ultra-...\n", " institution: Jet Propulsion Laboratory\n", " history: created at nominal 4-day latency; replaced nr...\n", " ... ...\n", " project: NASA Making Earth Science Data Records for Us...\n", " publisher_name: GHRSST Project Office\n", " publisher_url: http://www.ghrsst.org\n", " publisher_email: ghrsst-po@nceo.ac.uk\n", " processing_level: L4\n", " cdm_data_type: grid
<xarray.Dataset> Size: 9MB\n", "Dimensions: (time: 5, lat: 201, lon: 201)\n", "Coordinates:\n", " * time (time) datetime64[ns] 40B 2020-01-01T09:00:00 ... 2020-...\n", " * lat (lat) float32 804B 33.5 33.51 33.52 ... 35.48 35.49 35.5\n", " * lon (lon) float32 804B -75.5 -75.49 -75.48 ... -73.51 -73.5\n", "Data variables:\n", " analysed_sst (time, lat, lon) float64 2MB dask.array<chunksize=(1, 201, 201), meta=np.ndarray>\n", " analysis_error (time, lat, lon) float64 2MB dask.array<chunksize=(1, 201, 201), meta=np.ndarray>\n", " mask (time, lat, lon) float32 808kB dask.array<chunksize=(1, 201, 201), meta=np.ndarray>\n", " sea_ice_fraction (time, lat, lon) float64 2MB dask.array<chunksize=(1, 201, 201), meta=np.ndarray>\n", " dt_1km_data (time, lat, lon) timedelta64[ns] 2MB dask.array<chunksize=(1, 201, 201), meta=np.ndarray>\n", " sst_anomaly (time, lat, lon) float64 2MB dask.array<chunksize=(1, 201, 201), meta=np.ndarray>\n", "Attributes: (12/47)\n", " Conventions: CF-1.7\n", " title: Daily MUR SST, Final product\n", " summary: A merged, multi-sensor L4 Foundation SST anal...\n", " references: http://podaac.jpl.nasa.gov/Multi-scale_Ultra-...\n", " institution: Jet Propulsion Laboratory\n", " history: created at nominal 4-day latency; replaced nr...\n", " ... ...\n", " project: NASA Making Earth Science Data Records for Us...\n", " publisher_name: GHRSST Project Office\n", " publisher_url: http://www.ghrsst.org\n", " publisher_email: ghrsst-po@nceo.ac.uk\n", " processing_level: L4\n", " cdm_data_type: grid
<xarray.Dataset> Size: 7MB\n", "Dimensions: (time: 4, lat: 201, lon: 201)\n", "Coordinates:\n", " * time (time) datetime64[ns] 32B 2020-01-01T09:00:00 ... 2020-...\n", " * lat (lat) float32 804B 33.5 33.51 33.52 ... 35.48 35.49 35.5\n", " * lon (lon) float32 804B -75.5 -75.49 -75.48 ... -73.51 -73.5\n", "Data variables:\n", " analysed_sst (time, lat, lon) float64 1MB dask.array<chunksize=(1, 201, 201), meta=np.ndarray>\n", " analysis_error (time, lat, lon) float64 1MB dask.array<chunksize=(1, 201, 201), meta=np.ndarray>\n", " mask (time, lat, lon) float32 646kB dask.array<chunksize=(1, 201, 201), meta=np.ndarray>\n", " sea_ice_fraction (time, lat, lon) float64 1MB dask.array<chunksize=(1, 201, 201), meta=np.ndarray>\n", " dt_1km_data (time, lat, lon) timedelta64[ns] 1MB dask.array<chunksize=(1, 201, 201), meta=np.ndarray>\n", " sst_anomaly (time, lat, lon) float64 1MB dask.array<chunksize=(1, 201, 201), meta=np.ndarray>\n", "Attributes: (12/47)\n", " Conventions: CF-1.7\n", " title: Daily MUR SST, Final product\n", " summary: A merged, multi-sensor L4 Foundation SST anal...\n", " references: http://podaac.jpl.nasa.gov/Multi-scale_Ultra-...\n", " institution: Jet Propulsion Laboratory\n", " history: created at nominal 4-day latency; replaced nr...\n", " ... ...\n", " project: NASA Making Earth Science Data Records for Us...\n", " publisher_name: GHRSST Project Office\n", " publisher_url: http://www.ghrsst.org\n", " publisher_email: ghrsst-po@nceo.ac.uk\n", " processing_level: L4\n", " cdm_data_type: grid