Apparent Visible Wavelength (AVW)

Author

Eli Holmes (NOAA)

Colab Badge JupyterHub Badge Download Badge

📘 Learning Objectives

  1. Show how to work with the earthaccess package for PACE data
  2. Create a NASA EDL session for authentication
  3. Load single files with xarray.open_dataset
  4. Load multiple files with xarray.open_mfdataset

Overview

The PACE Level-3 (gridded) OCI (ocean color instrument) data is available on an NASA EarthData. Search using the instrument filter “OCI” and processing level filter “Gridded Observations” https://search.earthdata.nasa.gov/search?fi=OCI&fl=3%2B-%2BGridded%2BObservations and you will see 45+ data collections. In this tutorial, we will look at the Apparent Visible Wavelength (AVW) product.

The collection information page is here: PACE OCI Level-3 Global Mapped Apparent Visible Wavelength (AVW) Data, version 3.0. The concept id for this dataset is “C3385050418-OB_CLOUD” and the short name is “PACE_OCI_L3M_AVW”.

Prerequisites

You need to have an EarthData Login username and password. Go here to get one https://urs.earthdata.nasa.gov/

I assume you have a .netrc file at ~ (home). ~/.netrc should look just like this with your username and password. Create that file if needed. You don’t need to create it if you don’t have this file. The earthaccess.login(persist=True) line will ask for your username and password and create the .netrc file for you.

machine urs.earthdata.nasa.gov
        login yourusername
        password yourpassword

For those not working in the JupyterHub

Uncomment this and run the cell:

# pip install earthaccess

Create a NASA EDL authenticated session

Authenticate with earthaccess.login().Authenticate with earthaccess.login(). You will need your EarthData Login username and password for this step. Get one here https://urs.earthdata.nasa.gov/.

import earthaccess
auth = earthaccess.login()
# are we authenticated?
if not auth.authenticated:
    # ask for credentials and persist them in a .netrc file
    auth.login(strategy="interactive", persist=True)

Import Required Packages

import xarray as xr

Monthly data

I poked around on the files on search.earthdata so I know what the files look like.

import earthaccess
results_mo = earthaccess.search_data(
    short_name = "PACE_OCI_L3M_AVW",
    temporal = ("2024-03-01", "2024-10-31"),
    granule_name="*.MO.*.0p1deg.*"
)
len(results_mo)
8
results_mo[0]

Data: PACE_OCI.20240301_20240331.L3m.MO.AVW.V3_0.avw.0p1deg.nc

Size: 7.86 MB

Cloud Hosted: True

# Create a fileset
fileset = earthaccess.open(results_mo);
# let's load just one month
import xarray as xr
ds = xr.open_dataset(fileset[0])
ds
<xarray.Dataset> Size: 26MB
Dimensions:  (lat: 1800, lon: 3600, rgb: 3, eightbitcolor: 256)
Coordinates:
  * lat      (lat) float32 7kB 89.95 89.85 89.75 89.65 ... -89.75 -89.85 -89.95
  * lon      (lon) float32 14kB -179.9 -179.9 -179.8 ... 179.8 179.9 180.0
Dimensions without coordinates: rgb, eightbitcolor
Data variables:
    avw      (lat, lon) float32 26MB ...
    palette  (rgb, eightbitcolor) uint8 768B ...
Attributes: (12/62)
    product_name:                      PACE_OCI.20240301_20240331.L3m.MO.AVW....
    instrument:                        OCI
    title:                             OCI Level-3 Standard Mapped Image
    project:                           Ocean Biology Processing Group (NASA/G...
    platform:                          PACE
    source:                            satellite observations from OCI-PACE
    ...                                ...
    cdm_data_type:                     grid
    identifier_product_doi_authority:  http://dx.doi.org
    identifier_product_doi:            10.5067/PACE/OCI/L3M/AVW/3.0
    data_bins:                         3016790
    data_minimum:                      399.99997
    data_maximum:                      700.00006
ds["avw"].plot();

lat_mean = ds["avw"].sel(lat = slice(70, -70)).mean(dim=["lon"])
lat_mean.plot.line(x="lat");

Multiple months

ds = xr.open_mfdataset(
    fileset,
    combine='nested', concat_dim="time"
)
ds
<xarray.Dataset> Size: 207MB
Dimensions:  (time: 8, lat: 1800, lon: 3600, rgb: 3, eightbitcolor: 256)
Coordinates:
  * lat      (lat) float32 7kB 89.95 89.85 89.75 89.65 ... -89.75 -89.85 -89.95
  * lon      (lon) float32 14kB -179.9 -179.9 -179.8 ... 179.8 179.9 180.0
Dimensions without coordinates: time, rgb, eightbitcolor
Data variables:
    avw      (time, lat, lon) float32 207MB dask.array<chunksize=(1, 512, 1024), meta=np.ndarray>
    palette  (time, rgb, eightbitcolor) uint8 6kB dask.array<chunksize=(1, 3, 256), meta=np.ndarray>
Attributes: (12/62)
    product_name:                      PACE_OCI.20240301_20240331.L3m.MO.AVW....
    instrument:                        OCI
    title:                             OCI Level-3 Standard Mapped Image
    project:                           Ocean Biology Processing Group (NASA/G...
    platform:                          PACE
    source:                            satellite observations from OCI-PACE
    ...                                ...
    cdm_data_type:                     grid
    identifier_product_doi_authority:  http://dx.doi.org
    identifier_product_doi:            10.5067/PACE/OCI/L3M/AVW/3.0
    data_bins:                         3016790
    data_minimum:                      399.99997
    data_maximum:                      700.00006
lat_mean = ds["avw"].sel(lat = slice(70, -70)).mean(dim=["lon"])
lat_mean.plot.line(x="lat");

Daily data

We need the data links that have 0.1 deg and DAY in the file name.

import earthaccess
results_day = earthaccess.search_data(
    short_name = "PACE_OCI_L3M_AVW",
    temporal = ("2024-03-01", "2024-03-31"),
    granule_name="*.DAY.*.0p1deg.*"
)
len(results_day)
23
results_day[0]

Data: PACE_OCI.20240305.L3m.DAY.AVW.V3_0.avw.0p1deg.nc

Size: 2.25 MB

Cloud Hosted: True

# let's load the data
fileset = earthaccess.open(results_day)
ds = xr.open_mfdataset(
    fileset,
    combine='nested', concat_dim="time"
)
ds
<xarray.Dataset> Size: 363MB
Dimensions:  (time: 14, lat: 1800, lon: 3600, rgb: 3, eightbitcolor: 256)
Coordinates:
  * lat      (lat) float32 7kB 89.95 89.85 89.75 89.65 ... -89.75 -89.85 -89.95
  * lon      (lon) float32 14kB -179.9 -179.9 -179.8 ... 179.8 179.9 180.0
Dimensions without coordinates: time, rgb, eightbitcolor
Data variables:
    avw      (time, lat, lon) float32 363MB dask.array<chunksize=(1, 512, 1024), meta=np.ndarray>
    palette  (time, rgb, eightbitcolor) uint8 11kB dask.array<chunksize=(1, 3, 256), meta=np.ndarray>
Attributes: (12/62)
    product_name:                      PACE_OCI.20240305.L3m.DAY.AVW.V3_0.avw...
    instrument:                        OCI
    title:                             OCI Level-3 Standard Mapped Image
    project:                           Ocean Biology Processing Group (NASA/G...
    platform:                          PACE
    source:                            satellite observations from OCI-PACE
    ...                                ...
    cdm_data_type:                     grid
    identifier_product_doi_authority:  http://dx.doi.org
    identifier_product_doi:            10.5067/PACE/OCI/L3M/AVW/3.0
    data_bins:                         571271
    data_minimum:                      399.99997
    data_maximum:                      700.0001
ds["avw"].isel(time=0).plot();

import matplotlib.pyplot as plt
import gc
plt.show()
plt.clf()       # Clear the current figure
plt.close()     # Close the figure window
gc.collect()    # Ask Python to free up memory
32562
# let's look at the west coast of USA
ds["avw"].isel(time=0).sel(lat = slice(50, 30), lon=slice(-140, -110)).plot();

ds_mean = ds["avw"].mean(dim="time");
ds_mean.sel(lat = slice(50, 30), lon=slice(-140, -110)).plot();

# We can plot over the days to see when it was cloudy
ds['avw'].sel(lat = slice(50, 30), lon=slice(-140, -110)).plot(x='lon', y='lat', col="time", col_wrap=3);

References