# pip install earthaccess
Diffuse attenuation coefficients (Kd)
📘 Learning Objectives
- Show how to work with the
earthaccess
package for PACE data- Create a NASA EDL session for authentication
- Load single files with
xarray.open_dataset
- Load multiple files with
xarray.open_mfdataset
Overview
The PACE Level-3 (gridded) OCI (ocean color instrument) data is available on an NASA EarthData. Search using the instrument filter “OCI” and processing level filter “Gridded Observations” https://search.earthdata.nasa.gov/search?fi=OCI&fl=3%2B-%2BGridded%2BObservations and you will see 45+ data collections. In this tutorial, we will look at the Diffuse attenuation coefficients (Kd) product.
The data collection information page is here: PACE OCI Level-3 Global Binned Diffuse Attenuation Coefficient for Downwelling Irradiance (KD) Data, version 3.0. The concept id for this dataset is “C3385050161-OB_CLOUD” and the short name is “PACE_OCI_L3B_KD”.
Prerequisites
You need to have an EarthData Login username and password. Go here to get one https://urs.earthdata.nasa.gov/
I assume you have a .netrc
file at ~
(home). ~/.netrc
should look just like this with your username and password. Create that file if needed. You don’t need to create it if you don’t have this file. The earthaccess.login(persist=True)
line will ask for your username and password and create the .netrc
file for you.
machine urs.earthdata.nasa.gov
login yourusername
password yourpassword
For those not working in the JupyterHub
Uncomment the line and run the cell:
Create a NASA EDL authenticated session
Authenticate with earthaccess.login()
. You will need your EarthData Login username and password for this step. Get one here https://urs.earthdata.nasa.gov/.
import earthaccess
= earthaccess.login()
auth # are we authenticated?
if not auth.authenticated:
# ask for credentials and persist them in a .netrc file
="interactive", persist=True) auth.login(strategy
Import Required Packages
import xarray as xr
Monthly data
I looked at the files on search.earthdata so I know what the files look like. Here I will get monthly files for March to December 2024.
import earthaccess
= earthaccess.search_data(
results_mo = "PACE_OCI_L3B_KD",
short_name = ("2024-03-01", "2024-12-31"),
temporal ="*.MO.*"
granule_name
)len(results_mo)
10
0] results_mo[
# Create a fileset
= earthaccess.open(results_mo); fileset
import h5netcdf
with h5netcdf.File(fileset[0]) as file:
= list(file)
groups groups
['level-3_binned_data', 'processing_control']
# let's load just one month
import xarray as xr
= xr.open_dataset(fileset[0], group="level-3_binned_data")
ds ds
<xarray.Dataset> Size: 2GB Dimensions: (binListDim: 13959377, binDataDim: 13959377, binIndexDim: 4320) Dimensions without coordinates: binListDim, binDataDim, binIndexDim Data variables: (12/21) BinList (binListDim) [('bin_num', '<u4'), ('nobs', '<i2'), ('nscenes', '<i2'), ('weights', '<f4'), ('time_rec', '<f4')] 223MB ... Kd_351 (binDataDim) [('sum', '<f4'), ('sum_squared', '<f4')] 112MB ... Kd_361 (binDataDim) [('sum', '<f4'), ('sum_squared', '<f4')] 112MB ... Kd_385 (binDataDim) [('sum', '<f4'), ('sum_squared', '<f4')] 112MB ... Kd_413 (binDataDim) [('sum', '<f4'), ('sum_squared', '<f4')] 112MB ... Kd_425 (binDataDim) [('sum', '<f4'), ('sum_squared', '<f4')] 112MB ... ... ... Kd_640 (binDataDim) [('sum', '<f4'), ('sum_squared', '<f4')] 112MB ... Kd_655 (binDataDim) [('sum', '<f4'), ('sum_squared', '<f4')] 112MB ... Kd_665 (binDataDim) [('sum', '<f4'), ('sum_squared', '<f4')] 112MB ... Kd_678 (binDataDim) [('sum', '<f4'), ('sum_squared', '<f4')] 112MB ... Kd_711 (binDataDim) [('sum', '<f4'), ('sum_squared', '<f4')] 112MB ... BinIndex (binIndexDim) [('start_num', '<u4'), ('begin', '<u4'), ('extent', '<u4'), ('max', '<u4')] 69kB ...
References
- pace hackweek 2024 tutorials on working with grouped h5netcdf files