There are buoys in many locations around the world that provide data streams of oceanic and atmospheric parameters. The data are often available through data centers like the National Data Buoy Center (NDBC https://www.ndbc.noaa.gov) and the ARGO floats program (http://www.argo.ucsd.edu). In situ buoy data are widely used to monitor environmental conditions.
In-situ buoy data can be used to evaluate the accuracy of satellite data.
Objective
In this exercise, we will learn how to match up satellite data to in situ buoy data using rerddap and rxtracto R packages.
The exercise demonstrates the following techniques:
Downloading tabular data (buoy data) from CoastWatch ERDDAP data server
Retrieving information about a dataset from ERDDAP
Subsetting satellite data within a rectangular boundary
Matching satellite data with the buoy data
Running statistical analysis to compare buoy and satellite data
# Function to check if pkgs are installed, install missing pkgs, and loadpkgTest <-function(x){if (!require(x,character.only =TRUE)) {install.packages(x, dep=TRUE, repos ="https://cloud.r-project.org/")if(!require(x,character.only =TRUE)) stop("Package not found") }}list.of.packages <-c( "ncdf4", "rerddap", "plotdap", "httr","lubridate", "gridGraphics", "mapdata","ggplot2", "RColorBrewer", "grid", "PBSmapping", "rerddapXtracto","dplyr","viridis","cmocean")# create list of installed packagespkges =installed.packages()[,"Package"]for (pk in list.of.packages) {pkgTest(pk)}
The downloaded binary packages are in
/var/folders/w4/4s0hrwdd2gd8k4kp6s1z8zn40000gn/T//Rtmpp3xyQ6/downloaded_packages
The downloaded binary packages are in
/var/folders/w4/4s0hrwdd2gd8k4kp6s1z8zn40000gn/T//Rtmpp3xyQ6/downloaded_packages
The downloaded binary packages are in
/var/folders/w4/4s0hrwdd2gd8k4kp6s1z8zn40000gn/T//Rtmpp3xyQ6/downloaded_packages
The downloaded binary packages are in
/var/folders/w4/4s0hrwdd2gd8k4kp6s1z8zn40000gn/T//Rtmpp3xyQ6/downloaded_packages
Downloading buoy data from ERDDAP
Extract data using the rerddap::tabledap function
Using rerddap::tabledap function, we will request and download data with the following specifications:
Buoy dataset ID: cwwcNDBCMet
Region boundaries: 35 to 40 north latitude and -125 to -120 east longitude
Time span: 08/01/2023 to 08/10/2023
Variables: station, latitude, longitude, time, and water temperature parameters
# Subset and download tabular data from ERDDAP ERDDAP_Node ="https://coastwatch.pfeg.noaa.gov/erddap"NDBC_id ='cwwcNDBCMet'NDBC_info=info(datasetid = NDBC_id,url = ERDDAP_Node)buoy <- rerddap::tabledap( url = ERDDAP_Node, NDBC_id,fields=c('station', 'latitude', 'longitude', 'time', 'wtmp'), 'time>=2023-08-01', 'time<=2023-08-10', 'latitude>=35','latitude<=40', 'longitude>=-125','longitude<=-120','wtmp>0')#Create data frame with the downloaded databuoy.df <-data.frame(station=buoy$station,longitude=as.numeric(buoy$longitude),latitude=as.numeric(buoy$latitude),time=strptime(buoy$time, "%Y-%m-%d %H:%M:%S"),date=as.Date(buoy$time),temp=as.numeric(buoy$wtmp))# Check for unique stationsunique.sta <-unique(buoy$sta)n.sta <-length(unique.sta)n.sta
[1] 24
Plot the buoy data for the first 10 stations in buoy.df
Let’s see what the buoy data looks like for our time period.
plot(buoy.df$time,buoy.df$temp,type='n', xlab='Date', ylab='SST (ºC)',main='SST from the first 10 stations')for (i in1:10){ I=which(buoy.df$station==unique.sta[i])lines(buoy.df$time[I],buoy.df$temp[I])}
Select buoy data closest in time to satellite data
Since buoy data are hourly and the satellite data are daily, the buoy data needs to be averaged daily for each station.
url='https://coastwatch.pfeg.noaa.gov/erddap/'datasetid ='nesdisBLENDEDsstDNDaily'# Get Data Information given dataset ID and URLdataInfo <- rerddap::info(datasetid, url)# Show data InfodataInfo
We will extract satellite data for each buoy station.
1. get coordinates of each buoy station 2. use the rxtracto function and the buoy coordinates to download satellite data closest to each station
# Set the variable name of interest from the satellite dataparameter <-'analysed_sst'# Set x,y,t,z coordinates based on buoy dataxcoord <- buoy.df.day$lonycoord <- buoy.df.day$lattcoord <- buoy.df.day$date
# Set the variable name of interest from the satellite dataparameter <-'analysed_sst'# Set x,y,t,z coordinates based on buoy dataxcoord <- buoy.df.day$lonycoord <- buoy.df.day$lattcoord <- buoy.df.day$date# Extract satellite data extract <-rxtracto(dataInfo, parameter=parameter, tcoord=tcoord,xcoord=xcoord,ycoord=ycoord,xlen=.01,ylen=.01)buoy.df.day$sst<-extract$`mean analysed_sst`
Get subset of data where a satellite value was found
Our satellite product is gap-free (gaps due to clouds were filled using some interpolation) but it has a spatial resolution of 5km. Some buoy stations may be so close to shore that they end up in the landmask of the satellite data. So let’s find the stations that were matched up to an SST pixel.
# Get subset of data where there is a satellite value goodbuoy<-subset(buoy.df.day, sst >0)unique.sta<-unique(goodbuoy$station)nbuoy<-length(unique.sta)ndata<-length(goodbuoy$station)
Compare results for satellite and buoy
Plot the satellite SST verses the buoy temperature to visualize how well the two datasets match each other.
# Set up map title main="California coast, 8/1/23-8/10/23"p <-ggplot(goodbuoy, aes(temp.day, sst,color=lat)) +coord_fixed(xlim=c(8,25),ylim=c(8,25)) p +geom_point() +ylab('Satellite SST') +xlab('Buoy average daily SST') +scale_x_continuous(minor_breaks =seq(8, 25)) +scale_y_continuous(minor_breaks =seq(8, 25)) +#geom_abline(a=fit[1],b=fit[2]) +#annotation_custom(my_grob) + #scale_color_gradientn(colours = "viridis", name="Buoy\nLatitude") +scale_color_viridis(discrete =FALSE, name="Buoy\nLatitude") +labs(title=main) +theme(plot.title =element_text(size=20, face="bold", vjust=2))
Run a linear regression of Blended SST versus the buoy data. * The R squared is close to 1 (0.8733) * The slope is 0.7151
lmHeight =lm(sst~temp.day, data = goodbuoy)summary(lmHeight)
Call:
lm(formula = sst ~ temp.day, data = goodbuoy)
Residuals:
Min 1Q Median 3Q Max
-2.21936 -0.47676 0.02142 0.40927 2.19949
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 3.65951 0.27549 13.28 <2e-16 ***
temp.day 0.71469 0.01976 36.17 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.7322 on 195 degrees of freedom
Multiple R-squared: 0.8703, Adjusted R-squared: 0.8696
F-statistic: 1309 on 1 and 195 DF, p-value: < 2.2e-16
Create a map of SST and overlay the buoy data
Extract blended SST data for August 1st, 2023
# First define the box and time limits of the requested data ylim<-c(32,42)xlim<-c(-127,-118)# Extract the monthly satellite dataSST <-rxtracto_3D(dataInfo,xcoord=xlim,ycoord=ylim,parameter=parameter, tcoord=c('2023-08-01','2023-08-01'))SST$sst <-drop(SST$analysed_sst)
Create the map frame for the satellite data and buoy SST overlay