OCEAN ICE’s ERDDAP querying: griddap

OCEAN ICE’s ERDDAP querying: griddap#

This notebook will illustrate how to build queries and make requests to https://er1.s4oceanice.eu/erddap/index.html using Python.

For an interactive version of this page please visit the Google Colab:
Open in Google Colab
_{(To open link in new tab press Ctrl + click)}

Alternatively this notebook can be opened with Binder by following the link: OCEAN ICE’S ERDDAP querying: griddap

Setup#

To begin we need to import the necessary libraries.

# !pip install requests pandas
# these packages should be installed with the command above if running the code locally

import requests
import pandas as pd
import io

Get a list of available datasets#

To check what griddap datasets are available in the ERDDAP and get their URLs the first step is to make a request to https://er1.s4oceanice.eu/erddap/tabledap/allDatasets.html using the URL that will allow us to get the datasets’ ids and their URLs based on the data structure. After receiving the data it will be loaded into a pandas DataFrame.

datasets_url = 'https://er1.s4oceanice.eu/erddap/tabledap/allDatasets.csv?datasetID%2Cgriddap'

# request and load into DataFrame
datasets_resp = requests.get(datasets_url)
datasets_df = pd.read_csv(io.StringIO(datasets_resp.text), sep=',')

# drop rows where tabledap is NaN
datasets_df = datasets_df.dropna(subset=['griddap'])

# add url column
datasets_df['url'] = datasets_df['griddap']
cleaned_df = datasets_df.drop(columns=['griddap'])

pd.set_option('display.max_colwidth', None)
cleaned_df = cleaned_df.reset_index(drop=True)
cleaned_df

Using these URLs we will than be able to get their data.
In this example we will use the INSITU_GLO_PHY_TS_OA_MY_013_052 dataset, with the URL: https://er1.s4oceanice.eu/erddap/griddap/INSITU_GLO_PHY_TS_OA_MY_013_052

Get a list of variables for the dataset#

Now we can make a request to the dataset’s metadata, which will give us a list of all the available variables and their relative data type. These variables can be than used in the following requests.

BASE_URL = 'https://er1.s4oceanice.eu/erddap/griddap/seanoe_slev_anomaly_geostrophic_currents'

# building the full url for the metadata and making the request
metadata_url = BASE_URL.replace('tabledap', 'info').replace('griddap', 'info') + '/index.csv'

metadata_resp = requests.get(metadata_url)
metadata_df = pd.read_csv(io.StringIO(metadata_resp.text), sep=',')

# Extract time_coverage_start and time_coverage_end
time_coverage_start = metadata_df.loc[metadata_df['Attribute Name'] == 'time_coverage_start', 'Value'].iloc[0]
time_coverage_end = metadata_df.loc[metadata_df['Attribute Name'] == 'time_coverage_end', 'Value'].iloc[0]
geospatial_lat_max = metadata_df.loc[metadata_df['Attribute Name'] == 'geospatial_lat_max', 'Value'].iloc[0]
geospatial_lat_min = metadata_df.loc[metadata_df['Attribute Name'] == 'geospatial_lat_min', 'Value'].iloc[0]
geospatial_lon_max = metadata_df.loc[metadata_df['Attribute Name'] == 'geospatial_lon_max', 'Value'].iloc[0]
geospatial_lon_min = metadata_df.loc[metadata_df['Attribute Name'] == 'geospatial_lon_min', 'Value'].iloc[0]

variables_df = metadata_df.loc[metadata_df['Row Type'].isin(['variable', 'dimension'])]
variables_df.reset_index(drop=True, inplace=True)
variables_df.drop(columns=['Row Type', 'Attribute Name', 'Value'], inplace=True)

print(f"Time Coverage Start: {time_coverage_start}")
print(f"Time Coverage End: {time_coverage_end}")
print(f"Geospatial max Lat: {geospatial_lat_max}")
print(f"Geospatial min Lat: {geospatial_lat_min}")
print(f"Geospatial max Lon: {geospatial_lon_max}")
print(f"Geospatial min Lon: {geospatial_lon_min}")

variables_df

Time Coverage Start: 2013-04-01T00:00:00Z
Time Coverage End: 2019-07-31T00:00:00Z
Geospatial max Lat: 349.0
Geospatial min Lat: 0.0
Geospatial max Lon: 349.0
Geospatial min Lon: 0.0

/tmp/ipykernel_2531/810593471.py:19: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  variables_df.drop(columns=['Row Type', 'Attribute Name', 'Value'], inplace=True)

	Variable Name	Data Type
0	time	double
1	longitude	short
2	latitude	short
3	sla	float
4	formal_error	float
5	U	float
6	V	float

Get a list of platform codes#

We will then perform another request to retrieve all the sla values in the time range and the bounding coordinates we want (in this case we will use the time_coverage_end value and the maximum range between geospatial_lat_min/geospatial_lat_max and geospatial_lon_min/geospatial_lon_max available values, see the output above).

N.B. The wider the range the more the loading time will be. Loading could fail if the range is too wide.

In other datasets it is possible that there are no time or coordinate ranges. Anyway when there is a variable with a range of values, the query follows the same structure: .csv? + variable_we_want_to_see (in this case sla) + %5B + eventually_another_variable + %5B + (min value):1:(max value) + %5D%5B + (min value 2):1:(max value 2) and so on if another value range is available.

platforms_query = f'.csv?sla%5B({time_coverage_end}):1:({time_coverage_end})%5D%5B({geospatial_lat_min}):1:({geospatial_lat_max})%5D%5B({geospatial_lon_min}):1:({geospatial_lon_max})%5D'

# The data format specified is 'csv' (in which the first row contains the column names and the second the units of measurment, which will be removed from the dataframe in these examples).
# Other possibilities are  'csv0' which will return only the data rows and 'csvp', which will return a csv with the column names (and their unit of measurment) as first row and data starting from the second.
# the additional parameter &distinct() will ensure we will get only unique rows

platform_resp = requests.get(BASE_URL + platforms_query)
# Skip the first two rows (header and units)
platforms_df = pd.read_csv(io.StringIO(platform_resp.text), sep=',')
platforms_df

	time	longitude	latitude	sla
0	UTC	degrees_east	degrees_north	m
1	2019-07-31T00:00:00Z	0	0	9.96921E36
2	2019-07-31T00:00:00Z	0	1	9.96921E36
3	2019-07-31T00:00:00Z	0	2	9.96921E36
4	2019-07-31T00:00:00Z	0	3	9.96921E36
...	...	...	...	...
122496	2019-07-31T00:00:00Z	349	345	9.96921E36
122497	2019-07-31T00:00:00Z	349	346	9.96921E36
122498	2019-07-31T00:00:00Z	349	347	9.96921E36
122499	2019-07-31T00:00:00Z	349	348	9.96921E36
122500	2019-07-31T00:00:00Z	349	349	9.96921E36

122501 rows × 4 columns

Additional resources#

For additional information about ERDDAP please visit:

https://er1.s4oceanice.eu/erddap/information.html

The webpages for the Python’s libraries that have been used in this notebook are:

REQUESTS: https://requests.readthedocs.io/en/latest/
PANDAS: https://pandas.pydata.org/
IO: https://docs.python.org/3/library/io.html

	datasetID	url
0	INSITU_GLO_PHY_TS_OA_MY_013_052	https://er1.s4oceanice.eu/erddap/griddap/INSITU_GLO_PHY_TS_OA_MY_013_052
1	seanoe_slev_anomaly_geostrophic_currents	https://er1.s4oceanice.eu/erddap/griddap/seanoe_slev_anomaly_geostrophic_currents
2	RSMC_seaice	https://er1.s4oceanice.eu/erddap/griddap/RSMC_seaice
3	GLORYS12V1_sea_floor_potential_temp	https://er1.s4oceanice.eu/erddap/griddap/GLORYS12V1_sea_floor_potential_temp
4	GLODAPv2_2016b_MappedClimatologies	https://er1.s4oceanice.eu/erddap/griddap/GLODAPv2_2016b_MappedClimatologies
5	NOAA_OISST_v2	https://er1.s4oceanice.eu/erddap/griddap/NOAA_OISST_v2
6	SOCATv2024_tracks_gridded_monthly	https://er1.s4oceanice.eu/erddap/griddap/SOCATv2024_tracks_gridded_monthly
7	EU_circumpolar_seaice_prod_fluxes_1992_2023	https://er1.s4oceanice.eu/erddap/griddap/EU_circumpolar_seaice_prod_fluxes_1992_2023
8	SSP585_FWF_1990_2300_ZwallyBasins	https://er1.s4oceanice.eu/erddap/griddap/SSP585_FWF_1990_2300_ZwallyBasins
9	SSP126_FWF_1990_2300_ZwallyBasins	https://er1.s4oceanice.eu/erddap/griddap/SSP126_FWF_1990_2300_ZwallyBasins
10	SSP585_FWF_1990_2300_OceanSectors	https://er1.s4oceanice.eu/erddap/griddap/SSP585_FWF_1990_2300_OceanSectors
11	SSP126_FWF_1990_2300_OceanSectors	https://er1.s4oceanice.eu/erddap/griddap/SSP126_FWF_1990_2300_OceanSectors
12	SSP585_FWF_1990_2300_AIS	https://er1.s4oceanice.eu/erddap/griddap/SSP585_FWF_1990_2300_AIS
13	SSP126_FWF_1990_2300_AIS	https://er1.s4oceanice.eu/erddap/griddap/SSP126_FWF_1990_2300_AIS
14	PSMSL_Absolute_sea_level_trend	https://er1.s4oceanice.eu/erddap/griddap/PSMSL_Absolute_sea_level_trend
15	SCAR_RAATD	https://er1.s4oceanice.eu/erddap/griddap/SCAR_RAATD