{ "cells": [ { "cell_type": "markdown", "metadata": { "id": "BgJs6pCvKM-W" }, "source": [ "# **OCEAN ICE Data Catalogue** #" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For an interactive version of this page please visit the Google Colab: \n", "[ Open in Google Colab ](https://colab.research.google.com/drive/1SxGbDLXVHGNMr5m-fgPJDEw_Vg9J6ZhB)
\n", "\n", "(To open link in new tab press Ctrl + click)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Alternatively this notebook can be opened with Binder by following the link:\n", "[OCEAN ICE Data Catalogue](https://mybinder.org/v2/gh/s4oceanice/literacy.s4oceanice/main?urlpath=%2Fdoc%2Ftree%2Fnotebooks_binder%2Foceanice_catalogue.ipynb)" ] }, { "cell_type": "markdown", "metadata": { "id": "KpT-fwm6gKv2" }, "source": [ "**Purpose**" ] }, { "cell_type": "markdown", "metadata": { "id": "wn20YT_xgODt" }, "source": [ "This notebook builds an interactive catalog of datasets related to the OCEAN ICE project.\n", "It allows users to:\n", "\n", "* Browse datasets available on the **OCEAN ICE ERDDAP server** and Z**enodo repositories**.\n", "\n", "* Display key metadata (title, description, creators, funders, license, DOI, coverage).\n", "\n", "* Interactively select variables and temporal ranges.\n", "\n", "* Generate direct download links for the selected subsets in CSV format.\n", "\n", "This catalog provides a single access point for exploring OCEAN ICE observational and modeling datasets, making discovery, metadata inspection and data download." ] }, { "cell_type": "markdown", "metadata": { "id": "Sd8ROkeDgdpm" }, "source": [ "**Data sources**" ] }, { "cell_type": "markdown", "metadata": { "id": "3luXOGQEghIf" }, "source": [ "The sources are:\n", "* **Zenodo OCEAN ICE Community**:\n", "A curated list of Zenodo DOIs is queried via the Zenodo API to retrieve metadata (title, description, authors, funders, license, DOI and citation).\n", "\n", "* **OCEAN ICE ERDDAP server**:\n", "Metadata from the ERDDAP endpoint `allDatasets`\n", " is parsed to get dataset titles, structures and metadata links.\n", "These ERDDAP datasets include gridded (griddap) and tabular (tabledap) collections with rich metadata (spatial/temporal coverage, variables, licensing)." ] }, { "cell_type": "markdown", "metadata": { "id": "BMALrzcJhRdo" }, "source": [ "**Instructions to use this Notebook**" ] }, { "cell_type": "markdown", "metadata": { "id": "OV4_p5A5hTgr" }, "source": [ "To interact with the notebook, run each code cell sequentially, You can do this by clicking the **Play button** (▶️) on the left side of each grey code block. Executing the cells in order ensure that all features and visualizations work properly." ] }, { "cell_type": "markdown", "metadata": { "id": "mFK3L0afhd4y" }, "source": [ "**Explaining the code**" ] }, { "cell_type": "markdown", "metadata": { "id": "79QsK2VFhgFL" }, "source": [ "**1. Import required libraries & define data sources**" ] }, { "cell_type": "markdown", "metadata": { "id": "UIq9hgjVhn0E" }, "source": [ "This section loads Python libraries:\n", "\n", "* requests, re – HTTP requests & string cleaning.\n", "\n", "* pandas – handle metadata tables.\n", "\n", "* ipywidgets – build dropdowns, checkboxes, date pickers, and buttons.\n", "\n", "* IPython.display – render widgets, tables, and links inline.\n", "\n", "* datetime – manage time coverage metadata.\n", "\n", "and defines the sources:\n", "\n", "*a list of Zenodo IDs relevant to OCEAN ICE.\n", "\n", "* ERDDAP API URLs (allDatasets, metadata, and base access).\n", "\n", "* the fields of interest to be displayed for each dataset." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "cellView": "form", "id": "4o6l7XRLo3SZ" }, "outputs": [], "source": [ "# @title\n", "import requests\n", "import pandas as pd\n", "import re\n", "from ipywidgets import (\n", " Dropdown,\n", " Checkbox,\n", " VBox,\n", " GridspecLayout,\n", " Checkbox,\n", " Button,\n", " DatePicker,\n", " Label\n", ")\n", "from IPython.display import (\n", " display,\n", " Javascript\n", ")\n", "from datetime import (\n", " datetime,\n", " timedelta\n", ")\n", "\n", "zenodo_ids = [\n", " '15747365',\n", " '15590997',\n", " '15189061',\n", " '15267996',\n", " '15268272',\n", " '15268317',\n", " '15299425',\n", " '15299650',\n", " '15299705',\n", " '15280675',\n", " '15181349',\n", " '14162776',\n", " '14041098',\n", " '11652686',\n", " '11096059',\n", " '11652686',\n", " '12581210',\n", " '11096232',\n", " '14193092'\n", "]\n", "\n", "ZENODO_URL = 'https://zenodo.org/api/records/'\n", "ALL_DATASET_URL = 'https://er1.s4oceanice.eu/erddap/tabledap/allDatasets.csv?datasetID%2Ctitle%2CdataStructure%2Cmetadata'\n", "BASE_URL = 'https://er1.s4oceanice.eu/erddap/'\n", "\n", "fields_of_interest = [\n", " 'title',\n", " 'summary',\n", " 'conventions',\n", " 'creator_name',\n", " 'creator_type',\n", " 'creator_url',\n", " 'institution',\n", " 'project',\n", " 'project_url',\n", " 'infoUrl',\n", " 'license',\n", " 'citation',\n", " 'funding',\n", " 'doi',\n", " 'time_coverage_end',\n", " 'time_coverage_start',\n", " 'geospatial_lat_min',\n", " 'geospatial_lat_max',\n", " 'geospatial_lon_min',\n", " 'geospatial_lon_max']\n", "\n", "zenodo_fields = [\n", " 'title',\n", " 'description',\n", " 'creators',\n", " 'funder',\n", " 'doi',\n", " 'license',\n", " 'citation',\n", " ]" ] }, { "cell_type": "markdown", "metadata": { "id": "tXDnyuXniDd_" }, "source": [ "**2. Retrieve and parse Zenodo metadata**" ] }, { "cell_type": "markdown", "metadata": { "id": "aKgJpyzuiG-3" }, "source": [ "In this step, each **Zenodo ID** is processed to build a structured metadata record::\n", "\n", "* the **Zenodo API** is queried to fetch the dataset’s metadata.\n", "\n", "* from the response, key fields are extracted, including the dataset’s *title, description, list of creators, funding sources, DOI* and *license*.\n", "\n", "* since the descriptions often contain HTML tags, these are cleaned out to make the text more readable.\n", "\n", "* a properly formatted **citation string** is then created by combining the author names, dataset title, and DOI.\n", "\n", "* finally, all the cleaned and structured information is stored in a **Pandas DataFrame**, which makes it easy to explore or use in later parts of the notebook." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "cellView": "form", "id": "n-tAKn01ytfP" }, "outputs": [], "source": [ "# @title\n", "zenodo_data = []\n", "\n", "for zenodo_id in zenodo_ids:\n", " url = f'{ZENODO_URL}{zenodo_id}'\n", " try:\n", " response = requests.get(url)\n", " response.raise_for_status() # Raise an HTTPError for bad responses (4xx or 5xx)\n", " data = response.json()\n", " metadata = data.get('metadata', {})\n", " links = data.get('links', {}) # Get the links dictionary\n", "\n", " creators = [creator.get('name') for creator in metadata.get('creators', []) if creator.get('name')]\n", " # Corrected extraction of funder names\n", " funder = [grant.get('funder', {}).get('name') for grant in metadata.get('grants', []) if grant.get('funder', {}).get('name')]\n", "\n", "\n", " # Construct citation\n", " citation_parts = []\n", " if creators:\n", " citation_parts.append(\", \".join(creators))\n", " if metadata.get('title'):\n", " citation_parts.append(metadata.get('title'))\n", " if metadata.get('doi'):\n", " citation_parts.append(f\"DOI: {metadata.get('doi')}\")\n", "\n", " citation = \". \".join(citation_parts) if citation_parts else None\n", "\n", " # Clean HTML from description\n", " description = metadata.get('description')\n", " if description:\n", " clean = re.compile('<.*?>')\n", " description = re.sub(clean, '', description)\n", "\n", "\n", " record = {\n", " 'title': metadata.get('title'),\n", " 'description': description, # Use the cleaned description\n", " 'creators': creators,\n", " 'funder': funder,\n", " 'doi': metadata.get('doi'),\n", " 'license': metadata.get('license').get('id'),\n", " 'citation': citation,\n", " 'self_html': links.get('self_html') # Add self_html link\n", " }\n", " zenodo_data.append(record)\n", "\n", " except requests.exceptions.RequestException as e:\n", " print(f\"Error fetching data for Zenodo ID {zenodo_id}: {e}\")\n", " except Exception as e:\n", " print(f\"An unexpected error occurred for Zenodo ID {zenodo_id}: {e}\")\n", "\n", "\n", "zenodo_df = pd.DataFrame(zenodo_data)" ] }, { "cell_type": "markdown", "metadata": { "id": "ZjY73aBsjM9R" }, "source": [ "**2. Fetching ERDDAP Dataset Catalog**" ] }, { "cell_type": "markdown", "metadata": { "id": "RGn1fupyjRJh" }, "source": [ "Here, the notebook retrieves a list of all datasets available through the ERDDAP server:\n", "\n", "* the **allDatasets.csv** endpoint is queried from ERDDAP.\n", "\n", "* the dataset titles are extracted and combined with those retrieved from Zenodo.\n", "\n", "* a dropdown menu is created, allowing the user to select from the combined list of Zenodo and ERDDAP datasets.\n", "\n", "* this selection serves as the entry point for exploring dataset metadata in the next steps." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "cellView": "form", "colab": { "base_uri": "https://localhost:8080/", "height": 49, "referenced_widgets": [ "6d221a8a50444b3bba4485d8cacff85c", "724c285085aa479faf60ff84f983fe6f", "3f30fb46cf1842bc8a13de2acc8a07cd" ] }, "id": "c487ee6f", "outputId": "8baa023f-327c-4f03-82d0-154426f9653e" }, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "6d221a8a50444b3bba4485d8cacff85c", "version_major": 2, "version_minor": 0 }, "text/plain": [ "Dropdown(description='Dataset:', options=('* The List of All Active Datasets in this ERDDAP *', 'AAD - ASPeCt-…" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# @title\n", "try:\n", " df = pd.read_csv(ALL_DATASET_URL)\n", " #display(df)\n", "except Exception as e:\n", " print('ERROR: ', e)\n", "\n", "# Check if both df and zenodo_df exist and have a 'title' column\n", "if 'df' in globals() and df is not None and 'title' in df.columns and 'zenodo_df' in globals() and zenodo_df is not None and 'title' in zenodo_df.columns:\n", " # Combine titles from both dataframes\n", " erddap_titles = df['title'].dropna().tolist()\n", " zenodo_titles = zenodo_df['title'].dropna().tolist()\n", " all_titles = erddap_titles + zenodo_titles\n", "\n", " options = all_titles\n", "\n", " # Set the initial value of the dropdown only if options is not empty\n", " initial_value = options[0] if options else None\n", "\n", " dropdown = Dropdown(\n", " options=options,\n", " description='Dataset:',\n", " value=initial_value\n", " )\n", "\n", " # Display the dropdown and the metadata table together\n", " # The metadata will be loaded and displayed when a selection is made\n", "\n", "else:\n", " print(\"DataFrames or the 'title' column are not available.\")\n", " # Create an empty dropdown or display a message if dataframes are not available\n", " options = [\"No datasets available\"]\n", " # Set the initial value of the dropdown only if options is not empty\n", " initial_value = options[0] if options else None\n", " dropdown = Dropdown(\n", " options=options,\n", " description='Dataset:',\n", " value=initial_value\n", " )\n", "\n", "\n", "display(dropdown)" ] }, { "cell_type": "markdown", "metadata": { "id": "xVqJ7EaYnGl6" }, "source": [ "**3. Loading metadata for the selected dataset**" ] }, { "cell_type": "markdown", "metadata": { "id": "jjYUX1Meku-5" }, "source": [ "Once a dataset is chosen from the dropdown, its metadata is loaded and structured:\n", "\n", "* if the dataset comes from **Zenodo**, the notebook extracts the relevant fields directly from the Zenodo metadata table.\n", "\n", "* if the dataset comes from **ERDDAP**, the corresponding metadata `.csv` is downloaded and parsed.\n", "\n", "* key attributes (e.g., title, project, institution, geospatial and temporal coverage) are extracted and stored in a DataFrame.\n", "\n", "This ensures that both Zenodo and ERDDAP datasets can be handled in a unified way, even though their metadata formats differ.\n" ] }, { "cell_type": "markdown", "metadata": { "id": "f3VoJ4n6m03R" }, "source": [ "**Note**: Run this code everytime the selection of the dataset changes." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "cellView": "form", "id": "7e115aa3" }, "outputs": [], "source": [ "# @title\n", "def load_selected_dataset(change):\n", " global new_df\n", " global metadata_df\n", " global zenodo_html_link # Make the link available globally\n", " selected_dataset_id = change['new']\n", " zenodo_html_link = None # Reset the link\n", "\n", " # Check if the selected dataset is in the zenodo_df\n", " is_zenodo_dataset = selected_dataset_id in zenodo_df['title'].values\n", "\n", " if is_zenodo_dataset:\n", " # If it's a Zenodo dataset, extract information from zenodo_df\n", " zenodo_record = zenodo_df[zenodo_df['title'] == selected_dataset_id].iloc[0]\n", " metadata_to_display = {}\n", " for field in zenodo_fields:\n", " if field in zenodo_record.index:\n", " metadata_to_display[field] = zenodo_record[field]\n", "\n", " # Store the self_html link globally for the button\n", " if 'self_html' in zenodo_record.index:\n", " zenodo_html_link = zenodo_record['self_html']\n", "\n", "\n", " if metadata_to_display:\n", " metadata_df = pd.DataFrame.from_dict(metadata_to_display, orient='index', columns=['Value'])\n", " metadata_df.index.name = 'Attribute'\n", " # Since Zenodo data is already loaded, set new_df to None or an empty DataFrame\n", " # as there's no separate CSV to load in this flow.\n", " new_df = pd.DataFrame() # Or None, depending on how new_df is used later\n", " else:\n", " print(f\"Could not find metadata for Zenodo dataset: {selected_dataset_id}\")\n", "\n", " else:\n", " # If not a Zenodo dataset, assume it's an ERDDAP dataset and proceed as before\n", " try:\n", " metadata_url = df[df['title'] == selected_dataset_id]['metadata'].iloc[0]\n", " csv_url = metadata_url + '.csv'\n", "\n", " new_df = pd.read_csv(csv_url)\n", "\n", " # Extract and display metadata\n", " metadata_to_display = {}\n", "\n", " for field in fields_of_interest:\n", " if field in new_df['Attribute Name'].values:\n", " metadata_to_display[field] = new_df[new_df['Attribute Name'] == field]['Value'].iloc[0]\n", "\n", " if metadata_to_display:\n", " metadata_df = pd.DataFrame.from_dict(metadata_to_display, orient='index', columns=['Value'])\n", " metadata_df.index.name = 'Attribute'\n", " #print(f\"Successfully loaded data and metadata for ERDDAP dataset: {selected_dataset_id}\")\n", "\n", "\n", " except Exception as e:\n", " print(f\"ERROR loading data for dataset: {selected_dataset_id}\")\n", " print(e)\n", "\n", "\n", "dropdown.observe(load_selected_dataset, names='value')\n", "\n", "if dropdown.value:\n", " load_selected_dataset({'new': dropdown.value})" ] }, { "cell_type": "markdown", "metadata": { "id": "BArZv4AqnQGx" }, "source": [ "**4. Building interactive widgets for exploration**" ] }, { "cell_type": "markdown", "metadata": { "id": "Uxc2hiRKnUyJ" }, "source": [ "Here, the notebook creates interactive tools to refine what part of the dataset to explore:\n", "\n", "* **variable checkboxes** are generated, listing all available variables for the selected dataset.\n", "\n", "* **date pickers** are created, based on the dataset’s reported start and end dates, so the user can filter by time range.\n", "\n", "* if the dataset is from **Zenodo**, a button is also added that links directly to the Zenodo landing page.\n", "\n", "Together, these widgets let the user choose variables, restrict time periods and explore metadata interactively." ] }, { "cell_type": "markdown", "metadata": { "id": "SdvHHoUqnj7S" }, "source": [ "**Note**: Run this code everytime the selection of the dataset changes." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "cellView": "form", "colab": { "base_uri": "https://localhost:8080/", "height": 349, "referenced_widgets": [ "f108cecb95e4483ba16bdfbe16b9ad37", "d0aed5db18694a18b3bbc48e9e44ecd5", "90a355920c764adba471b35c54c767fb" ] }, "id": "1cf425d8", "outputId": "b1161b3e-0d17-47c8-a6a1-e3d86a3dc926" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Metadata:\n" ] }, { "data": { "application/vnd.google.colaboratory.intrinsic+json": { "summary": "{\n \"name\": \"metadata_df\",\n \"rows\": 7,\n \"fields\": [\n {\n \"column\": \"Attribute\",\n \"properties\": {\n \"dtype\": \"string\",\n \"num_unique_values\": 7,\n \"samples\": [\n \"title\",\n \"description\",\n \"license\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"Value\",\n \"properties\": {\n \"dtype\": \"object\",\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n }\n ]\n}", "type": "dataframe", "variable_name": "metadata_df" }, "text/html": [ "\n", "
\n", "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Value
Attribute
titleFESOM 2. Bellingshausen and Amundsen Seas Exp...
descriptionThe reference run was plublished in DOI:\\nIn t...
creators[van Caspel, Mathias, Janout, Markus, Timmerma...
funder[European Commission]
doi10.5281/zenodo.15299650
licensecc-by-4.0
citationvan Caspel, Mathias, Janout, Markus, Timmerman...
\n", "
\n", "
\n", "\n", "
\n", " \n", "\n", " \n", "\n", " \n", "
\n", "\n", "\n", "
\n", " \n", "\n", "\n", "\n", " \n", "
\n", "\n", "
\n", " \n", " \n", " \n", "
\n", "\n", "
\n", "
\n" ], "text/plain": [ " Value\n", "Attribute \n", "title FESOM 2. Bellingshausen and Amundsen Seas Exp...\n", "description The reference run was plublished in DOI:\\nIn t...\n", "creators [van Caspel, Mathias, Janout, Markus, Timmerma...\n", "funder [European Commission]\n", "doi 10.5281/zenodo.15299650\n", "license cc-by-4.0\n", "citation van Caspel, Mathias, Janout, Markus, Timmerman..." ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "f108cecb95e4483ba16bdfbe16b9ad37", "version_major": 2, "version_minor": 0 }, "text/plain": [ "Button(description='View on Zenodo', style=ButtonStyle())" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# @title\n", "def create_variable_checkboxes(df):\n", " \"\"\"Creates checkboxes for variables in the DataFrame and arranges them in a grid.\"\"\"\n", " if df is None or df.empty or 'Row Type' not in df.columns or 'Variable Name' not in df.columns:\n", " print(\"DataFrame is not valid or missing required columns.\")\n", " return None\n", "\n", " variable_names = df[df['Row Type'] == 'variable']['Variable Name'].dropna().unique().tolist()\n", "\n", " if not variable_names:\n", " print(\"No variables found in the DataFrame.\")\n", " return None\n", "\n", " num_variables = len(variable_names)\n", " num_cols = 4\n", " num_rows = (num_variables + num_cols - 1) // num_cols\n", "\n", " grid = GridspecLayout(num_rows, num_cols)\n", "\n", " for i, var_name in enumerate(variable_names):\n", " row = i // num_cols\n", " col = i % num_cols\n", " grid[row, col] = Checkbox(description=var_name, value=False)\n", "\n", " return grid\n", "\n", "def create_time_select(df):\n", " \"\"\"Creates date pickers based on time_coverage_start and time_coverage_end.\"\"\"\n", " start_date_str = df[df['Attribute Name'] == 'time_coverage_start']['Value'].iloc[0] if 'time_coverage_start' in df['Attribute Name'].values else None\n", " end_date_str = df[df['Attribute Name'] == 'time_coverage_end']['Value'].iloc[0] if 'time_coverage_end' in df['Attribute Name'].values else None\n", "\n", " if start_date_str and end_date_str:\n", " try:\n", " start_date = datetime.fromisoformat(start_date_str.replace('Z', '+00:00')).date()\n", " end_date = datetime.fromisoformat(end_date_str.replace('Z', '+00:00')).date()\n", "\n", " start_date_picker = DatePicker(\n", " description='Start Date:',\n", " value=start_date,\n", " min=start_date,\n", " max=end_date,\n", " disabled=False\n", " )\n", " end_date_picker = DatePicker(\n", " description='End Date:',\n", " value=end_date,\n", " min=start_date,\n", " max=end_date,\n", " disabled=False\n", " )\n", "\n", " return VBox([start_date_picker, end_date_picker])\n", "\n", " except Exception as e:\n", " print(f\"Error creating time select: {e}\")\n", " return None\n", " else:\n", " print(\"time_coverage_start or time_coverage_end not found in metadata.\")\n", " return None\n", "\n", "def create_zenodo_link_button(url):\n", " \"\"\"Creates a button that opens the given URL in a new tab when clicked.\"\"\"\n", " button = Button(description=\"View on Zenodo\")\n", "\n", " def on_button_click(b):\n", " display(Javascript(f'window.open(\"{url}\");'))\n", "\n", " button.on_click(on_button_click)\n", " return button\n", "\n", "\n", "if 'new_df' in globals() and new_df is not None and not new_df.empty:\n", " checkbox_grid = create_variable_checkboxes(new_df)\n", " time_select_widget = create_time_select(new_df)\n", "\n", " if 'metadata_df' in globals() and metadata_df is not None:\n", " print(\"Metadata:\")\n", " display(metadata_df)\n", "\n", "\n", " if checkbox_grid and time_select_widget:\n", " display(VBox([Label(\"\"), checkbox_grid, Label(\"\"), time_select_widget]))\n", " elif checkbox_grid:\n", " display(VBox([Label(\"\"), checkbox_grid]))\n", " elif time_select_widget:\n", " display(VBox([Label(\"\"), time_select_widget]))\n", " else:\n", " print(\"No widgets to display.\")\n", "elif 'metadata_df' in globals() and metadata_df is not None and not metadata_df.empty:\n", " print(\"Metadata:\")\n", " display(metadata_df)\n", " # Add the Zenodo link button if the link is available\n", " if 'zenodo_html_link' in globals() and zenodo_html_link:\n", " zenodo_button = create_zenodo_link_button(zenodo_html_link)\n", " display(zenodo_button)\n", "else:\n", " print(\"Please select a Dataset from the dropdown menu\")" ] }, { "cell_type": "markdown", "metadata": { "id": "6NLhQAGUnt7J" }, "source": [ "**5. Generating download links for ERDDAP datasets**" ] }, { "cell_type": "markdown", "metadata": { "id": "Y_-7LVH-g3ke" }, "source": [ "**Note**: Run this cell only after selecting an ERDDAP dataset if you want to download the corresponding data in `.csv `format.\n", "If you change your dataset, variables, or time range, make sure to re-run this cell to update the download link." ] }, { "cell_type": "markdown", "metadata": { "id": "_0H8R1GFn00S" }, "source": [ "Finally, the notebook enables downloading filtered data directly from ERDDAP:\n", "\n", "* based on the dataset selection, the chosen variables and any date filters, a query URL is constructed.\n", "\n", "* the query is checked against the server to confirm whether valid data is available.\n", "\n", "* if the data exists, a **Download button** is displayed, opening the dataset in .csv format.\n", "\n", "* if no data is available for the given variables or dates, the user receives a clear error message." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "b3e0fd48", "outputId": "e5cc64b2-7747-40ef-8bb6-487f04ec3ddb" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Please select at least one variable.\n" ] } ], "source": [ "# @title\n", "def generate_download_url(dropdown_widget, checkbox_grid, df, base_url, time_select_widget=None):\n", " \"\"\"Generates the download URL based on dropdown and checkbox selections.\"\"\"\n", " selected_dataset_title = dropdown_widget.value\n", " if not selected_dataset_title:\n", " return \"Please select a dataset.\"\n", "\n", " dataset_info = df[df['title'] == selected_dataset_title]\n", " if dataset_info.empty:\n", " return f\"Could not find information for dataset: {selected_dataset_title}\"\n", "\n", " selected_dataset_id = dataset_info['datasetID'].iloc[0] # Get the datasetID\n", "\n", " data_structure = dataset_info['dataStructure'].iloc[0]\n", " if data_structure == 'table':\n", " dap_type = 'tabledap'\n", " elif data_structure == 'grid':\n", " dap_type = 'griddap'\n", " else:\n", " return f\"Unknown data structure: {data_structure}\"\n", "\n", " selected_variables = []\n", " if checkbox_grid:\n", " # Check if checkbox_grid is a GridspecLayout or a single Checkbox\n", " if isinstance(checkbox_grid, GridspecLayout):\n", " for row in checkbox_grid.children:\n", " if isinstance(row, Checkbox) and row.value:\n", " selected_variables.append(row.description)\n", " elif isinstance(checkbox_grid, Checkbox) and checkbox_grid.value:\n", " selected_variables.append(checkbox_grid.description)\n", "\n", "\n", " if not selected_variables:\n", " return \"Please select at least one variable.\"\n", "\n", " variables_string = \"%2C\".join(selected_variables)\n", "\n", " url = f\"{base_url}{dap_type}/{selected_dataset_id}.csv?{variables_string}\"\n", "\n", "\n", " # Add time constraints if time_select_widget is available\n", " if time_select_widget and isinstance(time_select_widget, VBox):\n", " start_date_picker = time_select_widget.children[0]\n", " end_date_picker = time_select_widget.children[1]\n", " start_date = start_date_picker.value\n", " end_date = end_date_picker.value\n", "\n", " if start_date and end_date:\n", " # Format dates as required by ERDDAP (usually ISO 8601) without the 'Z'\n", " start_date_str = start_date.isoformat()\n", " end_date_str = end_date.isoformat()\n", " url += f\"&time>={start_date_str}&time<={end_date_str}\"\n", "\n", " return url\n", "\n", "# Assuming download_url is the URL generated from the previous step\n", "if 'checkbox_grid' in globals() and checkbox_grid is not None:\n", " download_url = generate_download_url(dropdown, checkbox_grid, df, BASE_URL, time_select_widget)\n", "\n", " def create_download_button(url):\n", " \"\"\"Creates a button that opens the given URL in a new tab when clicked.\"\"\"\n", " button = Button(description=\"Download Data\")\n", "\n", " def on_button_click(b):\n", " display(Javascript(f'window.open(\"{url}\");'))\n", "\n", " button.on_click(on_button_click)\n", " return button\n", "\n", " # Assuming download_url is the URL generated from the previous step\n", " if download_url:\n", " # Check if the generated download_url is an error message or a URL\n", " if download_url.startswith(\"http://\") or download_url.startswith(\"https://\"):\n", " # It's a URL, now check if it returns a 404\n", " try:\n", " response = requests.head(download_url)\n", " if response.status_code == 404:\n", " print(\"Error: Data not found for the selected variables and time range. Please select other or more variables.\")\n", " else:\n", " download_button = create_download_button(download_url)\n", " display(download_button)\n", " except requests.exceptions.RequestException as e:\n", " print(f\"Error checking URL: {e}\")\n", " else:\n", " # If it's not a URL, it's likely the \"Please select at least one variable.\" message\n", " print(download_url)\n", "else:\n", " print(\"Program is waiting for dataset selection from the dropdown menu.\")" ] } ], "metadata": { "colab": { "provenance": [] }, "kernelspec": { "display_name": "Python 3", "name": "python3" }, "language_info": { "name": "python" }, "widgets": { "application/vnd.jupyter.widget-state+json": { "3f30fb46cf1842bc8a13de2acc8a07cd": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "DescriptionStyleModel", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "DescriptionStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "description_width": "" } }, "6d221a8a50444b3bba4485d8cacff85c": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "DropdownModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "DropdownModel", "_options_labels": [ "* The List of All Active Datasets in this ERDDAP *", "AAD - ASPeCt-Bio: Chlorophyll a in Antarctic sea ice from historical ice core dataset (1983 - 2008)", "AADC (2019) Extract of data from the sea ice measurements database - 1985-2007", "AMUNDSEN CRUISES DB", "Antarctic Tide Gauge Database", "AntAWS Dataset: A compilation of Antarctic automatic weather station observations - 3 Hours", "AntAWS Dataset: A compilation of Antarctic automatic weather station observations - Daily - 25% threshold", "AntAWS Dataset: A compilation of Antarctic automatic weather station observations - Daily - 75% threshold", "AntAWS Dataset: A compilation of Antarctic automatic weather station observations - Monthly - 25% threshold", "AntAWS Dataset: A compilation of Antarctic automatic weather station observations - Monthly - 75% threshold", "ARCTICNET CRUISES DB", "Australian Antarctic Program Survey Webcams", "British Antartica Survey Webcams", "CCHDO Bottle Data", "CCHDO CTD Data", "CMEMS - CORA: Coriolis Ocean database for ReAnalysis", "Collection of XBT performed on the high density line IX28 (Dumont d Urville-Hobart) - 1990-2020", "CTD (data from NISKIN Bottles) LB21 ARCTIC Cruise Italian Arctic project CASSANDRA", "CTD (DOWNCAST) LB21 ARCTIC Cruise Italian Arctic project CASSANDRA", "CTD data set from mooring S1 @ 1000 m", "Daily Southern Ocean Sea Level Anomaly And Geostrophic Currents from multimission altimetry, 2013-2019", "Data from a local source.", "ISAR L2 SST product", "MEOP - Animal-borne Profiles", "Monthly climatology fields for product GLOBAL_REANALYSIS_PHY_001_030", "Network for the Collection of Knowledge on meLt of Antarctic iCe shElves (NECKLACE)", "NOAA - GLobal Ocean Data Analysis Project (GLODAP)", "NOAA - Optimum Interpolation (OI) Sea Surface Temperature (SST) V2 High Resolution Dataset", "NOAA - Surface Ocean CO2 Atlas Database Version 2024 (SOCAT v2024)", "OCEAN ICE - Argo DE - Deployment of 4 Argo float profilers", "OCEAN ICE - Argo Profiling Floats Compilation", "OCEAN ICE - Argo UK - Deployment of 4 Argo float profilers", "OCEAN ICE - European circumpolar sea ice production fluxes", "OCEAN ICE - Future freshwater fluxes from the Antarctic ice sheet - 27 drainage basins - high emission scenario", "OCEAN ICE - Future freshwater fluxes from the Antarctic ice sheet - 27 drainage basins - low emission scenario", "OCEAN ICE - Future freshwater fluxes from the Antarctic ice sheet - 5 Ocenan sectors - high emission scenario", "OCEAN ICE - Future freshwater fluxes from the Antarctic ice sheet - 5 Ocenan sectors - low emission scenario", "OCEAN ICE - Future freshwater fluxes from the Antarctic ice sheet - high emission scenario", "OCEAN ICE - Future freshwater fluxes from the Antarctic ice sheet - low emission scenario", "OCEAN ICE - Pan-Antarctic (90S-45S) temperature and salinity profiles compilation", "OCEAN ICE - Pan-Antarctic (90\\u00b0S-60\\u00b0S) moored time series compilation", "OCEAN ICE - Trace gas measurements, basal glacial meltwater fractions, and water ages during RV POLARSTERN cruise PS124, Southern Weddell Sea", "Physical and biogeochemical oceanography data from Conductivity, Temperature, Depth (Bottle) rosette deployments during the Antarctic Circumnavigation Expedition (ACE)", "Physical and biogeochemical oceanography data from Conductivity, Temperature, Depth (CTD) rosette deployments during the Antarctic Circumnavigation Expedition (ACE)", "POLARSTERN CRUISES DB", "PSMSL - Absolute Sea Level Trend", "SCAR - International Iceberg Database", "SO-CHIC - Continuous meteorological surface measurement conducted during Cruise 2022 - AgulhasII", "SOOS - Tracking of marine predators to protect Southern Ocean ecosystems", "Olivé Abelló et al., 2025 dataset - Iceberg grounding enhances the release of freshwater on the Antarctic continental shelf", "Datasets of the deliverable D3.1 - EO: Contribution of continental scale ice dynamics processes to freshwater fluxes", "FESOM 2. Circumpolar Simulation (1991-2020) /TS", "FESOM 2. Bellingshausen and Amundsen Seas Experiment (1991-2020) / TS", "FESOM 2. West and Shackleton Ice Shelf Experiment (1991-2020) / TS", "FESOM 2. Eastern Weddell Sea Experiment (1991-2020) / TS", "FESOM 2. West and Shackleton Ice Shelf Experiment (1991-2020) / UV", "FESOM 2. Bellingshausen and Amundsen Seas Experiment (1991-2020) / UV", "FESOM 2. Eastern Weddell Sea Experiment (1991-2020) / UV", "FESOM 2. Circumpolar Simulation (1991-2020) /UV", "Optimized Physical Tracer (OPT) reconstructions for past global ocean surface temperature, salinity and stable oxygen isotopes.", "Future freshwater fluxes from the Antarctic ice sheet", "NEMO4.2 eORCA1 configuration files for stable millennial ocean simulations", "EU project OCEAN:ICE Deliverable: D1.4 Gridded European circumpolar sea ice production fluxes", "Standardisation of pan-Antarctic mooring and profiles (D1.1)", "EU project OCEAN:ICE Deliverable: D1.4 Gridded European circumpolar sea ice production fluxes", "Record of water mass age and meltwater fractions, Weddell Sea (D1.12)", "Report on freshwater fluxes from surface mass budget and sub-shelf melt in Antarctica (D3.2)", "Temporal and spatial length scales in δ18O observations (D5.5)" ], "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "DropdownView", "description": "Dataset:", "description_tooltip": null, "disabled": false, "index": 56, "layout": "IPY_MODEL_724c285085aa479faf60ff84f983fe6f", "style": "IPY_MODEL_3f30fb46cf1842bc8a13de2acc8a07cd" } }, "724c285085aa479faf60ff84f983fe6f": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "90a355920c764adba471b35c54c767fb": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "ButtonStyleModel", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "ButtonStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "button_color": null, "font_weight": "" } }, "d0aed5db18694a18b3bbc48e9e44ecd5": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "f108cecb95e4483ba16bdfbe16b9ad37": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "ButtonModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "ButtonModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "ButtonView", "button_style": "", "description": "View on Zenodo", "disabled": false, "icon": "", "layout": "IPY_MODEL_d0aed5db18694a18b3bbc48e9e44ecd5", "style": "IPY_MODEL_90a355920c764adba471b35c54c767fb", "tooltip": "" } } } } }, "nbformat": 4, "nbformat_minor": 0 }