Southern Ocean Mixed Layer Depth Estimation from ARGO Floats — Regression Method of Courtois et al. (2017)

Southern Ocean Mixed Layer Depth Estimation from ARGO Floats — Regression Method of Courtois et al. (2017)#

For an interactive version of this page please visit the Google Colab:
Open in Google Colab
_{(To open link in new tab press Ctrl + click)}

Alternatively this notebook can be opened with Binder by following the link: Southern Ocean Mixed Layer Depth Estimation from ARGO Floats — Regression Method of Courtois et al. (2017)

Purpose

The Mixed Layer Depth (MLD) marks the upper ocean layer that is stirred and blended by winds, waves, and currents. It is a key property for many reasons:

Climate & Heat Storage: Controls heat and gas exchange between ocean and atmosphere.
Marine Life: Influences nutrient supply and light availability for phytoplankton growth.
Carbon Cycle: Regulates CO₂ uptake and long-term storage in the ocean interior.
Ocean Circulation: Contributes to water mass formation and global current systems.

In the Southern Ocean, MLD variability is central to understanding climate change impacts and ecosystem dynamics.

This notebook provides interactive tools to visualize and analyze MLD estimates from ARGO profiling floats. Users can:

Select specific float platforms and time periods.
View temperature–depth profiles and identify the MLD using a regression-based method.
Map monthly average MLDs across multiple floats.

Data sources

ARGO floats are autonomous, free-drifting instruments used for large-scale ocean monitoring. Each float:

Cycles vertically from the surface to depths of up to ~2,000 m.
Measures temperature, salinity, and sometimes biogeochemical parameters.
Transmits data via satellite when at the surface.
Operates for 4–5 years, collecting hundreds of profiles during its lifetime.

The global ARGO program maintains a network of ~4,000 floats worldwide. In the Southern Ocean, these floats provide year-round coverage in otherwise inaccessible regions, making them essential for climate and oceanographic research.

The dataset used in this notebook comes from https://er1.s4oceanice.eu/erddap/tabledap/ARGO_FLOATS_OCEANICE.html. It includes time, latitude, longitude, pressure (converted to depth in meters), and temperature profiles. The analysis here focuses on the period December 2023 – March 2024, but users can adjust the query to other intervals.

Instructions to use this Notebook

Run each code cell in order by clicking the Play button (▶️) on the left of each grey code block. This ensures all features execute properly.

Explaining the code

Method Note

The MLD estimation algorithm used here follows Courtois et al. 2017, who proposed a simplified regression-based method inspired by Holte & Talley 2009.

Two linear regressions are fit: one in the mixed layer (≤100 m) and one in the thermocline (150–500 m).
The intersection of these regressions defines the MLD.
Compared to Holte & Talley’s original multi-criterion approach, this version is computationally lighter and well-suited to analyzing large ARGO datasets in regions with deep convection, such as the Southern Ocean.

1. Notebook Setup and ARGO Float Platform Data Source Definition

This section imports all the necessary Python libraries for data handling, statistical analysis, mapping, and interactive widget creation. It also sets the URLs for accessing ARGO float platform information and associated time records from the OCEAN ICE ERDDAP server.

The following libraries are used in this notebook:

Data Acquisition & Processing: pandas, numpy, datetime.datetime, os
Visualization & Mapping: matplotlib.pyplot, scipy.stats.linregress, folium, folium.plugins.MarkerCluster
Interactive Data Exploration: ipywidgets
Output & Presentation: warnings, IPython.display

# @title
import numpy as np
from folium.plugins import MarkerCluster
import folium
import warnings
import os
import matplotlib.pyplot as plt
from scipy.stats import linregress
import pandas as pd
from datetime import datetime
from ipywidgets import (
    FloatSlider,
    Text,
    HBox,
    Layout,
    Output,
    VBox,
    HBox,
    HTML,
    Label,
    Dropdown,
    SelectionSlider,
    Button
)
from IPython.display import display, FileLink, HTML

platform_url = 'https://er1.s4oceanice.eu/erddap/tabledap/ARGO_FLOATS_OCEANICE.csv?PLATFORMCODE'
time_plat_url = 'https://er1.s4oceanice.eu/erddap/tabledap/ARGO_FLOATS_OCEANICE.csv?PLATFORMCODE%2Ctime&time%3E=2023-12-19T22%3A25%3A00Z&time%3C=2024-03-07T19%3A23%3A20Z'

---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
Cell In[1], line 3
      1 # @title
      2 import numpy as np
----> 3 from folium.plugins import MarkerCluster
      4 import folium
      5 import warnings

ModuleNotFoundError: No module named 'folium'

2. Interactive ARGO Float Profile Viewer and MLD Estimator

This tool lets the user browse temperature–depth profiles by platform and date.

Data are retrieved from ERDDAP and pressure is converted to depth (1 dbar ≈ 1.0047 m).
MLD is then estimated following Courtois et al. (2017)
The resulting profile is plotted with regression lines for the mixed layer and thermocline, and a red horizontal line marking the estimated MLD.

# @title
# Read the data from the platform URL, skipping the first row
platforms_df = pd.read_csv(platform_url)

# Get unique platform codes and sort them
unique_platforms = sorted(platforms_df['PLATFORMCODE'].unique())

# Create and display the platform dropdown
platform_dropdown = Dropdown(
    options=unique_platforms,
    disabled=False,
)

# Read the data from the time_plat_url, skipping the first row
time_df = pd.read_csv(time_plat_url, skiprows=[1])

# Convert 'time' column to datetime objects
time_df['time'] = pd.to_datetime(time_df['time'])

# Create the date dropdown (options will be updated dynamically)
date_dropdown = Dropdown(
    options=[datetime.now()], # Start with a placeholder option
    disabled=False,
)

# Output widget to display the date dropdown and plot
date_output_box = Output()

# Output widget for the plot
plot_output = Output()

# Function to estimate MLD (moved from cell Obn6S-Tyo83p)
def estimate_mld(depth, theta, ml_limit=100, tc_start=150, tc_end=500):
    # Ensure depth and theta are sorted by depth for correct slicing
    sorted_indices = np.argsort(depth)
    depth = depth[sorted_indices]
    theta = theta[sorted_indices]

    # Fitting nello strato misto
    ml_indices = depth <= ml_limit
    ml_depth = depth[ml_indices]
    ml_theta = theta[ml_indices]
    # Check if there's enough data points for linear regression
    if len(ml_depth) < 2:
        slope_ml, intercept_ml = np.nan, np.nan
    else:
        slope_ml, intercept_ml, *_ = linregress(ml_depth, ml_theta)

    # Fitting nella termoclina
    tc_indices = (depth >= tc_start) & (depth <= tc_end)
    tc_depth = depth[tc_indices]
    tc_theta = theta[tc_indices]
    # Check if there's enough data points for linear regression
    if len(tc_depth) < 2:
        slope_tc, intercept_tc = np.nan, np.nan
    else:
        slope_tc, intercept_tc, *_ = linregress(tc_depth, tc_theta)

    # Intersezione delle rette
    mld = np.nan # Initialize MLD as NaN
    if not np.isnan(slope_ml) and not np.isnan(slope_tc) and slope_ml != slope_tc:
        mld = (intercept_tc - intercept_ml) / (slope_ml - slope_tc)
        # Ensure MLD is within the range of the data used for fitting
        # Use min/max of the relevant data for robust range check
        valid_depths = np.concatenate([ml_depth, tc_depth])
        if len(valid_depths) > 0 and (mld < np.min(valid_depths) or mld > np.max(valid_depths)):
             mld = np.nan # Invalidate MLD if it's outside the fitting range


    return mld, (slope_ml, intercept_ml), (slope_tc, intercept_tc)


# Function to update date dropdown options and plot based on selected platform
def update_widgets_and_plot(*args):
    selected_platform = platform_dropdown.value
    if selected_platform is not None:
        # Filter time_df for the selected platform and get unique dates
        platform_dates_df = time_df[time_df['PLATFORMCODE'] == selected_platform]
        unique_times = sorted(platform_dates_df['time'].unique())

        # Update dropdown options
        with date_output_box:
            date_output_box.clear_output()
            if unique_times:
                date_dropdown.options = unique_times
                date_dropdown.value = unique_times[0] # Set default value if options are available
                display(HBox([Label('Select a date'), date_dropdown])) # Display the label and dropdown in an HBox
            else:
                date_dropdown.options = [datetime.now()] # Reset to placeholder if no dates
                date_dropdown.value = datetime.now() # Set default value to placeholder
                display(HBox([Label('Select a date'), date_dropdown])) # Display the label and dropdown in an HBox

        # Now, trigger plot update based on the new dropdown value
        update_plot()

# Function to update the plot when the dropdown value changes
def update_plot(*args):
    global api_df # Declare api_df as global
    selected_platform = platform_dropdown.value
    selected_time = date_dropdown.value

    if selected_platform is not None and selected_time is not None:
        # Format the selected time to the required URL format (YYYY-MM-DDTHH%3AMM%3ASSZ)
        formatted_time = selected_time.strftime('%Y-%m-%dT%H%%3A%M%%3A%SZ')

        # Construct the URL with selected values
        api_url = f'https://er1.s4oceanice.eu/erddap/tabledap/ARGO_FLOATS_OCEANICE.csv?time%2Clatitude%2Clongitude%2CPRESS%2CTEMP&PLATFORMCODE=%22{selected_platform}%22&time%3E={formatted_time}&time%3C={formatted_time}'

        try:
            # Read the data from the URL into a DataFrame
            api_df = pd.read_csv(api_url)

            # Convert 'PRESS' from decibars to depth in meters (approx. 1 dbar = 1.0047 m seawater)
            api_df = api_df.iloc[1:].copy() # Start from the second row and create a copy
            api_df['PRESS (decibar)'] = pd.to_numeric(api_df['PRESS'])
            api_df['DEPTH (m)'] = api_df['PRESS (decibar)'] * 1.0047
            api_df['TEMP (Degree_C)'] = pd.to_numeric(api_df['TEMP'])

            # Drop the original columns
            api_df = api_df.drop(columns=['PRESS', 'TEMP'])

            # Ensure data is numeric
            real_depth = api_df['DEPTH (m)'].values.astype(float)
            real_theta = api_df['TEMP (Degree_C)'].values.astype(float)

            # MLD calculation
            mld, (slope_ml, intercept_ml), (slope_tc, intercept_tc) = estimate_mld(real_depth, real_theta)

            # Calculate fitting lines for plotting
            theta_ml_fit = slope_ml * real_depth + intercept_ml
            theta_tc_fit = slope_tc * real_depth + intercept_tc

            # Plot rendering
            with plot_output:
                plot_output.clear_output(wait=True)
                plt.figure(figsize=(6, 10))
                plt.plot(real_theta, real_depth, label='Profile θ')

                # Plot fitting lines only if slopes and intercepts are not NaN
                if not np.isnan(slope_ml) and not np.isnan(intercept_ml):
                    plt.plot(theta_ml_fit, real_depth, '--', label='Fitting ML')

                if not np.isnan(slope_tc) and not np.isnan(intercept_tc):
                    plt.plot(theta_tc_fit, real_depth, '--', label='Fitting termocline')

                # Plot MLD line only if MLD is not NaN
                if not np.isnan(mld):
                    plt.axhline(mld, color='red', linestyle='-', label=f'Estimated MLD ≈ {mld:.1f} m')

                plt.gca().invert_yaxis()
                plt.xlabel('Potential temperature (°C)')
                plt.ylabel('Depth (m)')
                plt.title(f'Estimation of MLD by platform {selected_platform} as at {selected_time.strftime("%Y-%m-%d %H:%M:%S")}')
                plt.legend()
                plt.grid(True)
                plt.tight_layout()
                plt.show()

        except Exception as e:
            with plot_output:
                plot_output.clear_output(wait=True)
                print(f"Error fetching data or generating plot: {e}")

# Observe changes in the platform dropdown and update the date dropdown and plot
platform_dropdown.observe(update_widgets_and_plot, names='value')

# Observe changes in the date dropdown and update the plot
date_dropdown.observe(update_plot, names='value')

# Display the dropdowns and the output widget boxes
display(VBox([HBox([Label('Select a platform'), platform_dropdown]),
              date_output_box]))

# Initial update of the widgets and plot
update_widgets_and_plot()

3. Batch MLD Computation and Monthly Aggregation

This section processes all ARGO float profiles from the selected date range:

Data are cleaned and converted (pressure → depth).
- The estimate_mld function computes the MLD for each profile.
Each result is paired with geographic coordinates and timestamps.
Monthly averages of MLD are calculated for each platform and location.
arker sizes for later maps are scaled according to MLD ranges.

# @title
display(plot_output)

4. Display Retrieved Profile Data

This section allows direct inspection of raw values, calculated depths, and converted temperatures before further analysis or visualization.

# @title
display(api_df)

	time	latitude	longitude	PRESS (decibar)	DEPTH (m)	TEMP (Degree_C)
1	2024-01-12T00:33:00Z	-74.85373	-102.42796666666666	14.1	14.16627	0.094
2	2024-01-12T00:33:00Z	-74.85373	-102.42796666666666	23.9	24.01233	0.087
3	2024-01-12T00:33:00Z	-74.85373	-102.42796666666666	34.0	34.15980	0.028
4	2024-01-12T00:33:00Z	-74.85373	-102.42796666666666	44.1	44.30727	-0.173
5	2024-01-12T00:33:00Z	-74.85373	-102.42796666666666	54.6	54.85662	-0.783
...	...	...	...	...	...	...
90	2024-01-12T00:33:00Z	-74.85373	-102.42796666666666	904.3	908.55021	1.133
91	2024-01-12T00:33:00Z	-74.85373	-102.42796666666666	914.2	918.49674	1.134
92	2024-01-12T00:33:00Z	-74.85373	-102.42796666666666	924.2	928.54374	1.134
93	2024-01-12T00:33:00Z	-74.85373	-102.42796666666666	934.2	938.59074	1.134
94	2024-01-12T00:33:00Z	-74.85373	-102.42796666666666	941.7	946.12599	1.135

94 rows × 6 columns

5. Computation of Monthly Mean MLD from ARGO Profiles

This block Fetches a larger dataset (December 19, 2023 – March 7, 2024), cleans and structures it for spatio-temporal analysis of MLD variability. Prepares the results for comparison across platforms and regions.

# @title
warnings.filterwarnings("ignore", message="Converting to PeriodArray/Index representation will drop timezone information.")

url = 'https://er1.s4oceanice.eu/erddap/tabledap/ARGO_FLOATS_OCEANICE.csv?PLATFORMCODE%2Ctime%2Clatitude%2Clongitude%2CPRESS%2CTEMP&time%3E=2023-12-19T22%3A25%3A00Z&time%3C=2024-03-07T19%3A23%3A20Z'
all_data_df = pd.read_csv(url, skiprows=[1])

# Convert 'time' to datetime objects
all_data_df['time'] = pd.to_datetime(all_data_df['time'])

# Add 'year_month' column
all_data_df['year_month'] = all_data_df['time'].dt.to_period('M').astype(str)

# Ensure data is numeric for MLD calculation
all_data_df['PRESS'] = pd.to_numeric(all_data_df['PRESS'], errors='coerce')
all_data_df['TEMP'] = pd.to_numeric(all_data_df['TEMP'], errors='coerce')
all_data_df.dropna(subset=['PRESS', 'TEMP'], inplace=True)

# Convert 'PRESS' to depth in meters (approx. 1 dbar = 1.0047 m seawater)
all_data_df['DEPTH'] = all_data_df['PRESS'] * 1.0047

# Calculate MLD for each unique profile (platform and time)
unique_profiles = all_data_df[['PLATFORMCODE', 'time']].drop_duplicates()

mld_data = []
for index, row in unique_profiles.iterrows():
    platform_code = row['PLATFORMCODE']
    profile_time = row['time']

    # Filter data for the current profile
    profile_data = all_data_df[(all_data_df['PLATFORMCODE'] == platform_code) & (all_data_df['time'] == profile_time)].copy()

    # Sort profile data by depth
    profile_data_sorted = profile_data.sort_values(by='DEPTH').copy()

    real_depth = profile_data_sorted['DEPTH'].values
    real_theta = profile_data_sorted['TEMP'].values

    mld, _, _ = estimate_mld(real_depth, real_theta) # Use the existing estimate_mld function

    # Append MLD and location data if MLD is not NaN
    if not np.isnan(mld):
        mld_data.append({
            'PLATFORMCODE': platform_code,
            'time': profile_time,
            'latitude': profile_data_sorted['latitude'].iloc[0],
            'longitude': profile_data_sorted['longitude'].iloc[0],
            'MLD': mld,
            'year_month': profile_data_sorted['year_month'].iloc[0]
        })

mld_df = pd.DataFrame(mld_data)

# Calculate the monthly average MLD for each location, including PLATFORMCODE in the groupby
mld_monthly_location = mld_df.groupby(['year_month', 'latitude', 'longitude', 'PLATFORMCODE']).agg({'MLD': 'mean'}).reset_index()

# Calculate min and max MLD for scaling marker size
min_mld = mld_monthly_location['MLD'].min()
max_mld = mld_monthly_location['MLD'].max()

# Scaling factor for marker radius (adjust as needed for better visualization)
radius_scale = 20 / (max_mld - min_mld) if (max_mld - min_mld) > 0 else 1

6. Monthly MLD Map — Clustered ARGO Profiles (Folium)

Builds an interactive map of monthly MLD values:

Uses Folium with MarkerCluster to group profile locations by month.
Each marker popup shows the month, platform code, MLD and coordinates.
A layer control allows toggling months on/off.
The map is embedded in a compact format for easy navigation.

# @title

m = folium.Map(location=[-50, -30], zoom_start=1) # Centered near Antarctica

month_layers = {}
for month in mld_monthly_location['year_month'].unique():
    # Create a MarkerCluster for each month
    month_layers[month] = MarkerCluster(name=month)
    month_layers[month].add_to(m)

# Adding markers to clusters
for index, row in mld_monthly_location.iterrows():
    month = row['year_month']
    latitude = row['latitude']
    longitude = row['longitude']
    mld = row['MLD']
    platform_code = row['PLATFORMCODE'] # Get the platform code

    # Scale the marker radius based on MLD (optional with MarkerCluster, but can be used in popup)
    scaled_radius = (mld - min_mld) * radius_scale + 5 # Add a base size

    # Create a marker and add it to the appropriate month's cluster
    folium.Marker(
        location=[latitude, longitude],
        popup=folium.Popup(f"Month: {month}<br>Platform: {platform_code}<br>MLD: {mld:.2f} m<br>Latitude: {latitude:.4f}<br>Longitude: {longitude:.4f}", max_width=300) # Added platform code, latitude, and longitude to popup
    ).add_to(month_layers[month])


# Aggiunta del controllo dei layer
folium.LayerControl().add_to(m)

# Visualizzazione della mappa con dimensioni ridotte
# Convert the Folium map to HTML and wrap it in a styled div
map_html = m._repr_html_()
styled_map = HTML(f'<div style="width: 50%; height: 50%;">{map_html}</div>')
display(styled_map)

Make this Notebook Trusted to load map: File -> Trust Notebook