Southern Ocean Mixed Layer Depth Estimation from ARGO Floats — Regression Method of Courtois et al. (2017)#
For an interactive version of this page please visit the Google Colab:
Open in Google Colab
(To open link in new tab press Ctrl + click)
Alternatively this notebook can be opened with Binder by following the link: Southern Ocean Mixed Layer Depth Estimation from ARGO Floats — Regression Method of Courtois et al. (2017)
Purpose
The Mixed Layer Depth (MLD) marks the upper ocean layer that is stirred and blended by winds, waves, and currents. It is a key property for many reasons:
Climate & Heat Storage: Controls heat and gas exchange between ocean and atmosphere.
Marine Life: Influences nutrient supply and light availability for phytoplankton growth.
Carbon Cycle: Regulates CO₂ uptake and long-term storage in the ocean interior.
Ocean Circulation: Contributes to water mass formation and global current systems.
In the Southern Ocean, MLD variability is central to understanding climate change impacts and ecosystem dynamics.
This notebook provides interactive tools to visualize and analyze MLD estimates from ARGO profiling floats. Users can:
Select specific float platforms and time periods.
View temperature–depth profiles and identify the MLD using a regression-based method.
Map monthly average MLDs across multiple floats.
Data sources
ARGO floats are autonomous, free-drifting instruments used for large-scale ocean monitoring. Each float:
Cycles vertically from the surface to depths of up to ~2,000 m.
Measures temperature, salinity, and sometimes biogeochemical parameters.
Transmits data via satellite when at the surface.
Operates for 4–5 years, collecting hundreds of profiles during its lifetime.
The global ARGO program maintains a network of ~4,000 floats worldwide. In the Southern Ocean, these floats provide year-round coverage in otherwise inaccessible regions, making them essential for climate and oceanographic research.
The dataset used in this notebook comes from https://er1.s4oceanice.eu/erddap/tabledap/ARGO_FLOATS_OCEANICE.html. It includes time, latitude, longitude, pressure (converted to depth in meters), and temperature profiles. The analysis here focuses on the period December 2023 – March 2024, but users can adjust the query to other intervals.
Instructions to use this Notebook
Run each code cell in order by clicking the Play button (▶️) on the left of each grey code block. This ensures all features execute properly.
Explaining the code
Method Note
The MLD estimation algorithm used here follows Courtois et al. 2017, who proposed a simplified regression-based method inspired by Holte & Talley 2009.
Two linear regressions are fit: one in the mixed layer (≤100 m) and one in the thermocline (150–500 m).
The intersection of these regressions defines the MLD.
Compared to Holte & Talley’s original multi-criterion approach, this version is computationally lighter and well-suited to analyzing large ARGO datasets in regions with deep convection, such as the Southern Ocean.
1. Notebook Setup and ARGO Float Platform Data Source Definition
This section imports all the necessary Python libraries for data handling, statistical analysis, mapping, and interactive widget creation. It also sets the URLs for accessing ARGO float platform information and associated time records from the OCEAN ICE ERDDAP server.
The following libraries are used in this notebook:
Data Acquisition & Processing: pandas, numpy, datetime.datetime, os
Visualization & Mapping: matplotlib.pyplot, scipy.stats.linregress, folium, folium.plugins.MarkerCluster
Interactive Data Exploration: ipywidgets
Output & Presentation: warnings, IPython.display
# @title
import numpy as np
from folium.plugins import MarkerCluster
import folium
import warnings
import os
import matplotlib.pyplot as plt
from scipy.stats import linregress
import pandas as pd
from datetime import datetime
from ipywidgets import (
FloatSlider,
Text,
HBox,
Layout,
Output,
VBox,
HBox,
HTML,
Label,
Dropdown,
SelectionSlider,
Button
)
from IPython.display import display, FileLink, HTML
platform_url = 'https://er1.s4oceanice.eu/erddap/tabledap/ARGO_FLOATS_OCEANICE.csv?PLATFORMCODE'
time_plat_url = 'https://er1.s4oceanice.eu/erddap/tabledap/ARGO_FLOATS_OCEANICE.csv?PLATFORMCODE%2Ctime&time%3E=2023-12-19T22%3A25%3A00Z&time%3C=2024-03-07T19%3A23%3A20Z'
---------------------------------------------------------------------------
ModuleNotFoundError Traceback (most recent call last)
Cell In[1], line 3
1 # @title
2 import numpy as np
----> 3 from folium.plugins import MarkerCluster
4 import folium
5 import warnings
ModuleNotFoundError: No module named 'folium'
2. Interactive ARGO Float Profile Viewer and MLD Estimator
This tool lets the user browse temperature–depth profiles by platform and date.
Data are retrieved from ERDDAP and pressure is converted to depth (1 dbar ≈ 1.0047 m).
MLD is then estimated following Courtois et al. (2017)
The resulting profile is plotted with regression lines for the mixed layer and thermocline, and a red horizontal line marking the estimated MLD.
# @title
# Read the data from the platform URL, skipping the first row
platforms_df = pd.read_csv(platform_url)
# Get unique platform codes and sort them
unique_platforms = sorted(platforms_df['PLATFORMCODE'].unique())
# Create and display the platform dropdown
platform_dropdown = Dropdown(
options=unique_platforms,
disabled=False,
)
# Read the data from the time_plat_url, skipping the first row
time_df = pd.read_csv(time_plat_url, skiprows=[1])
# Convert 'time' column to datetime objects
time_df['time'] = pd.to_datetime(time_df['time'])
# Create the date dropdown (options will be updated dynamically)
date_dropdown = Dropdown(
options=[datetime.now()], # Start with a placeholder option
disabled=False,
)
# Output widget to display the date dropdown and plot
date_output_box = Output()
# Output widget for the plot
plot_output = Output()
# Function to estimate MLD (moved from cell Obn6S-Tyo83p)
def estimate_mld(depth, theta, ml_limit=100, tc_start=150, tc_end=500):
# Ensure depth and theta are sorted by depth for correct slicing
sorted_indices = np.argsort(depth)
depth = depth[sorted_indices]
theta = theta[sorted_indices]
# Fitting nello strato misto
ml_indices = depth <= ml_limit
ml_depth = depth[ml_indices]
ml_theta = theta[ml_indices]
# Check if there's enough data points for linear regression
if len(ml_depth) < 2:
slope_ml, intercept_ml = np.nan, np.nan
else:
slope_ml, intercept_ml, *_ = linregress(ml_depth, ml_theta)
# Fitting nella termoclina
tc_indices = (depth >= tc_start) & (depth <= tc_end)
tc_depth = depth[tc_indices]
tc_theta = theta[tc_indices]
# Check if there's enough data points for linear regression
if len(tc_depth) < 2:
slope_tc, intercept_tc = np.nan, np.nan
else:
slope_tc, intercept_tc, *_ = linregress(tc_depth, tc_theta)
# Intersezione delle rette
mld = np.nan # Initialize MLD as NaN
if not np.isnan(slope_ml) and not np.isnan(slope_tc) and slope_ml != slope_tc:
mld = (intercept_tc - intercept_ml) / (slope_ml - slope_tc)
# Ensure MLD is within the range of the data used for fitting
# Use min/max of the relevant data for robust range check
valid_depths = np.concatenate([ml_depth, tc_depth])
if len(valid_depths) > 0 and (mld < np.min(valid_depths) or mld > np.max(valid_depths)):
mld = np.nan # Invalidate MLD if it's outside the fitting range
return mld, (slope_ml, intercept_ml), (slope_tc, intercept_tc)
# Function to update date dropdown options and plot based on selected platform
def update_widgets_and_plot(*args):
selected_platform = platform_dropdown.value
if selected_platform is not None:
# Filter time_df for the selected platform and get unique dates
platform_dates_df = time_df[time_df['PLATFORMCODE'] == selected_platform]
unique_times = sorted(platform_dates_df['time'].unique())
# Update dropdown options
with date_output_box:
date_output_box.clear_output()
if unique_times:
date_dropdown.options = unique_times
date_dropdown.value = unique_times[0] # Set default value if options are available
display(HBox([Label('Select a date'), date_dropdown])) # Display the label and dropdown in an HBox
else:
date_dropdown.options = [datetime.now()] # Reset to placeholder if no dates
date_dropdown.value = datetime.now() # Set default value to placeholder
display(HBox([Label('Select a date'), date_dropdown])) # Display the label and dropdown in an HBox
# Now, trigger plot update based on the new dropdown value
update_plot()
# Function to update the plot when the dropdown value changes
def update_plot(*args):
global api_df # Declare api_df as global
selected_platform = platform_dropdown.value
selected_time = date_dropdown.value
if selected_platform is not None and selected_time is not None:
# Format the selected time to the required URL format (YYYY-MM-DDTHH%3AMM%3ASSZ)
formatted_time = selected_time.strftime('%Y-%m-%dT%H%%3A%M%%3A%SZ')
# Construct the URL with selected values
api_url = f'https://er1.s4oceanice.eu/erddap/tabledap/ARGO_FLOATS_OCEANICE.csv?time%2Clatitude%2Clongitude%2CPRESS%2CTEMP&PLATFORMCODE=%22{selected_platform}%22&time%3E={formatted_time}&time%3C={formatted_time}'
try:
# Read the data from the URL into a DataFrame
api_df = pd.read_csv(api_url)
# Convert 'PRESS' from decibars to depth in meters (approx. 1 dbar = 1.0047 m seawater)
api_df = api_df.iloc[1:].copy() # Start from the second row and create a copy
api_df['PRESS (decibar)'] = pd.to_numeric(api_df['PRESS'])
api_df['DEPTH (m)'] = api_df['PRESS (decibar)'] * 1.0047
api_df['TEMP (Degree_C)'] = pd.to_numeric(api_df['TEMP'])
# Drop the original columns
api_df = api_df.drop(columns=['PRESS', 'TEMP'])
# Ensure data is numeric
real_depth = api_df['DEPTH (m)'].values.astype(float)
real_theta = api_df['TEMP (Degree_C)'].values.astype(float)
# MLD calculation
mld, (slope_ml, intercept_ml), (slope_tc, intercept_tc) = estimate_mld(real_depth, real_theta)
# Calculate fitting lines for plotting
theta_ml_fit = slope_ml * real_depth + intercept_ml
theta_tc_fit = slope_tc * real_depth + intercept_tc
# Plot rendering
with plot_output:
plot_output.clear_output(wait=True)
plt.figure(figsize=(6, 10))
plt.plot(real_theta, real_depth, label='Profile θ')
# Plot fitting lines only if slopes and intercepts are not NaN
if not np.isnan(slope_ml) and not np.isnan(intercept_ml):
plt.plot(theta_ml_fit, real_depth, '--', label='Fitting ML')
if not np.isnan(slope_tc) and not np.isnan(intercept_tc):
plt.plot(theta_tc_fit, real_depth, '--', label='Fitting termocline')
# Plot MLD line only if MLD is not NaN
if not np.isnan(mld):
plt.axhline(mld, color='red', linestyle='-', label=f'Estimated MLD ≈ {mld:.1f} m')
plt.gca().invert_yaxis()
plt.xlabel('Potential temperature (°C)')
plt.ylabel('Depth (m)')
plt.title(f'Estimation of MLD by platform {selected_platform} as at {selected_time.strftime("%Y-%m-%d %H:%M:%S")}')
plt.legend()
plt.grid(True)
plt.tight_layout()
plt.show()
except Exception as e:
with plot_output:
plot_output.clear_output(wait=True)
print(f"Error fetching data or generating plot: {e}")
# Observe changes in the platform dropdown and update the date dropdown and plot
platform_dropdown.observe(update_widgets_and_plot, names='value')
# Observe changes in the date dropdown and update the plot
date_dropdown.observe(update_plot, names='value')
# Display the dropdowns and the output widget boxes
display(VBox([HBox([Label('Select a platform'), platform_dropdown]),
date_output_box]))
# Initial update of the widgets and plot
update_widgets_and_plot()
3. Batch MLD Computation and Monthly Aggregation
This section processes all ARGO float profiles from the selected date range:
Data are cleaned and converted (pressure → depth).
The
estimate_mld
function computes the MLD for each profile.
Each result is paired with geographic coordinates and timestamps.
Monthly averages of MLD are calculated for each platform and location.
arker sizes for later maps are scaled according to MLD ranges.
# @title
display(plot_output)
4. Display Retrieved Profile Data
This section allows direct inspection of raw values, calculated depths, and converted temperatures before further analysis or visualization.
# @title
display(api_df)
time | latitude | longitude | PRESS (decibar) | DEPTH (m) | TEMP (Degree_C) | |
---|---|---|---|---|---|---|
1 | 2024-01-12T00:33:00Z | -74.85373 | -102.42796666666666 | 14.1 | 14.16627 | 0.094 |
2 | 2024-01-12T00:33:00Z | -74.85373 | -102.42796666666666 | 23.9 | 24.01233 | 0.087 |
3 | 2024-01-12T00:33:00Z | -74.85373 | -102.42796666666666 | 34.0 | 34.15980 | 0.028 |
4 | 2024-01-12T00:33:00Z | -74.85373 | -102.42796666666666 | 44.1 | 44.30727 | -0.173 |
5 | 2024-01-12T00:33:00Z | -74.85373 | -102.42796666666666 | 54.6 | 54.85662 | -0.783 |
... | ... | ... | ... | ... | ... | ... |
90 | 2024-01-12T00:33:00Z | -74.85373 | -102.42796666666666 | 904.3 | 908.55021 | 1.133 |
91 | 2024-01-12T00:33:00Z | -74.85373 | -102.42796666666666 | 914.2 | 918.49674 | 1.134 |
92 | 2024-01-12T00:33:00Z | -74.85373 | -102.42796666666666 | 924.2 | 928.54374 | 1.134 |
93 | 2024-01-12T00:33:00Z | -74.85373 | -102.42796666666666 | 934.2 | 938.59074 | 1.134 |
94 | 2024-01-12T00:33:00Z | -74.85373 | -102.42796666666666 | 941.7 | 946.12599 | 1.135 |
94 rows × 6 columns
5. Computation of Monthly Mean MLD from ARGO Profiles
This block Fetches a larger dataset (December 19, 2023 – March 7, 2024), cleans and structures it for spatio-temporal analysis of MLD variability. Prepares the results for comparison across platforms and regions.
# @title
warnings.filterwarnings("ignore", message="Converting to PeriodArray/Index representation will drop timezone information.")
url = 'https://er1.s4oceanice.eu/erddap/tabledap/ARGO_FLOATS_OCEANICE.csv?PLATFORMCODE%2Ctime%2Clatitude%2Clongitude%2CPRESS%2CTEMP&time%3E=2023-12-19T22%3A25%3A00Z&time%3C=2024-03-07T19%3A23%3A20Z'
all_data_df = pd.read_csv(url, skiprows=[1])
# Convert 'time' to datetime objects
all_data_df['time'] = pd.to_datetime(all_data_df['time'])
# Add 'year_month' column
all_data_df['year_month'] = all_data_df['time'].dt.to_period('M').astype(str)
# Ensure data is numeric for MLD calculation
all_data_df['PRESS'] = pd.to_numeric(all_data_df['PRESS'], errors='coerce')
all_data_df['TEMP'] = pd.to_numeric(all_data_df['TEMP'], errors='coerce')
all_data_df.dropna(subset=['PRESS', 'TEMP'], inplace=True)
# Convert 'PRESS' to depth in meters (approx. 1 dbar = 1.0047 m seawater)
all_data_df['DEPTH'] = all_data_df['PRESS'] * 1.0047
# Calculate MLD for each unique profile (platform and time)
unique_profiles = all_data_df[['PLATFORMCODE', 'time']].drop_duplicates()
mld_data = []
for index, row in unique_profiles.iterrows():
platform_code = row['PLATFORMCODE']
profile_time = row['time']
# Filter data for the current profile
profile_data = all_data_df[(all_data_df['PLATFORMCODE'] == platform_code) & (all_data_df['time'] == profile_time)].copy()
# Sort profile data by depth
profile_data_sorted = profile_data.sort_values(by='DEPTH').copy()
real_depth = profile_data_sorted['DEPTH'].values
real_theta = profile_data_sorted['TEMP'].values
mld, _, _ = estimate_mld(real_depth, real_theta) # Use the existing estimate_mld function
# Append MLD and location data if MLD is not NaN
if not np.isnan(mld):
mld_data.append({
'PLATFORMCODE': platform_code,
'time': profile_time,
'latitude': profile_data_sorted['latitude'].iloc[0],
'longitude': profile_data_sorted['longitude'].iloc[0],
'MLD': mld,
'year_month': profile_data_sorted['year_month'].iloc[0]
})
mld_df = pd.DataFrame(mld_data)
# Calculate the monthly average MLD for each location, including PLATFORMCODE in the groupby
mld_monthly_location = mld_df.groupby(['year_month', 'latitude', 'longitude', 'PLATFORMCODE']).agg({'MLD': 'mean'}).reset_index()
# Calculate min and max MLD for scaling marker size
min_mld = mld_monthly_location['MLD'].min()
max_mld = mld_monthly_location['MLD'].max()
# Scaling factor for marker radius (adjust as needed for better visualization)
radius_scale = 20 / (max_mld - min_mld) if (max_mld - min_mld) > 0 else 1
6. Monthly MLD Map — Clustered ARGO Profiles (Folium)
Builds an interactive map of monthly MLD values:
Uses Folium with
MarkerCluster
to group profile locations by month.Each marker popup shows the month, platform code, MLD and coordinates.
A layer control allows toggling months on/off.
The map is embedded in a compact format for easy navigation.
# @title
m = folium.Map(location=[-50, -30], zoom_start=1) # Centered near Antarctica
month_layers = {}
for month in mld_monthly_location['year_month'].unique():
# Create a MarkerCluster for each month
month_layers[month] = MarkerCluster(name=month)
month_layers[month].add_to(m)
# Adding markers to clusters
for index, row in mld_monthly_location.iterrows():
month = row['year_month']
latitude = row['latitude']
longitude = row['longitude']
mld = row['MLD']
platform_code = row['PLATFORMCODE'] # Get the platform code
# Scale the marker radius based on MLD (optional with MarkerCluster, but can be used in popup)
scaled_radius = (mld - min_mld) * radius_scale + 5 # Add a base size
# Create a marker and add it to the appropriate month's cluster
folium.Marker(
location=[latitude, longitude],
popup=folium.Popup(f"Month: {month}<br>Platform: {platform_code}<br>MLD: {mld:.2f} m<br>Latitude: {latitude:.4f}<br>Longitude: {longitude:.4f}", max_width=300) # Added platform code, latitude, and longitude to popup
).add_to(month_layers[month])
# Aggiunta del controllo dei layer
folium.LayerControl().add_to(m)
# Visualizzazione della mappa con dimensioni ridotte
# Convert the Folium map to HTML and wrap it in a styled div
map_html = m._repr_html_()
styled_map = HTML(f'<div style="width: 50%; height: 50%;">{map_html}</div>')
display(styled_map)