-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
0 parents
commit 922afb4
Showing
11 changed files
with
1,453 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
*.pyc | ||
*.pyo | ||
__pycache__ | ||
cluster-location | ||
salinity-centroids | ||
temperature-centroids | ||
argo-profiles |
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,38 @@ | ||
# TH clusters | ||
<a href="https://github.com/romeroqe/th-clusters"><img src="https://shields.io/github/v/release/romeroqe/th-clusters" alt="Release"></a> | ||
<a href="http://creativecommons.org/licenses/by/4.0/"><img src="https://shields.io/github/license/romeroqe/th-clusters" alt="License"></a> | ||
<a href="https://doi.org/10.5281/zenodo.10038645"><img src="https://zenodo.org/badge/DOI/10.5281/zenodo.10038645.svg" alt="DOI"></a> | ||
|
||
This repository has the data sets containing temperature and salinity centroids, in addition to the location of the thermohaline, temperature and salinity clusters described in a scientific publication that is currently under review. The data set has the position of centroids of 2 to 50 clusters of Conservative Temperature, Absolute Salinity and Thermohaline, which allows delimiting coherent thermohaline structures in the global ocean at different spatial scales. The repository also contains demo notebooks of the use of these clusters and the procedures described in the manuscript. | ||
|
||
## Datasets | ||
In the `cluster-location.zip` there are the CSV files that contain the location of the salinity, temperature and thermohaline clusters, with the headers: | ||
- `longitude`: Contains the longitude where the profile was measured. | ||
- `latitude`: Contains the latitude where the profile was measured. | ||
- `month`: Contains the month of the year when the profile was measured. | ||
- `2_{param}`: Contains clusters from 0 to _k_-1 for 2 centroids (_k_=2). | ||
- `3_{param}`: Contains clusters from 0 to _k_-1 for 2 centroids (_k_=3). | ||
- ... | ||
- `50_{param}`: Contains clusters from 0 to _k_-1 for 2 centroids (_k_=50). | ||
|
||
In the `salinity-centroids.zip` and `temperature-centroids.zip` there are the CSV files that contain the data of the _k_ centroids (one centroid per column) from 10 m to 1500 m depth (rows). The centroid files are named `{k}_centroids.csv`, where _k_ is a string with the number of centroids (clusters) of interest to which a zero (0) is added at the beginning, until it reaches 2 digits. | ||
|
||
Finally, `argo-profiles.zip` contains 5 profiles from the Argo snapshot of June 2023 (https://www.seanoe.org/data/00311/42182/#103075), used in the demo notebooks. | ||
|
||
## Demo | ||
To view the demos, go to each notebook and run it: | ||
|
||
- `demo1.ipynb`: How to read the CSV files that contain the location of the salinity, temperature and thermohaline clusters, how to plot them and how to obtain thermohaline clusters from temperature and salinity clusters. | ||
- `demo2.ipynb`: How to read the CSV files containing _k_ centroid data, how to implement Euclidean distance to classify new profiles, and how to plot your classified profiles on a T-S diagram. | ||
- `demo3.ipynb`: How to use the K-means algorithm to group profiles of any parameter, how to obtain the centroids, how to assign the resulting clusters to each profile and how to plot the profiles by group in a p-T diagram. | ||
|
||
## How to cite | ||
|
||
The manuscript is currently under review. | ||
|
||
## Argo data acknowledgment | ||
Argo data were collected and made freely available by the International Argo Program and the national programs that contribute to it. ([http://www.argo.ucsd.edu](http://www.argo.ucsd.edu), [http://argo.jcommops.org](http://argo.jcommops.org)). The Argo Program is part of the Global Ocean Observing System. | ||
|
||
## License | ||
|
||
<a rel="license" href="http://creativecommons.org/licenses/by/4.0/"><img alt="Creative Commons License" style="border-width:0" src="https://i.creativecommons.org/l/by/4.0/88x31.png" /></a><br />This work is licensed under a <a rel="license" href="http://creativecommons.org/licenses/by/4.0/">Creative Commons Attribution 4.0 International License</a>. |
Binary file not shown.
Binary file not shown.
Large diffs are not rendered by default.
Oops, something went wrong.
Large diffs are not rendered by default.
Oops, something went wrong.
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,47 @@ | ||
################################################################################# | ||
# Implementation of the Euclidean distance from a list of centroids and profiles | ||
# - Emmanuel Romero (https://github.com/romeroqe) | ||
# | ||
# This file contains code used to develop the methodology | ||
# of a publication that is currently under review. | ||
# | ||
|
||
import math, operator | ||
|
||
############################### | ||
# Calculate Euclidean distance | ||
############################### | ||
def distance(X, y, N): | ||
distance = 0 | ||
for x in range(N): | ||
# Euclidean distance | ||
distance += (X[x] - y[x])**2 | ||
return math.sqrt(distance) | ||
|
||
############################### | ||
# Get nearest centroid | ||
############################### | ||
def getNearestCentroid(Xi, centroids): | ||
N = len(Xi) | ||
labels = list(range(len(centroids))) | ||
distances = [] | ||
for key in labels: | ||
distances.append((key, distance(Xi, centroids[key], N))) | ||
distances.sort(key=operator.itemgetter(1)) | ||
return distances[0][0] | ||
|
||
############################### | ||
# Iterate profiles | ||
############################### | ||
def NearestCentroid(X, centroids): | ||
############################################################# | ||
# parameters: | ||
# X: 2-D list, shape: [num_profiles, 1451 (depth)] | ||
# centroids: 2-D list, shape: [num_clusters, 1451 (depth)] | ||
|
||
predictions = [] | ||
for i in range(len(X)): | ||
predictions.append(getNearestCentroid(X[i], centroids)) | ||
print(f"{i}/{len(X)}", end="\r") | ||
print(f"{i}/{len(X)}") | ||
return predictions |
Binary file not shown.
Binary file not shown.