Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jlorincz/common testing #119

Merged
merged 66 commits into from
May 17, 2024
Merged
Show file tree
Hide file tree
Changes from 62 commits
Commits
Show all changes
66 commits
Select commit Hold shift + click to select a range
7d422d9
New generalized testing structure
lajz Apr 8, 2024
cff8aae
Improved generalization across tokamaks
lajz Apr 9, 2024
96c9e64
Test against sql running
lajz Apr 17, 2024
483df9a
use NFS folders to get tokamak
gtrevisan Apr 17, 2024
3a464ab
Clean up test against sql when run without pytest
lajz Apr 21, 2024
76e3d3d
Fix to add more clarity around expected failures
lajz Apr 21, 2024
24f9106
Better support for missing data and cli
lajz Apr 22, 2024
47099fd
Improve cli experience
lajz Apr 22, 2024
dce1a5f
allow tweaking tokamak from env vars
gtrevisan Apr 22, 2024
982ec0b
Avoid errors for missing data
lajz Apr 24, 2024
b2f0ed4
Merge branch 'jlorincz/common_testing' of github.com:MIT-PSFC/disrupt…
lajz Apr 24, 2024
4ca499d
New generalized testing structure
lajz Apr 8, 2024
dcf0159
Improved generalization across tokamaks
lajz Apr 9, 2024
1cd6025
Test against sql running
lajz Apr 17, 2024
ea02632
use NFS folders to get tokamak
gtrevisan Apr 17, 2024
c034568
Clean up test against sql when run without pytest
lajz Apr 21, 2024
80b2d16
Fix to add more clarity around expected failures
lajz Apr 21, 2024
d354c9d
Better support for missing data and cli
lajz Apr 22, 2024
6e94244
Improve cli experience
lajz Apr 22, 2024
aec6e35
Avoid errors for missing data
lajz Apr 24, 2024
cd7f0eb
allow tweaking tokamak from env vars
gtrevisan Apr 22, 2024
14cefc6
Merge branch 'jlorincz/common_testing' of github.com:MIT-PSFC/disrupt…
lajz Apr 24, 2024
9034c16
New generalized testing structure
lajz Apr 8, 2024
f3c4c75
Improved generalization across tokamaks
lajz Apr 9, 2024
0913642
Test against sql running
lajz Apr 17, 2024
a9759f7
use NFS folders to get tokamak
gtrevisan Apr 17, 2024
2a1e45c
Clean up test against sql when run without pytest
lajz Apr 21, 2024
c0089ec
Fix to add more clarity around expected failures
lajz Apr 21, 2024
bf3c772
Better support for missing data and cli
lajz Apr 22, 2024
510246b
Improve cli experience
lajz Apr 22, 2024
ce5d9f5
Avoid errors for missing data
lajz Apr 24, 2024
6461563
allow tweaking tokamak from env vars
gtrevisan Apr 22, 2024
4f2523c
New generalized testing structure
lajz Apr 8, 2024
6799f76
Improved generalization across tokamaks
lajz Apr 9, 2024
d5f0b09
Merge branch 'jlorincz/common_testing' of github.com:MIT-PSFC/disrupt…
lajz Apr 24, 2024
2eef615
Remove single tokamak testing as replaced by generalized testing
lajz Apr 24, 2024
af9ff85
Clean up report string
lajz Apr 24, 2024
79652ad
Fix common testing for d3d
lajz May 2, 2024
a621f5d
add z parameters to xfailures
gtrevisan May 3, 2024
156a9be
add Te width to xfailures
gtrevisan May 3, 2024
8bed3e4
Fix pytest issues
lajz May 3, 2024
df2c6f6
Restructure for eval files for improved clarity
lajz May 3, 2024
72196df
Fix test naming
lajz May 8, 2024
28bc286
Change default python test running to fail slow
lajz May 8, 2024
bc03937
Improve interoperability of feature testing
lajz May 10, 2024
db8c17d
d3d fixes for interoperable feature testing
lajz May 11, 2024
9ac663b
Better system for shot id selection
lajz May 13, 2024
abf2914
Fix shot id issue
lajz May 13, 2024
1f11d8c
Add shot info to expected failures
lajz May 14, 2024
4316f93
Fix xfail mark to continue execution and better monkey patch
lajz May 14, 2024
ceac5fd
Fix cast runtime warning bug and add xfail shot granularity
lajz May 14, 2024
6775998
Fix incorrect constants
lajz May 14, 2024
cdd5d4f
remove quick example, superseded by efit example
gtrevisan May 14, 2024
48d8d2c
check env var before folders
gtrevisan May 14, 2024
0a869ac
use common routines for tokamak and handler
gtrevisan May 14, 2024
550b132
Value rather than Runtime errors
gtrevisan May 14, 2024
05d38e1
remove old testing script
gtrevisan May 14, 2024
d3a1e52
run all tests in workflow
gtrevisan May 14, 2024
cc20a14
Widen usage of tokamak mapping infrastructure
lajz May 14, 2024
1f7fcfb
Improve mapping method naming
lajz May 14, 2024
de8a358
Resolve merge conflicts
lajz May 14, 2024
81bea4b
fix simple tests
gtrevisan May 15, 2024
42a3c2b
Remove expected shot failures
lajz May 15, 2024
d70c7ae
Single dictionaries for test constants
lajz May 15, 2024
6582012
fix logic for fast tests
gtrevisan May 17, 2024
825be59
remove arg set to default
gtrevisan May 17, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 5 additions & 7 deletions .github/workflows/tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -86,6 +86,8 @@ jobs:
echo "${{ secrets.CMOD_LOGIN }}" \
| tee ~/logbook.sybase_login \
| sha256sum
echo DISPY_TOKAMAK=CMOD \
| tee -a "$GITHUB_ENV"

- name: Setup DIII-D
if: ${{ matrix.tokamak == 'DIII-D' }}
Expand All @@ -107,7 +109,7 @@ jobs:
| sha256sum
echo -e "[FreeTDS]\nDescription = FreeTDS\nDriver = $TDS" \
| sudo tee -a /etc/odbcinst.ini
echo DIIID_TEST=1 \
echo DISPY_TOKAMAK=D3D \
| tee -a "$GITHUB_ENV"

- name: Setup Python
Expand Down Expand Up @@ -139,12 +141,8 @@ jobs:
- name: Test EFIT
run: python examples/efit.py

- name: Test quick
run: pytest -v tests/test_quick.py

- name: Test features
if: ${{ matrix.tokamak == 'C-MOD' }}
run: pytest -v --durations=0 tests/test_cmod_features.py
- name: Run all tests
run: pytest -v --durations=0 tests

- name: Close tunnel
run: xargs -a ssh.pid kill -TERM
156 changes: 25 additions & 131 deletions disruption_py/cli/evaluate_methods.py
Original file line number Diff line number Diff line change
@@ -1,141 +1,33 @@
import argparse
from contextlib import contextmanager
from typing import Dict, List
import numpy as np
import pandas as pd
from pandas.api.types import is_numeric_dtype
import logging
from disruption_py.handlers.cmod_handler import CModHandler
from disruption_py.settings.log_settings import LogSettings
from disruption_py.settings.shot_ids_request import ShotIdsRequestParams, shot_ids_request_runner
from disruption_py.settings.shot_settings import ShotSettings
from disruption_py.utils.constants import TIME_CONST
from disruption_py.utils.eval.eval_against_sql import eval_against_sql, get_failure_statistics_string
from disruption_py.utils.mappings.mappings_helpers import map_string_to_enum
from disruption_py.utils.mappings.tokamak import Tokamak
from disruption_py.utils.mappings.tokamak_helpers import get_tokamak_from_environment
from disruption_py.utils.math_utils import matlab_gradient_1d_vectorized
from disruption_py.utils.mappings.tokamak_helpers import get_tokamak_from_environment, get_tokamak_test_expected_failure_columns, get_tokamak_handler, get_tokamak_test_shot_ids

CMOD_TEST_SHOTS = [
1150805012, # Flattop Disruption
1150805013, # No Disruption
1150805014, # No Disruption
1150805015, # Rampdown Disruption
1150805016, # Rampdown Disruption
1150805017, # Rampdown Disruption
1150805019, # Rampdown Disruption
1150805020, # Rampdown Disruption
1150805021, # Rampdown Disruption
1150805022 # Flattop Disruption
]

TIME_EPSILON = 0.05 # Tolerance for taking the difference between two times [s]
IP_EPSILON = 1e5 # Tolerance for taking the difference between two ip values [A]

VAL_TOLERANCE = 0.01 # Tolerance for comparing values between MDSplus and SQL
MATCH_FRACTION = 0.95 # Fraction of signals that must match between MDSplus and SQL

def get_mdsplus_data(handler, shot_list):
shot_settings = ShotSettings(
efit_tree_name="efit18",
set_times_request="efit",
log_settings=LogSettings(
console_log_level=logging.ERROR
)
)
return handler.get_shots_data(
shot_ids_request=shot_list,
shot_settings=shot_settings,
output_type_request="dict",
)

def get_sql_data(handler, mdsplus_data : Dict, shot_list):
shot_data = {}
for shot_id in shot_list:
times = mdsplus_data[shot_id]['time']
sql_data = handler.database.get_shots_data([shot_id])
shot_data[shot_id] = pd.merge_asof(times.to_frame(), sql_data, on='time', direction='nearest', tolerance=TIME_CONST)
return shot_data


def test_data_match(sql_shot_df : pd.DataFrame, mdsplus_shot_df : pd.DataFrame, data_column : str):

sql_column_data = sql_shot_df[data_column].astype(np.float64)
mds_column_data = mdsplus_shot_df[data_column].astype(np.float64)

# copare data for numeric differences
relative_difference = np.where(
sql_column_data != 0,
np.abs((mds_column_data - sql_column_data) / sql_column_data),
np.where(mds_column_data != 0, np.inf, np.nan)
)
numeric_anomalies_mask = np.greater(relative_difference, VAL_TOLERANCE)

# compare data for nan differences
sql_is_nan_ = pd.isnull(sql_column_data)
mdsplus_is_nan = pd.isnull(mds_column_data)
nan_anomalies_mask = (sql_is_nan_ != mdsplus_is_nan)

anomalies = np.argwhere(numeric_anomalies_mask | nan_anomalies_mask)

return not (len(anomalies) / len(relative_difference) > 1 - MATCH_FRACTION)

def evaluate_cmod_accuracy(shot_list : List = None):
"""
Evaluate the accuracy of CMod methods.

Prints a short report on the methods that have suceeded and failed.
Success criteria is having more that 95% of results within 1% of known results.
"""
print("Evaluating accuracy of CMod methods...")
if shot_list is None or len(shot_list) == 0:
shot_list = CMOD_TEST_SHOTS
def evaluate_accuracy(tokamak : Tokamak, shot_ids : list[int], fail_quick : bool = False, data_columns : list[str] = None):
handler = get_tokamak_handler(tokamak)
if shot_ids is None or len(shot_ids) == 0:
shot_ids = get_tokamak_test_shot_ids(tokamak)
else:
shot_list = [int(shot_id) for shot_id in shot_list]

@contextmanager
def monkey_patch_numpy_gradient():
original_function = np.gradient
np.gradient = matlab_gradient_1d_vectorized
try:
yield
finally:
np.gradient = original_function

with monkey_patch_numpy_gradient():
return _evaluate_cmod_accuracy(shot_list)

def _evaluate_cmod_accuracy(shot_list : List = None):
cmod_handler = CModHandler()
print("Getting data from MDSplus")
mdsplus_data = get_mdsplus_data(cmod_handler, shot_list)
print("Getting data from sql table")
sql_data = get_sql_data(cmod_handler, mdsplus_data, shot_list)

success_columns = set()
unknown_columns = set()
failure_columns = set()

for shot_id in shot_list:

mdsplus_shot_df : pd.DataFrame = mdsplus_data[shot_id]
sql_shot_df : pd.DataFrame = sql_data[shot_id]
shot_ids = [int(shot_id) for shot_id in shot_ids]

for data_column in mdsplus_shot_df.columns:
if data_column not in sql_shot_df.columns or not is_numeric_dtype(mdsplus_shot_df[data_column]):
unknown_columns.add(data_column)
continue
if test_data_match(sql_shot_df, mdsplus_shot_df, data_column):
success_columns.add(data_column)
else:
failure_columns.add(data_column)
expected_failure_columns = get_tokamak_test_expected_failure_columns(tokamak)

data_differences = eval_against_sql(
handler=handler,
shot_ids=shot_ids,
expected_failure_columns=expected_failure_columns,
fail_quick=fail_quick,
test_columns=data_columns
)

success_columns = success_columns.difference(failure_columns)
unknown_columns = unknown_columns.difference(success_columns).difference(failure_columns)
print(f"Successful Columns (failure criteria not met for any shot): {success_columns}")
print(f"Columns with a failure: {failure_columns}")
print(f"Columns that lacked testing data: {unknown_columns}")
return success_columns, unknown_columns, failure_columns
print(get_failure_statistics_string(data_differences))


def main(args):
"""
Expand Down Expand Up @@ -183,13 +75,15 @@ def main(args):
)
all_shot_ids = shot_ids_request_runner(args.shotlist, shot_ids_request_params)

if tokamak == Tokamak.CMOD:
evaluate_cmod_accuracy(all_shot_ids)
else:
print("Sorry, this tokamak is not currently supported.")

data_columns = [args.data_column] if args.data_column else None

print("Running evaluation...")
evaluate_accuracy(tokamak=tokamak, shot_ids=all_shot_ids, fail_quick=args.fail_quick, data_columns=data_columns)


def get_parser():
parser = argparse.ArgumentParser(description='Evaluate the accuracy of DisruptionPy methods on a Tokamak.')
parser.add_argument('--shotlist', type=str, help='Path to file specifying a shotlist, leave blank for interactive mode', default=None)
parser.add_argument('--fail_quick', action='store_true', help='Fail quickly', default=False)
parser.add_argument('--data_column', type=str, help='Data column to test', default=None)
return parser
2 changes: 1 addition & 1 deletion disruption_py/handlers/__init__.py
Original file line number Diff line number Diff line change
@@ -1 +1 @@
from .cmod_handler import CModHandler
from .handler import Handler
10 changes: 6 additions & 4 deletions disruption_py/mdsplus_integration/mds_connection.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,8 @@
import numpy as np
import MDSplus

from disruption_py.utils.utils import safe_cast

class ProcessMDSConnection():
"""
Abstract class for connecting to MDSplus.
Expand Down Expand Up @@ -142,7 +144,7 @@ def get_data(

data = self.conn.get("_sig=" + path, arguments).data()
if astype:
data = data.astype(astype, copy=False)
data = safe_cast(data, astype)

return data

Expand Down Expand Up @@ -185,9 +187,9 @@ def get_data_with_dims(
dims = [self.conn.get(f"dim_of(_sig,{dim_num})").data() for dim_num in dim_nums]

if astype:
data = data.astype(astype, copy=False)
data = safe_cast(data, astype)
if cast_all:
dims = [dim.astype(astype, copy=False) for dim in dims]
dims = [safe_cast(dim, astype) for dim in dims]

return data, *dims

Expand Down Expand Up @@ -226,7 +228,7 @@ def get_dims(
dims = [self.conn.get(f"dim_of({path},{d})").data() for d in dim_nums]

if astype:
dims = [dim.astype(astype, copy=False) for dim in dims]
dims = [safe_cast(dim, astype) for dim in dims]

return dims

Expand Down
5 changes: 3 additions & 2 deletions disruption_py/settings/output_type_request.py
Original file line number Diff line number Diff line change
Expand Up @@ -203,7 +203,7 @@ def get_results(self, params: FinishOutputTypeRequestParams):
return self.results

def stream_output_cleanup(self, params: FinishOutputTypeRequestParams):
self.results = []
self.results = {}

class DataFrameOutputRequest(OutputTypeRequest):
"""
Expand All @@ -213,7 +213,8 @@ def __init__(self):
self.results : pd.DataFrame = pd.DataFrame()

def _output_shot(self, params : ResultOutputTypeRequestParams):
self.results = pd.concat([self.results, params.result], ignore_index=True)
if not params.result.empty and not params.result.isna().all().all():
self.results = pd.concat([self.results, params.result], ignore_index=True)

def get_results(self, params: FinishOutputTypeRequestParams):
return self.results
Expand Down
1 change: 1 addition & 0 deletions disruption_py/settings/set_times_request.py
Original file line number Diff line number Diff line change
Expand Up @@ -272,6 +272,7 @@ def _get_times(self, params : SetTimesRequestParams) -> np.ndarray:
_set_times_request_mappings: Dict[str, SetTimesRequest] = {
"efit" : EfitSetTimesRequest(),
"disruption" : DisruptionSetTimesRequest(),
"disruption_warning": {Tokamak.CMOD: EfitSetTimesRequest(), Tokamak.D3D: DisruptionSetTimesRequest()},
"ip" : IpSetTimesRequest(),
}
# --8<-- [end:set_times_request_dict]
Expand Down
4 changes: 2 additions & 2 deletions disruption_py/settings/shot_settings.py
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@ class or in an included shot_data_request. All methods with at least one include
set_times_request : SetTimesRequest
The set times request to be used when setting the timebase for the shot. The retrieved data will
be interpolated to this timebase. Can pass any SetTimesRequestType that resolves to a SetTimesRequest.
See SetTimesRequest for more details. Defaults to "efit".
See SetTimesRequest for more details. Defaults to "disruption_warning".
signal_domain : SignalDomain
The domain of the timebase that should be used when retrieving data for the shot. Either "full",
"flattop", or "rampup_and_flattop". Can pass either a SignalDomain or the associated string. Defaults
Expand Down Expand Up @@ -91,7 +91,7 @@ class or in an included shot_data_request. All methods with at least one include
shot_data_requests : List[ShotDataRequest] = field(default_factory=list)

# Timebase setting
set_times_request : SetTimesRequest = "efit"
set_times_request : SetTimesRequest = "disruption_warning"
signal_domain : SignalDomain = "full"
use_existing_data_timebase : bool = False
interpolation_method : InterpolationMethod = "linear"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
from disruption_py.settings.shot_data_request import ShotDataRequest, ShotDataRequestParams
from disruption_py.utils.mappings.tokamak import Tokamak
from disruption_py.utils.math_utils import gaussian_fit, interp1, smooth
from disruption_py.utils.utils import without_duplicates
from disruption_py.utils.utils import safe_cast, without_duplicates
from disruption_py.shots.helpers.method_caching import cached_method, parameter_cached_method
try:
from MDSplus import mdsExceptions
Expand Down Expand Up @@ -1434,7 +1434,7 @@ def efit_rz2psi(params : ShotDataRequestParams, r, z, t, tree='analysis'):
r = r.flatten()
z = z.flatten()
psi = np.full((len(r), len(t)), np.nan)
z = z.astype('float32') # TODO: Ask if this change is necessary
z = safe_cast(z, 'float32') # TODO: Ask if this change is necessary
psirz, rgrid, zgrid, times = params.mds_conn.get_data_with_dims(r'\efit_geqdsk:psirz', tree_name=tree, dim_nums=[0, 1, 2])
rgrid, zgrid = np.meshgrid(rgrid, zgrid) #, indexing='ij')

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -988,8 +988,8 @@ def _get_ne_te(params : ShotDataRequestParams, data_source="blessed", ts_systems
# Place NaNs for broken channels
lasers[laser]['te'][lasers[laser]['te'] == 0] = np.nan
lasers[laser]['ne'][np.where(lasers[laser]['ne'] == 0)] = np.nan
params.logger.debug("_get_ne_te: Core bins", lasers['core']['te'].shape)
params.logger.debug("_get_ne_te: Tangential bins", lasers['tangential']['te'].shape)
params.logger.debug("_get_ne_te: Core bins {}".format(lasers['core']['te'].shape))
params.logger.debug("_get_ne_te: Tangential bins {}".format(lasers['tangential']['te'].shape))
# If both systems/lasers available, combine them and interpolate the data
# from the tangential system onto the finer (core) timebase
if 'tangential' in lasers and lasers['tangential'] is not None:
Expand Down
Loading