-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve readUCI & readWDM for a broader range of valid files #40
Comments
Copying from LimnoTech#9 (comment) for our records. From May 19 "HSP2 test files from LimnoTech" email from @steveskrip to @rheaphy, and my response, for our records:
My emailed response:
Bob merged PR #34 into their develop on May 22. See the PR conversation for some additional details. In Bob's June 2 "HSP2 Status" email, he writes:
From June 2-13, Bob made three commits that refactored @steveskrip, let's confirm that these fixes work for us. I merged all these updates into https://github.com/LimnoTech/HSPsquared. |
@rheaphy, it looks like @steveskrip discovered some additional issues when trying to run the standard HSPF tests that @PaulDudaRESPEC suggested in his comment to respect #31: Expand & automate testing system! @steveskrip provides detailed information in LimnoTech#16. You'll see that the issue also includes problems with |
Added GLWA HSPF WQ Model Files 1 UCI, 4 WDM Issues with HSP2tools.readUCI and HSP2tools.readWDM #16 Use Jupyter notebook. respec#40 @aufdenkampe @steveskrip
I'm getting an error with reading in this UCI file: https://github.com/LimnoTech/HSPsquared/blob/develop-WaterQuality-BC/tests/GLWACSO/GLWA_HSPF_June2019_Mon8MileDataFilled_WT_RW_v4.UCI Here's a copy of the error message: ---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
<ipython-input-7-e0f78d821958> in <module>
----> 1 HSP2tools.readUCI(uciname, HDFname)
~\Documents\GitHub\limno_HSPsquared\HSP2tools\readUCI.py in readUCI(uciname, hdfname)
122 if line[0:3] == 'EXT': ext(info, getlines(f))
123 if line[0:6] == 'PERLND': operation(info, getlines(f),'PERLND')
--> 124 if line[0:6] == 'IMPLND': operation(info, getlines(f),'IMPLND')
125 if line[0:6] == 'RCHRES': operation(info, getlines(f),'RCHRES')
126
~\Documents\GitHub\limno_HSPsquared\HSP2tools\readUCI.py in operation(info, llines, op)
374 history[dpath[op,table],dcat[op,table]].append((table,df))
375
--> 376 (_,df) = history['GENERAL','INFO'][0]
377 valid = set(df.index)
378 for path,cat in history:
IndexError: list index out of range |
I'm getting errors with reading in this WDM file: https://github.com/LimnoTech/HSPsquared/blob/develop-WaterQuality-BC/tests/GLWACSO/KDTWMet-06272019-KOS_w_Mon17Filled_CHLA_ComDO.wdm PROGRAM ERROR: ATTRIBUTE INDEX not found 286 Attribute pointer 140
PROGRAM ERROR: ATTRIBUTE INDEX not found 287 Attribute pointer 141
PROGRAM ERROR: ATTRIBUTE INDEX not found 13 Attribute pointer 142
PROGRAM ERROR: ATTRIBUTE INDEX not found 12 Attribute pointer 143
PROGRAM ERROR: ATTRIBUTE INDEX not found 14 Attribute pointer 144
PROGRAM ERROR: ATTRIBUTE INDEX not found 15 Attribute pointer 145
PROGRAM ERROR: ATTRIBUTE INDEX not found 16 Attribute pointer 146
PROGRAM ERROR: ATTRIBUTE INDEX not found 286 Attribute pointer 134
PROGRAM ERROR: ATTRIBUTE INDEX not found 287 Attribute pointer 135
PROGRAM ERROR: ATTRIBUTE INDEX not found 13 Attribute pointer 136
PROGRAM ERROR: ATTRIBUTE INDEX not found 12 Attribute pointer 137
PROGRAM ERROR: ATTRIBUTE INDEX not found 14 Attribute pointer 138
PROGRAM ERROR: ATTRIBUTE INDEX not found 15 Attribute pointer 139
PROGRAM ERROR: ATTRIBUTE INDEX not found 16 Attribute pointer 140
PROGRAM ERROR: ATTRIBUTE INDEX not found 286 Attribute pointer 141
PROGRAM ERROR: ATTRIBUTE INDEX not found 287 Attribute pointer 142
PROGRAM ERROR: ATTRIBUTE INDEX not found 13 Attribute pointer 143
PROGRAM ERROR: ATTRIBUTE INDEX not found 12 Attribute pointer 144
PROGRAM ERROR: ATTRIBUTE INDEX not found 14 Attribute pointer 145
PROGRAM ERROR: ATTRIBUTE INDEX not found 15 Attribute pointer 146
PROGRAM ERROR: ATTRIBUTE INDEX not found 16 Attribute pointer 147
PROGRAM ERROR: ATTRIBUTE INDEX not found 286 Attribute pointer 141
PROGRAM ERROR: ATTRIBUTE INDEX not found 287 Attribute pointer 142
PROGRAM ERROR: ATTRIBUTE INDEX not found 13 Attribute pointer 143
PROGRAM ERROR: ATTRIBUTE INDEX not found 12 Attribute pointer 144
PROGRAM ERROR: ATTRIBUTE INDEX not found 14 Attribute pointer 145
PROGRAM ERROR: ATTRIBUTE INDEX not found 15 Attribute pointer 146
PROGRAM ERROR: ATTRIBUTE INDEX not found 16 Attribute pointer 147
PROGRAM ERROR: ATTRIBUTE INDEX not found 286 Attribute pointer 134
PROGRAM ERROR: ATTRIBUTE INDEX not found 287 Attribute pointer 135
PROGRAM ERROR: ATTRIBUTE INDEX not found 13 Attribute pointer 136
PROGRAM ERROR: ATTRIBUTE INDEX not found 12 Attribute pointer 137
PROGRAM ERROR: ATTRIBUTE INDEX not found 14 Attribute pointer 138
PROGRAM ERROR: ATTRIBUTE INDEX not found 15 Attribute pointer 139
PROGRAM ERROR: ATTRIBUTE INDEX not found 16 Attribute pointer 140
PROGRAM ERROR: ATTRIBUTE INDEX not found 286 Attribute pointer 134
PROGRAM ERROR: ATTRIBUTE INDEX not found 287 Attribute pointer 135
PROGRAM ERROR: ATTRIBUTE INDEX not found 13 Attribute pointer 136
PROGRAM ERROR: ATTRIBUTE INDEX not found 12 Attribute pointer 137
PROGRAM ERROR: ATTRIBUTE INDEX not found 14 Attribute pointer 138
PROGRAM ERROR: ATTRIBUTE INDEX not found 15 Attribute pointer 139
PROGRAM ERROR: ATTRIBUTE INDEX not found 16 Attribute pointer 140
PROGRAM ERROR: ATTRIBUTE INDEX not found 286 Attribute pointer 134
PROGRAM ERROR: ATTRIBUTE INDEX not found 287 Attribute pointer 135
PROGRAM ERROR: ATTRIBUTE INDEX not found 13 Attribute pointer 136
PROGRAM ERROR: ATTRIBUTE INDEX not found 12 Attribute pointer 137
PROGRAM ERROR: ATTRIBUTE INDEX not found 14 Attribute pointer 138
PROGRAM ERROR: ATTRIBUTE INDEX not found 15 Attribute pointer 139
PROGRAM ERROR: ATTRIBUTE INDEX not found 16 Attribute pointer 140
PROGRAM ERROR: ATTRIBUTE INDEX not found 286 Attribute pointer 134
PROGRAM ERROR: ATTRIBUTE INDEX not found 287 Attribute pointer 135
PROGRAM ERROR: ATTRIBUTE INDEX not found 13 Attribute pointer 136
PROGRAM ERROR: ATTRIBUTE INDEX not found 12 Attribute pointer 137
PROGRAM ERROR: ATTRIBUTE INDEX not found 14 Attribute pointer 138
PROGRAM ERROR: ATTRIBUTE INDEX not found 15 Attribute pointer 139
PROGRAM ERROR: ATTRIBUTE INDEX not found 16 Attribute pointer 140
PROGRAM ERROR: ATTRIBUTE INDEX not found 286 Attribute pointer 134
PROGRAM ERROR: ATTRIBUTE INDEX not found 287 Attribute pointer 135
PROGRAM ERROR: ATTRIBUTE INDEX not found 13 Attribute pointer 136
PROGRAM ERROR: ATTRIBUTE INDEX not found 12 Attribute pointer 137
PROGRAM ERROR: ATTRIBUTE INDEX not found 14 Attribute pointer 138
PROGRAM ERROR: ATTRIBUTE INDEX not found 286 Attribute pointer 140
PROGRAM ERROR: ATTRIBUTE INDEX not found 287 Attribute pointer 141
PROGRAM ERROR: ATTRIBUTE INDEX not found 13 Attribute pointer 142
PROGRAM ERROR: ATTRIBUTE INDEX not found 12 Attribute pointer 143
PROGRAM ERROR: ATTRIBUTE INDEX not found 14 Attribute pointer 144
PROGRAM ERROR: ATTRIBUTE INDEX not found 15 Attribute pointer 145
PROGRAM ERROR: ATTRIBUTE INDEX not found 16 Attribute pointer 146
PROGRAM ERROR: ATTRIBUTE INDEX not found 286 Attribute pointer 134
PROGRAM ERROR: ATTRIBUTE INDEX not found 287 Attribute pointer 135
PROGRAM ERROR: ATTRIBUTE INDEX not found 13 Attribute pointer 136
PROGRAM ERROR: ATTRIBUTE INDEX not found 12 Attribute pointer 137
PROGRAM ERROR: ATTRIBUTE INDEX not found 14 Attribute pointer 138
PROGRAM ERROR: ATTRIBUTE INDEX not found 15 Attribute pointer 139
PROGRAM ERROR: ATTRIBUTE INDEX not found 16 Attribute pointer 140
PROGRAM ERROR: ATTRIBUTE INDEX not found 286 Attribute pointer 140
PROGRAM ERROR: ATTRIBUTE INDEX not found 287 Attribute pointer 141
PROGRAM ERROR: ATTRIBUTE INDEX not found 13 Attribute pointer 142
PROGRAM ERROR: ATTRIBUTE INDEX not found 12 Attribute pointer 143
PROGRAM ERROR: ATTRIBUTE INDEX not found 14 Attribute pointer 144
PROGRAM ERROR: ATTRIBUTE INDEX not found 15 Attribute pointer 145
PROGRAM ERROR: ATTRIBUTE INDEX not found 16 Attribute pointer 146
PROGRAM ERROR: ATTRIBUTE INDEX not found 286 Attribute pointer 140
PROGRAM ERROR: ATTRIBUTE INDEX not found 287 Attribute pointer 141
PROGRAM ERROR: ATTRIBUTE INDEX not found 13 Attribute pointer 142
PROGRAM ERROR: ATTRIBUTE INDEX not found 12 Attribute pointer 143
PROGRAM ERROR: ATTRIBUTE INDEX not found 14 Attribute pointer 144
PROGRAM ERROR: ATTRIBUTE INDEX not found 15 Attribute pointer 145
PROGRAM ERROR: ATTRIBUTE INDEX not found 16 Attribute pointer 146
PROGRAM ERROR: ATTRIBUTE INDEX not found 286 Attribute pointer 140
PROGRAM ERROR: ATTRIBUTE INDEX not found 287 Attribute pointer 141
PROGRAM ERROR: ATTRIBUTE INDEX not found 13 Attribute pointer 142
PROGRAM ERROR: ATTRIBUTE INDEX not found 12 Attribute pointer 143
PROGRAM ERROR: ATTRIBUTE INDEX not found 14 Attribute pointer 144
PROGRAM ERROR: ATTRIBUTE INDEX not found 15 Attribute pointer 145
PROGRAM ERROR: ATTRIBUTE INDEX not found 16 Attribute pointer 146
PROGRAM ERROR: ATTRIBUTE INDEX not found 286 Attribute pointer 140
PROGRAM ERROR: ATTRIBUTE INDEX not found 287 Attribute pointer 141
PROGRAM ERROR: ATTRIBUTE INDEX not found 13 Attribute pointer 142
PROGRAM ERROR: ATTRIBUTE INDEX not found 12 Attribute pointer 143
PROGRAM ERROR: ATTRIBUTE INDEX not found 14 Attribute pointer 144
PROGRAM ERROR: ATTRIBUTE INDEX not found 15 Attribute pointer 145
PROGRAM ERROR: ATTRIBUTE INDEX not found 16 Attribute pointer 146
PROGRAM ERROR: ATTRIBUTE INDEX not found 286 Attribute pointer 140
PROGRAM ERROR: ATTRIBUTE INDEX not found 287 Attribute pointer 141
PROGRAM ERROR: ATTRIBUTE INDEX not found 13 Attribute pointer 142
PROGRAM ERROR: ATTRIBUTE INDEX not found 12 Attribute pointer 143
PROGRAM ERROR: ATTRIBUTE INDEX not found 14 Attribute pointer 144
PROGRAM ERROR: ATTRIBUTE INDEX not found 15 Attribute pointer 145
PROGRAM ERROR: ATTRIBUTE INDEX not found 16 Attribute pointer 146
PROGRAM ERROR: ATTRIBUTE INDEX not found 286 Attribute pointer 140
PROGRAM ERROR: ATTRIBUTE INDEX not found 287 Attribute pointer 141
PROGRAM ERROR: ATTRIBUTE INDEX not found 13 Attribute pointer 142
PROGRAM ERROR: ATTRIBUTE INDEX not found 12 Attribute pointer 143
PROGRAM ERROR: ATTRIBUTE INDEX not found 14 Attribute pointer 144
PROGRAM ERROR: ATTRIBUTE INDEX not found 15 Attribute pointer 145
PROGRAM ERROR: ATTRIBUTE INDEX not found 16 Attribute pointer 146
PROGRAM ERROR: ATTRIBUTE INDEX not found 286 Attribute pointer 140
PROGRAM ERROR: ATTRIBUTE INDEX not found 287 Attribute pointer 141
PROGRAM ERROR: ATTRIBUTE INDEX not found 13 Attribute pointer 142
PROGRAM ERROR: ATTRIBUTE INDEX not found 12 Attribute pointer 143
PROGRAM ERROR: ATTRIBUTE INDEX not found 14 Attribute pointer 144
PROGRAM ERROR: ATTRIBUTE INDEX not found 15 Attribute pointer 145
PROGRAM ERROR: ATTRIBUTE INDEX not found 16 Attribute pointer 146
PROGRAM ERROR: ATTRIBUTE INDEX not found 286 Attribute pointer 141
PROGRAM ERROR: ATTRIBUTE INDEX not found 287 Attribute pointer 142
PROGRAM ERROR: ATTRIBUTE INDEX not found 13 Attribute pointer 143
PROGRAM ERROR: ATTRIBUTE INDEX not found 12 Attribute pointer 144
PROGRAM ERROR: ATTRIBUTE INDEX not found 14 Attribute pointer 145
PROGRAM ERROR: ATTRIBUTE INDEX not found 15 Attribute pointer 146
PROGRAM ERROR: ATTRIBUTE INDEX not found 16 Attribute pointer 147
PROGRAM ERROR: ATTRIBUTE INDEX not found 286 Attribute pointer 140
PROGRAM ERROR: ATTRIBUTE INDEX not found 287 Attribute pointer 141
PROGRAM ERROR: ATTRIBUTE INDEX not found 13 Attribute pointer 142
PROGRAM ERROR: ATTRIBUTE INDEX not found 12 Attribute pointer 143
PROGRAM ERROR: ATTRIBUTE INDEX not found 14 Attribute pointer 144
PROGRAM ERROR: ATTRIBUTE INDEX not found 15 Attribute pointer 145
PROGRAM ERROR: ATTRIBUTE INDEX not found 16 Attribute pointer 146
PROGRAM ERROR: ATTRIBUTE INDEX not found 286 Attribute pointer 140
PROGRAM ERROR: ATTRIBUTE INDEX not found 287 Attribute pointer 141
PROGRAM ERROR: ATTRIBUTE INDEX not found 13 Attribute pointer 142
PROGRAM ERROR: ATTRIBUTE INDEX not found 12 Attribute pointer 143
PROGRAM ERROR: ATTRIBUTE INDEX not found 14 Attribute pointer 144
PROGRAM ERROR: ATTRIBUTE INDEX not found 15 Attribute pointer 145
PROGRAM ERROR: ATTRIBUTE INDEX not found 16 Attribute pointer 146
PROGRAM ERROR: ATTRIBUTE INDEX not found 286 Attribute pointer 140
PROGRAM ERROR: ATTRIBUTE INDEX not found 287 Attribute pointer 141
PROGRAM ERROR: ATTRIBUTE INDEX not found 13 Attribute pointer 142
PROGRAM ERROR: ATTRIBUTE INDEX not found 12 Attribute pointer 143
PROGRAM ERROR: ATTRIBUTE INDEX not found 14 Attribute pointer 144
PROGRAM ERROR: ATTRIBUTE INDEX not found 15 Attribute pointer 145
PROGRAM ERROR: ATTRIBUTE INDEX not found 16 Attribute pointer 146
PROGRAM ERROR: ATTRIBUTE INDEX not found 286 Attribute pointer 140
PROGRAM ERROR: ATTRIBUTE INDEX not found 287 Attribute pointer 141
PROGRAM ERROR: ATTRIBUTE INDEX not found 13 Attribute pointer 142
PROGRAM ERROR: ATTRIBUTE INDEX not found 12 Attribute pointer 143
PROGRAM ERROR: ATTRIBUTE INDEX not found 14 Attribute pointer 144
PROGRAM ERROR: ATTRIBUTE INDEX not found 15 Attribute pointer 145
PROGRAM ERROR: ATTRIBUTE INDEX not found 16 Attribute pointer 146
PROGRAM ERROR: ATTRIBUTE INDEX not found 286 Attribute pointer 140
PROGRAM ERROR: ATTRIBUTE INDEX not found 287 Attribute pointer 141
PROGRAM ERROR: ATTRIBUTE INDEX not found 13 Attribute pointer 142
PROGRAM ERROR: ATTRIBUTE INDEX not found 12 Attribute pointer 143
PROGRAM ERROR: ATTRIBUTE INDEX not found 14 Attribute pointer 144
PROGRAM ERROR: ATTRIBUTE INDEX not found 15 Attribute pointer 145
PROGRAM ERROR: ATTRIBUTE INDEX not found 16 Attribute pointer 146
PROGRAM ERROR: ATTRIBUTE INDEX not found 286 Attribute pointer 140
PROGRAM ERROR: ATTRIBUTE INDEX not found 287 Attribute pointer 141
PROGRAM ERROR: ATTRIBUTE INDEX not found 13 Attribute pointer 142
PROGRAM ERROR: ATTRIBUTE INDEX not found 12 Attribute pointer 143
PROGRAM ERROR: ATTRIBUTE INDEX not found 14 Attribute pointer 144
PROGRAM ERROR: ATTRIBUTE INDEX not found 15 Attribute pointer 145
PROGRAM ERROR: ATTRIBUTE INDEX not found 16 Attribute pointer 146
PROGRAM ERROR: ATTRIBUTE INDEX not found 286 Attribute pointer 140
PROGRAM ERROR: ATTRIBUTE INDEX not found 287 Attribute pointer 141
PROGRAM ERROR: ATTRIBUTE INDEX not found 13 Attribute pointer 142
PROGRAM ERROR: ATTRIBUTE INDEX not found 12 Attribute pointer 143
PROGRAM ERROR: ATTRIBUTE INDEX not found 14 Attribute pointer 144
PROGRAM ERROR: ATTRIBUTE INDEX not found 15 Attribute pointer 145
PROGRAM ERROR: ATTRIBUTE INDEX not found 16 Attribute pointer 146
PROGRAM ERROR: ATTRIBUTE INDEX not found 286 Attribute pointer 140
PROGRAM ERROR: ATTRIBUTE INDEX not found 287 Attribute pointer 141
PROGRAM ERROR: ATTRIBUTE INDEX not found 13 Attribute pointer 142
PROGRAM ERROR: ATTRIBUTE INDEX not found 12 Attribute pointer 143
PROGRAM ERROR: ATTRIBUTE INDEX not found 14 Attribute pointer 144
PROGRAM ERROR: ATTRIBUTE INDEX not found 15 Attribute pointer 145
PROGRAM ERROR: ATTRIBUTE INDEX not found 16 Attribute pointer 146
---------------------------------------------------------------------------
AssertionError Traceback (most recent call last)
~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\internals\construction.py in _list_to_arrays(data, columns, coerce_float, dtype)
563 try:
--> 564 columns = _validate_or_indexify_columns(content, columns)
565 result = _convert_object_array(content, dtype=dtype, coerce_float=coerce_float)
~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\internals\construction.py in _validate_or_indexify_columns(content, columns)
688 raise AssertionError(
--> 689 f"{len(columns)} columns passed, passed data had "
690 f"{len(content)} columns"
AssertionError: 9 columns passed, passed data had 10 columns
The above exception was the direct cause of the following exception:
ValueError Traceback (most recent call last)
<ipython-input-11-a67c96be9d33> in <module>
----> 1 HSP2tools.readWDM('KDTWMet-06272019-KOS_w_Mon17Filled_CHLA_ComDO.wdm', HDFname)
~\Documents\GitHub\limno_HSPsquared\HSP2tools\readWDM.py in readWDM(wdmfile, hdffile)
118
119
--> 120 dfsummary = pd.DataFrame(summary, index=summaryindx, columns=columns)
121 store.put('TIMESERIES/SUMMARY',dfsummary, format='t', data_columns=True)
122 return dfsummary
~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\frame.py in __init__(self, data, index, columns, dtype, copy)
507 if is_named_tuple(data[0]) and columns is None:
508 columns = data[0]._fields
--> 509 arrays, columns = to_arrays(data, columns, dtype=dtype)
510 columns = ensure_index(columns)
511
~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\internals\construction.py in to_arrays(data, columns, coerce_float, dtype)
522 return [], [] # columns if columns is not None else []
523 if isinstance(data[0], (list, tuple)):
--> 524 return _list_to_arrays(data, columns, coerce_float=coerce_float, dtype=dtype)
525 elif isinstance(data[0], abc.Mapping):
526 return _list_of_dict_to_arrays(
~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\internals\construction.py in _list_to_arrays(data, columns, coerce_float, dtype)
565 result = _convert_object_array(content, dtype=dtype, coerce_float=coerce_float)
566 except AssertionError as e:
--> 567 raise ValueError(e) from e
568 return result, columns
569
ValueError: 9 columns passed, passed data had 10 columns |
@aufdenkampe @steveskrip @bcous |
@PaulDudaRESPEC, thank you! @steveskrip & @bcous, I merged all this into LimnoTech's Unfortunately, I had a merge conflict when I tried to cherry-pick the individual commit into our @PaulDudaRESPEC, since we all just decided to focus on Water Quality modules, I'm wondering if it's time we merge all WaterQuality into |
@aufdenkampe , I'm on board with having only one development branch during this current effort. |
I have tested this with the same files as yesterday, and it appears to be working better. readUCI completed with no problem on the file I linked yesterday. There may still be an issue with the readWDM. It reads in 3 of the files correctly, but appears to hang up on this WDM: https://github.com/LimnoTech/HSPsquared/blob/develop-WaterQuality-BC/tests/GLWACSO/RPO_SWMM48LINKS2017_wCBOD_June2019.wdm When running in Jupyter notebooks, it never completes. It appears to add timeseries to the .h5 file (file is larger after it starts running), but it never updates the summary table. Let me know if you want to see the .h5 files and I can find a way to transfer them to you. |
Thanks @bcous for testing again! It's good to know that the Tomorrow morning, I'll work with @PaulDudaRESPEC to merge water quality into |
Hi @PaulDudaRESPEC -- I was chatting with @aufdenkampe about this issue. He suggested that it might be related to the 15-minute data in the WDM file. I checked and in at least 2 of the other WDMs that were read in there were timeseries with 15-minute flow data included as well. Let me know if you want to chat about specifics further. |
Thanks, @bcous , that's good to know. I've asked Jack, the WDM guru, to take a look. |
Circling back to this one... Jack took a look and noted that at least one of the problematic data sets, DSN 772, appears to have been compiled at various time steps -- daily, 15min, and annual, all in the same timeseries. Looks like the old WDM Fortran code knows how to deal with that, but not the python code. Until we have a fix, I suggest a work-around might be to build the data set from scratch at a 15min time step throughout. |
@PaulDudaRESPEC, that is very helpful to know. Thank you! |
@PaulDudaRESPEC, any updates on whether you or Jack might be able to fix |
Jack is looking at it. I think he's on the trail, but we've haven't solved it yet. My thought about rebuilding the files from scratch is that you could list the problematic timeseries in something like the SARA Timeseries Utility, save the list to a text file, and then re-import the data from the text file. But I'm not sure if you'd lose anything critical in the process. |
@aufdenkampe and @bcous |
@PaulDudaRESPEC, thanks for the update, and thanks to you and @jlkittle for your first round of fixes with dddd759 and f190fd8! That's really interesting to hear that its connected to different compression routines in the Fortran WDM code. We noticed with @bcous's project that those WDM files created massively bigger HDF5 files. I've been thinking that we might be able to do better with the HDF5 compression. In fact, the last work by @rheaphy including exploring better HSP2 performance by using BLOSC compression with the HDF5 files, as he described here: #36 (comment). It might be useful to pick up where he left off. |
I was trying to use readUCI on this file: https://github.com/LimnoTech/HSPsquared/blob/develop-WaterQuality-BC/tests/GLWACSO/model_files/GLWA_HSPF_June2019_Mon8MileDataFilled_WT_RW_v4.UCI The following error messages came up when I tried to run it. Thanks, Brendan ---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-6-b83acb602a13> in <module>
----> 1 get_ipython().run_line_magic('timeit', 'HSP2tools.readUCI(uciname, HDFname)')
~\AppData\Local\Continuum\anaconda3\lib\site-packages\IPython\core\interactiveshell.py in run_line_magic(self, magic_name, line, _stack_depth)
2325 kwargs['local_ns'] = self.get_local_scope(stack_depth)
2326 with self.builtin_trap:
-> 2327 result = fn(*args, **kwargs)
2328 return result
2329
<decorator-gen-54> in timeit(self, line, cell, local_ns)
~\AppData\Local\Continuum\anaconda3\lib\site-packages\IPython\core\magic.py in <lambda>(f, *a, **k)
185 # but it's overkill for just that one bit of state.
186 def magic_deco(arg):
--> 187 call = lambda f, *a, **k: f(*a, **k)
188
189 if callable(arg):
~\AppData\Local\Continuum\anaconda3\lib\site-packages\IPython\core\magics\execution.py in timeit(self, line, cell, local_ns)
1167 for index in range(0, 10):
1168 number = 10 ** index
-> 1169 time_number = timer.timeit(number)
1170 if time_number >= 0.2:
1171 break
~\AppData\Local\Continuum\anaconda3\lib\site-packages\IPython\core\magics\execution.py in timeit(self, number)
167 gc.disable()
168 try:
--> 169 timing = self.inner(it, self.timer)
170 finally:
171 if gcold:
<magic-timeit> in inner(_it, _timer)
~\Documents\GitHub\limno_HSPsquared\HSP2tools\readUCI.py in readUCI(uciname, hdfname)
143 if line[0:6] == 'PERLND': operation(info, getlines(f),'PERLND')
144 if line[0:6] == 'IMPLND': operation(info, getlines(f),'IMPLND')
--> 145 if line[0:6] == 'RCHRES': operation(info, getlines(f),'RCHRES')
146
147 colnames = ('AFACTR', 'MFACTOR', 'MLNO', 'SGRPN', 'SMEMN', 'SMEMSB',
~\Documents\GitHub\limno_HSPsquared\HSP2tools\readUCI.py in operation(info, llines, op)
566 df = concat([temp[1] for temp in history[path, cat]], axis='columns')
567 df = fix_df(df, op, path, ddfaults, valid)
--> 568 df.to_hdf(store, f'{op}/{path}/{cat}{count}', data_columns=True)
569 else:
570 print('UCI TABLE is not understood (yet) by readUCI', op, cat)
~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\generic.py in to_hdf(self, path_or_buf, key, mode, complevel, complib, append, format, index, min_itemsize, nan_rep, dropna, data_columns, errors, encoding)
2447 data_columns=data_columns,
2448 errors=errors,
-> 2449 encoding=encoding,
2450 )
2451
~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\io\pytables.py in to_hdf(path_or_buf, key, value, mode, complevel, complib, append, format, index, min_itemsize, nan_rep, dropna, data_columns, errors, encoding)
268 path_or_buf, mode=mode, complevel=complevel, complib=complib
269 ) as store:
--> 270 f(store)
271 else:
272 f(path_or_buf)
~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\io\pytables.py in <lambda>(store)
260 data_columns=data_columns,
261 errors=errors,
--> 262 encoding=encoding,
263 )
264
~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\io\pytables.py in put(self, key, value, format, index, append, complib, complevel, min_itemsize, nan_rep, data_columns, encoding, errors, track_times)
1127 encoding=encoding,
1128 errors=errors,
-> 1129 track_times=track_times,
1130 )
1131
~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\io\pytables.py in _write_to_group(self, key, value, format, axes, index, append, complib, complevel, fletcher32, min_itemsize, chunksize, expectedrows, dropna, nan_rep, data_columns, encoding, errors, track_times)
1799 nan_rep=nan_rep,
1800 data_columns=data_columns,
-> 1801 track_times=track_times,
1802 )
1803
~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\io\pytables.py in write(self, obj, axes, append, complib, complevel, fletcher32, min_itemsize, chunksize, expectedrows, dropna, nan_rep, data_columns, track_times)
4236 min_itemsize=min_itemsize,
4237 nan_rep=nan_rep,
-> 4238 data_columns=data_columns,
4239 )
4240
~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\io\pytables.py in _create_axes(self, axes, obj, validate, nan_rep, data_columns, min_itemsize)
3863
3864 blocks, blk_items = self._get_blocks_and_items(
-> 3865 block_obj, table_exists, new_non_index_axes, self.values_axes, data_columns
3866 )
3867
~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\io\pytables.py in _get_blocks_and_items(block_obj, table_exists, new_non_index_axes, values_axes, data_columns)
3986 blk_items = get_blk_items(mgr, blocks)
3987 for c in data_columns:
-> 3988 mgr = block_obj.reindex([c], axis=axis)._mgr
3989 blocks.extend(mgr.blocks)
3990 blk_items.extend(get_blk_items(mgr, mgr.blocks))
~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\util\_decorators.py in wrapper(*args, **kwargs)
307 @wraps(func)
308 def wrapper(*args, **kwargs) -> Callable[..., Any]:
--> 309 return func(*args, **kwargs)
310
311 kind = inspect.Parameter.POSITIONAL_OR_KEYWORD
~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\frame.py in reindex(self, *args, **kwargs)
4030 kwargs.pop("axis", None)
4031 kwargs.pop("labels", None)
-> 4032 return super().reindex(**kwargs)
4033
4034 def drop(
~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\generic.py in reindex(self, *args, **kwargs)
4460 # perform the reindex on the axes
4461 return self._reindex_axes(
-> 4462 axes, level, limit, tolerance, method, fill_value, copy
4463 ).__finalize__(self, method="reindex")
4464
~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\frame.py in _reindex_axes(self, axes, level, limit, tolerance, method, fill_value, copy)
3871 if columns is not None:
3872 frame = frame._reindex_columns(
-> 3873 columns, method, copy, level, fill_value, limit, tolerance
3874 )
3875
~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\frame.py in _reindex_columns(self, new_columns, method, copy, level, fill_value, limit, tolerance)
3919 copy=copy,
3920 fill_value=fill_value,
-> 3921 allow_dups=False,
3922 )
3923
~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\generic.py in _reindex_with_indexers(self, reindexers, fill_value, copy, allow_dups)
4528 fill_value=fill_value,
4529 allow_dups=allow_dups,
-> 4530 copy=copy,
4531 )
4532 # If we've made a copy once, no need to make another one
~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\internals\managers.py in reindex_indexer(self, new_axis, indexer, axis, fill_value, allow_dups, copy, consolidate)
1274 # some axes don't allow reindexing with dups
1275 if not allow_dups:
-> 1276 self.axes[axis]._can_reindex(indexer)
1277
1278 if axis >= self.ndim:
~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\indexes\base.py in _can_reindex(self, indexer)
3287 # trying to reindex on an axis with duplicates
3288 if not self.is_unique and len(indexer):
-> 3289 raise ValueError("cannot reindex from a duplicate axis")
3290
3291 def reindex(self, target, method=None, level=None, limit=None, tolerance=None):
ValueError: cannot reindex from a duplicate axis
|
@bcous , just posted a fix for UCI's with multiple GQUALs -- fixes the problem you reported yesterday. |
Doing additional testing and ran into an error in running HSP2.main. Error codes listed below: 2021-03-08 11:16:00.66 Processing started for file GLWA_HSPF_June2019_Mon8MileDataFilled_WT_RW_v4.h5; saveall=True
2021-03-08 11:16:02.67 Simulation Start: 2017-05-01 00:00:00, Stop: 2017-11-01 00:00:00
2021-03-08 11:16:02.67 PERLND P301 DELT(minutes): 15
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
<ipython-input-13-63be5facfe88> in <module>
----> 1 HSP2.main(hdfname,saveall=True)
~\Documents\GitHub\limno_HSPsquared\HSP2\main.py in main(hdfname, saveall, jupyterlab)
49
50 # now conditionally execute all activity modules for the op, segment
---> 51 ts = get_timeseries(store,ddext_sources[(operation,segment)],siminfo)
52 flags = uci[(operation, 'GENERAL', segment)]['ACTIVITY']
53 if operation == 'RCHRES':
~\Documents\GitHub\limno_HSPsquared\HSP2\main.py in get_timeseries(store, ext_sourcesdd, siminfo)
204 if row.MFACTOR != 1.0:
205 temp1 *= row.MFACTOR
--> 206 t = transform(temp1, row.TMEMN, row.TRAN, siminfo)
207
208 tname = f'{row.TMEMN}{row.TMEMSB}'
~\Documents\GitHub\limno_HSPsquared\HSP2\utilities.py in transform(ts, name, how, siminfo)
78 pass
79 elif tsfreq == None: # Sparse time base, frequency not defined
---> 80 ts = ts.reindex(siminfo['tbase']).ffill().bfill()
81 elif how == 'SAME':
82 ts = ts.resample(freq).ffill() # tsfreq >= freq assumed, or bad user choice
KeyError: 'tbase' |
@PaulDudaRESPEC and @bcous, the The issue has since been fixed, but we found yet another issue in that branch that we are presently working on fixing. |
With the recent successful Rewrite readWDM.py to read by data group & block #21, we can properly read all WDM files that we've tested, including those with irregular time series. All other readUCI issue have been addressed, to our knowledge. Getting HSP2 to Handle irregular time series input #51 is a separate issue Closing this issue as we will merge PR #35 (Merge develop_readWDM into develop to read time series by block & group #35) as soon as we resolve a merge conflict. |
Fetch upstream commits into `wq-updates-tmr` to squash merge conflicts
Hi,
The reason I didn't implement compression originally was that HDFView
and other third party tools required "registration" of compression
algorithms which was so poorly documented that I thought this would be hard
for most hydrologists. I expected that the improvements to HDFView would
make this either easy or automatic. I didn't want people frustrated that
they couldn't view their HDF5 files with standard tools. I have been
tracking the HDF tools created for JupyterLab but their progress has been
slow. Compression is easy using Pandas/pytables.
Bob
…On Fri, Jan 22, 2021 at 2:11 PM Anthony Aufdenkampe < ***@***.***> wrote:
@PaulDudaRESPEC <https://github.com/PaulDudaRESPEC>, thanks for the
update, and thanks to you and @jlkittle <https://github.com/jlkittle> for
your first round of fixes with dddd759
<dddd759>
and f190fd8
<f190fd8>
!
That's really interesting to hear that its connected to different
compression routines in the Fortran WDM code. We noticed with @bcous
<https://github.com/bcous>'s project that those WDM files created
massively bigger HDF5 files. I've been thinking that we might be able to do
better with the HDF5 compression. In fact, the last work by @rheaphy
<https://github.com/rheaphy> including exploring better HSP2 performance
by using BLOSC compression with the HDF5 files, as he described here: #36
(comment)
<#36 (comment)>.
It might be useful to pick up where he left off.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#40 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AFML2EKVJZ3Z22P7VDRUWFDS3HSZJANCNFSM4RWIVM6A>
.
|
This spring @steveskrip noticed that many UCI files successfully used by LimnoTech with HSPF (and created by LimnoTech's WinModel package) would not import with
readUCI
.@rheaphy also noted that there might be time issues in UCI files, because HSPF doesn't really correctly manage time and for HSP2, we're using ISO time standards that track leap seconds and time zones.
Let's use this issue thread to track @rheaphy's work to improve
readUCI
, and our results with testing it.The text was updated successfully, but these errors were encountered: