Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix read and write of NaN to hdf5 #1180

Merged
merged 9 commits into from
Jul 9, 2018

Conversation

peendebak
Copy link
Contributor

Fixes #1175

This solves writing datasets containing NaN elements to hdf5 format.

@dpfranke Can you check the PR and clean up where needed?

@jenshnielsen @WilliamHPNielsen

Copy link
Contributor

@astafan8 astafan8 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for fixing this bug, but the implementation itself needs some improvements.

@@ -79,6 +79,7 @@ def read(self, data_set, location=None):
data_set._h5_base_group['Data Arrays'].keys()):
# Decoding string is needed because of h5py/issues/379
name = array_id # will be overwritten if not in file
print('read array_id %s' % array_id)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this print statement needed for?

@@ -99,6 +100,9 @@ def read(self, data_set, location=None):
# set_arrays = ()
vals = dat_arr.value[:, 0]
if 'shape' in dat_arr.attrs.keys():
# extend with NaN if needed
esize = np.prod(dat_arr.attrs['shape'])
vals=np.append(vals, [np.nan]*(esize-vals.size))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add spaces around = for readability (around * and - as well?)

# get latest NaN element
new_dlen=(~np.isnan(x)).nonzero()[0][-1]+1
except:
new_dlen=old_dlen
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add spaces around = signs

new_dlen = len(x[~np.isnan(x)])
try:
# get latest NaN element
new_dlen=(~np.isnan(x)).nonzero()[0][-1]+1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you "unwrap" this line of code so that it is clear what is going on? actually, what is going on? :) also, is it possible to avoid the use of try-except (if you are not "excepting" a very specific exception, this pattern can easily hide possible bugs in the code inside the try block)?

@@ -99,6 +100,9 @@ def read(self, data_set, location=None):
# set_arrays = ()
vals = dat_arr.value[:, 0]
if 'shape' in dat_arr.attrs.keys():
# extend with NaN if needed
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add a test for this.

@@ -207,7 +211,12 @@ def write(self, data_set, io_manager=None, location=None,
datasetshape = dset.shape
old_dlen = datasetshape[0]
x = data_set.arrays[array_id]
new_dlen = len(x[~np.isnan(x)])
try:
# get latest NaN element
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add a test for this.

@peendebak
Copy link
Contributor Author

@astafan8 Fixed the issues. I will not write tests (formatter is for the 'old' dataset)

Copy link
Contributor

@astafan8 astafan8 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@peendebak ok, I understand. But could you at least please:

@codecov
Copy link

codecov bot commented Jul 8, 2018

Codecov Report

Merging #1180 into master will increase coverage by 0.01%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master    #1180      +/-   ##
==========================================
+ Coverage   79.66%   79.67%   +0.01%     
==========================================
  Files          47       47              
  Lines        6662     6667       +5     
==========================================
+ Hits         5307     5312       +5     
  Misses       1355     1355

@peendebak
Copy link
Contributor Author

@astafan8 Fixed the tests.

@astafan8 astafan8 merged commit fb574ef into microsoft:master Jul 9, 2018
giulioungaretti pushed a commit that referenced this pull request Jul 9, 2018
Merge: 1291672 a46fed2
Author: Mikhail Astafev <[email protected]>

    Merge pull request #1180 from VandersypenQutech/fix/hdf5nan
@eendebakpt eendebakpt deleted the fix/hdf5nan branch January 11, 2019 11:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants