Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

updating data.py for simplicity and z-mass and running 3particle update #2

Merged
merged 9 commits into from
Jul 28, 2020

Conversation

ggantos
Copy link
Collaborator

@ggantos ggantos commented Jul 10, 2020

No description provided.

Copy link
Collaborator

@djgagne djgagne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your patience with the review process. I will try to get it turned around faster next time. I do have a few minor code comments and adjustments. Otherwise it looks great. I really appreciate you adding more docstrings to the code.

@@ -2,3 +2,5 @@
*.DS_Store
**/.ipynb_checkpoints
*.out
batch**.sh
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the reason for the batch**.sh scripts to be ignored by git? Are they autogenerated by another program?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not autogenerated but I figured since they were native to our specific casper environment, we don't necessarily want them on the git repo. What do you think?

model_name: "cnn"
num_particles: 1
random_seed: 328942
output_cols: ["x", "y", "z", "d"]
input_scaler: "MinMaxScaler"
scaler_out: "MinMaxScaler"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does scaler_out handle the scaling of both the inputs and outputs or just the outputs?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on further review of the code, it looks like it handles the outputs, which is fine.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, yes, just outputs. I renamed scaler_vals to scaler_in to mimic the naming convention in my latest commit.

self.model.compile(optimizer=self.optimizer, loss=self.loss)
self.model.summary()

def fit(self, x, y):
def fit(self, x, y, xv, yv):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you change xv and yv to keyword arguments with the default equal to None (xv=None, yv=None) in case someone wants to fit data without using a validation set?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

print("Saving results and config file..")
mod.model.save(join(path_save, config["model_name"]+".h5"))
name_save = open(join(path_save, "train_outputs_pred.pkl"), 'wb')
pickle.dump(train_outputs_pred, name_save)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the prediction and error metric files, I would prefer they be saved in csv or parquet format instead of pickle. Pickle is not completely portable between different Python environments since it relies on environment code and libraries being the same to read the file.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't realize that. I'll revert back.

print("Saving results and config file..")
mod.model.save(join(path_save, config["model_name"]+".h5"))
name_save = open(join(path_save, "train_outputs_pred.pkl"), 'wb')
pickle.dump(train_outputs_pred, name_save)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same issue with pickle files here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the reason I pickled was to save a dict. I've fixed in subsequent commit. Thanks!

@ggantos
Copy link
Collaborator Author

ggantos commented Jul 24, 2020

I also added a few other things since I was working between the PR and this time. Sorry I hope that doesn't make things too complicated.

Copy link
Collaborator

@djgagne djgagne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new changes look good to me. Merging now.

@djgagne djgagne merged commit 0d9e150 into master Jul 28, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants