-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dp sf pap #544
Dp sf pap #544
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@daniellepace This is tremendous! thank you for the excellent additions! All my comments are small aesthetic things mostly around documentation.
ml4h/arguments.py
Outdated
@@ -269,6 +272,11 @@ def parse_args(): | |||
help='If true saves the model weights from the last training epoch, otherwise the model with best validation loss is saved.', | |||
) | |||
|
|||
# 2D image data augmentation parameters | |||
parser.add_argument('--rotation_factor', default=0., type=float, help='a float represented as fraction of 2 Pi, e.g., rotation_factor = 0.014 results in an output rotated by a random amount in the range [-5 degrees, 5 degrees]') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add that these are data augmentation specific to the help strings, eg "For data augmentation, a float....'
ml4h/arguments.py
Outdated
@@ -376,6 +384,16 @@ def parse_args(): | |||
default='3M', | |||
) | |||
|
|||
# Arguments for explorations/infer_medians | |||
parser.add_argument('--analyze_ground_truth', default=True, help='Whether or not to filter by images with ground truth segmentations, for comparison') | |||
parser.add_argument('--dates_file', help='File containing dates for each sample_id') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we re-use the more general existing arg app_csv for this? Just trying to mitigate argument explosion.
ml4h/arguments.py
Outdated
parser.add_argument('--intensity_thresh', type=float, help='Threshold value for preprocessing') | ||
parser.add_argument('--intensity_thresh_in_structures', nargs='*', default=[], help='Structure names whose pixels should be replaced if the images has intensity above the threshold') | ||
parser.add_argument('--intensity_thresh_out_structure', help='Replacement structure name') | ||
parser.add_argument('--results_to_plot', nargs='*', default=[], help='Structure names to make scatter plots') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we re-use structures_to_analyze for this?
ml4h/explorations.py
Outdated
return y | ||
|
||
def infer_medians(args): | ||
assert(args.batch_size == 1) # no support here for iterating over larger batches |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as this is "public" add a Docstring
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also maybe rename to make it a little more specific maybe infer_medians_from_segmented_regions or something...
ml4h/explorations.py
Outdated
|
||
def infer_medians(args): | ||
assert(args.batch_size == 1) # no support here for iterating over larger batches | ||
assert(len(args.tensor_maps_in) == 1) # no support here for multiple input maps |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can bake the comment in to the assert with assert(len(args.tensor_maps_in) == 1, 'no support here for multiple input maps')
and then it is a bit easier to debug.
@@ -666,13 +666,6 @@ def get_train_valid_test_paths( | |||
|
|||
logging.info(f'Found {len(train_paths)} train, {len(valid_paths)} validation, and {len(test_paths)} testing tensors at: {tensors}') | |||
logging.debug(f'Discarded {len(discard_paths)} tensors due to given ratios') | |||
if len(train_paths) == 0 or len(valid_paths) == 0 or len(test_paths) == 0: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we just replace the or
with and
if there is no data we still want fail fast right?
@@ -732,6 +725,51 @@ def get_train_valid_test_paths_split_by_csvs( | |||
logging.info(f"CSV:{balance_csvs[i-1]}\nhas: {len(train_paths[i])} train, {len(valid_paths[i])} valid, {len(test_paths[i])} test tensors.") | |||
return train_paths, valid_paths, test_paths | |||
|
|||
# https://stackoverflow.com/questions/65475057/keras-data-augmentation-pipeline-for-image-segmentation-dataset-image-and-mask | |||
def augment_using_layers(images, mask, in_shapes, out_shapes, rotation_factor, zoom_factor, translation_factor): | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add Docstring and typehints
ml4h/tensor_generators.py
Outdated
if wrap_with_tf_dataset: | ||
|
||
do_augmentation = bool(rotation_factor or zoom_factor or translation_factor) | ||
logging.info(f'doing_augmentation {do_augmentation}') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's already in the log at the top with all of the arguments so this not needed, but if you want to keep the info statement here I would add the actual values of the augmentation arguments.
ml4h/tensor_generators.py
Outdated
logging.info(f'doing_augmentation {do_augmentation}') | ||
|
||
if do_augmentation: | ||
assert(len(tensor_maps_in) == 1) # no support for multiple input tensors |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bake comments into asserts
@@ -15,6 +15,7 @@ DOCKER_COMMAND="docker" | |||
PORT="8888" | |||
SCRIPT_NAME=$( echo $0 | sed 's#.*/##g' ) | |||
GPU_DEVICE="--gpus all" | |||
ENV="" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm curious what you use this for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I use this if I want to use a notebook that uses ml4h code that I am actively developing. I can set my PYTHONPATH to my ml4h directory so that I can edit the ml4h code and have it accessible to the notebook. I use something similar when I run docker so that I can edit the ml4h code and have it show up in the docker without having to restart it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome work!
Improvements for 2D segmentation and pap project.
Includes:
Questions: