Skip to content
This repository has been archived by the owner on Dec 1, 2021. It is now read-only.

create div2k dataset class #547

Merged
merged 6 commits into from
Oct 29, 2019
Merged

create div2k dataset class #547

merged 6 commits into from
Oct 29, 2019

Conversation

suttang
Copy link
Contributor

@suttang suttang commented Oct 21, 2019

Motivation and Context

I want to implement a super-resolution network named DCSCN.
DNSCN uses the DIV2K dataset for train.

Description

  • Create a dataset class for DIV2K.
  • Create the DIV2K dataset test.

How has this been tested?

cd lmnet && PYTHONPATH=../:. pytest tests/lmnet_tests/datasets_tests/test_div2k.py

Screenshots (if appropriate):

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature / Optimization (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)

Checklist:

  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.

@suttang suttang requested a review from iizukak October 21, 2019 06:54
@suttang suttang self-assigned this Oct 21, 2019
@blueoil-butler blueoil-butler bot added the CI: auto-run Run CI automatically label Oct 21, 2019
Copy link
Member

@iizukak iizukak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM except few comments.


class Div2k(Base):
classes = []
num_classes = 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess setting num_classes = 0 is for base class restriction.
It's not clear. Please add comment.

target_file = self.files[i]
image = load_image(target_file)

return image, None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above. None label looks tricky.

@iizukak iizukak requested a review from tsawada October 25, 2019 00:48
@iizukak
Copy link
Member

iizukak commented Oct 25, 2019

@tsawada We need readability reviewer.

Copy link
Member

@iizukak iizukak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ownership Approval.

@@ -127,7 +131,6 @@ def _blueoil_to_lmnet(blueoil_config):
}
dataset = {}


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we remove a line here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My autopep8 removes unnecessary line break such as double empty line in functions.
Is this line necessary?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this line does not seem necessary but this diff is not related to the PR and thus should go into another PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

absolutely!

files = sorted(dataset.files)

dataset_files = os.path.join(environment.DATA_DIR, "DIV2K/{}/*.png".format(type["dir"]))
expected = sorted(glob(dataset_files))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

expected should be hard coded

def test_length(set_test_environment, subset):
dataset = Div2k(subset)

assert len(dataset.files) == len(dataset)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be hard coded

def test_num_per_epoch(set_test_environment, subset):
dataset = Div2k(subset)

assert len(dataset.files) == dataset.num_per_epoch
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be hardcoded. Probably we should not use parametrize.

def test_get_item(set_test_environment, subset):
dataset = Div2k(subset)

for i in range(len(dataset)):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

either

for image, _ in dataset:
    assert ...

or

assert all(isinstance(image, np.ndarray) for image, _ in dataset)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tsawada oops, thank you!!!!

Copy link
Contributor

@tsawada tsawada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! some minor comments...

@@ -127,7 +131,6 @@ def _blueoil_to_lmnet(blueoil_config):
}
dataset = {}


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this line does not seem necessary but this diff is not related to the PR and thus should go into another PR.

def test_get_item(set_test_environment):
dataset = Div2k("train")

assert all([isinstance(image, np.ndarray) for image, _ in dataset])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need to use list ([]) here. Generator comprehension is better as it'll fail early.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not know that!
thank you!!



def test_train_files(set_test_environment):
dataset = Div2k("train")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we don't need to bind everything into local variables.

assert sorted(Div2k("train")) == expected is short and easy to understand.

Same for other tests.

Copy link
Contributor

@tsawada tsawada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Python readability approval
🐍Thank you!

@suttang
Copy link
Contributor Author

suttang commented Oct 28, 2019

ARIGATO GOZAMASU!

@suttang suttang merged commit 335cffb into blue-oil:master Oct 29, 2019
@suttang suttang deleted the dataset/div2k branch October 29, 2019 01:19
@iizukak iizukak added this to the v0.14.0 milestone Nov 7, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
CI: auto-run Run CI automatically
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants