Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add MLCube support for Image Segmentation Benchmark #494

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

davidjurado
Copy link
Contributor

@davidjurado davidjurado commented Jul 2, 2021

Used PR #465 and #491 as references.

Current implementation

We'll be updating this section as we merge MLCube PRs and make new MLCube releases.

Benchmark execution with MLCube

Project setup

# Create Python environment and install MLCube Docker runner 
virtualenv -p python3 ./env && source ./env/bin/activate && pip install mlcube-docker

# Fetch the image segmentation workload
git clone https://github.com/mlcommons/training && cd ./training
git fetch origin pull/494/head:feature/mlcube_image_segmentation && git checkout feature/mlcube_image_segmentation
cd ./image_segmentation/mlcube

Dataset

The KiTS19 dataset will be downloaded and processed. Sizes of the dataset in each step:

Dataset Step MLCube Task Format Size
Download (raw dataset) download_data nii.gz ~29 GB
Preprocess (Processed dataset) preprocess_data npy ~31 GB
Total (After all tasks) All ~60 GB

Tasks execution

# Download KiTS19 dataset. Default path = mlcube/workspace/data
# To override it, use data_dir=DATA_DIR
mlcube run --task download_data

# Preprocess KiTS19 dataset
# It will use a subdirectory from the DATA_DIR path defined in the previous step
mlcube run --task preprocess_data

# Run benchmark. Default paths input_dir = mlcube/workspace/processed_data
# Parameters to override: input_dir=DATA_DIR, output_dir=OUTPUT_DIR, parameters_file=PATH_TO_TRAINING_PARAMS
mlcube run --task train

We are targeting pull-type installation, so MLCube images should be available on docker hub. If not, try this:

mlcube run ... -Pdocker.build_strategy=always

We are targeting pull-type installation, so MLCube images should be available on docker hub. If not, try this:

mlcube run ... -Pdocker.build_strategy=always

@github-actions
Copy link

github-actions bot commented Jul 2, 2021

MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅

@davidjurado
Copy link
Contributor Author

davidjurado commented Jul 2, 2021

One thing I noticed is that when running the command mlcube describe the generated files are not getting updated with the new instructions from the mlcube/workspace/.mlcube.yaml file.

@davidjurado davidjurado force-pushed the feature/mlcube_image_segmentation branch from 0d90b94 to 5a40603 Compare September 9, 2021 14:12
@nv-rborkar
Copy link
Contributor

@davidjurado is the issue you observed resolved now?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants