Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MetaSchedule] Improve the script for TorchBench model tuning & benchmarking #13255

Merged
merged 2 commits into from
Nov 3, 2022

Conversation

yelite
Copy link
Contributor

@yelite yelite commented Nov 1, 2022

This PR adds features to the python/tvm/meta_schedule/testing/torchbench/run.py.

  • Integrate with the TVM PyTorch integration to handle boolean tensor and unaligned memory.
  • Deduplicate collected tuning tasks to prevent thousands of tasks created by hundreds of subgraphs with similar structure.
  • Add option to cast model to float32, which are more stable numerically than float16 and prevents inaccurate result from many models.
  • Add option to choose search strategy in MetaSchedule.
  • Inspect output error if the actual output doesn't match the expectation. Also save the actual output and expected output for further analysis if needed.
  • Save subgraphs and their example input for debug purpose.
  • Print MetaSchedule profiling information at the end of execution.
  • Detach PyTorch tensor before exporting to dlpack.
  • Fix the sys path to avoid conflict with the benchmarks package installed by TorchBench dependency.
  • Trim all command line args passed in, in order to prevent breaking some TorchBench model that depends on args.
  • Empty cuda cache before starting the actual benchmark.

cc: @junrushao @zxybazh

@tvm-bot
Copy link
Collaborator

tvm-bot commented Nov 1, 2022

Thanks for contributing to TVM! Please refer to the contributing guidelines https://tvm.apache.org/docs/contribute/ for useful information and tips. Please request code reviews from Reviewers by @-ing them in a comment.

Generated by tvm-bot

Copy link
Member

@junrushao junrushao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@junrushao junrushao merged commit b98b9f9 into apache:main Nov 3, 2022
xinetzone pushed a commit to daobook/tvm that referenced this pull request Nov 10, 2022
…marking (apache#13255)

This PR adds features to the `python/tvm/meta_schedule/testing/torchbench/run.py`.

- Integrate with the TVM PyTorch integration to handle boolean tensor and unaligned memory.
- Deduplicate collected tuning tasks to prevent thousands of tasks created by hundreds of subgraphs with similar structure.
- Add option to cast model to float32, which are more stable numerically than float16 and prevents inaccurate result from many models.
- Add option to choose search strategy in MetaSchedule.
- Inspect output error if the actual output doesn't match the expectation. Also save the actual output and expected output for further analysis if needed.
- Save subgraphs and their example input for debug purpose.
- Print MetaSchedule profiling information at the end of execution.
- Detach PyTorch tensor before exporting to dlpack.
- Fix the sys path to avoid conflict with the `benchmarks` package installed by TorchBench dependency.
- Trim all command line args passed in, in order to prevent breaking some TorchBench model that depends on args.
- Empty cuda cache before starting the actual benchmark.
xinetzone pushed a commit to daobook/tvm that referenced this pull request Nov 25, 2022

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
…marking (apache#13255)

This PR adds features to the `python/tvm/meta_schedule/testing/torchbench/run.py`.

- Integrate with the TVM PyTorch integration to handle boolean tensor and unaligned memory.
- Deduplicate collected tuning tasks to prevent thousands of tasks created by hundreds of subgraphs with similar structure.
- Add option to cast model to float32, which are more stable numerically than float16 and prevents inaccurate result from many models.
- Add option to choose search strategy in MetaSchedule.
- Inspect output error if the actual output doesn't match the expectation. Also save the actual output and expected output for further analysis if needed.
- Save subgraphs and their example input for debug purpose.
- Print MetaSchedule profiling information at the end of execution.
- Detach PyTorch tensor before exporting to dlpack.
- Fix the sys path to avoid conflict with the `benchmarks` package installed by TorchBench dependency.
- Trim all command line args passed in, in order to prevent breaking some TorchBench model that depends on args.
- Empty cuda cache before starting the actual benchmark.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants