Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

General performance testing tooling improvements #1577

Merged
merged 4 commits into from
Sep 11, 2019

Conversation

skottmckay
Copy link
Contributor

Description:
Various changes to aid performance testing

  • Add access to parent node from a subgraph so details from that node can be used
  • Default to level 3 optimization for onnxruntime_perf_test
  • Add code to create Concurrency Visualizer markers so it's easy for anyone to use that tool.
    • Disabled by #define.
  • Add support for parallel requests to onnxruntime_perf_test
    • use threadpool with one thread per concurrent request
    • keep all threads running until all requests are done
      • previously issued n concurrent requests, waited for all threads to be done, and repeated which means you're always returning to a state where there are no requests being processed by ORT before new ones are issued
    • issue the specified number of requests (-r parameter value) without multiplying by the number of concurrent requests.
  • When statistics are requested write them to std::cout as well as the output file so they're easily viewed
  • Add overall time value
    • when issuing concurrent requests this is meaningful

Motivation and Context
General changes based on recent performance testing needs

@skottmckay skottmckay requested a review from a team as a code owner August 7, 2019 05:40
counter.load(std::memory_order_seq_cst);
requests.load(std::memory_order_seq_cst);

for (size_t i = 0; i != run_config.concurrent_session_runs; ++i) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you removed one level of loop?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This way we do the number of requests that the '-r' parameter specified. It's confusing to run with something like '-r 1000 -c 2' and have that send 2000 requests vs '-r 1000 -c 4' and have that send 4000 requests. Maybe 'concurrent_session_runs' should be renamed to 'concurrent_requests'.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants