Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance Metrics Tracking #8931

Closed
wants to merge 36 commits into from
Closed

Performance Metrics Tracking #8931

wants to merge 36 commits into from

Conversation

arindam1993
Copy link
Contributor

@arindam1993 arindam1993 commented Nov 1, 2019

What?

Add tooling and a benchmark suite that can be used to automatically gather performance metrics across multiple browsers and devices.

Why?

Make it easier to detect and catch performance regressions automatically, also measure the overall health of our sdk.

Which metrics?

Load Time metrics:

Measures "How soon till I see something on screen?"

  • loadTime (ms): Time elapsed from new Map() to map.on('load'), this is the time in which all the vector geometry has filled the screen.
  • fullLoadTime (ms): Time elapsed from new Map() to map.on('content.load') ( this is new, so name is subject to change), this is the time in which all content defined by the style have been loaded, this includes placement finishing for text labels, raster tiles finishing loading for satellite and/or hillshade layers etc.

Rendering Performance Metrics:

Measures "How fluid does it feel to interact with the map?"

This is slightly tricker to quantify, average frame-rate is a good measure but doesn't completely capture all sources of jank/hitch in an interactive application. For that we'd have to track frame-time, which the amount of time required to render each frame, but this not a single number, we need to track this for every frame we're rendering so we can detect spikes.
Here's a rather interesting article going into the details of why (https://cgvr.cs.ut.ee/wp/index.php/frame-rate-vs-frame-time/)
In order to quantify jank from our frame-time data, we can take the top 1% of our slowest frames, and measure their average frame-rate. The closer our 1% Low FPS is to our Average FPS the less janky our rendering engine is.

  • frameTimes Array: an array of millisecond values that track the amount of time it took to render each frame, this can be used to plot a frametime chart.
  • fps (frames per second): The average frames per second for the entire run
  • onePercentLowFps (frames per second): The framerate of our 1% slowestframes.

How?

  • Add instrumentation to gl-js that tracks these metrics and can output them.
  • Add build tooling that can strip out instrumentation for release builds.
  • Add a benchmark suite that defines certain styles, and Map operations on those styles.
  • Similar to the query test runner, build a test-page that when loaded in a browser starts running the benchmark suite, this will make it easy to run the suite on multiple browsers on various devices.
    After some discussion with @ryanhamley today:
  • Check consistency of metrics on CI-runs
  • Integrate with memory profiling (Add some memory stats to gl-stats #8949)
  • Potentially integrating with gl-stats?

How to run?

@ryanhamley ryanhamley changed the title Performance Mertics Tracking Performance Metrics Tracking Nov 1, 2019
@ryanhamley
Copy link
Contributor

Add a benchmark suite that defines certain styles, and Map operations on those styles.

We do have style benchmarking already which runs the benchmark suite against various styles. You could probably leverage that to add operations to run against the styles.

Similar to the query test runner, build a test-page that when loaded in a browser starts running the benchmark suite, this will make it easy to run the suite on multiple browsers on various devices.

The benchmark suite already runs in the browser. Or are you talking about a totally new benchmark suite? If so, we might want to come up with some different terminology to disambiguate them

@arindam1993
Copy link
Contributor Author

Add a benchmark suite that defines certain styles, and Map operations on those styles.

We do have style benchmarking already which runs the benchmark suite against various styles. You could probably leverage that to add operations to run against the styles.

The benchmark suite is built for gathering granular performance data for very specific sections for the pipeline. The idea with this is to gather higher-level performance metrics for the entire system.
I think benchmark suite acts kind of like unit-tests for performance, whereas this acts more like an integration test for performance. The data gathered will not be as granular, but it should be a better summary that helps us catch performance regressions.So I'm trying to go with something that has a similar , if not inter-compatible format with our render/query test fixtures.

Similar to the query test runner, build a test-page that when loaded in a browser starts running the benchmark suite, this will make it easy to run the suite on multiple browsers on various devices.

The benchmark suite already runs in the browser. Or are you talking about a totally new benchmark suite? If so, we might want to come up with some different terminology to disambiguate them

I agree, I think calling it the performance-metrics suite makes more sense?

@arindam1993 arindam1993 requested a review from mourner November 7, 2019 02:49
@arindam1993
Copy link
Contributor Author

Speaking to @ClareTrainor, we realized that it would be helpful to allow to dynamically override the style so the map design team can generate these metrics while keeping the SDK a constant, and varying the style, whereas for us we can keep the style a constant while varying the SDK.

ansis and others added 6 commits November 11, 2019 15:22
move featureMap from ProgramConfiguration to ProgramConfigurationSet.
All configurations within a set have the same vertex layout (because
they go together with the same vertex buffers).

This doesn't matter too much because the only time a set has more than
one programconfiguration is when multiple layers have identical layout
properties. The main goal here is to make the relationship a tiny bit
clearer.
- Add circle job to run metrics
- Implement messaging system between browser<->puppeteer
- Calculate mean and std-deviation of metrics for final output
@arindam1993
Copy link
Contributor Author

Closing this, and moving the discussion to a separate issue. Most of the driver code should be split out into a separate repo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants