Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

report: add fractional category scores #13009

Merged
merged 20 commits into from
Sep 9, 2021
Merged

Conversation

adamraine
Copy link
Member

@adamraine adamraine commented Sep 3, 2021

This moves the category ratio indicator from the flow report down to the legacy report renderer. There are still some style details to work out, but I wanted to get some initial impressions on the structure.

Screen Shot 2021-09-08 at 3 34 07 PM

@google-cla google-cla bot added the cla: yes label Sep 3, 2021
@adamraine
Copy link
Member Author

I'm starting to think that we should use a score gauge for timespan performance and snapshot accessibility. Quick mock:

Screen Shot 2021-09-07 at 2 27 42 PM

Snapshot accessibility makes the most sense because it's the same 54 audits that are found in navigation mode, so it makes sense to represent it the same way.

For timespan performance, it could be confusing when the category is graded as "average" even though the metrics are all passing (example). The unscored diagnostics and opportunities heavily impact how the category is scored, even though the metrics are the only audits scored. I think a score gauge makes more sense than a ratio in this case.

@patrickhulce
Copy link
Collaborator

Back in April, we had this to say as a group.

We can’t give 0-100 Performance in these, we can for other categories (a11y) but should we?
We like consistency, if we can’t do it for performance, we probably shouldn’t for the others.
Perhaps we just do green/yellow/red for the categories.

I think that largely holds true for me today, details below.

Snapshot accessibility makes the most sense because it's the same 54 audits that are found in navigation mode, so it makes sense to represent it the same way.

I agree, and this is the one I'm closest to using a gauge for. In addition to the above, I see two more arguments against this.

  1. Reinforcing "this isn't the same navigation environment" is more important than conveying the a11y score out of 0-100.
  2. There's been a lot of pushback on a11y in particular for mistakenly conveying "100" when there's a lot not covered. Switching to a fraction helps alleviate that and remind folks it's not comprehensive.

I think with the consistency argument plus these two points, I still lean toward the fraction.

I think a score gauge makes more sense than a ratio in this [timespan] case.

I'd feel really uncomfortable with a timespan score given it's just TBT and CLS. Long-term I have hope for more metric-driven scores on timespans but we really need to let the space mature and establish some benchmarks for success before we can confidently provide a holistic score in this area, IMO. A suite of metrics that doesn't consider network latency or feedback at all can't reasonably be considered a 0-100 quality, balanced performance score.

it could be confusing when the category is graded as "average" even though the metrics are all passing

If more people knew that the metrics alone are what matter for the score in navigation, I might agree here, but with a switch to the fraction I think it's reasonable that the number of red/yellow items you see matches the number of failures reported.

@adamraine
Copy link
Member Author

It would be a simple change if we do decide to make it, so I'm willing to defer to our previous position and finish this PR as is. I had a couple more thoughts if we decide to revisit:

Long-term I have hope for more metric-driven scores on timespans but we really need to let the space mature and establish some benchmarks for success before we can confidently provide a holistic score in this area, IMO.

I agree that a 0-100 score for just TBT and CLS isn't ideal, I just thought it might be better than the fraction. I guess I'm hoping for a more metric-driven score as well :)

  1. Reinforcing "this isn't the same navigation environment" is more important than conveying the a11y score out of 0-100.
  2. There's been a lot of pushback on a11y in particular for mistakenly conveying "100" when there's a lot not covered. Switching to a fraction helps alleviate that and remind folks it's not comprehensive.

I think with the consistency argument plus these two points, I still lean toward the fraction.

The problem is that a11y is the same between the two environments, so I'm still worried about the consistency within the category.

@@ -562,6 +562,18 @@
</a>
</template>

<!-- Lighthouse category ratio -->
<template id="ratio">
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been calling this "ratio", but I've seen others refer to it as "fraction". Any strong opinions about the name?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I typically refer to a representation like this as a "fraction" when the first number is strictly less than or equal to the second number, and a "ratio" when the two numbers have a more arbitrary relationship.

By that lens, I would call these a fraction.

@adamraine adamraine marked this pull request as ready for review September 8, 2021 19:35
@adamraine adamraine requested a review from a team as a code owner September 8, 2021 19:35
@adamraine adamraine requested review from patrickhulce and removed request for a team September 8, 2021 19:35
@@ -317,6 +317,10 @@ export class CategoryRenderer {
* @return {DocumentFragment}
*/
renderScoreGauge(category, groupDefinitions) { // eslint-disable-line no-unused-vars
if (category.displayMode === 'ratio') {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this seems like a bit of a misleading path where we've asked for a gauge but receive a ratio

new method renderCategoryScore that does this delegation like you've done in the flow report?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I started doing this but backed out once I reached

} else if (renderer.renderScoreGauge === categoryRenderer.renderScoreGauge) {

We would just keep that line the same right?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My preference would be to replace this line with renderCategoryScore === renderCategoryScore and rename the PWA one to renderCategoryScore. In fact that name makes more sense anyway because it doesn't have a gauge :)

We can alias renderScoreGauge to renderCategoryScore in PWA category for backcompat until v9 in case any partners are using it.

@adamraine adamraine changed the title report: category ratio report: category fraction Sep 8, 2021
@@ -17,7 +17,8 @@ interface ReportResult extends LHResult {

declare module ReportResult {
interface Category extends LHResult.Category {
auditRefs: Array<AuditRef>
auditRefs: Array<AuditRef>;
displayMode: 'gauge'|'fraction';
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thinking of the API proposal in #12772 (where prepareReportResult won't necessarily be used) and the possibility there might be more changes needed to category rendering besides just gauge vs fraction, what do you think about an options object passed into renderCategoryScore?

It seems like that's ultimately what we're going to want from renderReport()/renderReportPartials() and it would be nice to not alter ReportResult further from the LHR.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what do you think about an options object passed into renderCategoryScore?

That would mean passing the options through multiple layers of function calls, in multiple different locations. We could do it, but it seems like ReportResult was designed so we don't have to make these kinds of structural changes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ReportResult wasn't so much designed :)

* TODO(team): we all agree the LHR shape change is technical debt we should fix

It is a structural change and it would have to go through a few layers (though not so many). I think it may be worth it, though:

  • I want to call renderGaugeForCategory(category, groups, options), I don't want to have to call prepareReportResult first
  • if there are other per-mode changes (e.g. if we go with Use fraction score for non-performance categories #13022, perf is going to be different than the other categories in navigation reports but not in timespan/snapshot), we're going to have to expand displayMode or generalize it anyways

basically if we're going to parameterize on gatherMode, let's just do it outright rather than roundabout through ReportResult :)

@brendankenny brendankenny changed the title report: category fraction report: add fractional category scores Sep 8, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants