Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

core(font-size): speed up gatherer #6391

Closed
wants to merge 18 commits into from
Closed
Show file tree
Hide file tree
Changes from 15 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 4 additions & 13 deletions lighthouse-core/audits/seo/font-size.js
Original file line number Diff line number Diff line change
Expand Up @@ -208,7 +208,6 @@ class FontSize extends Audit {
analyzedFailingNodesData,
analyzedFailingTextLength,
failingTextLength,
visitedTextLength,
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed visitedTextLength because we can now easily visit every node (it's one protocol call).

totalTextLength,
} = artifacts.FontSize;

Expand All @@ -220,7 +219,7 @@ class FontSize extends Audit {

const failingRules = getUniqueFailingRules(analyzedFailingNodesData);
const percentageOfPassingText =
(visitedTextLength - failingTextLength) / visitedTextLength * 100;
(totalTextLength - failingTextLength) / totalTextLength * 100;
const pageUrl = artifacts.URL.finalUrl;

const headings = [
Expand All @@ -232,7 +231,7 @@ class FontSize extends Audit {

const tableData = failingRules.sort((a, b) => b.textLength - a.textLength)
.map(({cssRule, textLength, fontSize, node}) => {
const percentageOfAffectedText = textLength / visitedTextLength * 100;
const percentageOfAffectedText = textLength / totalTextLength * 100;
const origin = findStyleRuleSource(pageUrl, cssRule, node);

return {
Expand All @@ -246,7 +245,7 @@ class FontSize extends Audit {
// all failing nodes that were not fully analyzed will be displayed in a single row
if (analyzedFailingTextLength < failingTextLength) {
const percentageOfUnanalyzedFailingText =
(failingTextLength - analyzedFailingTextLength) / visitedTextLength * 100;
(failingTextLength - analyzedFailingTextLength) / totalTextLength * 100;

tableData.push({
source: 'Add\'l illegible text',
Expand All @@ -273,15 +272,7 @@ class FontSize extends Audit {
let explanation;
if (!passed) {
const percentageOfFailingText = parseFloat((100 - percentageOfPassingText).toFixed(2));
let disclaimer = '';

// if we were unable to visit all text nodes we should disclose that information
if (visitedTextLength < totalTextLength) {
const percentageOfVisitedText = visitedTextLength / totalTextLength * 100;
disclaimer = ` (based on ${percentageOfVisitedText.toFixed()}% sample)`;
}

explanation = `${percentageOfFailingText}% of text is too small${disclaimer}.`;
explanation = `${percentageOfFailingText}% of text is too small.`;
}

return {
Expand Down
187 changes: 111 additions & 76 deletions lighthouse-core/gather/gatherers/seo/font-size.js
Original file line number Diff line number Diff line change
Expand Up @@ -17,12 +17,10 @@
*/

const Gatherer = require('../gatherer');
const Sentry = require('../../../lib/sentry.js');
const FONT_SIZE_PROPERTY_NAME = 'font-size';
const TEXT_NODE_BLOCK_LIST = new Set(['SCRIPT', 'STYLE', 'NOSCRIPT']);
const MINIMAL_LEGIBLE_FONT_SIZE_PX = 12;
// limit number of protocol calls to make sure that gatherer doesn't take more than 1-2s
const MAX_NODES_VISITED = 500; // number of nodes to get the text length and compute font-size
const MAX_NODES_SOURCE_RULE_FETCHED = 50; // number of nodes to fetch the source font-size rule

/** DevTools uses a numeric enum for nodeType */
Expand All @@ -31,6 +29,7 @@ const TEXT_NODE_TYPE = 3;
/** @typedef {import('../../driver.js')} Driver */
/** @typedef {LH.Artifacts.FontSize['analyzedFailingNodesData'][0]} NodeFontData */
/** @typedef {LH.Artifacts.FontSize.DomNodeMaybeWithParent} DomNodeMaybeWithParent*/
/** @typedef {Map<number, {fontSize: number, textLength: number}>} BackendIdsToFontData */

/**
* @param {LH.Artifacts.FontSize.DomNodeMaybeWithParent=} node
Expand Down Expand Up @@ -178,11 +177,11 @@ function getEffectiveFontRule({inlineStyle, matchedCSSRules, inherited}) {
}

/**
* @param {LH.Crdp.DOM.Node} node
* @param {string} nodeValue
* @returns {number}
*/
function getNodeTextLength(node) {
return !node.nodeValue ? 0 : node.nodeValue.trim().length;
function getNodeTextLength(nodeValue) {
return nodeValue.trim().length;
}

/**
Expand All @@ -209,78 +208,14 @@ async function fetchSourceRule(driver, node) {
}

/**
* @param {Driver} driver
* @param {LH.Artifacts.FontSize.DomNodeWithParent} textNode text node
* @returns {Promise<?NodeFontData>}
*/
async function fetchComputedFontSize(driver, textNode) {
try {
const {computedStyle} = await driver.sendCommand('CSS.getComputedStyleForNode', {
nodeId: textNode.parentId,
});

const fontSizeProperty = computedStyle.find(({name}) => name === FONT_SIZE_PROPERTY_NAME);

return {
// @ts-ignore - font size property guaranteed to be returned in getComputedStyle
fontSize: parseInt(fontSizeProperty.value, 10),
textLength: getNodeTextLength(textNode),
node: textNode.parentNode,
};
} catch (err) {
Sentry.captureException(err, {tags: {gatherer: 'FontSize'}});
return null;
}
}

/**
* @param {LH.Artifacts.FontSize.DomNodeWithParent} node
* @param {{nodeType: number, parentNodeName: string}} node
* @returns {boolean}
*/
function isNonEmptyTextNode(node) {
return (
node.nodeType === TEXT_NODE_TYPE &&
!TEXT_NODE_BLOCK_LIST.has(node.parentNode.nodeName) &&
getNodeTextLength(node) > 0
);
function isTextNode(node) {
return node.nodeType === TEXT_NODE_TYPE && !TEXT_NODE_BLOCK_LIST.has(node.parentNodeName);
}

class FontSize extends Gatherer {
/**
* @param {LH.Gatherer.PassContext['driver']} driver
*/
static async fetchNodesToAnalyze(driver) {
let failingTextLength = 0;
let visitedTextLength = 0;
let totalTextLength = 0;

const nodes = await getAllNodesFromBody(driver);

const textNodes = nodes.filter(isNonEmptyTextNode);
totalTextLength = textNodes.reduce((sum, node) => (sum += getNodeTextLength(node)), 0);

const nodesToVisit = textNodes
.sort((a, b) => getNodeTextLength(b) - getNodeTextLength(a))
.slice(0, MAX_NODES_VISITED);

const nodePromises = nodesToVisit.map(node => fetchComputedFontSize(driver, node));
const visitedNodes = await Promise.all(nodePromises);

/** @type {Array<NodeFontData>} */
const failingNodes = [];
for (const visitedNode of visitedNodes) {
if (!visitedNode) continue;
visitedTextLength += visitedNode.textLength;

if (visitedNode.fontSize < MINIMAL_LEGIBLE_FONT_SIZE_PX) {
failingNodes.push(visitedNode);
failingTextLength += visitedNode.textLength;
}
}

return {totalTextLength, visitedTextLength, failingTextLength, failingNodes};
}

/**
* @param {LH.Gatherer.PassContext['driver']} driver
* @param {Array<NodeFontData>} failingNodes
Expand All @@ -304,6 +239,99 @@ class FontSize extends Gatherer {
return {analyzedFailingNodesData, analyzedFailingTextLength};
}

/**
* @param {LH.Crdp.DOMSnapshot.CaptureSnapshotResponse} snapshot
* @return {BackendIdsToFontData}
*/
calculateBackendIdsToFontData(snapshot) {
const strings = snapshot.strings;

// The document under analysis is the root document.
const doc = snapshot.documents[0];

// doc is a flattened property list describing all the Nodes in a document, with all string values
// deduped in a strings array.
// Implementation:
// https://cs.chromium.org/chromium/src/third_party/blink/renderer/core/inspector/inspector_dom_snapshot_agent.cc?sq=package:chromium&g=0&l=534

// satisfy the type checker that all expected values exist
if (!doc || !doc.nodes.nodeType || !doc.nodes.nodeName || !doc.nodes.backendNodeId
|| !doc.nodes.nodeValue || !doc.nodes.parentIndex) {
throw new Error('Unexpected response from DOMSnapshot.captureSnapshot.');
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it looks like these are meant to satisfy the type checker? Maybe that's worth commenting, e.g. // satisfy the type checker that all expected values exist


// Not all nodes have computed styles (ex: TextNodes), so doc.layout.* is smaller than doc.nodes.*
// doc.layout.nodeIndex maps the index into doc.layout.styles to an index into doc.nodes.*
// nodeIndexToStyleIndex inverses that mapping.
/** @type {Map<number, number>} */
const nodeIndexToStyleIndex = new Map();
doc.layout.nodeIndex.forEach((nodeIndex, styleIndex) => {
nodeIndexToStyleIndex.set(nodeIndex, styleIndex);
});

/** @type {BackendIdsToFontData} */
const backendIdsToFontData = new Map();
for (let i = 0; i < doc.nodes.nodeType.length; i++) {
const nodeType = doc.nodes.nodeType[i];
const nodeValue = strings[doc.nodes.nodeValue[i]];
const parentIndex = doc.nodes.parentIndex[i];
const parentNodeName = strings[doc.nodes.nodeName[parentIndex]];
if (!isTextNode({
nodeType,
parentNodeName,
})) continue;

const styleIndex = nodeIndexToStyleIndex.get(parentIndex);
if (!styleIndex) continue;
const textLength = getNodeTextLength(nodeValue);
if (!textLength) continue; // ignore empty TextNodes
const parentStyles = doc.layout.styles[styleIndex];
// each styles is an array of string indices, one for each property given
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

each styles? doc.layout.styles is an array of.... ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doc.layout.styles is an array of an array of string indices ...

// to DOMSnapshot.captureSnapshot. We only requested font-size, so there's just
// the one string index here.
const [fontSizeStringId] = parentStyles;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is arbitrary enough that this probably deserves a comment. Is it the only one in that array because font-size is all we requested? Do we want to assert it has length 1 to make sure of that fact?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a comment here would still be good too. The type is just Array<string>, so it and the protocol docs don't give much help on why this is the way it is

Copy link
Collaborator Author

@connorjclark connorjclark Oct 31, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a comment explaining why there's just the one value here.

I'm unsure about adding an assertion for .length === 1 here. This property is inherent to the API for this protocol method - do we assert post conditions like this often for other protocol methods?

// expects values like '11.5px' here
const fontSize = parseFloat(strings[fontSizeStringId]);
backendIdsToFontData.set(doc.nodes.backendNodeId[i], {
fontSize,
textLength,
});
}

return backendIdsToFontData;
}

/**
* The only connection between a snapshot Node and an actual Protocol Node is backendId,
* so that is used to join the two data structures. DOMSnapshot.captureSnapshot doesn't
* give the entire Node object, so DOM.getFlattenedDocument is used.
* @param {BackendIdsToFontData} backendIdsToFontData
* @param {LH.Artifacts.FontSize.DomNodeWithParent[]} crdpNodes
*/
findFailingNodes(backendIdsToFontData, crdpNodes) {
/** @type {NodeFontData[]} */
const failingNodes = [];
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you think we could move all this into some functions for the different steps you're adding so afterPass will be pretty short like before? I think it might help human-parsers figure out what's going on :)

it looks like there's roughly the following stages

  1. get the flattened document + snapshot and make sure its all valid
  2. create our mappings to be able to tie the two together
  3. go through all the nodes and identify the failing ones

is that roughly accurate?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that sounds good. I mostly wanted to wait for PR feedback before chopping up afterPass, just incase the whole thing was moot or something :P

let totalTextLength = 0;
let failingTextLength = 0;
for (const crdpNode of crdpNodes) {
const partialFontData = backendIdsToFontData.get(crdpNode.backendNodeId);
if (!partialFontData) continue; // wasn't a non-empty TextNode
const {fontSize, textLength} = partialFontData;
totalTextLength += textLength;
if (fontSize < MINIMAL_LEGIBLE_FONT_SIZE_PX) {
// Once a bad TextNode is identified, its parent Node is needed.
failingTextLength += textLength;
failingNodes.push({
node: crdpNode.parentNode,
textLength,
fontSize,
});
}
}

return {totalTextLength, failingTextLength, failingNodes};
}

/**
* @param {LH.Gatherer.PassContext} passContext
* @return {Promise<LH.Artifacts.FontSize>} font-size analysis
Expand All @@ -316,17 +344,24 @@ class FontSize extends Gatherer {
passContext.driver.on('CSS.styleSheetAdded', onStylesheetAdd);

await Promise.all([
passContext.driver.sendCommand('DOMSnapshot.enable'),
passContext.driver.sendCommand('DOM.enable'),
passContext.driver.sendCommand('CSS.enable'),
]);

// Use DOMSnapshot.captureSnapshot to get the computed font-size style of every node.
const snapshot = await passContext.driver.sendCommand('DOMSnapshot.captureSnapshot', {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

very small speedup if you run these two in parallel:

 lighthouse-core/gather/gatherers/seo/font-size.js | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/lighthouse-core/gather/gatherers/seo/font-size.js b/lighthouse-core/gather/gatherers/seo/font-size.js
index e6d0dbe0..3ceac9bd 100644
--- a/lighthouse-core/gather/gatherers/seo/font-size.js
+++ b/lighthouse-core/gather/gatherers/seo/font-size.js
@@ -350,13 +350,15 @@ class FontSize extends Gatherer {
     ]);
 
     // Use DOMSnapshot.captureSnapshot to get the computed font-size style of every node.
-    const snapshot = await passContext.driver.sendCommand('DOMSnapshot.captureSnapshot', {
+    const snapshotPromise = passContext.driver.sendCommand('DOMSnapshot.captureSnapshot', {
       computedStyles: ['font-size'],
     });
+    // but nodes will only contain the Body node and its descendants.
+    const allNodesPromise  = getAllNodesFromBody(passContext.driver);
+    const [snapshot, crdpNodes] = await Promise.all([snapshotPromise, allNodesPromise]);
+
     // backendIdsToFontData will include all non-empty TextNodes,
     const backendIdsToFontData = this.calculateBackendIdsToFontData(snapshot);
-    // but nodes will only contain the Body node and its descendants.
-    const crdpNodes = await getAllNodesFromBody(passContext.driver);
     const {
       totalTextLength,
       failingTextLength,

computedStyles: ['font-size'],
});
// backendIdsToFontData will include all non-empty TextNodes,
const backendIdsToFontData = this.calculateBackendIdsToFontData(snapshot);
// but nodes will only contain the Body node and its descendants.
const crdpNodes = await getAllNodesFromBody(passContext.driver);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getAllNodesFromBody relies on DOM.getFlattenedDocument but I think all the information we need from that is exposed in the snapshot.

Using #6410 it looks like getFlattenedDocument is slower, so it seems nice to drop it.
image

driver.getNodesInDocument is only used by font-size, so we can remove it from driver and rely on the capturedSnapshot entirely. wdyt?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The FontSize artifact contains entire CRDP Nodes. The snapshot data doesn't have all the data to reconstruct a CRDP Node.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we don't need everything that's in CRDP Node.
The audit only uses node.parentNode and node.attributes. It looks like all that data is already in the snapshot.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we don't need everything that's in CRDP Node.

we need CSS.getMatchedStylesForNode to get the current styles for a node, which needs the nodeId to query for it

Copy link
Collaborator Author

@connorjclark connorjclark Oct 31, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@paulirish , given the above (we need the snapshot + the flattened document protocol calls, to correlate failing nodes with their matched styles), should we still reduce the scope of artifact?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah looks like the snapshot doesn't have nodeId. only backendNodeId. Bummer.

K. We'll keep the current protocol calls. :)

const {
totalTextLength,
visitedTextLength,
failingTextLength,
failingNodes,
} = await FontSize.fetchNodesToAnalyze(passContext.driver);

} = this.findFailingNodes(backendIdsToFontData, crdpNodes);
const {
analyzedFailingNodesData,
analyzedFailingTextLength,
Expand All @@ -340,6 +375,7 @@ class FontSize extends Gatherer {
.forEach(data => (data.cssRule.stylesheet = stylesheets.get(data.cssRule.styleSheetId)));

await Promise.all([
passContext.driver.sendCommand('DOMSnapshot.disable'),
passContext.driver.sendCommand('DOM.disable'),
passContext.driver.sendCommand('CSS.disable'),
]);
Expand All @@ -348,7 +384,6 @@ class FontSize extends Gatherer {
analyzedFailingNodesData,
analyzedFailingTextLength,
failingTextLength,
visitedTextLength,
totalTextLength,
};
}
Expand Down
Loading