-
Notifications
You must be signed in to change notification settings - Fork 9.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lighthouse/Puppeteer integration #3837
Comments
Thanks for filing @Khady! In short, it's not currently possible today. Lighthouse always controls the navigation to the page. There are some settings (user agent/setting a cookie/logging in) you'd be able to set using puppeteer before providing the same port to LH, but you'll need to make sure LH isn't overriding those by disabling mobile emulation/storage reset where applicable. We've got it on our roadmap to enable auditing without a navigation though in which case this sort of thing becomes possible :) |
We're not ready to open up the full multiclient story where Lighthouse and Puppeteer/CRI talk to the same page. There are some dragons within here that we're not ready to fight yet. There's another approach we're discussing and we currently favor:
You can do all this today without any code changes (although #3864 should help quite a bit..) Your custom gatherer won't actually return a useful artifact, but that's OK. We're just abusing its lifecycle hooks. @patrickhulce does this match what you were thinking? |
Yeah this seems like the quickest way to achieve as much of the goals as possible today. Long-term vision, reusing existing tab and making LH more flexible in analyzing existing pages is definitely the way we should be moving to play better with DevTools and puppeteer 👍 |
how do we feel about creating an example: |
I'm working on a new version of my tool following your advices.
aslushnikov says in puppeteer/puppeteer#1398 (comment) that pptr could move from ws connection to pipe connection. If the pageId/targetId system is not portable over the pipe connection then I guess there is not much choice but to keep the connection system as it is currently. No point adding #3857 if it is going to be deprecated soon. ps: I think I found a possible bug in |
Great feedback @Khady you're somewhat of a pioneer in this area, so it's great to be aware of the pain points :) A few responses to your comments below
Ah, you're having trouble finding the target to use in puppeteer once the page has been loaded correct? Yeah, we should expand #3864 to communicate the target/page ID as well.
Yes, we had a plan for this and haven't gotten around since there wasn't an immediate need, but we want to implement audit and gatherer options to pass in dynamic runtime information that can control audit/gatherer behavior separately from the gatherer/audit code itself.
You should be able to mark an error with a lighthouse/lighthouse-core/gather/gather-runner.js Lines 126 to 138 in 407b1af
pass(/** stuff */) {
const error = new Error("Uh-oh something went wrong!");
error.fatal = true;
throw error;
}
Ah, good find! You're right we've never really run into this, especially since we discourage using headless for its lack of throttling, but we should update that to throw loudly at this point if we can't create a tab :) |
Thank you for your help!
Correct.
Good news. I can manage to do what I want in the current situation. But it's great to have visibility on the future plans.
Awesome. I should have read the whole code related to the gatherers and not only some parts. Nothing is blocking me for now, thanks to your advices. I just exploit a few undocumented information ( |
We will sort this out in the next 2 quarters. Thanks! |
I'm also interested in the request interception side of things. We're running Lighthouse as part of our CI/CD pipeline. However, our API endpoints have really erratic behavior, and requests can take anything from 500ms to 2s. That's forcing us to make our TTI checks much laxer than what we'd want. If we could intercept requests to those endpoints and immediately respond with a fixture, we'd have much more deterministic numbers, and we could tighten our TTI checks. |
See #5472 for another use case |
I'm running Chrome with chrome-launcher, then connecting to it with puppeteer. The only thing I'm setting up is this: // add HTTP BasicAuth credentials on new tab creation:
browser.on('targetcreated', async (target) => {
const page = await target.page()
if (page) await page.authenticate(basicAuth)
}) It works, but once Puppeteer is connected, Lighthouse (and Chrome's devtools, for that matter) stops gathering the size of requests. Anybody know why, or how to mitigate this (size 0 everywhere)? |
Going to add my perspective because I did not see mention of this after reading through the thread: My use-case: I'm looking to run multiple concurrent Lighthouse audits using a single instance of Chrome using a new Incognito Browser Context for each audit so that no data storage/state is shared between concurrent audits. Ideally each audit could be preceded by a series of actions (e.g. a log in), and state would be maintained per incognito context (tab). However, following the Puppeteer recipes in this repo, it seems like Lighthouse always opens the URL in the default (shared) browser context. |
Have you seen this? https://github.com/GoogleChrome/lighthouse/blob/master/docs/recipes/auth/example-lh-auth.js Puppeteer, by default, uses a fresh Chrome profile, so if you launch it like the above script does you shouldn't see any state persist.
FYI we recommend against this. If you rely on the performance category, the results will be skewed. Even if you don't, you risk protocol timeouts by asking Chrome to do too much at once. |
Thanks @connorjclark. Yep, I've seen that recipe. My specific problem with the fresh Chrome profile on launch approach is that I'm exposing Puppeteer as a micro-service (so it only launches when the service re/starts). Multiple clients can hit this service, but their requests are sandboxed from each other via Incognito contexts; I was hoping to borrow the same sandboxing approach for Lighthouse performance audits as well.
This is good to know (I'm looking at just the performance category for right now). Is this something that can be mitigated by throwing additional resources at Chrome (e.g. CPU cores / Memory)? Any documentation you have on this would be very much appreciated. I'd also be curious to hear how Google approaches scaling the PageSpeed Insights API, given the recommendation against concurrent audits in a single Chrome instance. |
I'd suggest this is a micro-optimization. Also, LH directs the browser to clear the cache on each run (by default), so you're also at risk of runs stomping on the cache of other runs.
We have many machines, a load balancer, and queue things up in the worst case. You could probably get away with a few concurrent runs, but I'd measure to be safe. 3 is probably fine on any non-network constrained machine. In any case, you certainly should queue up LH runs if you get more than 3 req/minute. |
In addition to connor's advice, if you're going to run LH concurrently (again we recommend you don't or your performance variability will be quite high), run each Lighthouse in its own child process and dedicate at least 2 cores to its execution. Scaling horizontally has shown to yield more consistent results than scaling vertically, i.e. using 8 smaller 2-core machines as opposed to running 8 runs on a 16-core machine. Just avoid any burst-able instance types. |
@iamEAP In our case, we run Lighthouse in a serverless compute service (e.g. AWS Lambda). We do this to run 60 tests simultaneously and then extract median performance data to see whether a given code change causes a performance regression (or is an improvement). This makes it easy to run LH concurrently (and scalably) and you get meaningful results as soon as the longest run completes. |
Sorry if this was already obvious, but as far as I understand most usecases above could be solved by adding a way to connect to a chrome instance by providing a browser websocket url right? Just like puppeteer.connect accepts a |
Can I please get an example of the client calling the lighthouse and passing browser in the parameter userConnect? I am trying to call lighthouse on a url after navigating in puppeteer and getting a new tab launched everytime lighthouse is called. I dont want the new tab to be launched and want the existing tab to be reused. Thanks in advance! puppeteer version 7.6.0 |
We have puppeteer examples here: https://github.com/GoogleChrome/lighthouse/blob/master/docs/puppeteer.md |
From the look of the doc it only partially solves the original issue. For example it doesn't offer a way to force lighthouse to use a specific tab. |
That is true . I went through all these examples but couldnt find a way to open lighthouse analysis on existing tab opened in puppeteer. I can achieve the result partially using the code lighthouse.snapshot in this code but the result report is not in desired html format but in Json format. lighthouse/lighthouse-core/test/fraggle-rock/api-test-pptr.js Lines 115 to 118 in 6b95928
It will be good to know if this resolution of opening in existing tab instead of opening new tab comes with official lighthouse release. Thanks. Thanks |
@Khady forcing Lighthouse on a particular tab will be solved by #11313. The issues @praveenralla ran into are unrelated to whether it can be used on a particular tab or not (just about consuming the output). |
@patrickhulce I was actually reacting because this issue (which I opened and might be different from the ones of @praveenralla) is being closed without being solved. But thanks for the link to 11313! I'll follow the progress there |
@Khady the issue is indeed solved by Fraggle Rock, as @patrickhulce pointed out. The new API is solid enough that we've starting using Fraggle Rock in production. Example usage: import {navigation} from 'lighthouse/lighthouse-core/fraggle-rock/api'
import puppeteer from 'puppeteer'
const browser = await puppeteer.launch()
const [thisIsYourCustomTab] = await browser.pages()
const result = await navigation({
page: thisIsYourCustomTab,
url: 'https://google.com',
config: {
// add your navigation config / settings here
}
}) |
Will lighthouse supports for SPA(Single page apps) |
It is a planned feature as part of #11313 |
I am using lighthouse node module along with puppeteer to record perf metrics of pages behind auth. I would also like to achieve below
Not sure if above can be purely achieved using lighthouse node module or with assistance of puppeteer. I am new to both lighthouse & puppeteer so any pointers will be helpful |
Lighthouse has an audit under "Best practices" that checks for errors in the console
I forgot Lighthouse will create its own page and close it automatically when running from the node module. The easiest way to check for elements is to open a separate page and test for the elements using Puppeteer without Lighthouse: const page = await browser.newPage();
await page.goto('https://example.com');
const check1 = await page.$('button.class'); |
thanks @adamraine !! will try above out . |
I am using lighthouse from javascript to check a few pages from a website. I would like to be able to tell lighthouse to use a specific tab that is already opened in my chrome to do that. Maybe by giving it at link to a ws endpoint. That's because I create the browser using puppeteer and I want to do some operations before to run lighthouse (like to set the useragent or some request interception configuration) and once the lighthouse check is done (like to get the html of the page or to interact with the page).
Is it possible to tell lighthouse to use a tab already existing?
The text was updated successfully, but these errors were encountered: