-
-
Notifications
You must be signed in to change notification settings - Fork 93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Timeout errors #156
Comments
Thanks for the detailed repro! Hm, I actually have seen this happen on other sites as well, I think the default |
@ikreymer you're right, it is running a lot smoother without screencasting! I can't quite understand why it is picking up URLs like |
Can you try this branch: https://github.com/webrecorder/browsertrix-crawler/tree/window-context-tweaks, hopefully this fixes it, with current browser and screencasting enabled.. |
@ikreymer This branch is working a lot better! |
…ements (#157) * new window: use cdp instead of window.open * new window tweaks: add reuseCount, use browser.target() instead of opening a new blank page * rename NewWindowPage -> ReuseWindowConcurrency, move to windowconcur.js potential fix for #156 * browser repair: - when using window-concurrency, attempt to repair / relaunch browser if cdp errors occur - mark pages as failed and don't reuse if page error or cdp errors occur - screencaster: clear previous targets if screencasting when repairing browser * bump version to 0.7.0-beta.3
Should be fixed as of 0.7.0-beta.3 release |
This appears to be a site specific issue when attempting to crawl
http://stephenratcliffe.blogspot.com
with two workers. After a short period of time a series ofnet:ERR_ABORTED
errors appear, which are eventually followed by a series of long running timeout errors:Once the timeout errors start the screencast window appears blank.
In
/crawls/collections/stephenratcliffe/logs/pywb.log
I noticed a series of posts being routed through pywb which seemed to generate the aborted connections, and resulted sometimes in a timeout. For example:I was using main at 827c153 with the following command:
and the following config file placed in the
crawls
directory:The text was updated successfully, but these errors were encountered: