Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How can I reuse the same chromium instance? #100

Closed
Richacinas opened this issue Jan 20, 2021 · 7 comments
Closed

How can I reuse the same chromium instance? #100

Richacinas opened this issue Jan 20, 2021 · 7 comments

Comments

@Richacinas
Copy link
Contributor

I would like to use the same Chromium instance every time, so only a new tab/page would be open for a new request.

Is that possible with Grover?

Thanks a lot

@Richacinas
Copy link
Contributor Author

I have the feeling that my process is taking up so long because this library is creating a new Chromium instance over and over. Does it make sense?

@abrom
Copy link
Contributor

abrom commented Jan 20, 2021

Not really practical with how the initialisation of the Grover instance occurs (ie the content to be rendered passed through the initialiser) and somewhat goes against the direction the project has been moving.

In #53 the schmooze gem was removed due to the way it leaked Chromium processes (through how it uses the GC to clean up workers), however I can imagine it would be possible to use something like schmooze to achieve what you're after, I just don't see it happening within this project. Schmooze boots up a single NodeJS instance and would allow re-entrant calls, although TBH I'm not sure if it would support persisting the Puppeteer instances across calls. Another option would be to consider something like the Ruby Puppeteer port, but I found it wasn't as performant as calling to Chromium through NodeJS.

I do like to see Grover support as many use-cases as possible, but this feels like quite a significant side-step and not really something I see being in Grover.

@drnic
Copy link

drnic commented Jul 6, 2021

#115 allows you to access a remote existing chromium. Helpful here?

@feliperaul
Copy link

@abrom Andrew, first of all, huge props for this amazing project.

I would greatly appreciate a way for Grover to use a long-running Puppeteer instance. The gem doesn't have to do all the work by itself, we could even use a systemd service or Ubuntu or something (and just provide the instructions in the readme for the ones that want to use it). I think that it would greatly reduce the time to generate PDFs since it would avoid the Chrome startup time every time someone clicks the "download PDF" link on our application.

@abrom
Copy link
Contributor

abrom commented Nov 2, 2021

Per my previous comments, I'm not convinced this is the path forward for Grover.. However some relatively simple tweaks would allow you to create a single Grover instance with a re-useable NodeJS/Puppeteer/Chromium setup.

See https://github.com/Studiosity/grover/compare/reuse-processor

The Grover instance method definitions need to change of course because you're going to want to pass different URL/options in per conversion!

grover = Grover.new
pdf1 = grover.to_pdf 'https://github.com'
pdf2 = grover.to_pdf 'https://www.google.com'

N.B

  • It uses a finaliser to clean up the NodeJS process so be aware that you're at the mercy of the Ruby garbage collector on that front RE process/memory cleanup..
  • I've wrapped the processor call in a Mutex to prevent any threading issues (given the interface can only be accessed by one thread at a time!).
  • I haven't tested this except for the cursory example above.. Use at your own risk!
  • This will not work with the middleware.. although you're welcome to tweak that if that's still something you need. You'd likely want to initialise the instance in the middleware initialiser, although I wouldn't be able to guarantee the behaviour with multi-threaded or multi-process web servers.. you'd need to make sure that any initialisation happens AFTER the thread/forked process is created!
  • I definitely have no intention to merge this and deploy as a part of the gem!

In terms of how much of an improvement it makes.. short answer.. some, but not as much as you'd think (if anything):

> grover = Grover.new
> start_time = Time.now; g.to_pdf 'https://www.google.com.au', path: '/tmp/foo.pdf'; end_time = Time.now; end_time - start_time
 => 3.06107 
> start_time = Time.now; g.to_pdf 'https://www.google.com.au', path: '/tmp/foo.pdf'; end_time = Time.now; end_time - start_time
 => 2.565773 

Compared to the current code:

> start_time = Time.now; Grover.new('https://www.google.com.au').to_pdf('/tmp/foo.pdf'); end_time = Time.now; end_time - start_time
 => 3.108409 
> start_time = Time.now; Grover.new('https://www.google.com.au').to_pdf('/tmp/foo.pdf'); end_time = Time.now; end_time - start_time
 => 2.761781 

... practically identical 😉

@feliperaul
Copy link

@abrom Thanks so much for the detailed answer, it would take me a long time to benchmark this.

@abrom
Copy link
Contributor

abrom commented Nov 6, 2021

Well the comparison was pretty simplistic. You'd really need to run it hundreds of times and compare not just the runtime but the memory usage etc too

@abrom abrom closed this as completed Jan 29, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants