Skip to content

AllencxWang/nightmare-crawler

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

nightmare-crawler


It is easy to scrap things from a web page if the content of that page is formed from the server side, but things are getting difficult when you want to do the same thing to a SPA website. The content of a SPA website is generated dynamically through multiple AJAX requests, and even more, some of those requests will only be issued after a script evaluation. It's relatively hard to achive the same goal on such website by using tools like Node.js request module. This might need some kind of browser automation to ease the pain, and there are options after a brief searching:

  • Phantom.js
  • Nightmare.js
  • WebdriverIO

At first, I choose Phantom.js as a starting point, but its API is just too cumbersome, so, after a short test, I decided to move on to Nightmare.js, and I found it very easy to be used. This repo will show you how to use Nightmare.js to scrap things from a website that requires authentication and contains dynamic content.

(I havn't really tried WebdriverIO, but since the speed of Nightmare.js isn't that fast, maybe I'll give it a shot some time later)

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published