Selenate is a cloud-based web-scraping framework build on top of Selenium WebDriver, which aims to simplify complex scraping jobs, and extend basic Selenium functionality.
Selenate is based on Client-Server arhitecture. Server hosts scraping sessions and runs the browsers while clients connect to the sessions and issue commands.
In this example, we have three clients and two servers: Client 1
has two active sessions: one on Server 1
, and the other on Server 2
, while Client 2
and Developer
are both connected to session 9f2749af
.
List of Selenate's main features:
Each scraping process has a session associated with it. Sessions can be saved and resumed, and they can be shared between clients.
To account for the fact that sometimes actions fail because of mysterious reasons, Selenate has support for automatic action retrying (until the action succeeds, for a maximum of a specified number of retries).
Client code doesn't need to worry about switching between frames and windows, Selenate can manage them automatically.
For sessions that require constant activity to remain active (auto log-out), Selenate can automatically and periodically execute a specified set of commands, to prevent session from expiring.
Selenate supports a browser pre-loading for faster session initialization.
Selenate can record a video of the entire scraping session for debugging purposes.
Selenate has a support for native mouse and keyboard actions, as well as Sikuli integration.
Selenate supports working with Java applets embedded in web pages.