Skip to content

Extracts features from web pages to determine whether the domain is parked

Notifications You must be signed in to change notification settings

yurafff/Domain-Parking-Sensors

 
 

Repository files navigation

DISCLAIMER: This code was forked from someone who then deleted the repo. It's not mine code.

Domain Parking Sensors

Introduction

These scripts can be used to extract features from web pages to build a classifier that can detect parked domains. The code is based on the research paper "Parking Sensors: Analyzing and Detecting Parked Domains" [PDF] by Thomas Vissers, Nick Nikiforakis and Wouter Joosen. If you use, extend or build upon this project, we kindly ask you to cite the original NDSS paper. The relevant BibTeX is provided below.

@inproceedings{vissers2015parking,
title={Parking Sensors: Analyzing and Detecting Parked Domains},
author={Vissers, Thomas and Joosen, Wouter and Nikiforakis, Nick},
booktitle={Proceedings of the ISOC Network and Distributed System Security Symposium (NDSS’15)},
year={2015}
}

Usage

  1. Retrieve the necessary data from a sample of domains (HAR, HTML, Redirections, frames, ...)

$ casperjs --folder=[output folder] --domain=[somedomain.com] retrieve_page_data.js

  1. Extract 20+ features from this data (e.g. link location lengths, amount of text, third-party request ratio, ...)

$ python feature_extractor.py [folder] [class label]

Example scenario
$ casperjs --folder=benign_samples --domain=github.com retrieve_page_data.js
$ casperjs --folder=benign_samples --domain=stackoverflow.com retrieve_page_data.js
...
$ casperjs --folder=parked_samples --domain=giyhub.com retrieve_page_data.js 
$ casperjs --folder=parked_samples --domain=stackovreflow.com retrieve_page_data.js 
...
$ python feature_extractor.py benign_samples benign
$ python feature_extractor.py parked_samples parked

Requirements

Troubleshooting

Some versions of PhantomJS use SSLv3 by default. This might cause issues with SSL sites since the POODLE vulnerability was disclosed. To resolve this issue, you can add the following parameter when executing CasperJS:

--ssl-protocol=any 

More information: http://stackoverflow.com/questions/26415188/casperjs-phantomjs-doesnt-load-https-page

About

Extracts features from web pages to determine whether the domain is parked

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 73.2%
  • JavaScript 25.6%
  • Shell 1.2%