Skip to content

Latest commit

 

History

History
30 lines (22 loc) · 1.53 KB

README.md

File metadata and controls

30 lines (22 loc) · 1.53 KB

MalwareClassifier

Requirements

  • this project uses jupyter notebooks, for tutorial on usage see jupyter-notebook
  • requirements.txt
    • contains modules needed to execute the code which are not included in the standard python distribution (apart from the jupyter notebook)
    • to install run: pip install --requirement requirements.txt

Project structure

  • folder Scripts contains:
    • TestDataExtraction and TrainDataExtraction subfolders.
      • Scripts within these folders are responsible for extraction of training and test data
    • TrainFeaturesScaling
      • Scales all training samples into the range [0, 255]
    • SingleFeatureProcessing
      • Contains overview of each feature together with a classifier based on each feature
    • WholeFEatureSetClassification
      • Contains LeNet-5 and random forest classifier for the whole feature set
    • utilities
      • contains helper classes and functions
  • folder Data.zip:
    • contains only the final version of training and test features in feature_merged_scaled and Data/test_data/test_features_merged_scaled
    • the whole dataset is not provided as it is several GB in size, but it can be found in the Ember dataset
    • upon downloading the following version: ember_dataset_2018_2.tar.bz2 both training and test data can be extracted using corresponding extract_to_csv.ipynb jupyter notebook