Skip to content

This is a project designed to review a number of machine learning models and architectures upon the EMBER malware dataset to see the performance of each on the features included.

Notifications You must be signed in to change notification settings

HammyLA/MalwareClassification

Repository files navigation

Malware Detection using the Ember Dataset

NOTE: This project is still unfinished and will continue to expand over the coming months

This project examines a number of different machine learning models to examine their performance on the EMBER dataset, and particularly the vectorized features of the EMBER dataset. To use the project, look at the malware.ipynb file and begin walking through it. Some of the results of training are also there to help compare the quality of each model. A few included models are:

  • LightGBM (Baseline EMBER model)

  • CatBoost

  • XGBoost

  • Neural Network

As well as model architectures taken from:

  • Puranik, Piyush Aniruddha, "Static Malware Detection using Deep Neural Networks on Portable Executables" (2019). UNLV Theses, Dissertations, Professional Papers, and Capstones. 3744. http://dx.doi.org/10.34917/16076285

  • Lad, Sumit & Adamuthe, Amol. (2022). Improved Deep Learning Model for Static PE Files Malware Detection and Classification. International Journal of Computer Network and Information Security. 14. 14-26. 10.5815/ijcnis.2022.02.02.

About

This is a project designed to review a number of machine learning models and architectures upon the EMBER malware dataset to see the performance of each on the features included.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published