This project focuses on malware detection using machine learning techniques. It employs various classifiers to identify obfuscated malware and evaluates their performance through different metrics.
The dataset used for this project is Obfuscated-MalMem2022.
For more detailed information, refer to the research paper: AI-Based Malware Detection.
pandas
numpy
seaborn
matplotlib
scikit-learn
- Data Preprocessing: Cleans and prepares the dataset for modeling.
- Exploratory Data Analysis (EDA): Analyzes malware distribution.
- Model Training: Implements and evaluates various classifiers:
- Random Forest
- Naive Bayes
- Decision Tree
- SVM
- Logistic Regression
- KNN
- Ensemble Models: Combines multiple classifiers using stacking techniques.
Models are evaluated using:
- F1 Score
- Precision
- Recall
- Accuracy
- Confusion Matrix