This project is a study into generating POC / Exploits for the metasploit framework using LLMs.
Assumptions of the project:
- Metasploit framework has a well defined outcome for a module.
- The modules have metadata required for each module that would make labeling easier and more consistent.
- The modules can be broken down into various utilities that (assumed to be similar to defined classes).
- Most modules are associated with CVE research that can lead to robust prompt generation.
Success Criteria for Project: Utilize the commandline chat prompt to generate a guide for install, and usage of a module that can be saved directly into the metasploit framework using a previously unseen CVE.
For a quick demo on how this is used please run the following commands (assuming you have the pre-reqs installed).
Setup
pip install poetry
git clone https://github.com/roostercoopllc/metAIsploit-assistant -r
cd metAIsploit-assistant
poetry install
# (Optional) This will download the snoozy binary by default
poetry run init
# If you don't have the snoozy model downloaded
poetry run demo
# if you do have the snoozy model downloaded
poetry run prompt-demo
You can run the script interactively by running the following commands:
export METASPLOIT_ROOT=<your metasploit root>
# Update the .env with your MSF root
poetry run chat
Note Depending on your hardware you are running this on, this might take a little while to return the response.
This project uses poetry to generate manage dependencies and attempts to keep the project clean (we will see for how long)
You can use this module through the poetry commands outlined in the pyproject.toml
.
However, it is intended to eventually be available through the msfconsole
to where you can use a digital assistant without needing to start a different terminal and keep the same session alive.
- python 3.11
- Metasploit-Framework
- git-lfs
- And the below pip packages managed by poetry
Development Setup
poetry install
To run the chat interactively
poetry run chat
You can then chat with the model and generate responses. When those responses find code snippets, the script will ask if you wanted to save each individual code snippet.
Create Initial Datasets
-
Labels
-
Attempted Automated Labeling:
There are two scripts that attempt to make the prompt dataset. These prompts are based off of a collection of the writeups on cves from the mitre collection of cves. They will associate the metasploit modules with ever one of the complete write ups housed in the the mitre datahouse.
The prompts for training are the entire white paper and an additional prompt of the phrase
write a metasploit module for cve-xxxx-yyyyy
.- Automated Labeling will take the CVE code and attempt to search it on the cve database on the MITRE repository for CVEs. It will then search the URLs of the CVE references and create prompts that associate with the Metasploit module the cve goes with. Note: Hopeuflly this will create mroe variance on what kind of description of the CVEs will generate a valid module.
-
Manual Labeling:
-
-
Training
- Transfer Learning:
- Scoring / Performance:
-
Saving Model
- Saving Models:
- Label Data
- Create Quality of Life to code
- Write wiki documents