ai-se / Mozilla_Firefox_Vulnerability_Data

Dataset of known vulnerabilities in the Mozilla Firefox project.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Mozilla_Firefox_Vulnerability_Data

Dataset of known vulnerabilities in the Mozilla Firefox project.

Cite as:

@article{yu2018improving,
  title={Improving Vulnerability Inspection Efficiency Using Active Learning},
  author={Yu, Zhe and Theisen, Christopher and Williams, Laurie and Menzies, Tim},
  journal={arXiv preprint arXiv:1803.06545},
  year={2018}
}

Dependent Variable:

Each row in vulnerabilities.csv related to a bug report being classified as security vulnerability-related by human reviewers.

Mapping between vulnerability types in vulnerabilities.csv to the categories in the paper:

{'arbitrary-code': 'Protection Mechanism Failure', 'injection': 'Protection Mechanism Failure', 'Code - Security Features - Protection Mechanism Failure': 'Protection Mechanism Failure', 'cross-site-scripting': 'Protection Mechanism Failure', 'Code - Resource Management Error - Improper Resource Shutdown or Release': 'Resource Management Errors', 'data-leakage': 'Resource Management Errors', 'use-after-free': 'Resource Management Errors', 'Code - Resource Management Error - Uncontrolled Resource Consumption': 'Resource Management Errors', 'Code - Resource Management Error': 'Resource Management Errors', 'spoofing': 'Resource Management Errors', 'Code - Resource Management Error - Use After Free': 'Resource Management Errors', 'denial-of-service': 'Resource Management Errors', 'Code - Data Processing': 'Data Processing Errors', 'memory-corruption': 'Data Processing Errors', 'buffer-overflow': 'Data Processing Errors', 'exploitable-crash': 'Data Processing Errors', 'Code - Code Quality': 'Code Quality', 'Configuration': 'Other', 'Environment': 'Other', 'Code - Traversal - Link Following': 'Other', 'Code - Time and State - Race Conditions': 'Other', 'privilege-escalation': 'Other', 'Code - Traversal': 'Other', '?': 'Other'}

Independent Variables:

Source code files

The snapshot was taken from the main branch on mercurial on November 21st, 2017.

Software metrics

software_metrics.csv

Crash counts

crashes.csv

Combined Data:

Each row in the Combined data has crash counts, software metrics, and source code of the file as independent variables and the categories of vulnerabilities the file contains as dependent variable. Using this data alone can reproduce the result of the paper.

About

Dataset of known vulnerabilities in the Mozilla Firefox project.