- Clone the repository
- For example: mkdir /home/mansour/code
- cd /home/mansour/code
git clone --recurse-submodules https://github.com/RiS3-Lab/FICS.git
- cd FICS
sh install.sh
- create a directory as the root of your data (e.g., source code, bitcodes, graphs, etc.)
- For example: mkdir /home/mansour/data
- cd /home/mansour/data
- create a directory inside and name it 'projects': mkdir projects
- cd /home/mansour/data/projects
- Modify settings.py and update DATA_DIR to the root of your data
- For example: DATA_DIR = '/home/mansour/data'
- In the "projects" directory, clone the source code a codebase you target:
- For example: git clone https://gitlab.com/libtiff/libtiff.git libtiff-19f6b70
- cd libtiff-19f6b70
- git checkout 19f6b70 .
- Compile the project with clang-3.8 and get compilation database (FICS just supports clang 3.8 and llvm 3.8)
- For example: cmake -D CMAKE_C_COMPILER="/usr/bin/clang-3.8" -D CMAKE_CXX_COMPILER="/usr/bin/clang++-3.8" .
- get compilation database: bear make
- Run FICS on the target codebase:
- For example:
sh scripts/get_inconsistencies_real_programs_NN_G2v.sh libtiff-19f6b70 p ns
- If you need to run FICS on larger projects like QEMU, change 'ns' to 's'. FICS splits the codebase to submodules
- The inconsistencies are saved in mongodb
- To query the saved inconsistencies, you need to run the following command:
python __init__.py -a=QI -p=libtiff-19f6b70 -it=check -f
- "-it" argument is inconsistency type and can be: check | call | type | store | order | all
- if you need to disable filtering, just remove -f
Bug | Link |
---|---|
Codebase | OpenSSL |
Missing check | Report/Patch |
Missing check | Patch |
Wrong use of clear_free | Report/Patch |
Null dereference | Report/Patch |
Null dereference | Report/Patch |
Inconsistent Check | Report/Patch |
Memory Leak | Report/Patch |
Missing clear_free | Report/Patch |
Codebase | QEMU |
2 Missing checks | Report/Patch |
Undefined Behaviour | Report/Patch |
Uninitialized variable | Report/Patch |
Codebase | LibTIFF |
Missing checks | Patch |
Mislocated check - Bad casting | Report/Patch |
Missing TIFFClose | Report/Patch |
Codebase | wolfSSL |
Missing check | Report/Patch |
Missing check | Report/Patch |
Memory exhaustion | Report/Patch |
Codebase | OpenSSH |
Missing bzero | Patch |
Codebase | libredwg |
Bad casting (Overflow) | Report/Patch |
Null dereference | Report/Patch |
Null dereference | Report/Patch |
Codebase | TCPdump |
Missing initialization | Report |
If your found FICS useful for your research, please cite the following paper:
@inproceedings{fics,
abstract = {
Probabilistic classification has shown success in detecting known types of software bugs. However, the works following this approach tend to require a large amount of specimens to train their models. We present a new machine learning-based bug detection technique that does not require any external code or samples for training. Instead, our technique learns from the very codebase on which the bug detection is performed, and therefore, obviates the need for the cumbersome task of gathering and cleansing training samples (e.g., buggy code of certain kinds). The key idea behind our technique is a novel two-step clustering process applied on a given codebase. This clustering process identifies code snippets in a project that are functionally-similar yet appear in inconsistent forms. Such inconsistencies are found to cause a wide range of bugs, anything from missing checks to unsafe type conversions. Unlike previous works, our technique is generic and not specific to one type of inconsistency or bug. We prototyped our technique and evaluated it using 5 popular open source software, including QEMU and OpenSSL. With a minimal amount of manual analysis on the inconsistencies detected by our tool, we discovered 22 new unique bugs, despite the fact that many of these programs are constantly undergoing bug scans and new bugs in them are believed to be rare.
},
author = {Ahmadi, Mansour and Mirzazade farkhani, Reza and Williams, Ryan and Lu, Long},
booktitle = {Proceedings of the 30th USENIX Security Symposium},
month = {August},
series = {USENIX Security'21},
title = {Finding Bugs Using Your Own Code: Detecting Functionally-similar yet Inconsistent Code},
year = {2021}
}