teambi0s / phpil

Bytecode based Fuzzer for the PHP language

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

phpil

A Fuzzer for the PHP language. The core concept is based on Fuzzili. Basically, its a port of Fuzzili, re-written in python for PHP :).

Currently only code generation is present. A python module for collecting clang’s sanitizer-coverage is present in coverage.

We did this as a course project for the college, but the course got completed before we completed the entire fuzzer :). So it currently its a random php program generator. Might work more on this if we find time later.

To run simply do

python main.py

That will print out a syntactically and semantically correct PHP code sample.

How it works

It works in a similar manner as Fuzzili. The fuzzing is done on a IR (PHPIL). A lifter lifts the PhpIL code to the PHP code when all the processing for the sample is done. The intermediate language (PHPIL) has opcodes which should on a high level cover most of the PHP functionalities (we did not do it for all the functionalities yet). Here is a high level overview of the working -

  • Code Generators - The code generators use the opcodes to generate blocks of semantically and syntactically correct code. This code is still in PhpIL. Once a block of code is generated, it is passed on to the analyzers for further analysis.

  • Analyzers - Currently supports 3 types of analysis -

    • Scope Analyzer - keeps track of the scope of each of the variables that is currently in use.
    • Context Analyzer - This is used to keep track of the program context. For example, we don’t want to emit a ‘return‘ instruction when we are not even within a function.
    • Type Analyzer - Its used to infer the types of the variables used in the input program.
  • lifter - Finally we need to convert the PHPIL code to PHP code. This job is taken care of by the lifter.

Other notes

  • Coverage - We can use clang sanitizer coverage to track the code coverage dynamically. For this we just need to compile php with the right options. phpcov.c is a python module to track the code coverage in python. This is very similar to Fuzzili. We use a shared memory region to share the coverage data between the fuzzer and the binary. It has not been integrated yet though.

  • program_builder - used to keep track of the current program that is being built/modified. Basically, its an instance of a PhpIL program.

  • settings - Governs the probability of a code generator being selected. The higher the value of a code generator, the more likely it is to appear in the sample. If zero, it will never appear.

TODOs

Of course there are many things left. To name a few -

  • Mutation
  • Coverage
  • Sane selection of integers/strings
  • Execution and crash tracking

Authors

About

Bytecode based Fuzzer for the PHP language


Languages

Language:Python 97.9%Language:C 2.1%