secureweb / symphp

Holistic Concolic Execution for Dynamic Web Applications via Symbolic Interpreter Analysis (IEEE S&P 2024)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

SymPHP

SymPHP is a concolic execution engine for PHP-based web applications, leveraging the idea of Symbolic Interpreter Analysis (SIA). It symbolically analyzes the language interpreter to (indirectly) analyze the web application code. To perceive web application execution state, we modified PHP interpreter to expose WebPC. We also enhanced S2E to symbolically analyze the language interpreter with WebPC-based state scheduling. SymPHP has been tested on Ubuntu 22.04. More technical details can be found in our SP24 paper.

@inproceedings{li2024symphp,
    title       = {Holistic Concolic Execution for Dynamic Web Applications via Symbolic Interpreter Analysis},
    author      = {Li, Penghui and Meng, Wei and Zhang, Mingxue and Wang, Chenlin and Luo, Changhua},
    booktitle   = {Proceedings of the 45th IEEE Symposium on Security and Privacy (S&P)},
    month       = may,
    year        = 2024
}

Installation

Build S2E

SymPHP generally follows the standard instructions for building S2E. Below are some key steps.

# install the dependencies in https://github.com/secureweb/s2e/blob/master/Dockerfile

# use s2e-env
cd $SYMPHP/s2e-env
python3 -m venv venv
. venv/bin/activate
pip install --upgrade pip
pip install .

# build S2E
cd $SYMPHP/s2e-env/s2e
source s2e_activate 
s2e build

# build an ubuntu-22.04-x86_64 VM image for running PHP interpreter in
s2e image_build ubuntu-22.04-x86_64

Build PHP Interpreter

cd $SYMPHP/php-src
./buildconf --force 

# this builds the PHP binaries sapi/cli/php and sapi/cgi/php-cgi
# one can enable other optional configurations, e.g., --with-mysql
# https://www.php.net/manual/en/configure.about.php
EXTRA_INCLUDES="-I ../s2e-env/s2e/source/s2e/guest/common/include" ./configure
make

Build Bug Checker

S2E can capture several types of errors during the analysis. Following the idea of Fault Escalation (Trickel et al. 2023), SymPHP issues a segfault for syntax parsing errors in sinks. It is worth noting that directly signaling SIGSEGV would not work due to the ways S2E captures errors. Therefore, we insert the following code snippet to intentionally trigger segfaults on syntax parsing errors. We modified the dash for command injection vulnerabilities.

// https://stackoverflow.com/a/2045478/10498659
*(int*)0 = 0;

Build dash:

cd $SYMPHP/dash
sh autogen.sh
./configure
make

Use

We illustrate the use cases of SymPHP using an example. The following PHP code snippet would trigger a segfault on the modified dash at the else branch.

# $SYMPHP/s2e-env/s2e/projects/test_files/test.php
<?php
if(isset($_GET['var1'])) {
    $var1 = $_GET['var1'];
	if($var1 == '2022'){
		echo "branch one \n";
	}
	elseif($var1 . "o" == "hello") {
		echo "branch two \n";
		
		// trigger segfault according to Fault Escalation
		system('ls ('); 
	}
}

Start a S2E Project

Create a new project for the binary php-cgi.

cd $SYMPHP/s2e-env 
. venv/bin/activate
cd $SYMPHP/s2e-env/s2e
source s2e_activate

s2e new_project -i ubuntu-22.04-x86_64 $SYMPHP/php-src/sapi/cgi/php-cgi
cd projects/php-cgi

Config S2E Project

In s2e-config.lua, InterpreterMonitor plugin should be enabled; WEBPC should be enabled for CUPASearcher. A reference configuration file (with hardcoded paths) can be found in test_files/s2e-config.lua. TestCaseGenerator plugin should be enabled to generate test cases on segfaults.

add_plugin("InterpreterMonitor")

add_plugin("CUPASearcher")
pluginsConfig.CUPASearcher = {
    -- The order of classes is important, please refer to the plugin
    -- source code and documentation for details on how CUPA works.
    classes = {
        -- This ensures that states run for a certain amount of time.
        -- Otherwise too frequent state switching may decrease performance.
        "batch",
	    
        -- include WEBPC as CUPA Searcher
        "WEBPC", 

        "pc",
    },
}

add_plugin("TestCaseGenerator")
pluginsConfig.TestCaseGenerator = {
    generateOnStateKill = true,
    generateOnSegfault = true
}

Run

Copy other required files, including the modified dash.

# copy necessary binaries
cp $SYMPHP/dash/src/dash dash
cp $SYMPHP/s2e-env/s2e/projects/test_files/invokephp.sh .
cp $SYMPHP/s2e-env/s2e/projects/test_files/test.php .
cp $SYMPHP/s2e-env/s2e/projects/test_files/bootstrap.sh .

Start the analysis.

GUI=1 ./launch-s2e.sh

More details can be found in test_files/bootstrap.sh

${S2ECMD} get "invokephp.sh"
${S2ECMD} get "test.php"
${S2ECMD} get "dash"
sudo chmod a+x dash php-cgi

#sudo systemctl stop mysql
sudo cp dash /usr/bin/dash
sh invokephp.sh

Check Results

When a segmentation fault is triggered, the TestCaseGenerator plugin of S2E would generate a test case automatically. For example, in s2e-last/debug.txt.

15 [State 2] TestCaseGenerator: generating test case at address 0xffffffff8104bd6b
15 [State 2] LinuxMonitor: Terminating state: received segfault
15 [State 2] Terminating state: Segfault
15 [State 2] TestCaseGenerator: generating test case at address 0xffffffff8104bd6b
15 [State 2] TestCaseGenerator:       v0__SYM_var1_0 = {0x68, 0x65, 0x6c, 0x6c}; (int32_t) 1819043176, (string) "hell"

How to Enhance Interpreter?

We explain how we modified the PHP interpreter here. We hope this could help interested readers to extend the idea of SIA to other languages/interpreters.

Report a Bug?

Please open an issue if you have questions or encounter bugs.

License

SymPHP is under Apache-2.0 license. SymPHP uses various libraries/tools that may have their own licenses. Please refer to the LICENSE file in each directory or to the header of each source file.

About

Holistic Concolic Execution for Dynamic Web Applications via Symbolic Interpreter Analysis (IEEE S&P 2024)

License:Apache License 2.0


Languages

Language:C 68.1%Language:PHP 13.5%Language:C++ 5.8%Language:POV-Ray SDL 4.8%Language:Assembly 1.7%Language:Python 1.6%Language:LLVM 1.5%Language:Shell 0.7%Language:C# 0.5%Language:M4 0.2%Language:Rich Text Format 0.2%Language:Makefile 0.2%Language:Java 0.2%Language:Haxe 0.1%Language:OCaml 0.1%Language:JavaScript 0.1%Language:Perl 0.1%Language:CMake 0.1%Language:Roff 0.1%Language:Visual Basic 6.0 0.1%Language:Lex 0.1%Language:Lua 0.1%Language:Yacc 0.0%Language:PowerShell 0.0%Language:Objective-C 0.0%Language:Batchfile 0.0%Language:UnrealScript 0.0%Language:GAP 0.0%Language:SWIG 0.0%Language:GDB 0.0%Language:VBA 0.0%Language:Smarty 0.0%Language:SmPL 0.0%Language:XSLT 0.0%Language:NSIS 0.0%Language:Gherkin 0.0%Language:Dockerfile 0.0%Language:Awk 0.0%Language:DTrace 0.0%Language:Max 0.0%Language:R 0.0%Language:Cython 0.0%Language:Tcl 0.0%Language:Ruby 0.0%Language:GLSL 0.0%Language:HTML 0.0%Language:Vim Script 0.0%Language:Smalltalk 0.0%Language:Emacs Lisp 0.0%