toor-de-force / Ghidra-to-LLVM

An binary-to-LLVM IR lifter that leverages Ghidra's IR and analysis

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Ghidra-to-LLVM

This tool lifts a a compiled binary to LLVM.

Special thanks to the my advisor Arie Gurfinkel and the CMU Pharos team (https://github.com/cmu-sei/pharos). Tests taken from their repository.

Required packages for Python 3

  • llvmlite
  • graphviz

Installation Instructions (Linux Only)

1. Install Ghidra

https://ghidra-sre.org/ghidra_9.1.1_PUBLIC_20191218.zip

  • Extract the JDK: tar xvf <JDK distribution .tar.gz>
  • Open ~/.bashrc with an editor of your choice. For example:vi ~/.bashrc
  • At the very end of the file, add the JDK bin directory to the PATH variable:export PATH=/bin:$PATH
  • Save file
  • Restart any open terminal windows for changes to take effect

2. Edit g2llvm.py

The script requires you to provide the location of two files (absolute path):

  • ghidra_headless_loc = "/PATH/TO/ghidra_9.1.1_PUBLIC/support/analyzeHeadless"
  • prj_dir = "/PATH/TO/GhidraProjects/"

Usage

To run the the tool, simply run the g2llvm.py script. It takes a single mandatory argument, the target executable.

Optional arguments:

  • '-out' emits intermediate files
  • '-opt X' attempts to optimize the file. Valid options 0-3. (Currently only 0 works)
  • '-cfg' saves a .PNG of the whole module CFG.
Extra Scripts
  • HighFunction_Analysis.java: Prints readable version of high function representation
  • HighFunction2LLVM.java: Makes an XML file if the high function representation
TODO
  • Implement lifting using Ghidra's HighFunction (will eventually be the default)

About

An binary-to-LLVM IR lifter that leverages Ghidra's IR and analysis

License:MIT License


Languages

Language:Python 41.3%Language:Java 26.6%Language:C++ 22.0%Language:LLVM 9.9%Language:Makefile 0.2%