lifting-bits / rellic

Rellic produces goto-free C output from LLVM bitcode

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Rellic

Rellic is an implementation of the pattern-independent structuring algorithm to produce a goto-free C output from LLVM bitcode.

The design philosophy behind the project is to provide a relatively small and easily hackable codebase with great interoperability with other LLVM and Remill projects.

Examples

Original program Compiled with -emit-llvm -O0 and decompiled
int main() {
  for(int i = 0; i < 30; ++i) {
    if(i % 3 == 0 && i % 5 == 0) {
      printf("fizzbuzz\n");
    } else if(i % 3 == 0) {
      printf("fizz\n");
    } else if(i % 5 == 0) {
      printf("buzz\n");
    } else {
      printf("%d\n", i);
    }
  }
}
int main() {
  unsigned int var0;
  unsigned int i;
  var0 = 0U;
  i = 0U;
  while ((int)i < 30) {
    if ((int)i % 3 != 0U || !((int)i % 5 == 0U || (int)i % 3 != 0U)) {
      if ((int)i % 3 != 0U) {
        if ((int)i % 5 != 0U) {
          printf("%d\n", i);
        } else {
          printf("buzz\n");
        }
      } else {
        printf("fizz\n");
      }
    } else {
      printf("fizzbuzz\n");
    }
    i = i + 1U;
  }
  return var0;
}
int main() {
  int i = 0;
  start:
  i++;
  switch(i) {
    case 1: printf("%d\n", i); goto start; break;
    case 2: printf("%d\n", i); goto start; break;
    case 3: printf("%d\n", i); break;
  }
}
int main() {
  unsigned int var0;
  unsigned int i;
  var0 = 0U;
  i = 0U;
  do {
    i = i + 1U;
    if (!(i != 3U && i != 2U && i != 1U))
      if (i == 3U) {
        printf("%d\n", i);
        break;
      } else if (i == 2U) {
        printf("%d\n", i);
      } else {
        printf("%d\n", i);
      }
  } while (!(i != 3U && i != 2U && i != 1U));
  return var0;
}
int main() {
  int x = atoi("5");
  if(x > 10) {
    while(x < 20) {
      x = x + 1;
      printf("loop1 x: %d\n", x);
    }
  }
  while(x < 20) {
    x = x + 1;
    printf("loop2 x: %d\n", x);
  }
}
int main() {
  unsigned int var0;
  unsigned int x;
  unsigned int call2;
  var0 = 0U;
  call2 = atoi("5");
  x = call2;
  if ((int)x > 10) {
    while ((int)x < 20) {
      x = x + 1U;
      printf("loop1 x: %d\n", x);
    }
  }
  if ((int)x <= 10 || (int)x >= 20) {
    while ((int)x < 20) {
      x = x + 1U;
      printf("loop2 x: %d\n", x);
    }
  }
  if ((int)x >= 20 && ((int)x <= 10 || (int)x >= 20)) {
    return var0;
  }
}

In the press

C your data structures with rellic-headergen

Interactive decompilation with rellic-xref

Magnifier: an experiment with interactive decompilation

Build Status

master
Linux Build Status

Getting Help

If you are experiencing undocumented problems with Rellic then ask for help in the #binary-lifting channel of the Empire Hacking Slack.

Supported Platforms

Rellic is supported on Linux platforms and has been tested on Ubuntu 22.04.

Dependencies

Most of Rellic's dependencies can be provided by the cxx-common repository. Trail of Bits hosts downloadable, pre-built versions of cxx-common, which makes it substantially easier to get up and running with Rellic. Nonetheless, the following table represents most of Rellic's dependencies.

Name Version
Git Latest
CMake 3.21+
Google Flags Latest
Google Log Latest
LLVM 16
Clang 16
Z3 4.7.1+

Pre-made Docker Images

Pre-built Docker images are available on Docker Hub and the Github Package Registry.

Getting and Building the Code

On Linux

First, update aptitude and get install the baseline dependencies.

sudo apt update
sudo apt upgrade

sudo apt install \
     git \
     python3 \
     wget \
     unzip \
     pixz \
     xz-utils \
     cmake \
     curl \
     build-essential \
     lsb-release \
     zlib1g-dev \
     libomp-dev \
     doctest-dev

If the distribution you're on doesn't include a recent release of CMake (3.21 or later), you'll need to install it. For Ubuntu, see here https://apt.kitware.com/.

The next step is to clone the Rellic repository.

git clone --recurse-submodules https://github.com/lifting-bits/rellic.git

Finally, we build and package Rellic. This script will create another directory, rellic-build, in the current working directory. All remaining dependencies needed by Rellic will be downloaded and placed in the parent directory alongside the repo checkout in lifting-bits-downloads (see the script's -h option for more details). This script also creates installable deb, rpm, and tgz packages.

cd rellic
./scripts/build.sh --llvm-version 16
# to install the deb package, then do:
sudo dpkg -i rellic-build/*.deb

To try out Rellic you can do the following, given a LLVM bitcode file of your choice.

# Create some sample bitcode or your own
clang-16 -emit-llvm -c ./tests/tools/decomp/issue_4.c -o ./tests/tools/decomp/issue_4.bc

./rellic-build/tools/rellic-decomp --input ./tests/tools/decomp/issue_4.bc --output /dev/stdout

On macOS

Make sure to have the latest release of cxx-common for LLVM 16. Then, build with

cmake \
  -DCMAKE_BUILD_TYPE=RelWithDebInfo \
  -DCMAKE_TOOLCHAIN_FILE=/path/to/vcpkg/scripts/buildsystems/vcpkg.cmake \
  -DVCPKG_TARGET_TRIPLET=x64-osx-rel \
  -DRELLIC_ENABLE_TESTING=OFF \
  -DCMAKE_C_COMPILER=`which clang` \
  -DCMAKE_CXX_COMPILER=`which clang++` \
  /path/to/rellic

make -j8

Docker image

The Docker image should provide an environment which can set-up, build, and run rellic. The Docker images are parameterized by Ubuntu verison, LLVM version, and architecture.

To build the docker image using LLVM 16 for Ubuntu 22.04 you can run the following command:

UBUNTU=22.04; LLVM=16; docker build . \
  -t rellic:llvm${LLVM}-ubuntu${UBUNTU} \
  -f Dockerfile \
  --build-arg UBUNTU_VERSION=${UBUNTU} \
  --build-arg LLVM_VERSION=${LLVM}

To run the decompiler, the entrypoint has already been set, but make sure the bitcode you are decompiling is the same LLVM version as the decompiler, and run:

# Get the bc file
clang-16 -emit-llvm -c ./tests/tools/decomp/issue_4.c -o ./tests/tools/decomp/issue_4.bc

# Decompile
docker run --rm -t -i \
  -v $(pwd):/test -w /test \
  -u $(id -u):$(id -g) \
  rellic:llvm16-ubuntu22.04 --input ./tests/tools/decomp/issue_4.bc --output /dev/stdout

To explain the above command more:

# Mount current directory and change working directory
-v $(pwd):/test -w /test

and

# Set the user to current user to ensure correct permissions
-u $(id -u):$(id -g) \

Testing

We use several integration and unit tests to test rellic.

Roundtrip tests will take C code, build it to LLVM IR, and then translate that IR back to C. The test then sees if the resulting C can be built and if the translated code does (roughly) the same thing as the original. To run these, use:

cd rellic-build #or your rellic build directory
CTEST_OUTPUT_ON_FAILURE=1 cmake --build . --verbose --target test

AnghaBench 1000 is a sample of 1000 files (x 4 architectures, so a total of 4000 tests) from the full million programs that come with AnghaBench. This test only checks whether the bitcode for these programs translates to C, not the prettiness or functionality of the resulting translation. To run this test, first install the required Python dependencies found in scripts/requirements.txt and then run:

scripts/test-angha-1k.sh --rellic-cmd <path_to_rellic_decompiler_exe>

About

Rellic produces goto-free C output from LLVM bitcode

License:Apache License 2.0


Languages

Language:C++ 83.8%Language:CMake 5.1%Language:Shell 2.8%Language:Python 2.2%Language:C 2.2%Language:JavaScript 1.6%Language:LLVM 1.1%Language:HTML 0.6%Language:CSS 0.2%Language:Dockerfile 0.2%Language:Makefile 0.1%