Wrong variable name observed in `.ktest` file
YikeZhou opened this issue · comments
Bug description
Two symbolic variables array_0
and array_1
were defined. KLEE completed and exited successfully. However, variable names in .ktest
files seemed to be broken.
.kquery
: botharray_0
andarray_1
appeared (GOOD).ktest
: 2 objects shared the same namearray_1
(BAD)
Example code
#include <alloca.h>
#include <stdio.h>
#include <string.h>
#include <klee/klee.h>
int main() {
int array[2];
for (int i = 0; i < 2; i++) {
char *s = (char *)alloca(50);
sprintf(s, "%s", "array");
sprintf(s + strlen(s), "_%d", i);
int temp;
klee_make_symbolic(&temp, sizeof(temp), s);
array[i] = temp;
}
int identical = 0;
if (array[0] == array[1])
identical++;
return 0;
}
Compiled with clang:
clang-11 -emit-llvm -c -g -O0 -Xclang -disable-O0-optnone main.c
KLEE cmdline:
klee --libc=uclibc --posix-runtime --write-kqueries main.bc
Output of KLEE:
KLEE: NOTE: Using POSIX model: /usr/local/lib/klee/runtime/libkleeRuntimePOSIX64_Debug+Asserts.bca
KLEE: NOTE: Using klee-uclibc : /usr/local/lib/klee/runtime/klee-uclibc.bca
KLEE: output directory is "/home/zyk/Projects/C/klee_examples/for_loop/klee-out-0"
KLEE: Using Z3 solver backend
KLEE: WARNING: executable has module level assembly (ignoring)
KLEE: WARNING ONCE: calling external: syscall(16, 0, 21505, 94780079623312) at klee/runtime/POSIX/fd.c:1012 10
KLEE: WARNING ONCE: Alignment of memory from call "malloc" is not modelled. Using alignment of 8.
KLEE: WARNING ONCE: calling __klee_posix_wrapped_main with extra arguments.
KLEE: done: total instructions = 39295
KLEE: done: completed paths = 2
KLEE: done: partially completed paths = 0
KLEE: done: generated tests = 2
Inspecting klee-last/test000001.ktest
with ktest-tool
:
$ ktest-tool klee-last/test000001.ktest
ktest file : 'klee-last/test000001.ktest'
args : ['main.bc']
num objects: 3
object 0: name: 'model_version'
object 0: size: 4
object 0: data: b'\x01\x00\x00\x00'
object 0: hex : 0x01000000
object 0: int : 1
object 0: uint: 1
object 0: text: ....
object 1: name: 'array_1'
object 1: size: 4
object 1: data: b'\xff\x00\x00\x00'
object 1: hex : 0xff000000
object 1: int : 255
object 1: uint: 255
object 1: text: ....
object 2: name: 'array_1'
object 2: size: 4
object 2: data: b'\x00\x00\x00\x00'
object 2: hex : 0x00000000
object 2: int : 0
object 2: uint: 0
object 2: text: ....
Content of klee-last/test000001.kquery
:
array array_0[4] : w32 -> w8 = symbolic
array array_1[4] : w32 -> w8 = symbolic
array model_version[4] : w32 -> w8 = symbolic
(query [(Eq 1
(ReadLSB w32 0 model_version))
(Eq false
(Eq (ReadLSB w32 0 array_0)
(ReadLSB w32 0 array_1)))]
false)
This archive file contains the C source file along with bitcode and KLEE's output:
for_loop.tar.gz
Platform information
OS version: Ubuntu 22.04.1 LTS
Output of klee --version
:
KLEE 3.0-pre (https://klee.github.io)
Build mode: RelWithDebInfo (Asserts: ON)
Build revision: 667ce0f1ef33c32fbe2d1836fc1b334066e244ca
LLVM (http://llvm.org/):
LLVM version 11.1.0
Optimized build.
Default target: x86_64-pc-linux-gnu
Host CPU: znver1
Just a quick guess: I think it's the name handling here:
klee/lib/Core/SpecialFunctionHandler.cpp
Line 830 in 667ce0f
temp
is hoisted, hence the same mo
is renamed.Yes, this is the issue, the code is essentially making symbolic the same variable twice, due to the way the LLVM code is generated.
A workaround is to allocate space for temp
on each iteration on the heap.
Thank you for your advice. It’s been very helpful!
A workaround is to allocate space for
temp
on each iteration on the heap.
According to this, I've modified the for-loop in the example and it worked!
for (int i = 0; i < 2; i++) {
char s[50];
sprintf(s, "%s_%d", "array", i);
int *temp = (int *)malloc(sizeof(int)); // <-- heap space allocated here
klee_make_symbolic(temp, sizeof(*temp), s);
array[i] = *temp;
free(temp); // <-- and freed here
}
@MartinNowack I tried --klee-call-optimisation=false
, but this problem still exists.
After investigating the relative discussions you have mentioned, I tried to compare:
main()
inmain.bc
(by clang directly)__klee_posix_wrapped_main()
found inassembly.ll
(generated by KLEE)
And the only difference (excluding debug info) I could find was this:
(TL;DR) One br
instruction was removed.
; Function Attrs: noinline nounwind uwtable
define dso_local i32 @main() #0 !dbg !7 {
%1 = alloca i32, align 4
%2 = alloca [2 x i32], align 4
%3 = alloca i32, align 4
%4 = alloca [50 x i8], align 16
%5 = alloca i32, align 4 ; ■■■ The variable "temp" ■■■
%6 = alloca i32, align 4
; ... omitted ...
10: ; preds = %7
call void @llvm.dbg.declare(metadata [50 x i8]* %4, metadata !24, metadata !DIExpression()), !dbg !30
%11 = getelementptr inbounds [50 x i8], [50 x i8]* %4, i64 0, i64 0, !dbg !31
%12 = load i32, i32* %3, align 4, !dbg !32
%13 = call i32 (i8*, i8*, ...) @sprintf(i8* %11, i8* getelementptr inbounds ([6 x i8], [6 x i8]* @.str, i64 0, i64 0), i8* getelementptr inbounds ([6 x i8], [6 x i8]* @.str.1, i64 0, i64 0), i32 %12) #4, !dbg !33
call void @llvm.dbg.declare(metadata i32* %5, metadata !34, metadata !DIExpression()), !dbg !35
%14 = bitcast i32* %5 to i8*, !dbg !36
%15 = getelementptr inbounds [50 x i8], [50 x i8]* %4, i64 0, i64 0, !dbg !37
call void @klee_make_symbolic(i8* %14, i64 4, i8* %15), !dbg !38 ; ■■■ Make "temp" symbolic here ■■■
%16 = load i32, i32* %5, align 4, !dbg !39
%17 = load i32, i32* %3, align 4, !dbg !40
%18 = sext i32 %17 to i64, !dbg !41
%19 = getelementptr inbounds [2 x i32], [2 x i32]* %2, i64 0, i64 %18, !dbg !41
store i32 %16, i32* %19, align 4, !dbg !42
br label %20, !dbg !43 ; ■■■ This was optimized out in assembly.ll ■■■
20: ; preds = %10 ■■■ This was optimized out in assembly.ll ■■■
%21 = load i32, i32* %3, align 4, !dbg !44
%22 = add nsw i32 %21, 1, !dbg !44
store i32 %22, i32* %3, align 4, !dbg !44
br label %7, !dbg !45, !llvm.loop !46
; ... omitted ...
}
Then I looked into the line pointed out by @251. Here is my guess:
After having executed klee_make_symbolic
twice, two Array
s (named array_0
and array_1
respectively) were bound to the same MemoryObject
named array_1
.
Lines 4308 to 4317 in 667ce0f
klee/lib/Core/ExecutionState.cpp
Lines 135 to 137 in 667ce0f
However, KTest
objects took the MemoryObject
's name returned by Executor::getSymbolicSolution
. This led to the problem.
Line 4576 in 667ce0f
Would it be ok to simply replace first
by second
here?
Looking forward to your response and thanks in advance!