Wrong handling of i1 in visitCastInst
ercoppa opened this issue · comments
Consider this example (inspired by a real-world code):
#include <stdlib.h>
#include <stdio.h>
#include <unistd.h>
int bar(unsigned char a) {
if (a == 0xCA) return -1;
else return 0;
}
int main() {
unsigned char input = 0;
read(0, &input, sizeof(input));
int r = bar(input);
if (r == -1) printf("Bingo!\n");
else printf("Ok\n");
return r;
}
Clang for bar
will emit with -O1
(when using -O2
, the function bar
is inlined, hiding the bug):
define dso_local i32 @bar(i8 zeroext %0) local_unnamed_addr #0 {
%2 = icmp eq i8 %0, -54
%3 = sext i1 %2 to i32
ret i32 %3
}
Notice the sext
operation. When instrumenting with SymCC, we get:
define dso_local i32 @bar(i8 zeroext %0) local_unnamed_addr #0 {
call void @_sym_notify_basic_block(i64 18040285541467748) #5
%2 = call i8* @_sym_get_parameter_expression(i8 0) #5
%3 = icmp eq i8* %2, null
br i1 %3, label %7, label %4
4: ; preds = %1
%5 = call i8* @_sym_build_integer(i64 202, i8 8) #5
%6 = call i8* @_sym_build_equal(i8* nonnull %2, i8* nonnull %5) #5
br label %7
7: ; preds = %1, %4
%8 = phi i8* [ null, %1 ], [ %6, %4 ]
%9 = icmp eq i8 %0, -54
%10 = icmp eq i8* %8, null
br i1 %10, label %13, label %11
11: ; preds = %7
%12 = call i8* @_sym_build_bool_to_bits(i8* nonnull %8, i8 32) #5
br label %13
13: ; preds = %7, %11
%14 = phi i8* [ null, %7 ], [ %12, %11 ]
%15 = sext i1 %9 to i32
call void @_sym_set_return_expression(i8* %14) #5
ret i32 %15
}
The problem is that _sym_build_bool_to_bits
builds an If-Then-Else like if (cond, 0x0...01, 0x0...0)
which is correct only in case of a zext
operation but not for a sext
operation. Indeed, SymCC is not able to generate an alternative input on the example:
SYMCC_OUTPUT_DIR=`pwd`/out ./main < input.txt
This is SymCC running with the QSYM backend
Reading program input until EOF (use Ctrl+D in a terminal)...
[STAT] SMT: { "solving_time": 0, "total_time": 531 }
[STAT] SMT: { "solving_time": 285 }
[STAT] SMT: { "solving_time": 285, "total_time": 1115 }
[STAT] SMT: { "solving_time": 498 }
Ok
One possible fix could be to provide, e.g., _sym_build_bool_to_sign_bits
and use it in visitCastInst
for the i1
case iff the instruction is Instruction::SExt
.
Let me know if you want a PR along this direction or if we should design a slightly different fix.
Nice find! And thanks for all the debug information 😊
I'm wondering if _sym_build_bool_to_bits
is doing more than necessary 🤔 How about we make it return an expression for i1
unconditionally, which we then feed to either _sym_build_sext
or _sym_build_zext
? The downside would be an additional call into the runtime, but since there's no branching the CPU should be able to handle it rather well. And the code would fit nicely into visitCastInst
... What do you think?
Your example is a really nice candidate for the test suite too. I can add it with the fix.