visitSelectInst does not propagate the symbolic expression
ercoppa opened this issue · comments
The current implementation of visitSelectInst
is:
void Symbolizer::visitSelectInst(SelectInst &I) {
// Select is like the ternary operator ("?:") in C. We push the (potentially
// negated) condition to the path constraints and copy the symbolic
// expression over from the chosen argument.
IRBuilder<> IRB(&I);
auto runtimeCall = buildRuntimeCall(IRB, runtime.pushPathConstraint,
{{I.getCondition(), true},
{I.getCondition(), false},
{getTargetPreferredInt(&I), false}});
registerSymbolicComputation(runtimeCall);
}
The effect is that SymCC may (e.g., depending on the branch bitmap) generate an alternative input. However, the data flow is not propagated: the result of SelectInst would not be symbolic. Hence, the code is not reflecting the comment.
This can be seen in this example (inspired by real-world code):
#include <stdlib.h>
#include <stdio.h>
#include <unistd.h>
char bar(char a, char b, char c) {
return (a != 0xA) ? b : c;
}
int main() {
char input[2] = { 0 };
read(0, &input, sizeof(input));
char r = bar(input[0], input[1], 0);
if (r == 0xB) printf("Bingo!\n");
else printf("Ok\n");
return 0;
}
where r
will not be symbolic due to the bug. SymCC cannot solve the branch in main
because r
is concrete.
A possible fix could be to add in the bottom of visitSelectInst
something like:
auto *data = IRB.CreateSelect(
I.getCondition(),
getSymbolicExpressionOrNull(I.getTrueValue()),
getSymbolicExpressionOrNull(I.getFalseValue()));
symbolicExpressions[&I] = data;
If this suggested fix is fine, then I can send a PR. Otherwise, let me know how you prefer to revise the fix.
So we're actually lying in that comment in visitSelectInst
😜 I agree with your suggested fix. We could additionally check if the expressions for both values are null (i.e., the values are compile-time known, as in foo ? "yes" : "no"
) and skip the creation of the select in that case, which will prevent us from emitting code for subsequent use of the selected value. (It would be short-circuited at runtime, but code that we don't generate is even better than code that we jump over 😉 )