qmonnet / rbpf

Rust virtual machine and JIT compiler for eBPF programs

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Incorrect shift implementation in the interpreter

pcy190 opened this issue · comments

According to the kernel eBPF standard, Shift operations use a mask of 0x3F (63) for 64-bit operations and 0x1F (31) for 32-bit operations.

https://github.com/torvalds/linux/blob/610a9b8f49fbcf1100716370d3b5f6f884a2835a/Documentation/bpf/standardization/instruction-set.rst?plain=1#L300-L301

In both ubpf and kernel eBPF implementation, it masks the operation of LSH/RSH:

https://github.com/iovisor/ubpf/blob/4b9a1cc747cbb9a9aad025a68aaedff7211f7fc2/vm/ubpf_vm.c#L495-L506
https://github.com/torvalds/linux/blob/610a9b8f49fbcf1100716370d3b5f6f884a2835a/kernel/bpf/core.c#L1712-L1724

In the rbpf interpreter, it lacks the masking operation.

rbpf/src/interpreter.rs

Lines 225 to 228 in 4812c52

ebpf::LSH32_IMM => reg[_dst] = (reg[_dst] as u32).wrapping_shl(insn.imm as u32) as u64,
ebpf::LSH32_REG => reg[_dst] = (reg[_dst] as u32).wrapping_shl(reg[_src] as u32) as u64,
ebpf::RSH32_IMM => reg[_dst] = (reg[_dst] as u32).wrapping_shr(insn.imm as u32) as u64,
ebpf::RSH32_REG => reg[_dst] = (reg[_dst] as u32).wrapping_shr(reg[_src] as u32) as u64,

Moreover, in LSH64/RSH64, it doesn't check the range of src value, which could result in the 'attempt to shift left with overflow' panic as the following PoC program shows.

rbpf/src/interpreter.rs

Lines 272 to 275 in 4812c52

ebpf::LSH64_IMM => reg[_dst] <<= insn.imm as u64,
ebpf::LSH64_REG => reg[_dst] <<= reg[_src],
ebpf::RSH64_IMM => reg[_dst] >>= insn.imm as u64,
ebpf::RSH64_REG => reg[_dst] >>= reg[_src],

This program would trigger the 'attempt to shift left with overflow' panic in interpreter

mov64 r8, 0x054545ff
lsh64 r8, r8
exit

Regarding the use of a mask for left/right shifts in the interpreter, my understanding is that it's 1) because the behaviour is otherwise undefined in C, and 2) to address reports from KUBSAN. I'm not sure either of those apply in our case, do you have a reproducer program for which the output would be different than uBPF or kernel's eBPF, for example? If not, we probably don't need to add useless masks in the interpreter for the sake of copying the C implementations.

Ignore the above, I misunderstood the comments about the undefined behaviour. If I understand correctly, in our case the behaviour is defined: it panics if we overflow (hence a different behaviour from kernel/uBPF). OK, so we probably need the masks as well.