seL4 / seL4

The seL4 microkernel

Home Page:https://sel4.systems

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Should we add `-Wconversion` to compiler flags?

lsf37 opened this issue · comments

We just finished tracking down a bug in a PR where ticks_t was implicitly converted by the compiler to word_t, changing the value (overflowing 32 bit) and causing the kernel to crash.

The compiler currently doesn't warn for implicit C conversions that my change value or sign. Verification will catch these, but only when the code is actually in the verified subset (the example was for MCS, where the C proofs are still in progress and currently only for 64 bit, where word_t and ticks_t is equal).

@Xaphiosis points out that manually adding -Wconversion to the compiler flags would catch these much earlier. And arguably we should not have any implicit conversions in the kernel at all: either we have thought about it and want a conversion, then we should make it explicit, or we should avoid it, because it either dangerous (not verified) or increases verification cost (needs reasoning why the conversion is safe).

The downside is that there are currently quite a few warnings if we switch this on. On first sampling they look fine, i.e. are safe, but there are a few specifically on MCS that might not be.

So it'd be a bit of work and code churn to remove all warnings (including verification update), but I think it might be worth it.

What do people think?

I think it's a good idea

Depends on how many of those "quite a few" warnings are duplicates, and if it are dozens or hundreds of unique warnings. If it's too many it's just a lot of work with not much gain: If people become too used fixing them without thinking, it may not prevent new bugs.

There are currently 85 such warnings on the master branch, many of which are versions of unsigned conversion from 'int' to 'syscall_t' at different points in syscall.c (all safe).

Some of the other ones are not obvious to me if they are correct or not. For instance is this one always guaranteed to work? It converts everything to unsigned for both the + and the *, so either we lose the sign or we lose half of the value space. (Or it might just work out because the result is converted back to a signed long.) It's just a print utility function, but it would be very hard to spot.

Depends on how many of those "quite a few" warnings are duplicates, and if it are dozens or hundreds of unique warnings. If it's too many it's just a lot of work with not much gain: If people become too used fixing them without thinking, it may not prevent new bugs.

My idea would be to take the opportunity to sort these out now and make sure that they're safe, so that for a new contributor the warning will pop up in their new code only. Unfortunately you are right, and there is no centralised cure for laziness.

There are currently 85 such warnings on the master branch, many of which are versions of unsigned conversion from 'int' to 'syscall_t' at different points in syscall.c (all safe).

That doesn't sound too bad.

Some of the other ones are not obvious to me if they are correct or not. For instance is this one always guaranteed to work?

No, it's broken, as it doesn't support negative numbers at all. But that has nothing to do with the sign conversion.

It converts everything to unsigned for both the + and the *, so either we lose the sign or we lose half of the value space.

Only on 32-bit, on 64-bit it's converted to long. But input overflow is not handled correctly. That is, the function can't deal with numbers that don't fit in a long anyway. For positive numbers that fit, it doesn't matter whether the intermediate values are stored signed or unsigned.