Rework ipi_send_target

Question

Rework ipi_send_target

andybui01 opened this issue 7 months ago · comments

Currently, the semantics of ipi_send_target() allows it to handle multiple IPI targets at once. However, this is not actually used in practice. generic_ipi_send_mask(), which calls ipi_send_target(), essentially breaks down the target mask and then dispatches one IPI at a time anyways. This has lead to very weird workarounds on all interrupt controllers (namely the APICs, RISC-V (PLIC?), and now GICv3).

This is a follow on from the discussion in: #1135

Gerwin Klein · Answer 1 · Thu Feb 08 2024 13:07:29 GMT+0800 (China Standard Time)

I'd like to understand this better. Is the issue that the multiple IPI targets are not supported by the hardware or is the issue that the implementation in seL4 does something suboptimal?

Indan Zupancic · Answer 2 · Thu Feb 08 2024 20:25:59 GMT+0800 (China Standard Time)

Both. The only one capable of sending multiple IPIs at once is GICv2, but even there only one IPI is sent at once because that's what generic_ipi_send_mask() does.

But the x86 and RISC-V ipi_send_target() implementations are written with the assumption that they are called with only one cpu id instead of a mask.

x86 xapic: apic_send_ipi_core(irq_t vector, cpu_id_t cpu_id)
RISC-V: ipi_send_target(irq_t irq, word_t hart_id)
ARM GICv2: ipi_send_target(irq_t irq, word_t cpuTargetList) and HW supports masks.
ARM GICv3: ipi_send_target(irq_t irq, word_t cpuTargetList) and HW does not support masks.

So the proposal is to change the semantics so they match reality as implemented:
ipi_send_target(irq_t irq, word_t cpu).

Then x86 and RISC-V are correct. GICv3 can be simplified because it doesn't need to loop any more. GICv2 needs to use BIT(cpu) instead of the mask to keep current behaviour, or it can implement its own optimised ipi_send_mask().

Long-term you want to get rid of masks altogether because there will be too many cpus to fit in one word.

Gerwin Klein · Answer 3 · Fri Feb 09 2024 07:00:18 GMT+0800 (China Standard Time)

Thanks for that summary, that makes sense to me. @kent-mcleod this means it probably does not make sense to have a mask for SGI targets in the upcoming RFC the IPI/SGI API, correct?

Kent McLeod · Answer 4 · Fri Feb 09 2024 07:35:39 GMT+0800 (China Standard Time)

The GICv3 hw does support sending to up to 16 cores at a time using a 16bit target mask.

Kent McLeod · Answer 5 · Fri Feb 09 2024 07:40:33 GMT+0800 (China Standard Time)

Edit: A difference between the GICv2 and GICv3 is that the GICv2 supports multiple targets on the receiving side. EG if core B and core C both SGI core A with SIG 2, core a will have to separately acknowledge 2B and 2C. Whereas on the GICv3 there's only one irq state for each SGI number, so if the SGI from B is still pending when the SGI from C is delivered, the 2 IRQ stays pending and can only be acknowledged once.

Indan Zupancic · Answer 6 · Fri Feb 09 2024 18:41:16 GMT+0800 (China Standard Time)

The GICv3 hw does support sending to up to 16 cores at a time using a 16bit target mask.

This is only any good if we do something similar at higher levels and also pack multiple targets into one word_t or something and propagate that throughout. That seems a lot of complexity for a probably rare corner case. Sending an IPI to all other cores may be worth optimising though.

Gerwin Klein · Answer 7 · Sun Feb 11 2024 07:56:18 GMT+0800 (China Standard Time)

The GICv3 hw does support sending to up to 16 cores at a time using a 16bit target mask.

This is only any good if we do something similar at higher levels and also pack multiple targets into one word_t or something and propagate that throughout.

That will be exactly the proposal for the new SGI/IPI API -- that the user can specify a CPU mask to send to multiple targets at once.

That seems a lot of complexity for a probably rare corner case. Sending an IPI to all other cores may be worth optimising though.

This makes sense for the current state of the code, but if we do add it as an API (which admittedly is all new and not even an RFC yet) then I don't think we should artificially limit for the user what the hardware supports. I initially thought this discussion meant that GICv3 does not support the feature, and then having a different API between GICv2 and GICv3 would be awkward, but so it sounds like we should clean up the sub-optimal internal implementation, not limit the main interface function.