Reduce branching (or batch reorganization) overheads
achimnol opened this issue · comments
In the EuroSys 2015 paper, we demonstrated a simple branch prediction technique.
However, after Packet/PacketBatch refactoring (#1), the current branch prediction technique does not perform well.
In this issue, we investigate why this happens and find another way to improve branching performance.
Anticipated overhead sources:
- Allocation of new batch objects from the memory pool
- Copying packet pointers to the new batches
- Handling packet masks in the original batch (if reused)
Test scenarios:
- single branch
- balanced tree (both majority/minority outputs have the same next branches)
- majority-skewed tree (next branch connected to the majority output of the current branch; minority outputs are short-circuited to L2Forward)
- minority-skewed tree (next branch connected to the minority output of the current branch; majority outputs are short-circuited to L2Forward)
- some processing elements after branch - to highlight the 3rd overhead