jonhoo / rust-ibverbs

Bindings for RDMA ibverbs through rdma-core

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

RFC: Integration testing

daniel-noland opened this issue · comments

I really appreciate this library but there is one thing which concerns me: integration testing.

I'm looking for feedback on the desired approach here (or alternatives if you think of any).

To be clear, I don't really feel that unit testing is especially problematic in this case: bindgen takes care of
validating the bindings and upstream rdma-core is already well tested in its own right. That said, there are a number of really handy convenience methods and data structures which I would like to help test.

I iterated through several options for integration testing here and I have basically concluded "it is going to be hard."
Still, that doesn't mean I shouldn't try.

Options

  1. Vagrant virtual machines + SoftRoCE: workable but a bit heavy.
    This will require significant tooling work and is likely to be really problematic if ya want to continue to use a
    tool like TravisCI.

  2. Docker container + SoftRoCE: workable, not super portable, challenging for CI.
    In the name of experimentation, I cooked up a debian-bullseye container which builds the project, sets up a
    SoftRoCE interface, and runs a few integration tests. The problem is that RDMA is, at its heart, a hardware offload.
    Even with SoftRoCE you are just emulating a hardware offload. As long as you are using the host's Kernel (as would
    be the case for most container technologies) then the container is really only buying you a consistent development
    environment; not a consistent deployment environment. For example, if the CI system hasn't loaded the SoftRoCE
    module (rxe) then the container won't work anyway. Packaging the SoftRoCE kernel module with the container is
    useless due to ABI compatibility problems with differing kernel versions. You might be able to somehow overlay mount
    /lib/modules/ from the host and use DKMS to build and load the SoftRoCE module on a per kernel basis but this seems
    like a huge pain. At the very least this will be very difficult to make reliable and portable.

    In my experience the best way to ensure tests never get written is to make them a huge pain to write and maintain.

    Oh, one more problem: RDMA devices are not (by default) network namespaced. Even when I enable RDMA namespacing, I
    can't seem to get SoftRoCE devices to move namespaces. Thus my dream of setting up something like mininet for RoCE
    doesn't seem viable currently.

  3. Bash script sets up SoftRoCE device, assume developer has taken care of the kernel config: not ideal but I think it
    is the best available choice at the moment.

    I have already composed a simple bash script which sets up a single SoftRoCE device. This isn't really portable
    either and it requires that the developer already have rdma-core installed (it requires iproute2 and the rdma
    command). I don't know how CI would react to this configuration.

Thoughts?

I like the approach you're taking in #13 if we can make it work! This is definitely a tricky challenge without dedicated testing hardware.

Hi, I would really recommend to do testing in a VM, because SoftRoCE is not very stable and can easily result in a kernel panic.