Unable to checkpoint container with `-nvproxy` after the introduction of `driverABI`
luiscape opened this issue · comments
Description
Similar to #9363, the driverABI
struct doesn't implement SaverLoader
.
I applied a similar patch to #9385 and am able to checkpoint containers with -nvproxy
successfully (still testing restore; patch below). I'm happy to submit a PR but wondering if this makes sense and what are the implications of not saving this state.
The patch would be made here.
// +stateify savable
type driverABI struct {
frontendIoctl map[uint32]frontendIoctlHandler `state:"nosave"`
uvmIoctl map[uint32]uvmIoctlHandler `state:"nosave"`
controlCmd map[uint32]controlCmdHandler `state:"nosave"`
allocationClass map[uint32]allocationClassHandler `state:"nosave"`
useRmAllocParamsV535 bool
}
Does this make sense?
This makes sense. The driver ABI should be savable. Happy to review your PR.
Although this would imply that the container must be restored on a host with the same nvidia driver version. If the driver version can change, then the ABI would need to be rebuilt (which requires extra work).
Although this would imply that the container must be restored on a host with the same nvidia driver version.
Gotcha. This is true in our case (for the most part :) ).
Submitted the patch here. Thanks a lot for the review.