veeam / veeamsnap

Veeam Agent for Linux kernel module

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Syscall to `IOCTL_TRACKING_ADD` fails, raising `function not implemented`

Dany9966 opened this issue · comments

This call runs fine on most of the distros we had tested on. Only ones with issues are Red Hat Enterprise Linux 8 (and others based on it, like CentOS 8, Oracle Linux 8), when running their latest kernel version - 4.18.0-348.7.1.el8_5.x86_64
This call is successful when running on older kernel versions, like the ones that come preinstalled with the OS ISOs (for releases up to 8.3)
Here is the exact call we're trying to run: https://github.com/cloudbase/coriolis-snapshot-agent/blob/main/internal/ioctl/ioctl.go#L175

Kernel versions that this successfully runs on:

  • 4.18.0-147.el8.x86_64
  • 4.18.0-193.el8.x86_64
  • 4.18.0-240.el8.x86_64

Kernel versions that this call fails:

  • 4.18.0-305
  • 4.18.0-348.el8.x86_64
  • 4.18.0-372.26.1.el8_6.x86_64

veeamsnap module is always built from source, master branch.

Thanks for the feedback!
We will check.
For details, please contact veeam support.

Ah, I get it.
You create your agent on 'go', which directly calls the veeamsnap module.

	r1, _, err := syscall.Syscall(syscall.SYS_IOCTL, dev.Fd(), IOCTL_TRACKING_ADD, uintptr(unsafe.Pointer(&deviceParams)))
	if r1 != 0 {
		return errors.Wrap(err, "running ioctl")
	}

I'm not a fan of the 'go' language. At first glance, everything is correct.
But maybe in 'go' there are features in the work of the memory manager. Check the lifetime of the variable 'deviceParams'.
If I were you, I would do a unit-test in C. And debugged on it.

We have checked these kernels, there are no problems.
So I close the case.

And check this code

	// Tracking
	IOCTL_TRACKING_ADD               uintptr = 1074288130
	IOCTL_TRACKING_REMOVE            uintptr = 1074288131

is it equal to #define IOCTL_TRACKING_ADD _IOW(VEEAM_SNAP, 2, struct ioctl_dev_id_s)?

I didn't consider the problem to be on our side, since this virtually works on every other system, but the RHEL8, with those specific kernel versions. Could be an edge-case also. Will investigate further.

Alas, I cannot join your research.

Check that the module is built specifically for this version of the kernel, as compatibility problems are possible.
See kmod-veeamsnap-*.rpm

Write about the results of research this problem. I'm very interested.

Hi @SergeiShtepa ,

Thanks for the leads. We'll test a bit more on our side. We will attempt to run that ioctl from C and see if we get the same result.

The C package in go allows us to reference C types and functions. This bit here:

https://github.com/cloudbase/coriolis-snapshot-agent/blob/main/internal/ioctl/ioctl.go#L170-L173

is essentially an instance of the struct_ioctl_dev_id_s structure from the C library imported above. We did this, because of the memory alignment in Go when declaring struct fields of different types. Go allocates larger chunks of memory for smaller types if they are not the last fields in the structure.

But given that we don't use a Go struct here, that's should not be the case in this instance. However, we will investigate this further as it's turning out to be a puzzle 😄 .

Thanks for the tips, for your continued work on this module and the efforts to upstream it! Will ping back with results.

Hi!
Could you find the cause of the problem? I'm interested.
Using out-of-tree kernel modules creates some difficulties.
That is why I am working on offering a module blksnap to the upstream. See issue and latest patch.

Any feedback is welcome.

Hi Sergei,

Not yet. Been swamped lately, but will definitely ping back when I manage to find the cause. I did not forget, just swamped at the moment.

Hello!

A sample C program and some additional logging have been added to the veeamsnap module, for debugging purposes, here: master...aznashwan:veeamsnap:debug-ioctl-cmd

We had found that when running the sample C program, the same IOCTL_TRACKING_ADD syscall fails with the same error. dmesg shows:
[ +0,003004] veeamsnap:tracking | ERR | Failed to create tracker. errno=-38

Error code 38 is ENOSYS: Function not implemented.

This is the same error thrown by our GO agent. The extra logging will also show the IOCTL command code each program is trying to run. Both the C program and the GO agent call the same IOCTL command 1074288130 (IOCTL_TRACKING_ADD), therefore we can exclude our agent being the problem here.

Yep.
See:

  • tracker_disk_ref().
  • ioctl_set_kernel_entries()
  • ioctl_get_unresolved_kernel_entries()

Get VAL 5.* and trace IOCTL_SET_KERNEL_ENTRIES/IOCTL_GET_UNRESOLVED_KERNEL_ENTRIES.
See /proc/kallsyms.

These are the crutches. Allows to call functions that are not exported by the kernel.
I'm sick of the aesthetics of such a crutch, but I don't have to choose.

That is why this work is important.