ossobv / proxmove

Migrate virtual machines between different Proxmox VE clusters with minimal downtime

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

ERROR: Failed to create VM (at target cluster)

megabert opened this issue · comments

Hi,

I'm having issues when migrating:

# ./proxmove --no-verify-ssl px04 pxc01 px03 localdisk kvm09-xxx

2022-02-02 14:24:20,067: INFO: Attempt moving px04<6ddebafe> => pxc01<6ddebafe> (node 'px03'): kvm09-xxx
2022-02-02 14:24:20,067: INFO: - source VM kvm09-xxx@px04<qemu/100/running>
2022-02-02 14:24:20,067: INFO: - storage 'ide2': None,media=cdrom (host=<unknown>, guest=<unknown>)
2022-02-02 14:24:20,067: INFO: - storage 'scsi0': localdisk:100/vm-100-disk-0.qcow2,cache=unsafe,discard=on,size=10G (host=10.0GiB, guest=10.0GiB)
2022-02-02 14:24:20,070: INFO: Creating new VM 'kvm09-xxx--CREATING' on 'pxc01', node 'px03'
2022-02-02 14:24:20,100: ERROR: Failed to create VM with parameters:

  # https://px03.xxx:8006/api2/json
  api.nodes("px03/qemu").create(**{'smbios1': 'uuid=9bdc528d-3a9a-424c-8cb7-5944832d741e', 'tags': 'prod', 'ostype': 'l26', 'boot': 'cd', 'sockets': 1, 'numa': 0, 'name': 'kvm09-xxx--CREATING', 'bootdisk': 'scsi0', 'net0': 'virtio=12:05:48:BF:7E:91,bridge=vmbr1,firewall=1', 'onboot': 1, 'meta': 'creation-qemu=6.1.0,ctime=1643742755', 'cores': 1, 'memory': 2048, 'agent': '1', 'vmid': 128})

Traceback (most recent call last):
  File "/root/proxmove/./proxmove", line 1983, in _start_moving_vm
    dst_vm = self.dst_pve.get_vm(
  File "/root/proxmove/./proxmove", line 572, in get_vm
    raise ProxmoxVm.DoesNotExist(
ProxmoxVm.DoesNotExist: VM named 'kvm09-xxx' not found in cluster 'pxc01'; do you have the PVEVMAdmin role?

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/root/proxmove/./proxmove", line 477, in create_vm
    vmhash = getattr(api_node, 'qemu').create(**mutable_config)
  File "/usr/local/lib/python3.9/dist-packages/proxmoxer/core.py", line 135, in create
    return self.post(*args, **data)
  File "/usr/local/lib/python3.9/dist-packages/proxmoxer/core.py", line 126, in post
    return self(args)._request("POST", data=data)
  File "/usr/local/lib/python3.9/dist-packages/proxmoxer/core.py", line 105, in _request
    raise ResourceException(
proxmoxer.core.ResourceException: 400 Bad Request: Parameter verification failed.
Traceback (most recent call last):
  File "/root/proxmove/./proxmove", line 1983, in _start_moving_vm
    dst_vm = self.dst_pve.get_vm(
  File "/root/proxmove/./proxmove", line 572, in get_vm
    raise ProxmoxVm.DoesNotExist(
__main__.DoesNotExist: VM named 'kvm09-xxx' not found in cluster 'pxc01'; do you have the PVEVMAdmin role?

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/root/proxmove/./proxmove", line 2299, in <module>
    main()
  File "/root/proxmove/./proxmove", line 2295, in main
    vmmover.run(options.dry_run)
  File "/root/proxmove/./proxmove", line 1935, in run
    self.move_vm(vm, translator, dry_run)
  File "/root/proxmove/./proxmove", line 1975, in move_vm
    dst_vm = self._start_moving_vm(src_vm, translator)
  File "/root/proxmove/./proxmove", line 1989, in _start_moving_vm
    dst_vm = self.dst_pve.create_vm(
  File "/root/proxmove/./proxmove", line 477, in create_vm
    vmhash = getattr(api_node, 'qemu').create(**mutable_config)
  File "/usr/local/lib/python3.9/dist-packages/proxmoxer/core.py", line 135, in create
    return self.post(*args, **data)
  File "/usr/local/lib/python3.9/dist-packages/proxmoxer/core.py", line 126, in post
    return self(args)._request("POST", data=data)
  File "/usr/local/lib/python3.9/dist-packages/proxmoxer/core.py", line 105, in _request
    raise ResourceException(
proxmoxer.core.ResourceException: 400 Bad Request: Parameter verification failed.
root@px04:~/proxmove# 

This is my .proxmoverc on both nodes:

# pxc01 is a cluster with 4 nodes
[pve:pxc01]
        api=https://adminrobot@pve:secret@px03.xxx:8006

        [storage:pxc01:localdisk@px01]
                ssh=root@px01.xxx
                path=/proxmox/images
                temp=/proxmox/temp

        [storage:pxc01:localdisk@px03]
                ssh=root@px03.xxx
                path=/proxmox/images
                temp=/proxmox/temp

# px04 is a single node
[pve:px04]
        api=https://adminrobot@pve:secret@px04.xxx:8006

        [storage:px04:localdisk@px04]
                ssh=root@px04.xxx
                path=/proxmox/images
                temp=/proxmox/temp

I'd logged into both clusters with the adminrobot account with the given password and verified that I have full "Administrator" - Role privileges. The proxmove from pxc01 to px04 is working(sometimes, sometimes not with the same error shown here), but the way back never works due to the given error message.

All Proxmox Nodes are up2date with the no-subscription repo:

pveversion -v
proxmox-ve: 7.1-1 (running kernel: 5.13.19-3-pve)
pve-manager: 7.1-10 (running version: 7.1-10/6ddebafe)
pve-kernel-helper: 7.1-8
pve-kernel-5.13: 7.1-6
pve-kernel-5.4: 6.4-12
pve-kernel-5.13.19-3-pve: 5.13.19-7
pve-kernel-5.4.162-1-pve: 5.4.162-2
pve-kernel-5.4.140-1-pve: 5.4.140-1
ceph-fuse: 14.2.21-1
corosync: 3.1.5-pve2
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown: 0.8.36+pve1
libjs-extjs: 7.0.0-1
libknet1: 1.22-pve2
libproxmox-acme-perl: 1.4.1
libproxmox-backup-qemu0: 1.2.0-1
libpve-access-control: 7.1-6
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.1-2
libpve-guest-common-perl: 4.0-3
libpve-http-server-perl: 4.1-1
libpve-storage-perl: 7.0-15
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 4.0.11-1
lxcfs: 4.0.11-pve1
novnc-pve: 1.3.0-1
proxmox-backup-client: 2.1.4-1
proxmox-backup-file-restore: 2.1.4-1
proxmox-mini-journalreader: 1.3-1
proxmox-widget-toolkit: 3.4-5
pve-cluster: 7.1-3
pve-container: 4.1-3
pve-docs: 7.1-2
pve-edk2-firmware: 3.20210831-2
pve-firewall: 4.2-5
pve-firmware: 3.3-4
pve-ha-manager: 3.3-3
pve-i18n: 2.6-2
pve-qemu-kvm: 6.1.0-3
pve-xtermjs: 4.12.0-1
qemu-server: 7.1-4
smartmontools: 7.2-pve2
spiceterm: 3.2-2
swtpm: 0.7.0~rc1+2
vncterm: 1.7-1
zfsutils-linux: 2.1.2-pve1

I think there are two exceptions and you're only showing us the first one. Am I correct?

That first Traceback is not a problem, as that is caught on L1985:

proxmove/proxmove

Lines 1981 to 1991 in be56486

try:
# Check for existing target VM. Happens if we're resuming.
dst_vm = self.dst_pve.get_vm(
src_vm.basename, suffix=SUFFIX_CREATING)
except ProxmoxVm.DoesNotExist:
# Pristine VMs: start, by translating config and
# creating a new VM.
dst_config = translator.config(src_vm.get_config())
dst_vm = self.dst_pve.create_vm(
dst_config, nodeid=self.dst_node,
poolid=src_vm.poolid)

Ah, indeed, your edits add this one:

Traceback (most recent call last):
  File "/root/proxmove/./proxmove", line 2299, in <module>
    main()
  File "/root/proxmove/./proxmove", line 2295, in main
    vmmover.run(options.dry_run)
  File "/root/proxmove/./proxmove", line 1935, in run
    self.move_vm(vm, translator, dry_run)
  File "/root/proxmove/./proxmove", line 1975, in move_vm
    dst_vm = self._start_moving_vm(src_vm, translator)
  File "/root/proxmove/./proxmove", line 1989, in _start_moving_vm
    dst_vm = self.dst_pve.create_vm(
  File "/root/proxmove/./proxmove", line 477, in create_vm
    vmhash = getattr(api_node, 'qemu').create(**mutable_config)
  File "/usr/local/lib/python3.9/dist-packages/proxmoxer/core.py", line 135, in create
    return self.post(*args, **data)
  File "/usr/local/lib/python3.9/dist-packages/proxmoxer/core.py", line 126, in post
    return self(args)._request("POST", data=data)
  File "/usr/local/lib/python3.9/dist-packages/proxmoxer/core.py", line 105, in _request
    raise ResourceException(
proxmoxer.core.ResourceException: 400 Bad Request: Parameter verification failed.

Apparently it dislikes one of these:

api.nodes("px03/qemu").create(**{
  'smbios1': 'uuid=9bdc528d-3a9a-424c-8cb7-5944832d741e',
  'tags': 'prod', 'ostype': 'l26', 'boot': 'cd', 'sockets': 1, 'numa': 0,
  'name': 'kvm09-xxx--CREATING', 'bootdisk': 'scsi0',
  'net0': 'virtio=12:05:48:BF:7E:91,bridge=vmbr1,firewall=1',
  'onboot': 1, 'meta': 'creation-qemu=6.1.0,ctime=1643742755',
  'cores': 1, 'memory': 2048, 'agent': '1', 'vmid': 128
})

What you can do is add a breakpoint before L477:

        try:
            import pdb; pdb.set_trace()  # <-- add this line
            vmhash = getattr(api_node, 'qemu').create(**mutable_config)
        except ResourceException:
            log.exception(

Then you can try running the code above manually, whilst leaving out one or more arguments.

E.g.

(pdb) mydict = {'smbios1': 'uuid=9bdc528d-3a9a-424c-8cb7-5944832d741e', 'tags': 'prod', 'ostype': 'l26', 'boot': 'cd', 'sockets': 1, 'numa': 0, 'name': 'kvm09-xxx--CREATING', 'bootdisk': 'scsi0', 'net0': 'virtio=12:05:48:BF:7E:91,bridge=vmbr1,firewall=1', 'onboot': 1, 'meta': 'creation-qemu=6.1.0,ctime=1643742755', 'cores': 1, 'memory': 2048, 'agent': '1', 'vmid': 128}
(pdb) tmpdict = mydict.copy()
(pdb) del tmpdict['tags']
(pdb) api.nodes("px03/qemu").create(**tmpdict)

If that works, then the new proxmox doesn't handle the tags parameter.

Thanks for the support. The patch for the issues you hinted for additional debug info is already applied to my proxmoxer Version.

I created a new test-VM.

The issue is caused by the "meta" key:

(Pdb) mydict = {'ostype': 'l26', 'smbios1': 'uuid=5142e2cd-6f73-4ddc-abce-eb75805c3cbb', 'boot': 'cd', 'net0': 'virtio=CA:50:BA:45:1E:88,bridge=vmbr0,firewall=1', 'name': 'testvm--CREATING', 'numa': 0, 'scsihw': 'virtio-scsi-pci', 'sockets': 1, 'cores': 1, 'meta': 'creation-qemu=6.1.0,ctime=1643811510', 'memory': 2048, 'vmid': 128}
(Pdb) tmpdict = mydict.copy()
(Pdb) self.api.nodes("px03/qemu").create(**tmpdict)
*** proxmoxer.core.ResourceException: 400 Bad Request: Parameter verification failed.
(Pdb) del tmpdict['meta']
(Pdb) self.api.nodes("px03/qemu").create(**tmpdict)
'UPID:px03:000B75E2:097D2D73:61FA9776:qmcreate:128:adminrobot@pve:'

A "meta" - Key is not listed in Proxmox PVE API Viewer:

https://pve.proxmox.com/pve-docs/api-viewer/index.html#/nodes/{node}/qemu

This may serve as a first workaround:

+++ proxmove    2022-02-02 15:53:26.243581105 +0100
@@ -466,6 +466,7 @@
         # Guess new VMID, set id and name.
         vmid = self.get_free_vmid()
         mutable_config['vmid'] = vmid
+        del mutable_config['meta']
         mutable_config['name'] = name_with_suffix
         assert 'hostname' not in mutable_config, mutable_config  # lxc??

Nice going. This commit should fix it for you.