Add generic boot disk resource
achetronic opened this issue · comments
Hello there. When crafting VMs for Talos, and creating a NAT with libvirt_network everything is fine
The problem comes with macvtap. When creating de VMs with macvtap mode, you usually need cloud-init to instruct the OS to get an IP, or to enable dhcp, from inside to work properly (as I do in github.com/achetronic/metal-cloud).
Talos is not working well as Talos starts, everything seems ok, but never connects. net0 device get an IP, inside the propper range but when you see ARP table, this IP is like a ghost. One way to fix it is to mount a machineconfig.yaml to configure some initial stuff before configuring the rest of the things through the API, BUT this is not possible with this provider: it does not include a resource to mount disks that are not for cloud-init
Is there a possibility to implement this? :)
One way to fix it is to mount a machineconfig.yaml to configure some initial stuff before configuring the rest of the things through the API, BUT this is not possible with this provider: it does not include a resource to mount disks that are not for cloud-init
What exactly do you mean by "mount a machineconfig.yaml" and "mount disks that are not for cloud-init"?
I personally use this provider to create VMs for Talos. I create an ISO containing the machine config, label it with metal-iso
and configure it on the domain as a disk { file = ... }
. The talos.config=metal-iso
kernel arg allows booting with this config, which I configure using factory.talos.dev
ever since its recent release.
I haven't run into any problems with cloud-init
support in this provider.
hello @michaelbeaumont I am trying your way, using the kernel, initrd and some kernel args directly (no config introduced yet), but once the VM is launched it seems to be rebooting constantly.
I'm doing this way because as I said in other issue, there is no way to provide kernel args to the talos.iso
image without using directly the vzlinux of talos and set the args to it using the libvirt_domain.kernel
and libvirt_domain.cmd
fields
I assume when you use the kernel and the initrd, no config is applied for netwirk, or the rest of initial stuff.
Could you provide your YAML (of course redacted or @achetronic on Telegram by DM) to test using it from the metal-iso and give more feedback?
Thank you in advance
Just in case this is useful for someone, with this Terraform code libvirt is able to start VMs for Talos (networking included):
# Create a dir where all the volumes will be created
resource "libvirt_pool" "volume_pool" {
name = "vms-volume-pool"
type = "dir"
path = "/opt/libvirt/vms-volume-pool"
}
resource "libvirt_volume" "kernel" {
source = "https://github.com/siderolabs/talos/releases/download/${var.globals.talos.version}/vmlinuz-amd64"
name = "kernel-${var.globals.talos.version}"
pool = libvirt_pool.volume_pool.name
format = "raw"
}
resource "libvirt_volume" "initrd" {
source = "https://github.com/siderolabs/talos/releases/download/${var.globals.talos.version}/initramfs-amd64.xz"
name = "initrd-${var.globals.talos.version}"
pool = libvirt_pool.volume_pool.name
format = "raw"
}
# General purpose volumes for all the instances
resource "libvirt_volume" "instance_disk" {
for_each = var.instances
name = join("", [each.key, ".qcow2"])
pool = libvirt_pool.volume_pool.name
format = "qcow2"
# 10GB (as bytes) as default
size = try(each.value.disk, 10 * 1000 * 1000 * 1000)
}
resource "libvirt_domain" "instance" {
for_each = var.instances
cpu {
mode = "host-passthrough"
}
xml {
xslt = file("${path.module}/templates/xsl/cdrom-fixes.xsl")
}
# Set config related directly to the VM
name = each.key
memory = each.value.memory
vcpu = each.value.vcpu
# Use UEFI capable machine
machine = "q35"
firmware = "/usr/share/OVMF/OVMF_CODE.fd"
# You may be wondering why I'm using directly these params instead of released metal ISO image.
# Well, hard to say, but you can not set kernel params on a crafted image...
# and I wanted to set some initial things through the machine config YAML on this stage
initrd = libvirt_volume.initrd.id
kernel = libvirt_volume.kernel.id
# Ref: https://www.talos.dev/v1.6/reference/kernel/
cmdline = [{
# Args retrieved directly from ISO image
console = "ttyS0" # Serial console for kernel output.
console = "tty0" # Virtual terminal console for kernel output.
consoleblank = 0 # Control auto-blanking of the console after inactivity (0 to disable).
"nvme_core.io_timeout" = 4294967295 # Set maximum I/O timeout for NVMe devices in milliseconds (max value).
"printk.devkmsg" = "on" # Enable real-time logging of device kmsg messages.
ima_template = "ima-ng" # Specify the Integrity Measurement Architecture (IMA) template to use.
ima_appraise = "fix" # Configure IMA file appraisal mode (e.g., "fix" to repair).
ima_hash = "sha512" # Set the hash algorithm used by IMA to verify file integrity.
# Required (and recommended) args by Talos Team
"talos.platform" = "metal" # Platform for running Talos (e.g., "metal" for physical hardware).
pti = "on" # Enable Page Table Isolation (PTI) vulnerability mitigation.
init_on_alloc = 1 # Initialize allocated memory pages (1 to enable, 0 to disable).
#"talos.config" = "metal-iso" # Specify the Talos configuration (e.g., "metal-iso" for ISO installation mode).
#"talos.hostname" = each.key
#"talos.experimental.wipe" = "system"
},{
_ = "slab_nomerge" # Unspecified parameter, may be a custom or system-specific setting.
}]
# Attach MACVTAP networks
dynamic "network_interface" {
for_each = each.value.networks
iterator = network
content {
macvtap = network.value.interface
hostname = each.key
mac = network.value.mac
addresses = network.value.addresses
wait_for_lease = false
# Guest virtualized network interface is connected directly to a physical device on the Host,
# As a result, requested IP address can only be claimed by the OS: Linux is configured in static mode by cloud-init
}
}
disk {
volume_id = libvirt_volume.instance_disk[each.key].id
scsi = true
}
# IMPORTANT: this is a known bug on cloud images, since they expect a console
# we need to pass it
# https://bugs.launchpad.net/cloud-images/+bug/1573095
console {
type = "pty"
target_port = "0"
target_type = "serial"
}
console {
type = "pty"
target_port = "1"
target_type = "virtio"
}
video {
type = "qxl"
}
graphics {
# Not using 'spice' to keep using cockpit GUI with ease :)
type = "vnc"
listen_type = "address"
autoport = true
}
qemu_agent = false
autostart = true
lifecycle {
ignore_changes = [
nvram,
disk[0],
network_interface[0],
]
}
}
The file named cdrom-fixes.xsl
<?xml version="1.0" ?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:template match="node()|@*">
<xsl:copy>
<xsl:apply-templates select="node()|@*"/>
</xsl:copy>
</xsl:template>
<!-- Fix: Connect a cdrom device on SATA instead of IDE bus -->
<xsl:template match="/domain/devices/disk[@device='cdrom']/target/@bus">
<xsl:attribute name="bus">
<xsl:value-of select="'sata'"/>
</xsl:attribute>
</xsl:template>
</xsl:stylesheet>
In theory, there is a simpler VM definition able to work with Talos, as shown in this example:
https://github.com/siderolabs/contrib/blob/main/examples/terraform/advanced/main.tf#L182-L217