grosser / kennel

Datadog monitors/dashboards/slos as code, avoid chaotic management via UI

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

validate changed items using datadog api

grosser opened this issue · comments

see https://docs.datadoghq.com/workflows/actions_catalog/monitor_validatemonitor/
only works for monitors ... another option could be to "apply + revert" for others and then get errors from that
... or validating with a deliberate error so we get "expected failure + unknown failure" and then sort out the unknown

POC

desc "Validate resources against datadog api using generated/ (atm monitor only) [PROJECT|TRACKING_ID|FILE=]"
task validate: "kennel:environment" do
  files =
    if (project = ENV["PROJECT"])
      Dir["generated/#{project}/*.json"]
    elsif (id = ENV["TRACKING_ID"])
      Dir["generated/#{id.split(":").join("/")}.json"]
    elsif (file = ENV["FILE"])
      file
    else
      raise "Need PROJECT or TRACKING_ID or FILE"
    end

  monitors = files
    .map { |f| [f, JSON.parse(File.read(f))] }
    .select { |_, r| r["api_resource"] == "monitor" }
  raise "No monitors in selected files" if monitors.empty?

  monitors.each do |file, monitor|
    begin
      Kennel::Api.new.create("monitor/validate", monitor)
      puts "#{file}: Valid"
    rescue StandardError
      puts "#{file}:"
      body = $!.message.split("\n").last # parsing api output from what lib/kennel/api.rb adds
      puts JSON.parse(body).fetch("errors")
    end
  end
end

to make this work for dashboard we could add a fake broken widget and then see if the error coming back is that widget, if it is then we know it's valid
(this way we don't do any real updates ... but it's still risky)
for slo we'd need a similar scheme but might be even harder / less reliable