canonical / hotsos

Software analysis toolkit. Define checks in high-level language and leverage library to perform analysis of common Cloud applications.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Overview

Hotsos is a framework for software-defined analysis. It provides a library of plugins written in Python along with a high-level language in which to implement checks and analysis and report problems. Supported plugins include:

  • Openstack
  • Kubernetes
  • Ceph
  • Open vSwitch
  • Juju
  • MAAS
  • Vault
  • MySQL
  • RabbitMQ
  • and more

Each plugin has a set of associated checks or "scenarios" that run in the context of that plugin and seek to identify issues. The output of running hotsos is a summary produced by each plugin including key information about the runtime of that application along with any issues detected. This summary also aims to contain as much information needed to aid manual analysis beyond the automated checks and is easily extensible.

The default summary format is yaml and a number of other options and formats are provided.

Hotsos is either run directly against a host or a sosreport.

The code has the following structure:

  • core library (includes plugins)
  • yaml-defined checks (see documentation at defs)
  • plugin extensions e.g. summary output
  • tests

Usage

Let's say for example that you are running an Openstack Cloud and one of your hypervisor nodes that also happens to be running part of a Ceph storage cluster is experiencing a problem with network connectivity to workloads. You can simply run hotsos either against a sosreport generated from that node or on that node directly as follows:

ubuntu@ncpu1$ hotsos -s
INFO: analysing localhost /
INFO: output saved to hotsos-output-1673868979

Now you will find a folder called hotsos-output-1673868979 containing a summary of information in a number of different formats. Taking the most common yaml format we can see:

ubuntu@ncpu1$ cat hotsos-output-1673868979/ncpu1.summary.yaml

This file will contain a per-plugin summary of information found along with any issues detected. By default hotsos will only look at the last 24 hours of logs. You can increase this with --all-logs which will by default give you 7 days worth and if you want more you can use --max-logrotate-depth <days>.

Our folder also contains other formats of the same information and one of those is json which can easily be queried using a tool called jq. Using this useful tool we can easily query for specific information e.g.

ubuntu@ncpu1$ jq -r '.storage."potential-issues"' hotsos-output-1673868979/ncpu1.summary.json
{
  "BcacheWarnings": [
    "One or more of the following bcache bdev config assertions failed: sequential_cutoff eq \"0.0k\"/actual=\"4.0M\", cache_mode eq \"writethrough [writeback] writearound none\"/actual=\"writethrough [writeback] writearound none\", writeback_percent ge 10/actual=\"10\" (origin=storage.auto_scenario_check)",
    "One or more of the following bcache cacheset config assertions failed: congested_read_threshold_us eq 0/actual=\"2000\", congested_write_threshold_us eq 0/actual=\"20000\" (origin=storage.auto_scenario_check)"
  ]
}

Examples

An example full (yaml) summary can be found here

An example short (yaml) summary can be found here

Install

HotSOS is distributed using the following methods:

pypi

You can install using Python pip e.g.

$ sudo apt install python3-pip
$ pip install hotsos

NOTE: currently requires Python >= 3.8

snap

You can install as a snap e.g.

$ sudo apt install snapd
$ sudo snap install hotsos --classic

See https://snapcraft.io/hotsos for more info on usage.

or run from source e.g.

$ git clone https://github.com/canonical/hotsos
$ pip install -r hotsos/requirements.txt
$ ./hotsos/scripts/hotsos

About

Software analysis toolkit. Define checks in high-level language and leverage library to perform analysis of common Cloud applications.

License:Apache License 2.0