stvoutsin / yarncleaner

Python lib for running a Yarn monitor, and clearing up temporary space if above a given threshold

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Yarn Cleaner

This module can be used to monitor and clean the temporary directories that are populated during Yarn jobs.

Features

  • Checks disk usage of Yarn and kills the application that is using up more than a given threshold
  • Connects to a remote node via SSH and runs commands

Prerequisites

  • python3
  • paramiko
  • argparse

Installation

Clone this repository:

git clone https://github.com/stvoutsin/yarncleaner.git

cd into the repository:

cd yarncleaner

Install dependencies:

  pip install -r requirements.txt

Usage

Using the Command Line This module can be run from the command line with the following command:

sh

python yarncleaner.py [--workers [WORKERS [WORKERS ...]]] [--ssh-username SSH_USERNAME] [--ssh-key-file SSH_KEY_FILE] [--usercache-dir USERCACHE_DIR] [--threshold-percent THRESHOLD_PERCENT]

Arguments

Argument Description
--workers A list of workers
--ssh-username The username to use for SSH
--ssh-key-file The path to the private key file to use for SSH
--usercache-dir The directory where the usercache is located
--threshold-percent The percentage of disk usage to trigger a clean

Using the API

This module can also be used as an API:

from yarncleaner import YarnCleaner
workers = ["worker01", "worker02", "worker03"]
ssh_username = "username"
ssh_key_file = "/path/to/key/file.pem"
cleaner = YarnCleaner(workers=workers, ssh_username=ssh_username, ssh_key_file=ssh_key_file)
cleaner.clean()

License

This module is licensed under the GNU GENERAL PUBLIC LICENSE.

About

Python lib for running a Yarn monitor, and clearing up temporary space if above a given threshold

License:GNU General Public License v3.0


Languages

Language:Python 100.0%