mehrdadpfg / hadoop_exporter

A hadoop exporter for prometheus, scrape hadoop metrics (including HDFS, YARN, MAPREDUCE, HBASE. etc.) from hadoop components jmx url.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Hadoop Exporter for Prometheus

Exports hadoop metrics via HTTP for Prometheus consumption.

How to run

python hadoop_exporter.py

Help on flags of hadoop_exporter:

$ python hadoop_exporter.py -h
usage: hadoop_exporter.py [-h] [-c cluster_name] [-hdfs namenode_jmx_url]
                          [-rm resourcemanager_jmx_url] [-dn datanode_jmx_url]
                          [-jn journalnode_jmx_url] [-mr mapreduce2_jmx_url]
                          [-hbase hbase_jmx_url] [-hive hive_jmx_url]
                          [-p metrics_path] [-host ip_or_hostname] [-P port]

hadoop node exporter args, including url, metrics_path, address, port and
cluster.

optional arguments:
  -h, --help            show this help message and exit
  -c cluster_name, --cluster cluster_name
                        Hadoop cluster labels. (default "cluster_indata")
  -hdfs namenode_jmx_url, --namenode-url namenode_jmx_url
                        Hadoop hdfs metrics URL. (default
                        "http://indata-10-110-13-165.indata.com:50070/jmx")
  -rm resourcemanager_jmx_url, --resourcemanager-url resourcemanager_jmx_url
                        Hadoop resourcemanager metrics URL. (default
                        "http://indata-10-110-13-164.indata.com:8088/jmx")
  -dn datanode_jmx_url, --datanode-url datanode_jmx_url
                        Hadoop datanode metrics URL. (default
                        "http://indata-10-110-13-163.indata.com:1022/jmx")
  -jn journalnode_jmx_url, --journalnode-url journalnode_jmx_url
                        Hadoop journalnode metrics URL. (default
                        "http://indata-10-110-13-163.indata.com:8480/jmx")
  -mr mapreduce2_jmx_url, --mapreduce2-url mapreduce2_jmx_url
                        Hadoop mapreduce2 metrics URL. (default
                        "http://indata-10-110-13-165.indata.com:19888/jmx")
  -hbase hbase_jmx_url, --hbase-url hbase_jmx_url
                        Hadoop hbase metrics URL. (default
                        "http://indata-10-110-13-164.indata.com:16010/jmx")
  -hive hive_jmx_url, --hive-url hive_jmx_url
                        Hadoop hive metrics URL. (default
                        "http://ip:port/jmx")
  -p metrics_path, --path metrics_path
                        Path under which to expose metrics. (default
                        "/metrics")
  -host ip_or_hostname, -ip ip_or_hostname, --address ip_or_hostname, --addr ip_or_hostname
                        Polling server on this address. (default "127.0.0.1")
  -P port, --port port  Listen to this port. (default "9130")

Tested on Apache Hadoop 2.7.3

hadoop_exporter

Usage

You can run each Collector under directory cmd/, just like:

cd hadoop_exporter/cmd
python hdfs_namenode.py  -h
# input the params the script asked.

Or if you want to run the entire project, you should have an webhook/api with url = http://<rest_api_host_and_port>/alert/getservicesbyhost to provide the jmx urls: the content in the webhook or api should be like:

{
    "cluster_name1": [
        {
            "node1.fqdn.com": {
                "DATANODE": {
                    "jmx": "http://node1.fqdn.com:1022/jmx"
                },
                "HBASE_REGIONSERVER": {
                    "jmx": "http://node1.fqdn.com:60030/jmx"
                },
                "HISTORYSERVER": {
                    "jmx": "node1.fqdn.com:19888/jmx"
                },
                "JOURNALNODE": {
                    "jmx": "node1.fqdn.com:8480/jmx"
                },
                "NAMENODE": {
                    "jmx": "node1.fqdn.com:50070/jmx"
                },
                "NODEMANAGER": {
                    "jmx": "node1.fqdn.com:8042/jmx"
                }
            }
        },
        {
            "node2.fqdn.com": {
                "DATANODE": {
                    "jmx": "http://node2.fqdn.com:1022/jmx"
                },
                "HBASE_REGIONSERVER": {
                    "jmx": "http://node2.fqdn.com:60030/jmx"
                },
                "HIVE_LLAP": {
                    "jmx": "http://node2.fqdn.com:15002/jmx"
                },
                "HIVE_SERVER_INTERACTIVE": {
                    "jmx": "http://node2.fqdn.com:10502/jmx"
                },
                "JOURNALNODE": {
                    "jmx": "http://node2.fqdn.com:8480/jmx"
                },
                "NODEMANAGER": {
                    "jmx": "http://node2.fqdn.com:8042/jmx"
                }
            }
        },
        {
            "node3.fqdn.com": {
                "DATANODE": {
                    "jmx": "http://node3.fqdn.com:1022/jmx"
                },
                "HBASE_MASTER": {
                    "jmx": "http://node3.fqdn.com:16010/jmx"
                },
                "HBASE_REGIONSERVER": {
                    "jmx": "http://node3.fqdn.com:60030/jmx"
                },
                "JOURNALNODE": {
                    "jmx": "http://node3.fqdn.com:8480/jmx"
                },
                "NODEMANAGER": {
                    "jmx": "http://node3.fqdn.com:8042/jmx"
                },
                "RESOURCEMANAGER": {
                    "jmx": "http://node3.fqdn.com:8088/jmx"
                }
            }
        }
    ],
    "cluster_name2": [
        {
            "node4.fqdn.com": {
                "DATANODE": {
                    "jmx": "http://node4.fqdn.com:1022/jmx"
                },
                "HBASE_REGIONSERVER": {
                    "jmx": "http://node4.fqdn.com:60030/jmx"
                },
                "HISTORYSERVER": {
                    "jmx": "node4.fqdn.com:19888/jmx"
                },
                "JOURNALNODE": {
                    "jmx": "node4.fqdn.com:8480/jmx"
                },
                "NAMENODE": {
                    "jmx": "node4.fqdn.com:50070/jmx"
                },
                "NODEMANAGER": {
                    "jmx": "node4.fqdn.com:8042/jmx"
                }
            }
        },
        {
            "node5.fqdn.com": {
                "DATANODE": {
                    "jmx": "http://node5.fqdn.com:1022/jmx"
                },
                "HBASE_REGIONSERVER": {
                    "jmx": "http://node5.fqdn.com:60030/jmx"
                },
                "HIVE_LLAP": {
                    "jmx": "http://node5.fqdn.com:15002/jmx"
                },
                "HIVE_SERVER_INTERACTIVE": {
                    "jmx": "http://node5.fqdn.com:10502/jmx"
                },
                "JOURNALNODE": {
                    "jmx": "http://node5.fqdn.com:8480/jmx"
                },
                "NODEMANAGER": {
                    "jmx": "http://node5.fqdn.com:8042/jmx"
                }
            }
        },
        {
            "node6.fqdn.com": {
                "DATANODE": {
                    "jmx": "http://node6.fqdn.com:1022/jmx"
                },
                "HBASE_MASTER": {
                    "jmx": "http://node6.fqdn.com:16010/jmx"
                },
                "HBASE_REGIONSERVER": {
                    "jmx": "http://node6.fqdn.com:60030/jmx"
                },
                "JOURNALNODE": {
                    "jmx": "http://node6.fqdn.com:8480/jmx"
                },
                "NODEMANAGER": {
                    "jmx": "http://node6.fqdn.com:8042/jmx"
                },
                "RESOURCEMANAGER": {
                    "jmx": "http://node6.fqdn.com:8088/jmx"
                }
            }
        }
    ]
}

Then you can run:

# -s means the rest api or webhook url mentioned above, should be in <host:port> format, no schema and path( I know it's ugly).
# -P (upper) means hadoop_exporter should export metrics in this port. you can get metrics from <http://hostname:9131/metrics>
python hadoop_exporter.py -s "<rest_api_host_and_port>" -P 9131

One more thing: you should run all this steps in all hadoop nodes.

MAYBE I'll improve this project for common use.

About

A hadoop exporter for prometheus, scrape hadoop metrics (including HDFS, YARN, MAPREDUCE, HBASE. etc.) from hadoop components jmx url.


Languages

Language:Python 99.8%Language:Smarty 0.2%