mozilla-services / hindsight

Hindsight - light weight data processing skeleton

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

A "Hello World" example

deric opened this issue · comments

Are there some simple examples how to use hindsight? e.g. parsing syslog or just simple tutorial how to parse a log file and output to stdout with debugging info?

Please check this post on the hindsight mailing list => https://mail.mozilla.org/pipermail/hindsight/2016-October/000008.html
If it answers your question, please close the issue.

I am open to suggestions. It is unclear how useful a simple example would be i.e. I could show a simple polling input that would inject a message once a second (followed by some analysis counter and output sandbox). However, this is no different than the basic documentation provided for all sandboxes. The only thing you could actually re-use when you try to apply it to your problem would be the main hindsight.cfg described here: https://github.com/mozilla-services/hindsight/blob/master/docs/configuration.md

The quick start is really setup a directory structure like this (with the hindsight.cfg example from the docs and the selection of sandboxes/plugins you desire)

.
├── hindsight.cfg
├── load
│   ├── analysis
│   ├── input
│   └── output
└── run
    ├── analysis
    │   ├── counter.cfg
    │   └── counter.lua
    ├── input
    │   ├── once_a_second.cfg
    │   └── once_a_second.lua
    └── output
        ├── stats.cfg
        └── stats.lua

FYI: The load directory structure is only necessary if you have dynamic loading configured.

Then run hindsight hindsight.cfg 7

@simonpasquier Thanks, that's indeed very helpful. Maybe add information like that into a getting started file? Or something like migrating from Heka guide.

@trink It took me a while to wrap my head around the whole hindsight concept. For example there's no mention of lua_sandbox_extensions project on the first README page. I know that Kafka and Elasticseach are not part of this project, but for newcomers would be useful to understand the possibilities (maybe just list some extensions that are already available).

Configuration page is ok, but goes too much into detail in case that you're just starting with the project.

The systemd module is not really documented. Is there a way how to process journalctl -f?

@deric: The systemd module was initially pulled in for sd_listen_fds(). This is mandatory to listen directly to listen directly to syslog.socket (see https://www.freedesktop.org/wiki/Software/systemd/syslog/).

I my current test setup, I replace rsyslog by hindsight. It works well (with some pull request not yet merged). I should document this in a blog post.

pluging to journal socket can be done by writing a new input and using the journal API. See sd_journal_* at https://github.com/daurnimator/lua-systemd/blob/master/README.md.

You'll gain extra fields, but you'll probably have worst throughput (journal API is sloooow).

See also my TODO list at https://gist.github.com/sathieu/5a7e83d514638f396e17d462f13adee0

Thanks a lot for all the hints!