NYPL / engineering-general

Standards, values, and other information relevant to the NYPL Engineering Team.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Log Message Standard?

kfriedman opened this issue · comments

Hello (especially @NYPL/recap-data-search, @NYPL/recap-request, @NYPL/recap-ui),

There has been some discussion about logging and standardizing our logging messages. Currently, we're using a few different libraries (winston, bunyan, and monolog). Instead of standardizing on the library, maybe we can agree on log message standard with a "minimum" set of key/values. I'll kick off the discussion with the following proposal:

Log Message Standard

The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in this document are to be interpreted as described in RFC 2119.

The message format MUST be JSON.

The message MUST contain the following top-level keys: level, message:

  • level MUST be a string of of one of the following values (case-sensitive) and MUST follow this order of severity (from least to greatest): DEBUG, INFO, NOTICE, WARNING, ERROR, CRITICAL, ALERT, or EMERGENCY.
  • message MUST be a string and SHOULD contain a message useful for debugging/error reporting.

Additional key/values of any type MAY be included and MUST NOT break functionality.


Any thoughts? Should we RECOMMEND or REQUIRE additional key/values?

Winston, out of the box, does not capitalize the values found level key.
e.g. They will say 'info' not 'INFO' and 'debug' not 'DEBUG'

I'm fine not capitalizing the level. Some libraries capitalize and some don't. So some people are going to have to do extra work. What do other people think?

For reference. Here are some sample log messages:

winston:
{"level":"info","message":"Closed out remaining connections."}

bunyan:
{"name":"myapp","hostname":"banana.local","pid":40161,"level":30,"msg":"hi","time":"2013-01-04T18:46:23.851Z","v":0}

monolog:
{"message": "Expired token", "level": 200, "level_name": "INFO","channel": "API","datetime": "2017-06-19 12:24:45.337162"}

Not sure one can override "level" in bunyan natively trentm/node-bunyan#194 Have to intercept it when it's writing to the stream and do Object.assign

You're probably right @nonword. Most of the libraries I've seen have some mechanism to either "format" or rewrite the output. Some devs will likely need to do this to match the standard.

If we want to have the flexibility to configure levels, we should look at Winston -- https://github.com/winstonjs/winston#logging-levels

It is also very popular in the Node community and as of v0.2.0 they introduced the ability to customize levels. The implementation is very straightforward and we could also build a wrapper around it for NYPL services.

As the example above, the minimum logging provided does not include timestamp, pid, etc..

However, you can easily add it if desired:

winston.info('Hello world!', {timestamp: Date.now(), pid: process.pid});

Output:

info: Hello world! timestamp=1402286804314, pid=80481

One thing to note from the Bunyan example above is that a numeric level value when exposed is actually this:

{"name":"myapp","hostname":"pwony-2","pid":12616,"level":30,"msg":"hello","time":"2014-05-26T17:58:32.835Z","v":0}

NODE output:

[2014-05-26T18:03:40.820Z] INFO: myapp/13372 on pwony-2: hello

Ideally numeric level values are mapped to string levels. See https://github.com/trentm/node-bunyan#levels

Either one works for me, Winston is a bit more flexible and minimal.

I would propose that we should log as much as possible, that's a benefit of using Bunyan which takes a more machine readable log implementation.

Based on your recommendation above to use syslog levels -- Winston can be configured to use the RFC5424 SPEC -- https://en.wikipedia.org/wiki/Syslog#Severity_level

Also: https://tools.ietf.org/html/rfc5424

If you notice, the keyword values are used by Winston to express the level as string:

See: https://github.com/winstonjs/winston#logging-levels

I think in your recommendation you are using the 'severity' values except for 'INFO' which as per the docs it's 'INFORMATIONAL'

@nodanaonlyzuul -- You can configure Winston to format your error output via the formatter callback function: https://github.com/winstonjs/winston#custom-log-format

Notice they uppercase the severity level.

Regarding numeric log levels...my very strong instinct is:

  • You won't need it.
  • People haven't internalized it as much as debug => info => warning => error.
  • You won't need it still

Yeah, @nodanaonlyzuul, you're probably right about numeric log levels.

So, it seems like no one has raised any significant issues with the original standard. I'm going to post it as a file to this repo and then we can further iterate on it with pull requests:

https://github.com/NYPL/engineering-general/blob/master/standards/log_message.md

Thanks everyone for the comments!