cloudfoundry / lager

An opinionated logger for Go.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

"Fatal" log level is not json compliant

tscolari opened this issue · comments

While trying to parse log files created with lager we found a situation where the json standard was broken.

The Fatal log level exits with panic, making the log file dirty with non-json content (possibly duplicated content since it just logged the same message).
As a suggestion maybe it could instead of panicking, simply exit with error status, and append the stack trace to the log line metadata?

We have created an issue in Pivotal Tracker to manage this. You can view the current status of your issue at: https://www.pivotaltracker.com/story/show/99690506.

Hi, @tscolari,

Thanks for asking about this. First, Fatal doesn't exit with panic, it merely panics with the provided error. If you intend to stop that panic and do something else, you can defer a recover call farther up the call stack to trap it.

If that panic does propagate out to the top of a goroutine, the Go runtime will exit the program and print the panic payload and a full goroutine dump to stderr. If you have a sink registered on your lager.Logger instance that also writes to stderr, then it is true that this goroutine dump will pollute the lager JSON output, but this is also the case for any panic that crashes your program. The components in Cloud Foundry's diego-release typically emit their lager logs to stdout to prevent mixing these streams of data.

There's also potentially more information in that panic output than in the fatal lager output: that lager output contains the state and stack trace for only the calling goroutine, but the panic output contains it for all the running goroutines in the program.

One benefit of panicking instead of calling os.Exit() is to give any deferred functions in that goroutine a chance to run as the panic propagates up the call stack. Exiting simply halts the program immediately without any opportunity to run those deferred functions, which may do desired resource cleanup. This also makes it possible to test the functionality of the Fatal method in-process using ginkgo, whereas that seems impossible with an os.Exit() call, as presumably the ginkgo test process itself would exit.

As a minimal example, the following program in main.go, exercised via go run main.go > out 2> err,

package main

import (
    "errors"
    "os"

    "github.com/pivotal-golang/lager"
)

func main() {
    logger := lager.NewLogger("yoikes")
    logger.RegisterSink(lager.NewWriterSink(os.Stdout, lager.FATAL))

    logger.Fatal("kaboom", errors.New("boom"))
}

produces output files of the form

$ cat out
{"timestamp":"1437696732.172485352","source":"yoikes","message":"yoikes.kaboom","log_level":3,"data":{"error":"boom","trace":"goroutine 1 [running]:\ngithub.com/pivotal-golang/lager.(*logger).Fatal(0x208212000, 0x1079f0, 0x6, 0x22081e8740, 0x2081e42a0, 0x0, 0x0, 0x0)\n\t/Users/pivotal/go/src/github.com/pivotal-golang/lager/logger.go:131 +0xc8\nmain.main()\n\t/Users/pivotal/workspace/scratch/lager-fatal/main.go:14 +0x43e\n"}}

$ cat err
panic: boom

goroutine 1 [running]:
github.com/pivotal-golang/lager.(*logger).Fatal(0x208212000, 0x1079f0, 0x6, 0x22081e8740, 0x2081e42a0, 0x0, 0x0, 0x0)
    /Users/pivotal/go/src/github.com/pivotal-golang/lager/logger.go:152 +0x5d0
main.main()
    /Users/pivotal/workspace/scratch/lager-fatal/main.go:14 +0x43e

goroutine 2 [runnable]:
runtime.forcegchelper()
    /usr/local/Cellar/go/1.4.2/libexec/src/runtime/proc.go:90
runtime.goexit()
    /usr/local/Cellar/go/1.4.2/libexec/src/runtime/asm_amd64.s:2232 +0x1

goroutine 3 [runnable]:
runtime.bgsweep()
    /usr/local/Cellar/go/1.4.2/libexec/src/runtime/mgc0.go:82
runtime.goexit()
    /usr/local/Cellar/go/1.4.2/libexec/src/runtime/asm_amd64.s:2232 +0x1

goroutine 4 [runnable]:
runtime.runfinq()
    /usr/local/Cellar/go/1.4.2/libexec/src/runtime/malloc.go:712
runtime.goexit()
    /usr/local/Cellar/go/1.4.2/libexec/src/runtime/asm_amd64.s:2232 +0x1
exit status 2

@tedsuo was primarily responsible for the development of the lager package last year, so if there are additional reasons for this design he may be able to explain them.

Thanks again,
Eric, CF Runtime Diego PM

Hi Eric,

Thanks for your detailed response, it was very enlightening.
It makes perfect sense to me, I believe were misusing the tool :)

Many thanks again!

Thanks, @tscolari!