Explain why Logger.log() returns error

Question

Explain why Logger.log() returns error

cep21 opened this issue 9 years ago · comments

Hi,

I think go-kit/kit/log reflects my personal thoughts on logging, but I am hesitant to use it directly due to Logger.log() returning an error. In my opinion errors are intended to be checked, not ignored. And it's unreasonable to ask users to check log messages. I feel a better philosophy is for the formatters/writers that eventually output messages to disk/tcp/stdout/etc to allow registering handlers for error types. Then, when I setup logging I can say "Here is a logger that logs JSON to a file. If there is an error, call this function (maybe try stdout or maybe trigger some application "debug me" state).

Maybe something like

type jsonLogger struct {
  io.Writer
  onError func(err error, keyvals ...interface{})
}

Jack Lindamood · Answer 1 · Wed Dec 09 2015 06:04:57 GMT+0800 (China Standard Time)

Another option is to turtles it all the way down

type jsonLogger struct {
  io.Writer
  onError func(err error) log.Logger
}

then you would call

if err := json.Marshall(obj); err != nil && onError != nil {
  onError(err).Log(keyvals)
}

This isn't so bad since at some point failures have to just be ignored (In my case, the last failure would set some debug bit that a monitoring system would check for).

Jack Lindamood · Answer 2 · Wed Dec 09 2015 06:12:51 GMT+0800 (China Standard Time)

As an example, there exist loggers that simply can't return errors. One such logger would be like this

type bitSetter struct {
  isSet int
}
func (b *bitSetter) Log(...keyvals interface{}) {
  if contains(keyvals, "state", "critical") {
    atomic.StoreInt32(isSet, 1)
  }
}

func (b *bitSetter) getPrometheusMetrics() prometheus.Metric {
  return prometheus.Metric("critical_state", atomic.LoadInt32(isSet))
}

For the above logger, my monitoring system would be able to tell me if it entered a critical state. Maybe this is good as a check of last resort if all the loggers are FUBAR (The disk mount is RO for example). It's impossible for this logger to error out, which implies to me errors are a consequence of individual loggers and not of logging as a concept.

Chris Hines · Answer 3 · Wed Dec 09 2015 10:12:09 GMT+0800 (China Standard Time)

There is a lot to unpack here. In this post I will do my best to explain why the log.Logger.Log method returns an error. I will follow up in another post to present what I believe are more elegant solutions to the challenges you present.

In my opinion errors are intended to be checked, not ignored. And it's unreasonable to ask users to check log messages.

I agree, and yet we don't often see Go code like the following.

if _, err := fmt.Println("Hello, world!"); err != nil {
    // ???
}

Consider the case of the bytes.Buffer.Write method. It is declared like this:

// Write appends the contents of p to the buffer, growing the buffer as
// needed. The return value n is the length of p; err is always nil. If the
// buffer becomes too large, Write will panic with ErrTooLarge.
func (b *Buffer) Write(p []byte) (n int, err error)

Why does it return an error that is always nil? The answer is obvious, because implementing the io.Writer interface makes bytes.Buffer more useful. As a result of implementing io.Writer we can pass a *bytes.Buffer to any of the many functions that accept an io.Writer. On the other hand, it is clear that many io.Writer implementations need to return errors, so the interface is declared accordingly and bytes.Buffer simply does not utilize that part of the interface.

The io.Writer interface defines the fundamental interface for writing streams of bytes in Go. Similarly, our log.Logger interface defines the fundamental interface for recording streams of structured log data in Go kit.

Some loggers might return errors, others may never return errors, but the central interface must allow it. The value of implementing the central interface is greater than the loss of precision for the implementations that never return errors. We can follow the example of the standard library and simply document those cases that will always return nil errors.

More later...

Jack Lindamood · Answer 4 · Wed Dec 09 2015 15:46:31 GMT+0800 (China Standard Time)

I think go's built in library takes both approaches. This mismatch is probably related to the special function logging has vs other APIs. For example, the built in logging library has func (l *Logger) Print(v ...interface{}) and not func (l *Logger) Print(v ...interface{}) error, where func (l *Logger) Output(calldepth int, s string) error is used for people that really need an error. I've never actually seen it used, however.

I'm trying to think about what makes logging different than fmt.Println and bytes.Buffer.Write and I think it comes down to the following statement:

If I write an API that accepts an interface who's method returns an error, there is an implicit contract that I will or need to check the return code of that error.

If an object contained a bytes.Buffer or accepted one, then nobody expects me to check return codes of writing to that object. If however,it accepts io.Writer, then it's reasonable that it's expected I check all io.Write() calls and do some kind of logging or tracking if it returns an error.

There's a danger someone could write a logger that returns an error and expect people to check that error.

An example of the danger of ignoring error codes is your currently implemented json_logger.go If someone adds a log value that is unconvertable to JSON then they would end up with empty log messages because json.Encode() would return an error and the entire log message would be lost. This could at least be notifiable if people registered an error callback for the JSON logger.

As other evidence, most (all?) the other written logging libraries (including the log package) don't return errors on log methods and it's very unexpected if any of the Java logging libraries threw an exception (I don't know any that do).

Jack Lindamood · Answer 5 · Wed Dec 09 2015 17:41:32 GMT+0800 (China Standard Time)

Another perspective is to consider that Go is an implicitly implemented interface language. So forget go-kit, forget the idea of logging, forget JSON and files and think of it entirely from the API owner's perspective.

I'm writing a zookeeper or kafka library and I need to do something with random types of events that are probably important for people to know about but I can't really handle. Or maybe I do handle them, but I should still tell someone about the event. TCP reconnects, etc. Perfect use case for logging. Now I need to decide what API I want to expose to the world for how they can pass me a logger.

From my perspective, does this API have a log method that returns an error or not? I think it makes the most sense for this API to not have a method that returns an error. If I communicate to the world via my exposed API that I want a logger that returns an error, then I'm implicitly communicating that I will handle errors returned by that function call. But since there's nothing I can do with those errors, it makes sense for me (again think about implicit interfaces and forget go-kit/logging/JSON/etc) to want a logger that does not return an error. That communicates via my API that whoever is passing me this logger should make it handle its own errors somehow.

I think this perspective holds true for most (all?) use cases of logging from an API writer's perspective.

I think if one removes themselves from trying to write a logging library and instead thinks of it from the consumers of the API's perspective that is trying to expose a logging mechanism, it makes more sense to want a logging interface that does not return an error.

Peter Bourgon · Answer 6 · Wed Dec 09 2015 20:57:37 GMT+0800 (China Standard Time)

@cep21 Package log is designed to be used for more than simple application info/warning/error logging; it's suitable for log-structured data in an e.g. Lambda architecture, where each invocation is important. I agree with you that if we were doing only application logging the error would be more noise than signal. But the scope of the package is larger than that.

Jack Lindamood · Answer 7 · Thu Dec 10 2015 02:42:46 GMT+0800 (China Standard Time)

Lambda architecture

That makes a lot of sense to me. I can totally see how lambda may want a method that sends things that requires an error check on each invoation. However, I'm focusing here on a logging interface. People taking an io.Writer are expected to check error codes, but people expecting a logger interface do not. I think the quote* where each invocation is important* has two important assumptions

The only way to verify a call is to return an error.
This doesn't have to be true. I gave two alternatives above. The current implementation is actually worse than this. By returning an error in the interface, you've guaranteed practically that errors will be ignored (given how logging libraries are used in practice). If instead logging did not return an error, then you've negotiated via API handling of the error in the logger implementation.
The caller knows it is logging to Lambda
The caller doesn't know this. The caller is taking an interface.

Since logging is an interface and go is an implicit interface language, we should define the interface we want API writers to accept.

Can you give an example of a library written in go that has wide usage, does logging, and would be improved by taking a logger who's Logger.Log() method is checked on each invocation?

Chris Hines · Answer 8 · Thu Dec 10 2015 15:05:17 GMT+0800 (China Standard Time)

@cep21 I believe we agree that in many cases call sites that originate log events do not want to handle errors and indeed probably don't have a good way to handle them anyway. I believe we also agree that errors will occur when serializing and transmitting or persisting log events.

Let's take a look at how other logging packages transmit errors (or not).

stdlib log

type Logger struct {
    ...
    out    io.Writer  // destination for output
    ...
}

func (l *Logger) Print(v ...interface{}) {
    l.Output(2, fmt.Sprint(v...))
}

func (l *Logger) Output(calldepth int, s string) error {
    ...
    _, err := l.out.Write(l.buf)
    return err
}

We can see from the above code excerpts that the standard log package simply discards the errors if you call Logger.Print or its siblings. If you care about handling the error returned from the io.Writer you either need to do your own formatting and call Logger.Output directly or perhaps give Logger.SetOutput an io.Writer that handles errors the way you want.

log15

// A Logger writes key/value pairs to a Handler
type Logger interface {
    ...
    // SetHandler updates the logger to write records to the specified handler.
    SetHandler(h Handler)

    // Log a message at the given level with context key/value pairs
    Debug(msg string, ctx ...interface{})
    ... etc ...
}

// A Logger prints its log records by writing to a Handler.
// The Handler interface defines where and how log records are written.
// Handlers are composable, providing you great flexibility in combining
// them to achieve the logging structure that suits your applications.
type Handler interface {
    Log(r *Record) error
}

type logger struct {
    ...
    // swapHandler wraps another handler that may be swapped out
    // dynamically at runtime in a thread-safe fashion.
    h   *swapHandler
}

func (l *logger) Debug(msg string, ctx ...interface{}) {
    l.write(msg, LvlDebug, ctx)
}

func (l *logger) write(msg string, lvl Lvl, ctx []interface{}) {
    l.h.Log(&Record{ ... })
}

We can see that the log15 package defines two interfaces. The higher level interface, Logger, does not return errors, while the lower level Handler interface does. Like the standard library, log15 ignores errors at this level (in the logger.write method when calling l.h.Log). But that's not the end of the story for log15.

Log15 includes a FailoverHandler (follow the link for full docs). FailoverHandler wraps a slice of handlers that it tries in order until one succeeds. If the last handler in the slice were to always return a nil error then the FailoverHandler will also always return nil.

Go kit log

The Go kit log package takes quite a bit of inspiration from log15 and tries to simplify the API as much as possible. Logger is the sole central interface of the package. It provides the same degree of composability as log15's Handler but is more general.

Go kit leans more heavily on the practice of wrapping or decorating loggers with other loggers. This choice dramatically simplifies the API without loss of expression, but it also means that Logger.Log must return an error. I don't consider that a problem.

If, when using Go kit, you want to ensure all logging errors are handled in some way then I believe the correct approach is to write a Logger decorator that handles errors returned by a wrapped logger as desired.

I believe that when doing application logging, calls to Logger.Log should ignore the errors and rely on main.main to provide a Logger implementation that does the right thing for that application. Perhaps it installs a MustLogger that panics on all errors, or perhaps it installs a FailoverLogger configured with your bitSetter as the logger of last resort.

I started working on some "meta" loggers about three months ago, but I haven't decided if it is worth adding them to the main repo. Here is my take on FailoverLogger copied from the branch where I was playing with the ideas. (This code is not fully tested.)

// NewFailoverLogger returns a Logger that logs all log events to the first
// provided logger. If the first logger returns an error the log event is
// logged to the second logger, and so on until one of the loggers succeeds.
// For each logging error the log event is augmented with an additional key of
// the form "failover_err_{idx}" and the error as its value before the log
// event is sent to the next logger in the chain.
func NewFailoverLogger(loggers ...Logger) Logger {
    return &failoverLogger{
        loggers: append([]Logger{}, loggers...),
    }
}

func (l *failoverLogger) Log(keyvals ...interface{}) error {
    kvs := keyvals[:len(keyvals):len(keyvals)]
    if len(kvs)%2 != 0 {
        kvs = append(kvs, ErrMissingValue)
    }

    var err error
    for i, l := range l.loggers {
        err = l.Log(kvs...)
        if err == nil {
            return nil
        }
        kvs = append(kvs, fmt.Sprintf("failover_err_%d", i), err)
    }
    return err
}

Jack Lindamood · Answer 9 · Fri Dec 11 2015 05:31:48 GMT+0800 (China Standard Time)

Thanks for entertaining this discussion :)

at how other logging packages transmit errors

I think we both agree that other log packages handle errors in a suboptimal way. My primary point is that their main Log method does not return an error.

I believe that when doing application logging, calls to Logger.Log should ignore the errors and rely on main.main to provide a Logger implementation that does the right thing for that application.

I think this is the strongest disagreement. There are two philosophies here. The first is Define an API that can be abused and document how to use it correctly vs Define an API that limits abuse. I strongly prefer the later case, which is why I think Log() without returning an error is the correct approach.

As an example, if I'm writing a ZooKeeper library and I need a place to log things and I don't want to check error codes, then the best API for me to expose in an implicitly typed language is Log() without an error. In this case, I have limited a vector to abuse the API.

Ignoring errors is really dangerous. If I were to write a top 5 Things to know as a new Go programmer to write fault tolerant systems that list would include Never ignore errors returned by interface function calls

Chris Hines · Answer 10 · Fri Dec 11 2015 06:40:00 GMT+0800 (China Standard Time)

Perhaps the disconnect here is the difference between the Go kit Logger interface, and what a ZooKeeper library should expose. I would be interested in what you think of the discussion we had in #42.

A ZooKeeper package that wants to provide logging output and enforce a no errors from logging policy can export its own Logger interface. It would then be main.main's responsibility to provide an implementation of that interface. If main.main is using Go kit logging, it would need to provide an adapter that filtered out any errors that bubble up from the lower levels. The code might look something like the following.

package zookeeper

type Logger interface {
    Log(keyvals ...interface{})
}

func (z *Client) SetLogger(logger Logger) {
    z.logger = logger
}

func (z *Client) Get(key string) string {
    ...
    z.logger.Log("event", "sending request")
    ...
    z.logger.Log("event", "received response")
    ...
}

package main

import "os"
import "zookeeper"
import "github.com/go-kit/kit/log"

func main() {
    logFormat := log.LogfmtLogger(os.Stdout)
    logger = log.LoggerFunc(func (keyvals ...interface{}) error {
        if err := logFormat.Log(keyvals...); err != nil {
            // handle error
        }
        return nil
    })

    zk := zookeeper.Open()
    zk.SetLogger(&zkLogger{logger})

    ...
}

type zkLogger struct {
    logger log.Logger
}

func (l *zkLogger) Log(keyvals ...interface{}) {
    l.logger.Log(keyvals...)
}

Jack Lindamood · Answer 11 · Fri Dec 11 2015 07:09:56 GMT+0800 (China Standard Time)

can export its own Logger interface

I totally agree. And I think now we're getting somewhere. Since this is the perspective of go-kit's other libraries itself, then why doesn't go-kit dogfood this very advice?

None of the following libraries ever check log errors. If they aren't checking log errors, then why aren't they designed with an API that accepts a non error logging Logger? Why aren't they requiring this same wrapper?

https://github.com/go-kit/kit/blob/master/loadbalancer/dnssrv/publisher.go
https://github.com/go-kit/kit/blob/master/transport/http/server.go
... (many others as examples)

Peter Bourgon · Answer 12 · Sat Dec 12 2015 02:45:24 GMT+0800 (China Standard Time)

why doesn't go-kit dogfood this very advice?

It's OK for client code to ignore errors, if the error doesn't introduce invalid state into the program, and the client code has no meaningful way to act on that error. Ignoring the error from an application log invocation fits both of these criteria.

Introducing and using this

type ApplicationLogger interface {
    Log(keyvals ...interface{}) // no error
}

to wrap this

type Logger interface {
    Log(keyvals ...interface{}) error
}

is largely unnecessary noise.

Chris Hines · Answer 13 · Sat Dec 12 2015 02:52:46 GMT+0800 (China Standard Time)

A ZooKeeper package that wants to provide logging output and enforce a no errors from logging policy can export its own Logger interface.

why doesn't go-kit dogfood this very advice?

I wasn't presenting that idea as advice, but as a choice that a package author could make if they held strong opinions on this topic. I stated my personal opinion above which align with @peterbourgon's comments.

Jack Lindamood · Answer 14 · Sat Dec 12 2015 03:00:50 GMT+0800 (China Standard Time)

Thanks for entertaining this conversation :)

I think I understand your perspective, I just disagree with it lots. But that's ok, right!? Plus, it's really difficult to disagree with Benedict Cumberbatch's profile photo. I think you've fallen on something pretty useful with the Log vararg syntax and the context, and totally agree with the lack of levels. I'll probably replicate something similar. Thanks for the MIT license!

Chris Hines · Answer 15 · Sat Dec 12 2015 03:11:23 GMT+0800 (China Standard Time)

@cep21 Thanks for raising the question and challenging assumptions. :)

Jack Lindamood · Answer 16 · Sat Dec 12 2015 03:36:50 GMT+0800 (China Standard Time)

:)

As a personal anecdote, here's something that actually happened to me about a month ago. I was using logrus and had some if condition that rarely triggered, but if it did I wanted to know.

I had some unit test coverage with a mock item to check that hit my code coverage. The condition that triggered the log never happened in production. But then it did. And I never knew! (Actually I eventually did because logrus falls back to stderr on errors and I saw it on stderr). Because my non mocked item context wasn't JSON serializable in production so the json.Marshal() threw the entire thing on the floor. I really had no idea looking at the code that such a thing was possible. After all, all my other log statements worked just fine.

It made me realize how important checking and surfacing error codes were. Even if I think I can't handle something and even if I think an error is almost impossible, it's important to do something with those errors that potentially gets them in front of someone. Especially the errors I don't think should happen. Your current JSON serializer could potentially suffer the same fate. People using it should understand this potential.

Chris Hines · Answer 17 · Sat Dec 12 2015 03:52:32 GMT+0800 (China Standard Time)

I agree that our current JSON serializer could use some hardening. Please file issues for these sort of things.

Peter Bourgon · Answer 18 · Thu Jan 14 2016 05:16:43 GMT+0800 (China Standard Time)

@cep21 Are you OK with the conclusions here & if I close the issue?

Jack Lindamood · Answer 19 · Thu Jan 14 2016 05:45:00 GMT+0800 (China Standard Time)

I think the deliverable here may be to put some of this reasoning in the documentation of the project. However, I'm fine closing the ticket.

Will Faught · Answer 20 · Thu Jan 19 2017 09:45:16 GMT+0800 (China Standard Time)

In my opinion errors are intended to be checked, not ignored. And it's unreasonable to ask users to check log messages.

Agreed.

I agree, and yet we don't often see Go code like the following.

I think fmt.Println being the way to print to stdout is a bad design. Most don't care about stdout errors because if they happen then there's no point in continuing to run. If you can't print to stdout then something is really wrong in most cases. The standard way to print to stdout should just panic instead of returning an error, and if you wanted to actually check for an error, then you could still fmt.Fprintf(os.Stdout...).

Why does it return an error that is always nil? The answer is obvious, because implementing the io.Writer interface makes bytes.Buffer more useful.

But the Logger interface doesn't make things more useful outside use of the kit log package. So a Log method that doesn't return an error would be just as useful. Everybody deals with bytes in all kinds of situations and for all kinds of purposes; is Logger really used that broadly?

On the other hand, it is clear that many io.Writer implementations need to return errors, so the interface is declared accordingly and bytes.Buffer simply does not utilize that part of the interface.

Isn't there a difference, though, in that I/O errors are frequently actionable, whereas log errors are not? What do you even do with a log error? You can't log it again. Panic with it? That's no different than Log not having an error result and it just panicking with the error.

This question is actually what brought me here. Gometalinter reports unchecked uses of Logger.Log. I have no idea what I should do with Log errors. I worked around it for now in one case by doing something equivalent to this:

stdlog.SetOutput(...kitlogger...)
...
if err != nil {
    err = kitlogger.Log("error", err)

    if err != nil {
        stdlog.Printf("error: %v\n", err)
    }
}

The horror! I'm pretty sure this is the wrong thing to do, but I don't have a better idea. (Feedback appreciated!)

Package log is designed to be used for more than simple application info/warning/error logging; it's suitable for log-structured data in an e.g. Lambda architecture, where each invocation is important.

Doesn't this "go" :) against the Go tenet that code shouldn't be magical? For example, the for-range statement doesn't work against generic iterators because they might trigger some computationally-expensive code. Likewise, logging has a well-understood, universal, useful purpose, and when it's usually done, it's expected to be simple, quick, and robust, not something complex, slow, and fragile.

Wouldn't the Lambda scenario be better suited by something like:

type Reporter interface {
    Report(keyvals ...interface{}) error
}

whose documented meaning explicitly acknowledges stuff might blow up?

I think it's better to require someone who wants to shoehorn something like Lambda into Logger to adapt to Logger (or go elsewhere entirely), rather than Logger accommodate edge cases like Lambda and make it worse for its most common use case.

Will Faught · Answer 21 · Fri Jan 20 2017 04:29:08 GMT+0800 (China Standard Time)

Instead of having the Log caller deal with the error, why not have the Logger implementor deal with it? If you truly want to do something fancy with Lambda (or whatever) that might fail, deal with it in the Log impl, not at every call site. All the error handling logic for each call just discourages logging to begin with.

Chris Hines · Answer 22 · Fri Jan 20 2017 04:34:54 GMT+0800 (China Standard Time)

Instead of having the Log caller deal with the error, why not have the Logger implementor deal with it?

Yes, I believe that is the correct approach, as I explained in #164 (comment). This technique was also recently added to the package documentation in 1b5215d.

Will Faught · Answer 23 · Fri Jan 20 2017 05:16:33 GMT+0800 (China Standard Time)

So if the implementation is dealing with the error, Log doesn't need to return it, right?

Chris Hines · Answer 24 · Fri Jan 20 2017 23:01:50 GMT+0800 (China Standard Time)

So if the implementation is dealing with the error, Log doesn't need to return it, right?

If you are asking about the declaration of the Logger interface, then no I believe it must declare that the Log method has a single return value of type error.

If you are asking about the behavior of a particular implementation of the Logger interface, then yes it is perfectly valid for it to handle errors it receives and return a nil error to its caller.

If you are asking something else, then please clarify.

Will Faught · Answer 25 · Sat Jan 21 2017 00:28:43 GMT+0800 (China Standard Time)

Did you miss my first post, the one just above the one you replied to?

…

On Fri, Jan 20, 2017 at 7:01 AM Chris Hines ***@***.***> wrote: So if the implementation is dealing with the error, Log doesn't need to return it, right? If you are asking about the declaration of the Logger interface, then no I believe it must declare that the Log method has a single return value of type error. If you are asking about the behavior of a particular implementation of the Logger interface, then yes it is perfectly valid for it to handle errors it receives and return a nil error to its caller. If you are asking something else, then please clarify. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#164 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAD5VsauU0Yz3nM7Z2iOxpP0rDWLBf77ks5rUMxfgaJpZM4GxZSR> .

Chris Hines · Answer 26 · Sat Jan 21 2017 00:32:05 GMT+0800 (China Standard Time)

I don't think so. I have replied to both of your two most recent comments on this issue.

Will Faught · Answer 27 · Sat Jan 21 2017 04:38:30 GMT+0800 (China Standard Time)

OK. It seemed so because I don't think you've addressed the points made or questions asked there. Sorry if I misunderstood you. :)

I'd at least like to know how it's recommended we properly handle Logger.Log errors in a typical "application logging" context as @peterbourgon described above — that is, there's no reasonable failover logging or reporting mechanism (for example, liberal debug messages). I've no idea.

Peter Bourgon · Answer 28 · Sat Jan 21 2017 04:42:19 GMT+0800 (China Standard Time)

In a typical application logging context, the failure of a log invocation has no meaningful effect on the state of the program, and requires no compensatory follow-up action. Therefore, the right thing to do is to ignore the errors. If your linter complains about those ignored errors, then you should reconfigure the linter.

Jack Lindamood · Answer 29 · Sat Jan 21 2017 04:48:37 GMT+0800 (China Standard Time)

@willfaught I think a lot has already been said on this topic. It's just a difference of perspective and it's good to have this public so people can see both sides, but probably not worth it to keep talking till we get consensus :)

Therefore, the right thing to do is to ignore the errors

This is dangerous advice. You 100% in all cases want to know if your log messages aren't working. I have encountered real issues because of this in the past. See the below quote where I never knew I had a production issue happening because of JSON serialization issues.

I had some unit test coverage with a mock item to check that hit my code coverage. The condition that triggered the log never happened in production. But then it did. And I never knew! (Actually I eventually did because logrus falls back to stderr on errors and I saw it on stderr). Because my non mocked item context wasn't JSON serializable in production so the json.Marshal() threw the entire thing on the floor. I really had no idea looking at the code that such a thing was possible. After all, all my other log statements worked just fine.

It made me realize how important checking and surfacing error codes were. Even if I think I can't handle something and even if I think an error is almost impossible, it's important to do something with those errors that potentially gets them in front of someone. Especially the errors I don't think should happen. Your current JSON serializer could potentially suffer the same fate. People using it should understand this potential.

Peter Bourgon · Answer 30 · Sat Jan 21 2017 04:50:13 GMT+0800 (China Standard Time)

You 100% in all cases want to know if your log messages aren't working.

Only the Sith deal in absolutes ;)

Chris Hines · Answer 31 · Sat Jan 21 2017 05:32:17 GMT+0800 (China Standard Time)

@willfaught I thought that had been answered, but apparently there is still a disconnect here, so I apologize for that. Obviously I haven't understood the question you are asking, so bear with me.

We have recommended that when you are logging at the application level you ignore the return value of Log. That might look something like this:

func AppLogic(logger log.Logger, a int, b string) {
    // ...
    logger.Log("msg", "foo bar happened", "a", a, "b", b)
    // ...
}

Now it seems to me like the question is, "How is that safe? Don't we want to know when logging isn't working?" I hope I've got that right.

I'm on board with that sentiment, but in my experience trying to handle the error inside AppLogic will drive you crazy. I believe dealing with errors during logging is more of an application wide policy decision. Possible choices for this policy are:

Logging is very important, if it isn't working the application must abort.
Logging is nice but not critical.
Logging is important and there are failover options, but abort if they all fail.
Logging is a nuisance, just turn it all off.
... and many more I'm sure.

I recommend that you choose the policy that is right for the application and enforce it application wide by implementing the policy as a log.Logger that you compose with your choice of formatting logger, concurrency policy, output destination, base log.Context and so forth in main.main.

An error policy logger might look something like the following. (I doubt that this is a good policy, it is solely to demonstrate that you can do whatever you want here.)

type errorLogger struct {
    logger log.Logger
    mcount metrics.Counter
    count  int
    thresh int
}

// Log implements the log.Logger interface. It delegates logging to another Logger
// and tracks the number of errors encountered. If the error count exceeds the configured
// threshold it will panic. It will always return nil or panic.
func (l *errorLogger) Log(keyvals ...interface{}) error {
    if err := l.logger.Log(keyvals...); err != nil {
        l.count++
        l.mcount.Add(1)
        if l.count > thresh {
            panic(err)
        }
    }
    return nil
}

You can use a Logger like the above to guarantee that no call to logger.Log in AppLogic will ever return an error. You also now have a single central place within your application to change how logging errors are handled. Furthermore AppLogic can easily be put in a package that you can share between applications with different error handling policies. To me it seems like a perfect separation of concerns and a nice way to manage the cross cutting concern of logging and handling errors.

Finally, here is what main might look like.

func main() {
    // ...
    var logger log.Logger
    logger = log.NewLogfmtLogger(log.NewSyncWriter(os.Stdout))
    logger = errorLogger{
        logger: logger,
        thresh: 10,
        mcounter: logErrorCounter,
    }
    logger = log.NewContext(logger).With("time", log.DefaultTimestampUTC)
    // ...
    app.AppLogic(logger, 5, "hello, world")
}

I hope that helps.

Jack Lindamood · Answer 32 · Sat Jan 21 2017 06:00:15 GMT+0800 (China Standard Time)

The alternative implementation is at https://github.com/signalfx/golib/tree/master/log where the logger accepts a fallback func NewJSONLogger(w io.Writer, ErrHandler ErrorHandler) Logger and people configure a fallback to do things like increment counters, panic, print to stderr, or a fallback of another logger itself.

In this case, the base interface is

type Logger interface {
	Log(keyvals ...interface{})
}

Will Faught · Answer 33 · Sat Jan 21 2017 06:05:09 GMT+0800 (China Standard Time)

Therefore, the right thing to do is to ignore the errors.
If your linter complains about those ignored errors, then you should reconfigure the linter.

I think these are the fundamental disagreement. Like @cep21, I don't think it's acceptable to ignore errors. If an interface method returns an error, you always need to check the error, because there can be multiple implementations, possibly ones that haven't even been written yet or that you'll ever know about. There are at least two common linters I know of that flag unchecked errors (errcheck and gas, both run by gometalinter by default). It's even emphasized in https://blog.golang.org/errors-are-values: "Whatever you do, always check your errors!"

I'll come at this from another angle too. There seems to have been an evolution of the design of the logging API that sought the most simple and robust interface. The final result was Log(keyvals ...interface{}) error. I would argue it's actually Log(keyvals ...interface{}). It's even simpler, having no result, yet still allows for error handling if you really care to by implementing Logger yourself and handling the errors there. This of course assumes that Log callers never want to handle errors themselves. I think that's true in most cases.

Thanks for picking up the discussion again. No worries if you still disagree. :)

Will Faught · Answer 34 · Tue Jan 24 2017 03:53:02 GMT+0800 (China Standard Time)

@peterbourgon @ChrisHines Any thoughts on my last post? Just want to confirm whether you still disagree.

Chris Hines · Answer 35 · Tue Jan 24 2017 04:26:57 GMT+0800 (China Standard Time)

@willfaught I understand your point but, I still disagree (at least for now). I feel like I've shown multiple examples that demonstrate how a program may be structured to completely mitigate the risk of unhandled logging errors, but I haven't gotten much feedback as to whether I've made my point. I really don't know what else to tell you other than that I'm glad to have had the discussion and I am fine with agreeing to disagree.

I have taken a brief look at the package that @cep21 posted above and found it is interesting to see an alternative API. If I find myself with a block of time I might, as an exercise, see what the minimum impact of dropping the error return from Log is, but from what I could tell looking at that implementation it will make the API fatter and more tedious. I don't think it will add any capabilities or make logging any less error prone.

Errcheck by default excludes stdlib functions that are documented to never return an error. It also accepts a list of additional functions to exclude. I echo @peterbourgon that if main.main arranges it so that all errors returned from the logging chain are handled it is appropriate to configure errcheck to exclude calls to Logger.Log from its analysis.

Will Faught · Answer 36 · Thu Jan 26 2017 10:14:01 GMT+0800 (China Standard Time)

@ChrisHines Sorry for the delay replying.

I feel like I've shown multiple examples that demonstrate how a program may be structured to completely mitigate the risk of unhandled logging errors, but I haven't gotten much feedback as to whether I've made my point.

Your examples showed how those non-kit libs provide a second, alternative method that discards the error (IIUC). While I suppose that would solve the unchecked error lint problem for Log callers, adding a second method that doesn't return an error would complicate the Logger interface. I think it'd be simpler to have Log be errorless and handle errors in the impl behind the interface if you choose to.

How would it make the API fatter and more tedious? This is what it would be:

type Logger interface {
    Log(keyvals ...interface{})
}

It seems to me that for 99% of cases you'd want this:

logger = ...kitlog.NewLogfmtLogger(os.Stderr)...
...
logger.Log("foo", "bar") // passes lint, everybody's happy

and for 1% of cases you could do this:

type FailoverLogger struct {
    w io.Writer
}

func (f *FailoverLogger) Log(keyvals ...interface{}) {
    var s = stringify(keyvals...)

    if _, err := f.w.Write(s); err != nil {
        // handle error, everybody's happy
    }
}

...
logger = NewFailoverLogger(...)
...
logger.Log("foo", "bar") // passes lint, everybody's happy

I don't think it will add any capabilities or make logging any less error prone.

Removing a method result definitely doesn't add capabilities; that's not what I'm arguing. I'm arguing that Logger is just as capable without the error result, and is better suited for the way it's used 99% of the time: application logging, as @peterbourgon put it.

It'd have no effect on error proneness, and I'm not arguing that it would. Having no error result reflects the fact that in 99% of cases you're doing application logging, and are in no position to meaningfully handle a log error; and also that application logging errors are rare, so in the general case there's no need for Logger to accommodate handling them, although there's still a way to do so with your own Logger impl if you need to.

Maybe it would be easiest to understand your position if you could explain why you think it's bad that stdlog.Logger.Print* methods don't return errors. (I'm assuming you see both cases as the same, but if not, then I'd like to understand that too.) kitlog.Logger is meant to be a general-purpose logger, correct? One that could be sufficient for the standard library log package?

Peter Bourgon · Answer 37 · Thu Jan 26 2017 16:17:07 GMT+0800 (China Standard Time)

I think it'd be simpler to have Log be errorless and handle errors in the impl behind the interface if you choose to.

That would be simpler for the application logging use case, where you don't care about errors at the callsite. It would defeat the purpose of the event logging use case, where a failure to emit a log entry needs to be dealt with at the callsite: some type of retry, a failure of the originating transaction, etc.

I understand you believe it's an error to use the same interface for both of these use cases, and I've heard your arguments for why that's the case. Do you think we can agree to disagree?

Chris Hines · Answer 38 · Fri Jan 27 2017 00:20:14 GMT+0800 (China Standard Time)

@willfaught wrote:

Your examples showed how those non-kit libs provide a second, alternative method that discards the error (IIUC). While I suppose that would solve the unchecked error lint problem for Log callers, adding a second method that doesn't return an error would complicate the Logger interface.

It sounds like you are referring to this #164 (comment); and to be clear I don't believe I argued to discard the errors.

I was more curious what you thought of my response to you here #164 (comment).

In that comment I demonstrated how a decorator logger—an implementation of the existing Logger interface—can be written that catches all errors from another logger and handle them as needed. The decorator always returns a nil error after handling the original error. I showed how main.main could use this tool to construct a chain of loggers that centrally handle all errors and also never return an error to application code that uses the logger.

As an aside, reviewing ancient history, in my first proposal for this package Log did not return an error. During the design discussion on that proposal @peterbourgon made this incisive remark:

My intuition is that handling should be composed from, rather than injected into, Loggers.

He then provided a counter proposal in #21 that we iterated on and arrived at the current definition for the Logger interface. We settled on the decorator approach to composing loggers during that discussion and it is that design decision that drives the choice for the Log method to have an error return.

How would it make the API fatter and more tedious?

To be clear, by API I mean that of the API of the entire log package (and sub packages). Not allowing Log methods to return errors would require all Loggers that might encounter errors as part of their implementation to handle errors locally. For reusable loggers that aren't part of main.main they would have to provide some means for main.main to inject error handling logic. But consider Peter's comment above. The overall API is simplified by allowing Log to return an error and use decorator style composition.

Will Faught · Answer 39 · Mon Jan 30 2017 18:29:40 GMT+0800 (China Standard Time)

It would defeat the purpose of the event logging use case, where a failure to emit a log entry needs to be dealt with at the callsite: some type of retry, a failure of the originating transaction, etc.

@peterbourgon Let me make sure I'm on the same page. What do you mean by event logging? How is it different than application logging? Why encompass both in one interface if each is used differently?

Do you think we can agree to disagree?

Absolutely. I just want to understand your position, and I don't think I do quite yet. (See my above questions.) If you'd rather end the discussion, just say the word. No hard feelings. :) I'm guessing this will boil down to a difference of goals/priorities or a misunderstanding on my part, and once I understand, that's enough for me.

I was more curious what you thought of my response to you here #164 (comment).

@ChrisHines Sorry I didn't address that before. See below.

I recommend that you choose the policy that is right for the application and enforce it application wide by implementing the policy as a log.Logger that you compose with your choice of formatting logger, concurrency policy, output destination, base log.Context and so forth in main.main.

If you already know the implementation at every call site, then why have an interface at all? You can just use the type of your Logger implementation directly all over your app. I think the point of the interface is that you don't know what's going on under the hood. The code might not be a program at all, but a library, say, a library of middleware that take a Logger as a dependency. What you're suggesting doesn't make sense for the author of that library. As I said above, you might not ever know about the underlying implementation, or it might not have been written yet. If all you have is a black box with an interface, I don't see how you can safely ignore the error.

You mentioned the decorator pattern above. What value actually is there in including the error in a logger decorator? For endpoints, the value of including the error in the decorator is clear enough: the endpoint behavior depends on whether there was an error, whether that's logging values or panic recovery or rolling back transactions or whatever. What is it that Loggers need error propagation for? Undoing a logging if there's an error? Rolling back a logging? I've never had to do any thing like that, so I'm seriously asking: what is the utility of signaling and handling errors along a chain of Loggers? The one thing I can think of is retrying or failover in the case of failure, but we've already been over how you don't need Log to return errors to do that: you just do it at the place where the error occurs.

My intuition is that handling should be composed from, rather than injected into, Loggers.

I think my question above about the utility of error propagation addresses this.

Not allowing Log methods to return errors would require all Loggers that might encounter errors as part of their implementation to handle errors locally.

But that would be the implementation, not the API, right?

For reusable loggers that aren't part of main.main they would have to provide some means for main.main to inject error handling logic.

Agreed, there's a tradeoff here between handling errors in a callback (or channel or whatever) configured in main and handling them at every call site, which I think comes back to the application-vs-event logging discussion.

But consider Peter's comment above. The overall API is simplified by allowing Log to return an error and use decorator style composition.

I don't see the connection between @peterbourgon's comment and API simplification, unless you're referring to just the event logging side of things (again, I'm not clear on what that means yet, and that might be my disconnect)?

Peter Bourgon · Answer 40 · Tue Jan 31 2017 11:36:17 GMT+0800 (China Standard Time)

What do you mean by event logging? How is it different than application logging?

Quoting myself in my OK Log introduction,

The principal use case would be application logs from typical microservices: think debug, info, warn, etc. These are typically high-volume, low-QOS logs, where lower latency (time-to-query) is preferred. But, we also want to serve so-called event logs: audit trails, click tracking, and so on. These are typically lower volume, high-QOS logs, where higher latency is acceptable.

—

Why encompass both in one interface if each is used differently?

They are vastly more alike than different. They both benefit from structure at the callsite; they both are modeled as streams; they both wind up in aggregation systems. The only significant difference is the QOS and, thus, how you tend to handle errors.

Will Faught · Answer 41 · Tue Jan 31 2017 16:34:46 GMT+0800 (China Standard Time)

But, we also want to serve so-called event logs: audit trails, click tracking, and so on. These are typically lower volume, high-QOS logs, where higher latency is acceptable.

OK, now I understand! Thanks. :) All right, here are my final thoughts. No worries if you're still unconvinced. I appreciate you all taking the time.

These sound like different things to me: one needs to propagate errors (apparently? see below), and one doesn't. I've argued that if an interface method returns an error, you have to check it in the general/library case, which everyone has tacitly accepted by not arguing against it—please do so now if you missed that point. So combining application and event logging into one thing dooms application logging callers to have to check those errors. I've also asked what value there is in propagating logging errors along a chain of Loggers, which hasn't been answered—from which I'm forced to conclude there is none. For example, I don't understand why an audit trail Logger would need to propagate its error (to what other Logger? and why a Logger? what would it do with it?) or handle one from another Logger (how would it? and does it even make sense for an audit trail logger to decorate another Logger?). For these reasons, I still think Log shouldn't return an error.

Audit trails and tracking don't seem like logging to me; they seem like endpoints. I don't follow why something so general has to be supported by logging.

Why not have a separate log/event package with a Logger that returns errors? That separates the concerns. If you want application Loggers to play nicely with event Loggers, you can just adapt them:

type ApplicationAdaptorLogger struct {
    log.Logger
}

func (l *ApplicationAdaptorLogger) Log(keyvals ...interface{}) error {
    l.Logger.Log(keyvals...)

    return nil
}

Peter Bourgon · Answer 42 · Tue Jan 31 2017 22:09:10 GMT+0800 (China Standard Time)

I've argued that if an interface method returns an error, you have to check it in the general/library case, which everyone has tacitly accepted by not arguing against it—please do so now if you missed that point.

I don't know what "the general/library case" is. I don't agree with your conclusion without qualifications. If you want to provide some value-add wrapper, router, decorator, etc. to the Go kit logger interface in an intermediating library, and you want your intermediating library to be fully general i.e. accommodate both application and event logging, then yes, you must deal with the errors somehow, likely by bubbling them through to the callsite. But, I think this is not a common use case. I think it is much more common that an intermediating library would specialize to e.g. application logging, and e.g. squelch the errors. @ChrisHines has provided several examples in this thread of ways to do that. Indeed, your own ApplicationAdapterLogger is a great example of what an intermediating library or framework could provide. This is not problematic in any way. This is precisely how the Go kit logger is intended to be used.

I've also asked what value there is in propagating logging errors along a chain of Loggers, which hasn't been answered—from which I'm forced to conclude there is none. For example, I don't understand why an audit trail Logger would need to propagate its error (to what other Logger? and why a Logger? what would it do with it?) or handle one from another Logger (how would it? and does it even make sense for an audit trail logger to decorate another Logger?). For these reasons, I still think Log shouldn't return an error.

I'm sorry, I feel like both Chris and I have exhaustively responded to all of these points.

Audit trails and tracking don't seem like logging to me; they seem like endpoints. I don't follow why something so general has to be supported by logging.

To be clear, what the log package does is allow programmers to write structured data to some destination. From this general form, and using the defined Logger interface, we are able to specialize both to application logging and event logging, depending on the underlying implementation (the concrete objects constructed to satisfy the Logger interface) and its physical destination (a file on disk, a network socket, an OS pipe, or anything else).

So,

Why not have a separate log/event package with a Logger that returns errors? That separates the concerns. If you want application Loggers to play nicely with event Loggers, you can just adapt them

The answer is that we have decided to combine these two use cases into a single, more general-purpose interface, and have decided that the inevitable and concomitant loss in specificity that comes with such generalizations is a cost we are willing to bear. I completely understand that you disagree with this conclusion. I completely understand that you want the Logger interface to specialize to the application logging use case. But that is not what it is, and for what I believe are good reasons :)