Clean Up Logging
EricSoroos opened this issue · comments
CKAN version
2.9, 2.10, Master
Describe the bug
Following the recent CVE for log injection I was reviewing what we're doing with logs, and I think we should go through and change a bunch of logging calls.
For context:
- https://dev.arie.bovenberg.net/blog/is-your-python-code-vulnerable-to-log-injection/
- https://bugs.python.org/issue46200
- https://discuss.python.org/t/safer-logging-methods-for-f-strings-and-new-style-formatting/13802/20
The Logging api is generally:
log.[severity]("% format string", repl1, repl2, ...)
This later gets run through a log formatter to add the wrapper for the log format string -- timestamp, module, etc. It gets % evaluated at that point. Also, for performance reasons, the log formatting isn't run if the log is not going to be emitted.
However, in a lot of places, we're doing one of:
log.debug("string %s" % variable)
log.debug("string {0}".format(variable))
log.debug(f"string {variable}")
e.g.:
./ckan/lib/search/query.py281: log.debug('Package query: %r' % query)
./ckan/logic/action/create.py1108: log.debug('Created user {name}'.format(name=user.name))
This is probably not directly dangerous if there's no substitution into the message anywhere in the stack. Performance wise, In the debug case, by default, all this string formatting is wasted.
The recommendations from the links are:
- Don’t log untrusted text. Python’s logging library doesn’t protect you from newlines or other unicode characters which allow attackers to mess up — or even forge — logs.
- Don’t format logs yourself (with f-strings or otherwise). In certain situations this could leave you vulnerable to denial-of-service attacks or even sensitive data exposure.
However, using a filter, we can prevent newlines or other interesting things injected into the logs anywhere using a logging.Filter to prevent that crlf from being added to the logs via a parameter.
e.g.
def filter(record):
if instanceof(record.args, tuple):
record.args = tuple(clean(s.__str__()) for s in records)
elif isinstance(record.args, dict):
for k,v in record.items():
record[k] = clean(v.__str__())
Todo
- Update all the logging calls to use
log.level(format, variable, ...)
- Create a log filter, and add it to the root logger