Best practices for using Markers in SLF4J/Logback

We are using SLF4J+Logback combination at our project for a while now and are quite happy with it, but our logging strategy is fairly simple, using straightforward class based loggers and no fancy stuff like MDC or Markers.

What I want to know is if anybody in the community actually uses these features and how they are used to improve logging/filtering.

I am specially interested in where, why and how would one use[1] Markers for logging. They strike me as a pretty neat feature for adding semantic context into the logging - e.g. while a class may be handling multiple concerns, one may use task/concern specific markers to discriminate log statements.

What may be the best practices, conventions or strategies for creating and using markers in logging.

Update: I guess, what I am really after is not so much why to use markers, but rather the how part — is there some good practices of naming markers (e.g. using plain text with spaces or dash/underscore/punctuation delimited keyword style names), should there be some sort of pool of "standard names", naming stuff based on the business functions. The questions I can probably figure out for myself, but if I want to use these features systematically and introduce them to a team of developers, it makes sense to have some formalizeable set of guidelines around...


[1] - By asking how to use markers I am not really asking how to use API (it is really quite straight forward) - I am rather referring to the more general level of how would one set up logging around using markers consistently


Solution 1:

First, as @darioo said:

  • MDC is used for associating multiple events with few "entities"
  • [Markers] are used for "special" events that you want to have filtered from usual ones

So your assertion that You want to use MDC for this. Markers are for highlighting "special" events--filtering, if you will--rather than "slicing". For example, you might slice based on a particular user, but filter based on any unexpected exceptions. In this case, you would create a User MDC dimension and an UnexpectedException Marker.


But this apparently doesn't address the question you had in mind. You are "rather referring to the more general level of how would one set up logging around using markers consistently." So let's address that:

MDC is for slicing and dicing, and Markers are for filtering. These activities are carried out during testing and in production. As such, you need to decide which dimensions you expect may be useful to slice the log data by, and which cases it might be useful to filter it against, when testing/production comes around. Each dimension gets an MDC dimension. Each case gets a Marker. It's as simple as that.

The developers don't need to make any decisions here. A single person or team should decide, at design time, what sort of slicing, dicing and filtering needs to be supported. This should be informed by imagining what sort of analysis tasks one expects they may be asked to perform.

This same person or team should decide on the naming convention. It's entirely arbitrary. Choose something that's aesthetically pleasing, self-descriptive (most important), and specific enough to be unlikely to conflict with later additions. Hyphens vs. underscores is exceedingly nitpicky and alarmingly beside the point, but note it may be less confusing for ESL employees to read underscores (at least compared to CamelCase); at the same time, this reportedly annoys some developers due to the awkwardness of reaching the requisite keys.

As far as deciding on a policy, this just means defining in which cases a given Marker or MDC dimension needs to be employed. Keep this tight (centralized, deliberate), but allow for feedback from developers if they feel the set of dimensions and markers are insufficient for the task at hand. Revise/add dimensions and/or attributes as appropriate.

Understand this policy will almost necessarily be project-specific. Not every project needs the same sort of logging analysis. Picture some nightmare scenarios. Then imagine how you would like to be able to analyze the logs in that scenario. You probably don't want to have to write a complicated script to try and track which message belongs to which context, and which state is which at which time, right? Encode whatever such information is necessary as dimensions and Markers, and save yourself some of the hassle if something does go wrong.

Solution 2:

First, MDC.

MDC is really useful in an environment where you have one "entity" that is associated with some behaviour. A typical example: user interacting with a web application. So, let's say you have many users messing around with your web app. Using MDC, you can easily track them without too much hassle. Simplified example:

...[Sandy][abcd] clicked on "change profile"
...[Joe][1234] clicked on "weather reports"
...[Joe][1234] clicked on "Europe"
...[Sandy][abcd] clicked on "logout"
...[Joe][1234] clicked on "logout"
...[Sandy][efgh] logged in

Here, you're using MDC in two places: for username and for session ID. This way, you can easily grep one user's session to see everything they've been doing.

Second, markers.

Markers are usually used for "special" circumstances, such as sending an email to an administrator for some seriously critical errors. Not all errors always fall in the same category; some have to be dealt in an appropriate way.

Or, when a user quits from your service, it usually goes to an INFO log, but you can also use a marker for such instances, if you want events such as this one to go in a separate log file, so you can monitor it more easily for statistical gathering of users quitting.

Rule of thumb:

  • MDC is used for associating multiple events with few "entities"
  • markers are used for "special" events that you want to have filtered from usual ones

Solution 3:

Markers can be used to color or mark a single log statement. What you do with these colors, i.e. markers, is entirely up to you. However, two patterns seem to be common (the first more common than the second) for marker usage.

  1. Triggering: Some appender could be instructed to take an action in the presence of a certain marker. For example, SMTPAppender can be configured to send an email whenever a logging event is marked with the NOTIFY_ADMIN marker regardless of the log level. See marker-based triggering in the logback documentation. You may also combine log levels and markers for triggering.

  2. Filtering: You could for example color/mark all your persistence related logs (in various and multiple class files) with the color "DB". You could then filter for "DB": disable logging except for log statements marked with DB. See the chapter on filters in the logback documentation for more information (search for MarkerFilter).

Solution 4:

Just as an addendum, if you are using logstash and have json logging enabled, there's another potential use of Marker - for logging variables to associate with a specific log message. This is more consistent and easier to parse than including it in the message body. Very useful, if it suits your use-case.

See details here:

https://github.com/logstash/logstash-logback-encoder#loggingevent_custom_event