CEE prototype and a show-case for the new 3.4 features
You may remember the Lumberjack project I wrote about earlier. It is an attempt to improve system logging by creating conventions and standards to cover structured logs in a general way.
Since its inception, a lot of discussion happened on the lumberjack mailing list, a preliminary list of fields to be used got defined, and umberlog, a library to seamlessly start to generate structured logs on Linux systems got published.
As things stand we now have a CEE-style message source (in the form of a JSON payload embedded in log messages), but the next logical question arises: How to handle this JSON stuff? And since this is a wonderful use-case for the new features of the upcoming syslog-ng 3.4 I’ve decided to create a prototype that shows some possibilities.
Please note that this is not yet “the implementation” for CEE, it just shows how syslog-ng can play a role in structured logging. It still lacks a way to store messages in a way that can easily be queried, but MongoDB (and mongodb destination in syslog-ng) can be play a role there in the future.
The syslog-ng feature I’m writing about was described in this blog post, in essence it allows you to combine the various processing capabilities of syslog-ng in an arbitrary manner. Previously, a parser, a rewrite rule and filters were completely independent objects, their work could only be combined in user configuration file, and it was difficult to package (or share) syslog-ng configuration snippets that would deliver some kind of complex processing.
Since this got changed, it is possible to create a cee-parser(), that looks like this:
block parser cee-parser() {
channel {
junction {
channel {
parser { json-parser(marker("@cee:") prefix(".cee.")); };
rewrite {
set("${.cee.msg}" value("MESSAGE"));
set-tag(".cee");
};
flags(final);
};
channel {
# non-CEE log, convert raw syslog fields as best as we can, umberlog style
rewrite {
set("$MSG" value(".cee.msg"));
set("$PID" value(".cee.pid"));
set("$PROGRAM" value(".cee.program"));
set("$HOST" value(".cee.host"));
set("$S_ISODATE" value(".cee.timestamp"));
};
flags(final);
};
};
};
};As you can see, this block uses a json-parser(), a combination of rewrite rules and filters and names the whole thing as a “cee-parser()” object, which can later be used in user configuration file. The point of the parser is to turn all incoming events into CEE style structured messages. If the incoming message is already in that format, it simply parses the json portion. If it’s not, it turns usual syslog headers into CEE fields.
Here’s the rest of the config that shows how the above can be used:
block source cee-system() {
channel {
source { system(); };
parser { cee-parser(); };
};
};
source s_local {
cee-system();
};
destination d_cee {
channel {
filter { tags(".cee"); };
destination {
file("cee.log" template("$(format-json --key '.cee.*' --replace .cee.=)\n"));
};
};
};
destination d_raw {
file("raw.log");
};
log {
source(s_local);
destination(d_cee);
destination(d_raw);
};That’s about all, the file cee.log receives all structured events having CEE fields, the other is a regular plain text log file. Even if the input is JSON, the text file will be a usual log. All that’s needed to run the config above is in the current git HEAD of syslog-ng 3.4.

[...] can be combined in a block and easily reused in many configs. For details and examples check: http://bazsi.blogs.balabit.com/2012/05/cee-prototype-and-a-show-case-for-the-new-3-4-features/ Version 3.4 also merges many features from syslog-ng PE, which can be followed in git commit [...]
[...] Via | Balázs Scheidler [...]
[...] Via | Balázs Scheidler [...]
[...] Via | Balázs Scheidler [...]