Bazsi's blog

Guarding Your Business

Archive for the ‘Logging’ Category

CEE prototype and a show-case for the new 3.4 features

Sunday, May 6, 2012 @ 09:05 PM Author: Balázs Scheidler

You may remember the Lumberjack project I wrote about earlier. It is an attempt to improve system logging  by creating conventions and standards to cover structured logs in a general way.

Since its inception, a lot of discussion happened on the lumberjack mailing list, a preliminary list of fields to be used got defined, and umberlog, a library to seamlessly start to generate structured logs on Linux systems got published.

As things stand we now have a CEE-style message source (in the form of a JSON payload embedded in log messages), but the next logical question arises: How to handle this JSON stuff? And since this is a wonderful use-case for the new features of the upcoming syslog-ng 3.4 I’ve decided to create a prototype that shows some possibilities.

Please note that this is not yet “the implementation” for CEE, it just shows how syslog-ng can play a role in structured logging. It still lacks a way to store messages in a way that can easily be queried, but MongoDB (and mongodb destination in syslog-ng) can be play a role there in the future.

The syslog-ng feature I’m writing about was described in this blog post, in essence it allows you to combine the various processing capabilities of syslog-ng in an arbitrary manner. Previously, a parser, a rewrite rule and filters were completely independent objects, their work could only be combined in user configuration file, and it was difficult to package (or share) syslog-ng configuration snippets that would deliver some kind of complex processing.

Since this got changed, it is possible to create a cee-parser(), that looks like this:

block parser cee-parser() {
  channel {
    junction {
      channel {
        parser { json-parser(marker("@cee:") prefix(".cee.")); };
        rewrite {
          set("${.cee.msg}" value("MESSAGE"));
          set-tag(".cee");
        };
        flags(final);
      };
      channel {
        # non-CEE log, convert raw syslog fields as best as we can, umberlog style
        rewrite {
          set("$MSG" value(".cee.msg"));
          set("$PID" value(".cee.pid"));
          set("$PROGRAM" value(".cee.program"));
          set("$HOST" value(".cee.host"));
          set("$S_ISODATE" value(".cee.timestamp"));
        };
        flags(final);
      };
    };
  };
};

As you can see, this block uses a json-parser(), a combination of rewrite rules and filters and names the whole thing as a “cee-parser()” object, which can later be used in user configuration file. The point of the parser is to turn all incoming events into CEE style structured messages. If the incoming message is already in that format, it simply parses the json portion. If it’s not, it turns usual syslog headers into CEE fields.

Here’s the rest of the config that shows how the above can be used:

block source cee-system() {
  channel {
    source { system(); };
    parser { cee-parser(); };
  };
};

source s_local {
  cee-system();
};

destination d_cee {
  channel {
    filter { tags(".cee"); };
    destination {
      file("cee.log" template("$(format-json --key '.cee.*' --replace .cee.=)\n"));
    };
  };
};

destination d_raw {
  file("raw.log");
};

log {
   source(s_local);
   destination(d_cee);
   destination(d_raw);
};

That’s about all, the file cee.log receives all structured events having CEE fields, the other is a regular plain text log file. Even if the input is JSON, the text file will be a usual log. All that’s needed to run the config above is in the current git HEAD of syslog-ng 3.4.

 

First alpha release of syslog-ng 3.4 published

Sunday, March 11, 2012 @ 02:03 PM Author: Balázs Scheidler

I’ve just uploaded the first release in the upcoming 3.4.x series. This is an incremental step over 3.3.x, continuing to enhance syslog-ng with features that allows more in-depth processing of messages.

I consider the most important one the ability to freely combine different kind of processing elements (parser & rewrite rules and filters) with sources and/or destinations and handle the combination as a single object. This is listed “junctions & channels” below, but you can also read more details in this blog post.

Certainly, this release is not meant to be used in production, however it also helps if you try to run your production configuration, and report back on the results. The syslog-ng configuration parser was heavily modified in this release, as little as this can also help to improve syslog-ng. I hope that binaries for the experimental repositories of various distributions will show up shortly, until then you can always clone the git tree.

Here’s a excerpt of the NEWS entry that describes the changes compared to 3.3.x:

Features:

  • Support for junctions & channels were added, which improve the flexibility of the syslog-ng configuration language. This allows combining sources with their closely tied processing functionality (like parser, rewrite and filter statements).Read this blog post for more information: http://bazsi.blogs.balabit.com/2012/01/syslog-ng-flexibility-improvements/

    In the final form of the functionality the “log” keyword as described in the blog post above was replaced with “channel”.

  • The functionality to query and manipulate sets of name-value pairs (often referenced as value-pairs and used in the mongodb() destination driver and the $(format-json) template function). got significantly improved. It is now possible to change the name of the keys when creating the output. See this commit for more information:https://github.com/bazsi/syslog-ng-3.4/commit/ddc7c2539bd66fa35e8df441e4baf58e87b6708d
  • Plugins & modules are now demand-loaded automatically if the “autoload-compiled-modules” global variable is set to 1, which is the default. Any shared libraries found on the module search path is considered for loading if the configuration file contains a reference to a functionality it provides.To disable this functionality simply set the referenced variable to 0 with a “@define” statement and load modules explicitly via”@module” statements.

    To list the available plugins & modules, use the –module-registry command line option for syslog-ng, which results in a detailed listing.

  • Added a new parser named json-parser() to parse incoming JSON formatted messages. See this commit for more information:https://github.com/bazsi/syslog-ng-3.4/commit/e5569687bba2551c89a78faee55bcf8b4944066f

     

  • Added a number of template functions:

       $(length ARG)               – length of a template expression
$(substr ARG START [LEN])   – substring of a string
$(strip ARG)                – remove white space from the start and end
$(sanitize ARG1 ARG2)       – join args to form a filename while removing special characters like ‘/’
$(+), $(-), $(*), $(/), $(%) – perform numeric operations

  • Reload of the configuration can now be triggered using “syslog-ng-ctl reload”.
  • A new macro named $LOGHOST was added, which expands to the local hostname running syslog-ng.
  • A set of time macros were added prefixed with “C_” that use the current time instead of the reception time (prefixed R_) and and the time that was included in the message (prefixed S_).  This means that C_DATE expands to the current date, whereas R_DATE would expand to the date the current message was received at.https://github.com/bazsi/syslog-ng-3.4/commit/c2d17009e2ce14960acb519750fe2537b05e6f46
  • Improved error reporting by including the configuration-file location of the object associated with the error. This makes it easier to diagnose errors even in the case of otherwise unnamed objects.
  • This release also includes all fixes of the 3.3 branch, which are not listed here for brevity’s sake. The merged commit ID is: bf742b0, which is a couple of patches ahead of “3.3.4″.

Credits:

syslog-ng is developed as a community project, and as such it relies on volunteers to do the work necessarily to produce syslog-ng.

Reporting bugs, testing changes, writing code or simply providing feedback are all important contributions, so please if you are a user of syslog-ng, contribute.

These people have helped in this release:

  • Andreas Piesk
  • Balazs Scheidler (BalaBit)
  • Balint Kovacs (BalaBit)
  • Evan Rempel (University of Victoria)
  • Gergely Nagy (BalaBit)
  • Heiko Gerstung
  • Hendrik Völker (Verizon)
  • Jakub Jankowski (superhost.pl)
  • Martin Grauel (BalaBit)
  • Matthias Runge (Fedora)
  • Patrick Hemmer
  • Russ Milne (Seccuris)

syslog-ng git repo moved to github

Monday, January 16, 2012 @ 08:01 PM Author: Balázs Scheidler

I’ve been playing with github the last couple of months for git hosting, and since I like what I see and users seem to have found this out on their own, I figured this should be official.

The git repository is being moved from their old git.balabit.hu location to:

syslog-ng 3.3
http://github.com/bazsi/syslog-ng-3.3.git

syslog-ng 3.4
http://github.com/bazsi/syslog-ng-3.4.git

Older releases (like 3.2) will remain on git.balabit.hu for now. Please update your git clones with the new URL.

The syslog-ng webpages will be updated shortly.

syslog-ng flexibility improvements

Sunday, January 15, 2012 @ 06:01 PM Author: Balázs Scheidler

Update: The syntax of this feature has slightly changed due to discussions on the mailing list (e.g. change the log keyword to channel), and I’ve updated it to use the current syntax.

syslog-ng is often referred as a very flexible application when it comes to processing logs. Over the years however, I began to feel that some things are a bit more difficult to achieve in the configuration language than it should be. For instance it is sometimes too rigid when you need a combination of parsers (patterndb with db-parser) and rewrite rules to achieve the goal you wanted. Parsers and rewrite rules are distinct part of the configuration, it is not possible to combine them into a single functionality. Also, declaring objects first and then referencing them later, makes the configuration easy to read, however sometimes it is quite cumbersome, when you only need to invert the result of an already existing filter.

To solve this situation, I’ve set out to implement an idea I had on mind for some time now. It is quite difficult to describe the feature in clear and concise words, as it is a combination of various changes that together makes syslog-ng configuration more flexible and easier to use, without sacrificing readability. Curious? Please read on.

In-line objects

Perhaps the simplest of all features is that you can now define the contents of a given object right on the spot, without having to use a separate statement. For example, earlier you had to write:

log {
  source(s_local);
  filter(f_postfix);
  destination(d_postfix);
};

Sometimes, f_postfix filter is only used once and is trivial. This can now be written as:

log {
 source(s_local);
 filter { program("^postfix/"); };
 destination(d_postfix);
};

Furthermore both the source() and destination() options can be written in-line, you simply use braces instead of parentheses. The same functionality applies to everything: sources, destinations, filters, parsers and rewrite rules.

Junctions

A limited form of junctions has been supported since syslog-ng 3.0 in the form of “embedded log statements”, which has been generalized now. Within syslog-ng, when a message is received it is dispatched to a log processing path or pipeline, which carries out the task at hand. A junction is a point in the log processing path where the processing is performed on multiple independent branches, each doing its own specific thing with the message.

The limited functionality in 3.0 only allowed the processing tree to split (or fork) into independent branches, each of the branches was a “sink”, where processing also ended. Configuration example:

log {
  source(s_all); filter(f);
  log { filter(f1); destination(d1); };
  log { filter(f2); destination(d2); };
};

This sample forks the processing path into two branches starting with the “log” keyword within the top-level log statement. The first branch evaluates the filter f1 and the writes matching messages to the d1 destination, effectively sending all messages that match (f AND f1) to d1. Likewise, d2 receives all messages that match (f AND f2).

The limitation of the embedded log statement concept was simple: it could only be listed at the very end of a log statement, and the end-result of the branches couldn’t be processed further. Effectively the message at the end of each branch “fell off”. Junctions on the other hand makes it possible to do things to messages once the branches converge to the same point again. Repeating the sample above, it is now possible to write:

log {
  source(s_all); filter(f);
  junction {
    channel { filter(f1); destination(d1); };
    channel { filter(f2); destination(d2); };
  };
  destination(d_all);
};

The new thing is that you can now add processing after the branches finish their processing. A bit more useful example would be:

log {
  source(s_apache_files);
  source(s_syslog);
  junction {
    channel { filter(f_apache_files); rewrite(r_apache_remove_file_header); parser(p_apache); flags(final); };
    channel { filter(f_apache_syslog); parser(p_apache); flags(final); };
  };
  destination(d_files);
};

This example does an alternative processing of incoming logs based on where the message came from.

Everything is a log expression

This feature is probably the most complicated, however provides very nice properties and expressiveness to the configuration. From now on, not just the well known log statement allows the specification of log processing rules, but all the objects in the syslog-ng configuration file can use the same expressive power.

It is now possible to use embedded log statements, junctions and in-line object definitions within source, destination, filter, rewrite and parser definitions. Huh, you could ask: what does it bring to me as a benefit? Well, until now, objects of different types were separate entities, connected using log statements, with this change a source can also specify a rewrite rule and that combination used as a log source in a log statement.

For instance, a usual source definition looked like this:

source s_apache {
  file("/var/log/apache/error.log");
};

If you wanted to process this log file in a specific way, you needed to define the accompanying processing rules (parsers and rewrite expressions) and combine them in a log statement. But how about this:

source s_apache {
  channel {
    source { file("/var/log/apache/error.log"); };
    parser(p_apache_parser); };
  };
};

log { source(s_apache); ... };

Can you see? The s_apache source used a file source and the reference of a specific parser and all messages read from the apache error log file would be processed by that parser. The log statement is just as simple as if s_apache would be a “normal” source definition. This feature allows pairing the essential log preprocessing functionality very close to the source itself, making it very easy to write and read the log statements. As an added bonus, it becomes very easy to distribute application specific source & parser definitions as an SCL configuration snippet.

Where?

This stuff is available in the syslog-ng 3.4 git tree, on master. It passes the included regression test, so it is at least dogfoodable. The nice thing about the implementation is that it only slightly increased the code size, but brought a lot of new features. If you have trouble getting the code from git, let me know, I’m willing to create an alpha release, so that it becomes easier to play with it.

Feedback

I see a lot of potential in this functionality, however my examples may have not been the best ones. I would really appreciate any kind of feedback, please be sure to send those to the syslog-ng mailing list or post me as a private email.

syslog-ng and the journal

Tuesday, December 6, 2011 @ 12:12 PM Author: Balázs Scheidler

There’s an ongoing project to create a new logging subsystem for Linux, called the journal, by Lennart Poettering of PulseAudio & systemd fame. It is implemented as a core component of systemd, thus has a good chance to be integrated to all distributions that carry systemd: Fedora, openSUSE, and probably others.

The vision and design is described in a paper here.

The reactions to the idea were mixed: there are some good features behind the idea, however it changes a couple of fundamental UNIX traditions. See article and comments here.

Since syslog-ng is also in the logging sphere, the logical question arises: how does this new project affect syslog-ng in the long run?

The short answer to that question is that it’ll probably help syslog-ng, but please read on.

Journald now & future

Right now journald is a very limited syslog implementation that only focuses on local logging: it collects local syslog messages, converts them into name-value pairs, then adds some trusted ones (like the pid, uid and gid) and writes  these into journal files for storage.

The idea of working with structured messages in journald is currently limited because of the required application changes: only traditional syslog fields are available, the application message is stored within a single field. The vision here is to add structured logging to applications.

Other sources of local logs are to be integrated too: stuff like login/logout records (wtmp), audit logs and firmware (ACPI) logs.

The file format is interesting, although it is the source of most of the negative feelings: it is a binary format. It is undocumented (for now) with a library in the works to read & write them. The problem most people see that in emergency situations the file may become corrupt, and crucial information can be lost to diagnose the problem that caused the corruption in the first place.

As far as I understand, if applications were to support journal, they’d have to write records to the journal through the API without involving journald itself. This means that the file structure must support inter process synchronization, or that each application would log to different files to avoid that (using UUIDs for instance).

Network transport of the journal doesn’t exist yet, the vision seems to be that the journal is structured on the disk so that journals from multiple hosts can be merged simply by copying them to the same host. Ideas such as rsync or NFS mounts were floated as solutions to the transport problem. This is modeled after a bit like a git repository, which is great for storing source code, but may not be as great for log storage.

Further processing of logs is not in the cards, e.g. journald would take and store logs. In order to support higher level processing, applications would need to be modified and external tools to process logs would be needed.

As it seems, journald leans towards using a distributed model: each host has its own journals and whenever you want to look up records, you go back to the source and tries to address security concerns via cryptographic means (like using a chained HMAC for stored log messages).

Comparing to syslog-ng

syslog-ng has become much more than a syslogd in recent times. It supports structured messages similarly to journald (name-value pairs), can even extract such values from unstructured messages (db-parser), and can store stuff into much more than text files: MongoDB keeps the structure and provides indexed access, but output as JSON objects into simple text files is possible too

With syslog-ng (unlike journald), the user is in control: she can influence the storage policy, use whatever files, databases she pleases to store data. This is flexibility on one hand, but can be a problem if one wants to create tools that  universally work with logs out-of-the-box.

Some users store messages in /var/log, others in /logs, yet others use MongoDB for their log storage. Writing a GUI application that works with log data out-of-the-box without having to specify where these are stored is next to  impossible. The syslog model is to use specialized tools for each of these storage mechanisms and home-grewn scripts to do site specific processing. The reason is simple: use-cases are so different (from mail logs to financial transactions) and log data so voluminous (in the 100TB scale) that one size doesn’t fit all.

syslog-ng is more like an infrastructure for the actual, potential per-site log processing needs, journald is a complete system for a more limited use-case, which may just be enough for a number of users.

The security model of traditional syslog is to get logs off the potentially vulnerable system, leaving as small window for potential modification as possible. Certainly sending out a hash value periodically is possible with the journal too, however the actual log messages could be lost if the source system is compromised. This way syslog leans toward a centralized system, which itself can be distributed by using trusted intermediaries that store log data in need, but _independently_ of the source host.

Currently the only feature that journald has over syslog-ng is ‘trusted fields’, e.g. the fact that journald can determine the actual uid, gid, pid… values, making it more difficult to forge messages on the local host. Although these are handy, I didn’t see the need for these in practice in the past 13 years I’ve been maintaining syslog-ng, and I knew about the possibility to get this information from the kernel. Anyway, adding this to syslog-ng is not difficult, probably an hour or two and I may just do that to demonstrate. rsyslog has rushed to implement these probably for the very same reason: http://blog.gerhards.net/2011/11/trusted-properties-in-rsyslog.html

The other planned log sources are partially supported by syslog-ng too,  for instance BSD process accounting logs are supported directly (http://bazsi.blogspot.com/2010/07/syslog-ng-and-process-accounting.html)  and similar support can also be added for the others, if not already integrated with syslog.

What Next?

I think that journald can become great for computers where logging is not the primary function. If the user has never changed her syslog.conf file, journald would provide much more for the user out-of-the-box, than does the current syslog. These are:

  • proper logging under the boot process
  • integrated to the service manager, easing troubleshooting for failed services (saving stdout and stderr)
  • GUI application for ad-hoc checking of logs
  • the ability to programmatically query logs without having to care about site-specific policies (how log files are organized for instance)

For those cases where logging is important, mandated by regulations or operations for a heterogenous enterprise system journald will probably not be enough. Not enough even if the whole vision is accomplished. As I see these features will not be adequate in journald:

  • off-system storage for a long period (1-5 years is mandated by various regulations)
  • on-line log collection for getting the message off the potentially vulnerable system as fast as possible (being late at most a minute is acceptable, but hourly syncs are not)
  • performance for on-line collection and storage (I’ve seen requirements for handling up-to 250k msg/sec)
  • interoperability with syslog: not just receiving and storing but also preprocessing, normalization & classification.
  • existing standards
  • open for on-line integration with home-grewn processing tools and
  • SIEMs

These are the benefits how syslog-ng can win from journald:

  • structured logging in applications: if these would actually emerge, syslog-ng would be there to support them too
  • syslog-ng as an application: currently syslog-ng is used as a system component, replacing syslogd, which drives some features that may not match the primary vision of syslog-ng itself. If local logging would be taken care of journald, syslog-ng could focus where it is best: collecting, preprocessing, normalizing & classifying logs, including the ones in journald.

Journald can mean that Linux boxes would probably be installed without a full-blown syslogd by default.

As long as interoperability with a syslog application is a goal for journald (and I fail to see it won’t be the case), syslog-ng can happily coexist with the journal and can itself leverage all the benefits that journald brings. Journald will only replace syslog and syslog-ng in a limited use-case, which is not a primary focus for syslog-ng.

Currently, syslog-ng is not a default choice for the majority of distributions, which means that right now one needs to explicitly install syslog-ng over the default. This will not change much by the introduction of the journal, except the fact that the current default is a more direct competition for syslog-ng. If journald replaces that role, the playing field would be leveled somewhat.

syslog-ng multithreaded performance

Saturday, July 30, 2011 @ 08:07 PM Author: Balázs Scheidler

It seems that the BalaBit syslog-ng team that produces the Premium Edition of syslog-ng has beaten the community project this time, at least in terms of release date.

syslog-ng Premium Edition 4F1 (e.g. the first feature release past 4.0) has been released this week. It is the first release of PE in a long time that is actually based on an actual OSE  core, namely 3.3.

I still have about 100 patches to review and integrate into OSE, hopedully with community involvement. But more about that in an upcoming post.

It is also interesting that some performanc testing was also done, and the new core does pretty well, and scales nicely on an 8 core machine, up to 800k msg/sec in @some onfigurations. Here’s the post the has some more details.

Now,  if only the fixes they did were integrated properly to the OSE repository.  but hey, life would be easy without challenges.

On CVE-2011-1951: bug or security issue?

Sunday, July 10, 2011 @ 09:07 AM Author: Balázs Scheidler

There’s an ongoing debate on the Linux Kernel Mailing List, whether security issues need separate attention. While I agree that distributing available information on security relevance is a good thing, I can also understand the concerns about the “security circus”. Being the finder of a security bug has value in one’s reputation, and building that reputation is a priority for some. Positioning mere bugs as security problems is quite simple: any crash-bug can easily become ‘Denial of Service’.

Certainly, crashes in a piece of the critical infrastructure is a severe problem, and addressing those bugs is important, but pinpointing a single case over the other ones withdraws energy from other, perhaps more important efforts.

syslog-ng had a bug in its PCRE support code: whenever the PCRE engine returned an error code when performing a match on a global (/g) pattern, it entered into an infinite loop, eating up memory and CPU cycles until memory runs out or it gets killed. It is important to note that at this point the regular expression is already compiled, so triggering this condition is not as easy as specifying an incorrect regular expression in the syslog-ng configuration file.

But anyway, mishandling error returns is bad, especially as starting with PCRE version 8.12 always returns an error because of the bug in the glue code. This new version is popping up in various distros, and was the first hint to find this bug.

This means, that you always get the infinite loop with a recent PCRE version in case you have a subst() rewrite rule in global mode and there comes a message which does not match. I’d translate this as the most obvious case when testing one’s deployment.

And quite possibly this bug is never triggered in any other cases.

Is this bug serious? Well, for those affected this can certainly cause some trouble. Is it worth fixing? Of course. Is it a security problem on a production system, with a stable software and configuration environment? Well, doubtful.

But still, syslog-ng is being patched with feverish pace in distros, and probably by users with custom compiled packages. This effort could perhaps be used to test and provide feedback on the syslog-ng version in development: 3.3.

Log messages in 3d

Sunday, June 26, 2011 @ 09:06 AM Author: Balázs Scheidler

Algernon has created a nice 3d visualization for log messages, 12 days compressed into one and a half minutes of video. Very nice music too :) The post describing how it works is here.

 

Behind the scenes: syslog-ng 3.3

Monday, June 20, 2011 @ 08:06 PM Author: Balázs Scheidler

I just wanted to let you know, that fixes are nicely coming into the 3.3 beta tree, although it might not be very visible from th outside.

So if you consider trying out 3.3, I’d suggest to try a git snapshot instead of the 3.3beta1 tarball.

I’m trying to release a beta2 or rc1 in the near future. The version number depends on how much feedback we get until then :)

Repository for syslog-ng 3rd party modules

Saturday, June 4, 2011 @ 01:06 PM Author: Balázs Scheidler

I long wanted to create a repository to hold things that were not yet integrated into the syslog-ng codebase. Things can happen to be in this phase either because of technical reasons, which are not addressed, or because of lack of time. Surely, if I was maintaining the repository the time issue wouldn’t have been solved, but fortunately Algernon has stepped up and started it, which means that whenever I’m distracted modules can have a place to be.

The intent is to have these migrated to the official source code if time permits.

Right now, it has the SMTP destination, but probably will receive other goodies in the future.

I’m happy :)

 

syslog-ng in Kindle

Friday, May 20, 2011 @ 06:05 AM Author: Balázs Scheidler

CzP has found out that syslog-ng is used in the Amazon Kindle. Seems like our userbase is in the millions. :)

 

syslog-ng 3.3 feature freeze, 3.4 branch opened

Sunday, May 1, 2011 @ 10:05 PM Author: Balázs Scheidler

With the recent maintenance policy updates in my last post, I plan quickly release a maintenance version for 3.2 (with version number 3.2.3) and then to concentrate on getting 3.3 into a stable form, starting with a beta release.

As a reminder, here are the new features of syslog-ng 3.3:

  • performance improvements:
    • new multi-threaded core that allows syslog-ng to scale into the hundred thousand message/sec range by using all the CPU cores available in the system
    • use epoll() system call instead of traditional poll() (where available)
    • transaction support in the SQL destination driver, resulting in significant performance improvements (not LOAD DATA though)
    • buffered output for destination files at the cost of some latency
    • other miscallenous changes the improve performance
  • MongoDB destination driver with support for creating documents based on the dynamic syslog-ng message structure
  • $(format-json) template function that converts messages into a JSON representation
  • systemd support (which was backported to the 3.2 release as well to support distributions in their integration work on systemd)

As you can see, this release is clearly performance oriented, hopefully 3.4 will also come with new and exciting features. For now, I’ve opened the 3.4 branch in order to have a place where new stuff can go, instead of languishing as patches on the mailing list. I’m quite excited with the new threaded core, I see further opportunities, although I can hardly imagine someone with several hundred megabytes/sec of logs which the current core can deliver.

Also, the non-performance related items on the list above were contributed by members of the community, so by all means this release contains much more community work than previous ones. Thanks guys.

Maintenance Policy update for syslog-ng Open Source Editions

Sunday, May 1, 2011 @ 10:05 PM Author: Balázs Scheidler

Dear syslog-ng users,

As discussed on the syslog-ng mailing list, the current versioning policies regarding syslog-ng Open Source Edition is confusing, and with the proliferation of syslog-ng versions, their maintenance is an increasing burden on the  syslog-ng project. Currently three major versions are supported (3.0, 3.1 and 3.2) and a fourth one (3.3) is in active development.

A decision was made that the distinction between “feature” and “stable” releases for syslog-ng OSE releases will cease to exist: all releases will have the same status support-wise:

  • they will be supported for a year, or
  • until the next stable release is made
  • whichever is longer

Also, the versioning of Open Source and Premium editions will become completely independent, and it is not possible to compare their functionality on the version number alone. The Premium Edition will always be based on a specific Open Source release, and provide additional functionality compared to the base version. OSE releases published after the PE release may provide additional functionality, not yet present in a PE release.

The changes above cause some of our currently supported versions to be deprecated. In order to provide a time window for migration to a newer release, the following EOL dates were set:

  • syslog-ng 3.0: 30th, June 2011.
  • syslog-ng 3.1: 30th, June 2011.

Everyone running these or earlier releases should upgrade to the latest 3.2.x release, which is currently at 3.2.2 (with 3.2.3 being prepared).

For more information:

The syslog-ng project, on its own doesn’t provide syslog-ng binaries, except for a limited number of Linux distributions. It is expected that users compile syslog-ng on their own, or use the binary provided by the OS supplier.

This change doesn’t affect you, if

  • you run a Premium Edition of syslog-ng,
  • you have a support contract in place, that says otherwise

Happy Logging,

Bazsi

Intrusion Prevention with syslog-ng

Wednesday, February 23, 2011 @ 03:02 PM Author: Balázs Scheidler

Valentijn has published (blog post, mailing list archive) a nice hack using syslog-ng to actively react to intrusion attempts with patterndb and iptables. The blocking part is implemented using iptables recent match that is capable of closing an opened port for certain amount of time. This is controlled by syslog-ng: whenever a  login failure is received, syslog-ng informs the recent module about that.

And please note that it doesn’t matter which application the intruder is trying to use, by feeding new rules into patterndb, you can have the same functionality for any of your applications, with the syslog-ng configuration unchanged.

Nice idea, thanks Valentijn.

syslog-ng’s development drivers

Sunday, February 6, 2011 @ 01:02 PM Author: Balázs Scheidler

I got some interesting comments in a forum posting, outlining a perception how syslog-ng’s development is driven by BalaBit. The original post is here, but the interesting quote I’d like to react is this:

@all Some general points:
A main difference between rsyslog and syslog-ng is that syslog-ng is backed by a large commercial organisation and a commercial fork of syslog-ng. This is not necessarily bad. But it is a difference to how rsyslog is driven: I get some funding for rsyslog development from Adiscon, but Adiscon is much, much smaller. So rsyslog is more of a community project. I have to admit that we at Adiscon still hope to find some commercial funding, e.g. via support contracts or custom development. But we do have far less of the traditional commercial machinery behind that (maybe it would be better we had, at least for me ;) ).

[...]

To sum up: both rsyslog and syslog-ng are quite good implementations, each with their own strength. I think the main difference is the process in which they are created (more commercial-focus vs. community/technology-focus).

This is an interesting perspective, although as you may imagine doesn’t match my own idea how things work. But probably many of my reasons are not visible from the outside, thus this post. I hope to shed some light on how syslog-ng development is being done, maybe that’ll help you understand the situation better.

BalaBit may be larger than Adiscon, but syslog-ng development has never been its primary focus. syslog-ng was represented differently in the company for a long time: it was my hobby. Later on, we’ve received so many development requests and support queries that we’ve decided to build a product that is not simply a game (and responsibility) of one single guy, and thus Premium Edition was born. The way we’ve carried that out could have been done better, but I think this was discussed well enough in earlier posts. (here, and here). I think we’ve addressed most of the concerns in this area with the change in licensing, dropping the requirement for copyright assignment and so on.

Today, there’s a development team inside BalaBit, which delivers Premium Edition to paying customers, and there’s me and a couple of other guys in the company, who are not members of that team, but still want to work on syslog-ng. Since we fulfill other roles, in work-time we are not working on syslog-ng, but in our free-time (and coffee breaks :) we can talk, extend and work on syslog-ng hand-in-hand with the community.

That’s how syslog-ng is driven.

This means that the Open Source and the Premium editions are on a different track, features/bugfixes may be present in one, and not-present in the other. This would be quite cumbersome to maintain, thus from time-to-time we synchronize them up. That’s when features go to the OSE version or vice-versa.

The point is that syslog-ng Open Source Edition is independent from the Premium Edition, different people are doing it, with a different motivation. The OSE version in our case is not a bait for people to buy the Premium Edition, you can be a happy OSE user and be completely satisfied, or you can be a PE user if you need the backing services. It’s up to you.

The motivation for customers to buy syslog-ng Premium Edition is the professional support, the predictable release schedule, the number of platforms it realiable runs (and gets tested) on, the installation packages and the fact that you can actually have legal assurances that a piece of your infrastructure works, and there’s someone to blame and turn to if it doesn’t. Also, the PE team does a thorough testing on functionality coming from the OSE branch, not because it’d be done badly on purpose, but people usually focus to different things when they implement a functionality in an open source setting.

I don’t see the big difference between syslog-ng and rsyslog in the way it is driven. In fact Rainer is in about the same position at Adiscon, as I’m at BalaBit.

Article on message correllation

Tuesday, February 1, 2011 @ 06:02 PM Author: Balázs Scheidler

There’s a good writeup on syslog-ng correllation functions on LWN. Since it is currently for subscriber’s only, here’s a link that you can use to see until it is published.

http://lwn.net/SubscriberLink/424459/dc2ec3fee7d80d3b/

LWN is a great publication by the way, so consider subscribing if you can.

syslog-ng releases

Sunday, January 16, 2011 @ 04:01 PM Author: Balázs Scheidler

I’ve made a round of syslog-ng releases in the last couple of weeks.

From these 3.0.10 and 3.1.4 are quite similar, as they carry the almost the same set of bugfixes, which you can find in the respective changelogs. 3.2.2 is however different, it is a slightly larger update, as the 3.2.x branch of syslog-ng is the most recent. I’m quite happy how the 3.2.x beta period went, as the bugs found since the initial 3.2 release were not at all earthquakes: although they certainly affect some people, those are mainly in the newly introduced functionality (e.g. the correllation engine), the basic functionality of syslog-ng remained quite stable. And considering the size of the 3.1 -> 3.2 update, this is a result on its own.

mongodb() driver for syslog-ng

Tuesday, January 11, 2011 @ 05:01 PM Author: Balázs Scheidler

Update: The driver has a homepage of its own at http://asylum.madhouse-project.org/projects/syslog-ng/mongodb/

Though I had no chance to look at it yet, Algernon has posted a MongoDB destination driver for syslog-ng. I can’t wait to have a closer look at it, hopefully I get a chance in the coming days, but until then be sure to check it out. The announcement went to the syslog-ng mailing list, here’s a direct link:

http://comments.gmane.org/gmane.comp.syslog-ng/10322

Threading + epoll on 3.3 mainline

Tuesday, December 21, 2010 @ 04:12 PM Author: Balázs Scheidler

I’ve achieved an important milestone on the current threading stuff and I’m happy to tell you that multi-processing and epoll related performance improvements work is progressing nicely. The current master branch of the syslog-ng-3.3 tree runs the testsuite (make check) and performs much better than earlier releases.

The only performance data was measured on my laptop, there it grew from about 60k to 180k msg/sec. While doing fixes and adding locking here and there, it went down to 160k and I didn’t investigate why that happened. But anyway, 160k msg/sec is not really bad either, from a single client and my guess is that adding more clients (and CPUs) to the picture will scale syslog-ng to several hundred thousand messages per second range.

I still have some locking job to do, I’ve just found problems with the udp() destination driver, so it is currently quite fragile, but I’d appreciate any kind of feedback you could have by installing it on your test systems. Production is of course out of the question.

Until now, the work was available in the “wip/epoll” branch, which I rebased it regularly, so that fixes of problems that I’ve found were incorporated into the original “threading” patchset. However that patch grew quite large by now and I now feel it’d be easier to track changes as individual patches instead of folding them back into the original series. Therefore I merged it back to “master”, and from now on the wip/epoll branch will be removed, further fixes will be published on the “master” branch.

In order to compile this stuff you’ll need one dependency library: ivykis. Ivykis is written by Lennert Buytenhek and encapsulates an epoll based event loop. It also supports other systems like FreeBSD’s kqueue, Solaris’s /dev/poll and of course the traditional select/poll system calls. I needed a couple of modifications against ivykis, those are hosted on git.balabit.hu, more specifically at: git://git.balabit.hu/bazsi/ivykis.git. I’m working with Lennert to incorporate my changes, so that hopefully no changes to upstream ivykis will be necessary.

In the coming days, I’m trying to fix up things that broke, and then quite possibly do a 3.3alpha1 release once I feel that it is getting stable enough for anyone to try.

Stay tuned!

syslog-ng 3.2 in openSUSE

Wednesday, December 8, 2010 @ 09:12 PM Author: Balázs Scheidler

The adoption rate of syslog-ng 3.2 is marvellous. It was made available for Mandriva on the date of the release, and about a week later openSUSE Factory has a package, thanks to Marius Tomaschewsky. I also received a patch to include support for cygwin into the system() source, courtesy of Corinna Vinschen. FreeBSD ports still has a 3.2beta1, hopefully it’ll be updated soon.

I’m happy.