Future-proof Logging

Some time ago I had a project that required working with lots of exchange orders, these orders were arriving into the system in hundreds per second: some were filled, others modified or canceled. In other words there was always some processing involved that needed to leave some traces in order to understand what happened with the particular order in case of some client dispute.

Project was developed from scratch therefore it was my decision on how to introduce logging there. As the feature seemed low priority not much time was invested in it and as a result a simple unstructured text logging with timestamp & priority was implemented. It was very human-readable and seemed pretty good until the project went public…

Every week I got a few gigabytes of text logs to process in order to find this or that problem or just to make sure the system is functioning as designed and everything is fine. During that time there were couple of severe problems detected very late because of this very inconvenient and time-consuming process.
After thinking over this mess several things became clear: text files are very inconvenient logs storage and a Nagios probe (or some other automated solution) should be reading the log in my place.

Logging mechanism improvements were in order: introduce some structure to log messages (make some mandatory and optional parameters that every log message has) and use database for logs storage.
After implementing these improvements we could easily search DB for errors ourselves or setup some Nagios probes that regularly do it for us.

To be fair I must confess that I left the company before improved logging mechanism was introduced, it was only half-way there, meaning that only some of the log messages were stored to DB – only those messages that were a consequence of some exception happening in the code and automatic probes could analyze only this portion of log searching the DB for the red flags so to say.

Anyway even this portion of the log stored inside the DB showed that some problems still remained with this approach: SQL database with its fixed schema was too rigid and inflexible storage for the log messages which can have arbitrary number of arbitrary parameters.
The answer is to store the logs into a NoSQL schema-less document database in an easily-parsable form which allows any parameters for a specific message.

I should give a guy who wrote this post a credit for making a case of using JSON as log message format, he also mentions systems that can be used for storing, searching and presenting the logged information.

All of the above makes me think that using the structured logging and storing everything to schema-less DB provides a pretty flexible solution that could withstand a test of time and in particular the everchanging requirements which is the hardest test for any system. Even the syslog protocol has some place for structured log information since RFC 5424 (see 6.3. STRUCTURED-DATA section) although this RFC is still in a proposal state.