Handling of "trusted" properties coming from untrusted inputs
by Miloslav Trmač
Hello,
while working on the Lumberjack message processing "pipeline", a design question has come up that probably must be a known issue for logging in general, so I'd like to ask before we start implementing.
What to do about "trusted" properties coming from untrusted input sources?
In particular, we are aiming at an rsyslog configuration where anything passed to /dev/log is automatically annotated with "pid", "uid" and other fields; at that point the information is 100% reliable.
Similarly, syslog messages can come from remote hosts over TCP, and if those hosts are similarly configured by the same administrators, the data is reliable as well.
OTOH, if syslog messages come from hosts that are not under the same control (e.g. data coming from user-administered machines), this data is not "reliable" - what to do about it?
We have come up with three possibilities:
a) Just leave it in, and let the user filter the data out based on other properties (e.g. host name) at the time of log searching (or just let the user notice when reading the log message ad hoc) - some queries and statistics may have misleading results.
b) Delete the untrusted fields - drops data.
c) Somehow mark the fields as "untrusted" - preserves all data, and allows queries that ignore it.
c1) Is it any really any better than just filtering on host names as in a)?
c2) How can we mark the fields as "untrusted", in particular in the context of Lumberjack?
I'm leaning towards a) as the UNIX-tradition "worse is better" solution, but I really don't have that much experience in logging, so any feedback would be appreciated.
Mirek
11 years, 8 months
[ANN]: libumberlog 0.3.0 released [API & ABI breakage]
by Gergely Nagy
Hi!
Since there are detailed NEWS[1] available, I'll keep this short: with
many thanks to Miloslav Trmač <mitr(a)redhat.com>, libumberlog 0.3.0 is
out now.
There are a couple of breaking changes in this release, both the API and
the ABI changed, and 0.3.0 is not backwards compatible with earlier
versions in any way. The three most significant breaking changes are:
* The library comes in two variants now: an LD_PRELOAD-able version that
overrides syslog() & friends, and a version one can link against,
which does not override syslog() & co.
* The LOG_UL_NODISCOVER flag was renamed to LOG_UL_NOIMPLICIT
* Instead of bolting the new flags onto ul_openlog(), one must call a
separate, new function: ul_set_log_flags() to set the new flags.
As far as the LD_PRELOAD usage goes, the only change this brings is that
the shared object to LD_PRELOAD is now called
$pkglibdir/libumberlog_preload.so.
For users of the ul_*() family of functions: migrating to
ul_set_log_flags() and LOG_UL_NOIMPLICIT gets the job done.
The other changes - while also important - have significantly less
effect overall, see the NEWS file.
I highly recommend switching to this version, as it also fixes a couple
of bugs, and the new API is the way forward in the future.
As usual, it is available on github, both via git[2], and as a
tarball[3].
[1]: https://github.com/algernon/libumberlog/blob/master/NEWS
[2]: git://github.com/algernon/libumberlog.git
[3]: https://github.com/downloads/algernon/libumberlog/libumberlog-0.3.0.tar.xz
--
|8]
11 years, 8 months