Is journalctl in the tail -f mode called "follow" supposed to be realtime? I'm seeing it more or less output log lines in realtime for many hours and then eventually it falls behind with half an hour or one hour delay.
The (simplified) lines are from a perl sshd tracker are:
open( LOG, "journalctl -o short-precise -u sshd --follow |"); while (<LOG>) { ... }
I suppose, it is possible that the delay is in perl, but journalctl seems more likely to be the culprit, hence the subject line.
-wolfgang
On Mon, Feb 9, 2015 at 11:59 PM, Wolfgang S. Rupprecht wolfgang.rupprecht@gmail.com wrote:
Is journalctl in the tail -f mode called "follow" supposed to be realtime? I'm seeing it more or less output log lines in realtime for many hours and then eventually it falls behind with half an hour or one hour delay.
The (simplified) lines are from a perl sshd tracker are:
open( LOG, "journalctl -o short-precise -u sshd --follow |"); while (<LOG>) { ... }
I suppose, it is possible that the delay is in perl, but journalctl seems more likely to be the culprit, hence the subject line.
I haven't seen this. If you quit and then issue a new journalctl -f, do you see a bunch of things that previously weren't there with (approximately) current time? It might be a bug worth inquiring about on systemd-devel@.
Chris Murphy lists@colorremedies.com writes:
On Mon, Feb 9, 2015 at 11:59 PM, Wolfgang S. Rupprecht wolfgang.rupprecht@gmail.com wrote:
Is journalctl in the tail -f mode called "follow" supposed to be realtime? I'm seeing it more or less output log lines in realtime for many hours and then eventually it falls behind with half an hour or one hour delay.
I haven't seen this. If you quit and then issue a new journalctl -f, do you see a bunch of things that previously weren't there with (approximately) current time? It might be a bug worth inquiring about on systemd-devel@.
I do see journalctl output the delayed lines when I run either journalctl by itself or with "-f".
(with slight editing, just to toy with the script kiddies probing the system. ;-))
# journalctl -o short-precise -u ssh-ban -u sshd --lines 73 ... Feb 10 09:30:12.267795 xxx.example.com sshd[10846]: Set /proc/self/oom_score_adj to 0 Feb 10 09:30:12.278631 xxx.example.com sshd[10846]: Connection from 104.236.247.20 port 59270 on 192.168.35.32 port 22 Feb 10 09:49:22.061551 xxx.example.com sshd[10952]: Set /proc/self/oom_score_adj to 0 Feb 10 09:49:22.069974 xxx.example.com sshd[10952]: Connection from 219.153.36.198 port 41053 on 192.168.35.32 port 22 Feb 10 09:55:47.553083 xxx.example.com sshd[10966]: Set /proc/self/oom_score_adj to 0 Feb 10 09:55:47.556836 xxx.example.com sshd[10966]: Connection from 103.41.124.32 port 51058 on 192.168.35.32 port 22 Feb 10 09:55:47.560852 xxx.example.com ssh-ban[764]: Connection 104.236.247.20 Count: 1 Feb 10 09:55:47.561618 xxx.example.com ssh-ban[764]: Connection 219.153.36.198 Count: 2 Feb 10 09:55:47.562250 xxx.example.com ssh-ban[764]: Connection 103.41.124.32 Count: 4 Feb 10 09:55:47.562861 xxx.example.com ssh-ban[764]: SSHBANNED: 103.41.124.32
My script will print significant events to its output which systemd will then throw into the logs. This lets me see the original sshd printf timestamp and the time that my script (ssh-ban) saw it at.
In this case the first connection, from 104.236.247.20 was logged at 09:30:12.278631 but the script saw it at 09:55:47.560852 . That's a delay of 25 minutes.
Thanks for the tip on systemd-devel@ mailing list.
-wolfgang
Is this delayed behavior reproducible in a shell running journalctl -f ? Or only happens with the script log? If you do both at the same time, i.e. one instance of journalctl -f running in a terminal window, another instance executes within your Perl script, once the delay is happening does it appear in both the Perl generated log and the plain journalctl -f terminal window?
If you aren't using rsyslog, I wonder if you can use the single socket designed for this instead, if that's more reliable for something like this?
Anyway, feel free to ignore all of the above. I think you'll get better help from systemd-devel@.
Chris Murphy
Chris Murphy lists@colorremedies.com writes:
Is this delayed behavior reproducible in a shell running journalctl -f ? Or only happens with the script log?
I guess I need to get off my duff and run that test.
If you aren't using rsyslog, I wonder if you can use the single socket designed for this instead, if that's more reliable for something like this?
I didn't know about this. Thanks! I'll look into it and give it a shot.
Anyway, feel free to ignore all of the above. I think you'll get better help from systemd-devel@.
Your outlook is always appreciated. It is clear to me that you have a very good handle on the Linux internals. Thanks for thinking about this and speaking up!
-wolfgang
On Tue, Feb 10, 2015 at 5:22 PM, Wolfgang S. Rupprecht wolfgang.rupprecht@gmail.com wrote:
Chris Murphy lists@colorremedies.com writes:
Is this delayed behavior reproducible in a shell running journalctl -f ? Or only happens with the script log?
I guess I need to get off my duff and run that test.
If you aren't using rsyslog, I wonder if you can use the single socket designed for this instead, if that's more reliable for something like this?
I didn't know about this. Thanks! I'll look into it and give it a shot.
Yeah I don't know any details about this hook; but basically systemd-journald is the single source for logging now, but it provides a socket for rsyslog (and other conventional loggers that have been updated) can grab the stream and do their own thing like they have in the past. But there is only one such socket, and no two loggers can share it, as I understand things.
The typical use case for this is setting systemd-journald to use volatile storage instead of writing to disk, and then via this socket you'd have rsyslog write a conventional /var/log/messages file. However, in your case you don't have to disable persistent storage (to disk), you can keep the default behavior writing the journal to disk (or redirect it to a server for longer term storage); and then have your special case logger doing its thing locally. I have no idea if there's a perl hook for this socket or how difficult it'd be to build one. But I bet dollars to donuts someone on systemd-devel@ already knows the answer to this question.
Anyway, feel free to ignore all of the above. I think you'll get better help from systemd-devel@.
Your outlook is always appreciated. It is clear to me that you have a very good handle on the Linux internals. Thanks for thinking about this and speaking up!
I think I have a below average handle on most Linux internals. I mostly just have a vivid imagination!
Chris Murphy lists@colorremedies.com writes:
Yeah I don't know any details about this hook; but basically systemd-journald is the single source for logging now, but it provides a socket for rsyslog (and other conventional loggers that have been updated) can grab the stream and do their own thing like they have in the past. But there is only one such socket, and no two loggers can share it, as I understand things.
Thanks for the background.
I think I have a handle on what was happening. The delay wasn't in the input side of things but in my using the stdout/stderr to do a quick and dirty logging via systemd's redirection of stdout/stderr into the logs. Systemd seems to delay things a bit more as time goes on. Maybe a delay based on the total byte-count as some auto-adjusting efficiency hack?
In any case this shell script with the output to a gnome terminal showed only 3 or 4 milliseconds delay over a 24+ hr period.
journalctl -f -o short-precise -u apcupsd | \ while read line do date="$(date '+%b %d %H:%M:%S.%N')" echo "$date -- $line" done
Feb 12 17:32:50.127051639 -- Feb 12 17:32:50.124113 arbol.wsrcc.com apcupsd[1147]: 000.0,000.0,000.0,27.27,00.00,29.0,00.0,000.0,000.0,119.0,100.0,1 Feb 12 17:33:50.615503948 -- Feb 12 17:33:50.610058 arbol.wsrcc.com apcupsd[1147]: 000.0,000.0,000.0,27.27,00.00,29.0,00.0,000.0,000.0,119.0,100.0,0 Feb 12 17:35:04.240717368 -- Feb 12 17:35:04.237443 arbol.wsrcc.com apcupsd[1147]: 000.0,000.0,000.0,27.27,00.00,29.0,00.0,000.0,000.0,119.0,100.0,1 Feb 12 17:36:57.216696271 -- Feb 12 17:36:57.213388 arbol.wsrcc.com apcupsd[1147]: 000.0,000.0,000.0,27.27,00.00,28.0,00.0,000.0,000.0,119.0,100.0,0 Feb 12 17:37:57.702263685 -- Feb 12 17:37:57.699308 arbol.wsrcc.com apcupsd[1147]: 000.0,000.0,000.0,27.27,00.00,28.0,00.0,000.0,000.0,119.0,100.0,1 Feb 12 17:38:58.198243702 -- Feb 12 17:38:58.185059 arbol.wsrcc.com apcupsd[1147]: 000.0,000.0,000.0,27.27,00.00,28.0,00.0,000.0,000.0,119.0,100.0,0 Feb 12 17:39:58.669490321 -- Feb 12 17:39:58.666990 arbol.wsrcc.com apcupsd[1147]: 000.0,000.0,000.0,27.27,00.00,28.0,00.0,000.0,000.0,119.0,100.0,1 Feb 12 17:40:59.152423114 -- Feb 12 17:40:59.148905 arbol.wsrcc.com apcupsd[1147]: 000.0,000.0,000.0,27.27,00.00,28.0,00.0,000.0,000.0,119.0,100.0,0
That ultra-low delay compared to what I was seeing got me wondering about the delay in the output of my perl program.
Thanks for the ideas. Now that I replace the lazy logging in my perl script with perl calls to syslog(), things are logging so rapidly that the scripts log entries and sshd's log entries are interleaved. That's never happened before.
I think I have a below average handle on most Linux internals. I mostly just have a vivid imagination!
;-)
A good imagination is the most valuable debugging skill.
-wolfgang