Hi,
This patch splits abrt-dump-oops into a generic log watcher tool, abrt-watch-log, and oops finder, which retains the name abrt-dump-oops:
$ abrt-watch-log --help Usage: abrt-watch-log [-vsw] FILE PROG [ARGS]
Watch log file FILE, run PROG when it grows or is replaced
-v, --verbose Be verbose -s Log to syslog -w Do not exit, watch FILE for changes
$ abrt-dump-oops --help Usage: abrt-dump-oops [-vsox] [-d DIR]/[-D]
Extract oops from standard input
-v, --verbose Be verbose -s Log to syslog -o Print found oopses on standard output -d DIR Create ABRT dump in DIR for every oops found -D Same as -d DumpLocation, DumpLocation is specified in abrt.conf -x Make the problem directory world readable
Instead of running
abrt-dump-oops -rwxD /var/log/messages
about the same effect is achieved by
abrt-watch-log -w /var/log/messages -- abrt-dump-oops -xD
This opens up a possibility to have other log files watched, with different processing apps.
Problems: * How to kill a log watcher which watches particular file?
Ideas: * Add --match='STR' and --match-regex='REGEX' switches to limit the frequency of executions of the helper. * Remove -w switch as redundant? * Retain optional FILE parameter of abrt-dump-oops?
1.patch is preparatory, it removes old abrt-dump-oops.c - otherwise it's hard to read the changes. 2.patch is the gist of the change.
Please review.
On 03/14/2012 01:51 PM, Denys Vlasenko wrote:
This patch splits abrt-dump-oops into a generic log watcher tool, abrt-watch-log, and oops finder, which retains the name abrt-dump-oops.
...
Instead of running
abrt-dump-oops -rwxD /var/log/messages
about the same effect is achieved by
abrt-watch-log /var/log/messages -- abrt-dump-oops -xD
And here is the second user of abrt-watch-log: Xorg crash analyzer:
$ abrt-dump-xorg --help Usage: abrt-dump-xorg [-vsox] [-d DIR]/[-D] [FILE]
Extract Xorg crash from FILE (or standard input)
-v, --verbose Be verbose -s Log to syslog -o Print found crash data on standard output -d DIR Create ABRT dump in DIR for every crash found -D Same as -d DumpLocation, DumpLocation is specified in abrt.conf -x Make the problem directory world readable
Please review the attached patch.
Denys Vlasenko dvlasenk@redhat.com writes:
diff --git a/src/plugins/abrt-dump-xorg.c b/src/plugins/abrt-dump-xorg.c new file mode 100644 index 0000000..f90a6c0 --- /dev/null +++ b/src/plugins/abrt-dump-xorg.c .... .... .... +/* I want to use -Werror, but gcc-4.4 throws a curveball: + * "warning: ignoring return value of 'ftruncate', declared with attribute warn_unused_result" + * and (void) cast is not enough to shut it up! Oh God... + */ +#define IGNORE_RESULT(func_call) do { if (func_call) /* nothing */; } while (0)
src/daemon/abrt-dbus.c:28:#define IGNORE_RESULT(func_call) do { if (func_call) /* nothing */; } while (0) src/daemon/abrtd.c:34:#define IGNORE_RESULT(func_call) do { if (func_call) /* nothing */; } while (0) src/hooks/abrt-hook-ccpp.c:33:#define IGNORE_RESULT(func_call) do { if (func_call) /* nothing */; } while (0)
we already have it 3 times. move it to libabrt.h
+static char *skip_pfx(char *p) +{ + if (p[0] != '[') + return p; + char *q = strchr(p, ']'); + if (!q) + return p; + if (q[1] == ' ') + return q + 2; + return p; +}
we already have this code. little buried in src/lib/kernel.c:record_oops() 148 /* remove jiffies time stamp counter if present 149 * jiffies are unsigned long, so it can be 2^64 long, which is 150 * 20 decimal digits*/ 151 if (*c == '[') 152 { 153 char *c2 = strchr(c, '.'); 154 char *c3 = strchr(c, ']'); 155 if (c2 && c3 && (c2 < c3) && (c3-c) < 21 && (c2-c) < 8) 156 { 157 c = c3 + 1; 158 if (*c == ' ') 159 c++; 160 } 161 }
the refactorization would be nice
+static char *list2lines(GList *list) +{ + struct strbuf *s = strbuf_new(); + while (list) + { + strbuf_append_str(s, (char*)list->data); + strbuf_append_char(s, '\n');
strbuf_append_strf("%s\n", (char*)list->data); will do the job at one call
except those small things is every thing ok
On 03/15/2012 05:44 PM, Nikola Pajkovsky wrote:
Denys Vlasenkodvlasenk@redhat.com writes:
diff --git a/src/plugins/abrt-dump-xorg.c b/src/plugins/abrt-dump-xorg.c new file mode 100644 index 0000000..f90a6c0 --- /dev/null +++ b/src/plugins/abrt-dump-xorg.c .... .... .... +/* I want to use -Werror, but gcc-4.4 throws a curveball:
- "warning: ignoring return value of 'ftruncate', declared with attribute warn_unused_result"
- and (void) cast is not enough to shut it up! Oh God...
- */
+#define IGNORE_RESULT(func_call) do { if (func_call) /* nothing */; } while (0)
src/daemon/abrt-dbus.c:28:#define IGNORE_RESULT(func_call) do { if (func_call) /* nothing */; } while (0) src/daemon/abrtd.c:34:#define IGNORE_RESULT(func_call) do { if (func_call) /* nothing */; } while (0) src/hooks/abrt-hook-ccpp.c:33:#define IGNORE_RESULT(func_call) do { if (func_call) /* nothing */; } while (0)
we already have it 3 times. move it to libabrt.h
It's a compiler workaround/hack akin to those ugly (void)func() casts, only even worse looking one. I guarantee you: in 1-2 years when gcc changes again and we don't need IGNORE_RESULT(), we will start removing it. I don't want it to be in the header: it's ugly...
+static char *skip_pfx(char *p) +{
- if (p[0] != '[')
return p;
- char *q = strchr(p, ']');
- if (!q)
return p;
- if (q[1] == ' ')
return q + 2;
- return p;
+}
we already have this code. little buried in src/lib/kernel.c:record_oops() 148 /* remove jiffies time stamp counter if present 149 * jiffies are unsigned long, so it can be 2^64 long, which is 150 * 20 decimal digits*/ 151 if (*c == '[') 152 { 153 char *c2 = strchr(c, '.'); 154 char *c3 = strchr(c, ']'); 155 if (c2&& c3&& (c2< c3)&& (c3-c)< 21&& (c2-c)< 8) 156 { 157 c = c3 + 1; 158 if (*c == ' ') 159 c++; 160 } 161 }
the refactorization would be nice
Good idea.
+static char *list2lines(GList *list) +{
- struct strbuf *s = strbuf_new();
- while (list)
- {
strbuf_append_str(s, (char*)list->data);
strbuf_append_char(s, '\n');
strbuf_append_strf("%s\n", (char*)list->data); will do the job at one call
Yes. I did it this way because printf code is slow. I don't feel strongly about it though...
Denys Vlasenko dvlasenk@redhat.com writes:
+static void run_scanner_prog(int fd, struct stat *statbuf, char **prog) +{ + /* fstat(fd, &statbuf) was just done by caller */ + + off_t cur_pos = lseek(fd, 0, SEEK_CUR); + if (statbuf->st_size <= cur_pos) + { + /* If file was truncated, treat it as a new file. + * (changing inode# causes caller to think that file was closed or renamed) + */ + if (statbuf->st_size < cur_pos) + statbuf->st_ino++; + return; /* we are at EOF, nothing to do */ + } + + VERB3 log("File grew by %llu bytes, from %llu to %llu", + (long long)(statbuf->st_size - cur_pos), + (long long)(cur_pos), + (long long)(statbuf->st_size)); + + pid_t pid = vfork(); + if (pid < 0) + perror_msg_and_die("vfork"); + if (pid == 0) + { + xmove_fd(fd, STDIN_FILENO); + execvp(prog[0], prog); + perror_msg_and_die("Can't execute '%s'", prog[0]); + } + + safe_waitpid(pid, NULL, 0);
I think, we should here check returned value from execvp program. Program which SIGSEGV or fails from whatever reason is pointless to run.
I can't help myself, but I think I'm reading very stupid cron, which is activated by inotify :)
Apart of this, its looks good
On 03/15/2012 05:26 PM, Nikola Pajkovsky wrote:
Denys Vlasenkodvlasenk@redhat.com writes:
+static void run_scanner_prog(int fd, struct stat *statbuf, char **prog) +{
- /* fstat(fd,&statbuf) was just done by caller */
- off_t cur_pos = lseek(fd, 0, SEEK_CUR);
- if (statbuf->st_size<= cur_pos)
- {
/* If file was truncated, treat it as a new file.
* (changing inode# causes caller to think that file was closed or renamed)
*/
if (statbuf->st_size< cur_pos)
statbuf->st_ino++;
return; /* we are at EOF, nothing to do */
- }
- VERB3 log("File grew by %llu bytes, from %llu to %llu",
(long long)(statbuf->st_size - cur_pos),
(long long)(cur_pos),
(long long)(statbuf->st_size));
- pid_t pid = vfork();
- if (pid< 0)
perror_msg_and_die("vfork");
- if (pid == 0)
- {
xmove_fd(fd, STDIN_FILENO);
execvp(prog[0], prog);
perror_msg_and_die("Can't execute '%s'", prog[0]);
- }
- safe_waitpid(pid, NULL, 0);
I think, we should here check returned value from execvp program. Program which SIGSEGV or fails from whatever reason is pointless to run.
- but you need to run it to find out that it doesn't work, so I don't understand your proposal - you mean if we detect it failed once then disable it and never try to run it again?
I can't help myself, but I think I'm reading very stupid cron, which is activated by inotify :)
- yes, because we want it to be event driven not timer driver..
Apart of this, its looks good
Jiri Moskovcak jmoskovc@redhat.com writes:
On 03/15/2012 05:26 PM, Nikola Pajkovsky wrote:
Denys Vlasenkodvlasenk@redhat.com writes:
+static void run_scanner_prog(int fd, struct stat *statbuf, char **prog) +{
- /* fstat(fd,&statbuf) was just done by caller */
- off_t cur_pos = lseek(fd, 0, SEEK_CUR);
- if (statbuf->st_size<= cur_pos)
- {
/* If file was truncated, treat it as a new file.
* (changing inode# causes caller to think that file was closed or renamed)
*/
if (statbuf->st_size< cur_pos)
statbuf->st_ino++;
return; /* we are at EOF, nothing to do */
- }
- VERB3 log("File grew by %llu bytes, from %llu to %llu",
(long long)(statbuf->st_size - cur_pos),
(long long)(cur_pos),
(long long)(statbuf->st_size));
- pid_t pid = vfork();
- if (pid< 0)
perror_msg_and_die("vfork");
- if (pid == 0)
- {
xmove_fd(fd, STDIN_FILENO);
execvp(prog[0], prog);
perror_msg_and_die("Can't execute '%s'", prog[0]);
- }
- safe_waitpid(pid, NULL, 0);
I think, we should here check returned value from execvp program. Program which SIGSEGV or fails from whatever reason is pointless to run.
- but you need to run it to find out that it doesn't work, so I don't
understand your proposal - you mean if we detect it failed once then disable it and never try to run it again?
try to think what would happen if you have an app with auto start and it fails to start or do some operation (we have inotify). running it over and over doesn't bring you anything good. in worst case it could eat cpu, memory or i/o.
so if execvp(prog[0], prog) return on pid exit != 0, we should stop watch-log and report to abrt; hey man, we have some hiccups please report to abrt <email>
I can't help myself, but I think I'm reading very stupid cron, which is activated by inotify :)
- yes, because we want it to be event driven not timer driver..
Apart of this, its looks good
On 03/16/2012 10:53 AM, Nikola Pajkovsky wrote:
Jiri Moskovcakjmoskovc@redhat.com writes:
On 03/15/2012 05:26 PM, Nikola Pajkovsky wrote:
Denys Vlasenkodvlasenk@redhat.com writes:
- pid_t pid = vfork();
- if (pid< 0)
perror_msg_and_die("vfork");
- if (pid == 0)
- {
xmove_fd(fd, STDIN_FILENO);
execvp(prog[0], prog);
perror_msg_and_die("Can't execute '%s'", prog[0]);
- }
- safe_waitpid(pid, NULL, 0);
I think, we should here check returned value from execvp program. Program which SIGSEGV or fails from whatever reason is pointless to run.
- but you need to run it to find out that it doesn't work, so I don't
understand your proposal - you mean if we detect it failed once then disable it and never try to run it again?
try to think what would happen if you have an app with auto start and it fails to start or do some operation (we have inotify). running it over and over doesn't bring you anything good. in worst case it could eat cpu, memory or i/o.
It will start at most once per two seconds. So, yes, it won't be nice, but not disastrous either.
so if execvp(prog[0], prog) return on pid exit != 0, we should stop watch-log
Maybe. Maybe not. Maybe we need to make it optional? (--die-on-worker-failure?)
Denys Vlasenko dvlasenk@redhat.com writes:
On 03/16/2012 10:53 AM, Nikola Pajkovsky wrote:
Jiri Moskovcakjmoskovc@redhat.com writes:
On 03/15/2012 05:26 PM, Nikola Pajkovsky wrote:
Denys Vlasenkodvlasenk@redhat.com writes:
- pid_t pid = vfork();
- if (pid< 0)
perror_msg_and_die("vfork");
- if (pid == 0)
- {
xmove_fd(fd, STDIN_FILENO);
execvp(prog[0], prog);
perror_msg_and_die("Can't execute '%s'", prog[0]);
- }
- safe_waitpid(pid, NULL, 0);
I think, we should here check returned value from execvp program. Program which SIGSEGV or fails from whatever reason is pointless to run.
- but you need to run it to find out that it doesn't work, so I don't
understand your proposal - you mean if we detect it failed once then disable it and never try to run it again?
try to think what would happen if you have an app with auto start and it fails to start or do some operation (we have inotify). running it over and over doesn't bring you anything good. in worst case it could eat cpu, memory or i/o.
It will start at most once per two seconds. So, yes, it won't be nice, but not disastrous either.
so if execvp(prog[0], prog) return on pid exit != 0, we should stop watch-log
Maybe. Maybe not. Maybe we need to make it optional? (--die-on-worker-failure?)
yup, I like the idea about --die-on-worker-failure.
crash-catcher@lists.fedorahosted.org