Hi Xunlei,
On Friday 09 December 2016 08:17 AM, Xunlei Pang wrote:
On 12/08/2016 at 02:47 PM, Pratyush Anand wrote:
> This tool enables few tracepoints like mm_page_alloc, mm_page_free,
> module_load and module_put into the kernel and then keeps track of the peak
> memory usage by a task or kernel module.
>
> Signed-off-by: Pratyush Anand <panand(a)redhat.com>
> ---
>
> Hi All,
>
> This is an initial version of c code, which is yet to take shape of a tool,
> is open for your review comments. My idea is to have a systemd service
> which launches this tool as early as possible after kernel boot. It will
> start tracking peak memory usage thereafter, until it receives SIGTERM
> signal, which probably we can send by systemd stop service.
>
[snip]
> + *
> + * usage:
> + * # gcc -o memtrace memtrace.c
> + * # ./memtrace &
> + * (if tracing directory is not mounted at /sys/kernel/debug/tracing/ then
> + * pass path of tracing directory as argument like following)
> + * # ./memtrace /sys/kernel/tracing/ &
> + * (to get current stats on screen)
> + * # killall -s SIGUSR1 memtrace
> + * (to save current stats in file /tmp/mem_debug_log)
> + * # killall -s SIGUSR2 memtrace
> + * (to terminate the spplication and to save current stats in file)
> + * # killall -s SIGTERM memtrace
> + */
> +
> +#include <fcntl.h>
> +#include <search.h>
> +#include <signal.h>
> +#include <stdbool.h>
> +#include <stdio.h>
> +#include <stdlib.h>
> +#include <string.h>
Hi Pratyush,
Thanks for the job, I think it's useful, I once used it to analyze the memory
consumption of lvm2 tools.
And we need such a dedicated tool under the restricted kdump environment to debug various
memory issues.
Here are some comments.
Thanks for your comments.
I think you missed the following headers, I got some gcc warnings when compiling:
Hummm..I did not get these warnings..
#include <ctype.h>
#include <unistd.h>
Also some other program warnings when doing "gcc -Wall":
Yes, I did not use -Wall. Thanks, I will fix them.
memtrace.c: In function 'read_next_entry':
memtrace.c:161:15: warning: format '%d' expects argument of type 'int',
but argument 2 has type 'int *' [-Wformat=]
printf("cpu:%d\n", cpu);
^
memtrace.c: In function 'process_entries':
memtrace.c:329:15: warning: unused variable 'pidno' [-Wunused-variable]
int fd, ret, pidno, cpu, memory;
^~~~~
memtrace.c: In function 'main':
memtrace.c:437:12: warning: unused variable 'tid' [-Wunused-variable]
pthread_t tid;
^~~
> +
> +#define MAX_TASK_TO_MONITOR 200
In the shell world, there are many tasks forked, for example the following will result in
failures:
for i in $(seq 1 1 210); do ls > /dev/null; done
I agree, and so I had written these as open points in comments.
Probably, I will allocate all these dynamically.
This is very common for shell, I would suggest to also monitor
"sched:sched_process_exit" event,
output the result and free the trace_entry accordingly.
We can free them only after printing the stats for user. Now that we
need to see, if we print the stat for a task when it exits as well or not?
> +#define MAX_NUMBER_OF_CPUS 128
> +
> +#define DEFAULT_LOG_PATH "/tmp/mem_trace_log"
I'd suggest to output the result to stdout which won't surprise users, like
"where is the result?"
We can rely on ">>" of shell to do the redirection as needed.
I have both the mechanism. So SIGUSER1 prints to stdout and SIGUSR2
prints to file. These things are mentioned in the top comment of file as
well as in README.
https://github.com/pratyushanand/memtrace/blob/master/README.md
However, I agree that we can have even better way for user interface.
> +
> +struct trace_entry {
> + char key[64];
> + char comm[17];
> + int peak_memory;
> + int memory;
> +};
> +
> +struct debug_trace_info {
> + char trace_path[64];
> + char tracing_on_path[64];
> + char set_event_path[64];
> + char events_enable_path[64];
> + char cur_mod[MAX_NUMBER_OF_CPUS][64];
What's the purpose of cur_mod here?
We can easily move module info to individual "trace_entry" using
"task-pid" as the hint
like that I did in the dracut "memtrace-ko.sh".
key for allocation of a new entry is module+pid. Because we can have
multiple module insertion for the same modprobe.
This was necessary because I was reading trace_pipe. Now that we can
also read per_cpu/trace_pipe_raw, probably this can be avoided.
In the total trace_pipe we can have something like this:
comm:modprobe -> module_load for mod_1 at cpu 1
comm:modprobe -> module_load for mod_2 at cpu 3
comm:modprobe -> mm_page_alloc for mod_1 at cpu 1
So, if we not keep cur_mod corresponding to each cpu we will count above
mm_page_alloc for mod_2, which in actual was for mod_1.
Anyway, with per_cpu data we will not need this.
~Pratyush