Re: [PATCH RFC] memtrace: Trace peak memory allocation of task and kernel module

Thursday, 8 December 2016

Hi Xunlei,

On Friday 09 December 2016 08:17 AM, Xunlei Pang wrote:
...
 On 12/08/2016 at 02:47 PM, Pratyush Anand wrote:
> This tool enables few tracepoints like mm_page_alloc, mm_page_free,
> module_load and module_put into the kernel and then keeps track of the peak
> memory usage by a task or kernel module.
>
> Signed-off-by: Pratyush Anand <panand(a)redhat.com&gt;
> ---
>
> Hi All,
>
> This is an initial version of c code, which is yet to take shape of a tool,
> is open for your review comments. My idea is to have a systemd service
> which launches this tool as early as possible after kernel boot. It will
> start tracking peak memory usage thereafter, until it receives SIGTERM
> signal, which probably we can send by systemd stop service.
>

 [snip]

> + *
> + * usage:
> + * # gcc -o memtrace memtrace.c
> + * # ./memtrace &
> + * (if tracing directory is not mounted at /sys/kernel/debug/tracing/ then
> + * pass path of tracing directory as argument like following)
> + * # ./memtrace /sys/kernel/tracing/ &
> + * (to get current stats on screen)
> + * # killall -s SIGUSR1 memtrace
> + * (to save current stats in file /tmp/mem_debug_log)
> + * # killall -s SIGUSR2 memtrace
> + * (to terminate the spplication and to save current stats in file)
> + * # killall -s SIGTERM memtrace
> + */
> +
> +#include <fcntl.h>
> +#include <search.h>
> +#include <signal.h>
> +#include <stdbool.h>
> +#include <stdio.h>
> +#include <stdlib.h>
> +#include <string.h>

 Hi Pratyush,

 Thanks for the job, I think it's useful, I once used it to analyze the memory
consumption of lvm2 tools.
 And we need such a dedicated tool under the restricted kdump environment to debug various
memory issues.

 Here are some comments. 
Thanks for your comments.

...

 I think you missed the following headers, I got some gcc warnings when compiling: 
Hummm..I did not get these warnings..
...
 #include <ctype.h>
 #include <unistd.h>

 Also some other program warnings when doing "gcc -Wall": 
Yes, I did not use -Wall. Thanks, I will fix them.

...
 memtrace.c: In function 'read_next_entry':
 memtrace.c:161:15: warning: format '%d' expects argument of type 'int',
but argument 2 has type 'int *' [-Wformat=]
   printf("cpu:%d\n", cpu);
                ^
 memtrace.c: In function 'process_entries':
 memtrace.c:329:15: warning: unused variable 'pidno' [-Wunused-variable]
   int fd, ret, pidno, cpu, memory;
                ^~~~~
 memtrace.c: In function 'main':
 memtrace.c:437:12: warning: unused variable 'tid' [-Wunused-variable]
   pthread_t tid;
             ^~~

> +
> +#define MAX_TASK_TO_MONITOR	200

 In the shell world, there are many tasks forked, for example the following will result in
failures:
 for i in $(seq 1 1 210); do ls > /dev/null; done 
I agree, and so I had written these as open points in comments. 
Probably, I will allocate all these dynamically.

...

 This is very common for shell, I would suggest to also monitor
"sched:sched_process_exit" event,
 output the result and free the trace_entry accordingly. 
We can free them only after printing the stats for user. Now that we 
need to see, if we print the stat for a task when it exits as well or not?

...

> +#define MAX_NUMBER_OF_CPUS	128
> +
> +#define DEFAULT_LOG_PATH "/tmp/mem_trace_log"

 I'd suggest to output the result to stdout which won't surprise users, like
"where is the result?"
 We can rely on ">>" of shell to do the redirection as needed. 
I have both the mechanism. So SIGUSER1 prints to stdout and SIGUSR2 
prints to file. These things are mentioned in the top comment of file as 
well as in README.

https://github.com/pratyushanand/memtrace/blob/master/README.md

However, I agree that we can have even better way for user interface.

...

> +
> +struct trace_entry {
> +	char	key[64];
> +	char	comm[17];
> +	int	peak_memory;
> +	int	memory;
> +};
> +
> +struct debug_trace_info {
> +	char	trace_path[64];
> +	char	tracing_on_path[64];
> +	char	set_event_path[64];
> +	char	events_enable_path[64];
> +	char	cur_mod[MAX_NUMBER_OF_CPUS][64];

 What's the purpose of cur_mod here?
 We can easily move module info to individual "trace_entry" using
"task-pid" as the hint
 like that I did in the dracut "memtrace-ko.sh". 
key for allocation of a new entry is module+pid. Because we can have 
multiple module insertion for the same modprobe.

This was necessary because I was reading trace_pipe. Now that we can 
also read per_cpu/trace_pipe_raw, probably this can be avoided.

In the total trace_pipe we can have something like this:

comm:modprobe -> module_load for mod_1 at cpu 1
comm:modprobe -> module_load  for mod_2 at cpu 3
comm:modprobe -> mm_page_alloc for mod_1 at cpu 1

So, if we not keep cur_mod corresponding to each cpu we will count above 
mm_page_alloc for mod_2, which in actual was for mod_1.

Anyway, with per_cpu data we will not need this.

~Pratyush

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

Re: [PATCH RFC] memtrace: Trace peak memory allocation of task and kernel module