Hi,
Disclaimer: This is clearly marked OT, with the only connection to this group being the fact that I am running F23 on a 20-core Dell T5810 @3.1 GHz each and 64 GiB memory. My OT queries over the past 13 years (almost) here have elicited great wealth of information so I am posing here.
So, I am trying to compare two kinds of methods in a C program. Both are written as efficiently as possible (assumed because no point otherwise). I would like to know which of these is more efficient. I have been using get_rusage but I was wondering whether there is a better way?
Separately, is there a way to get the number of floating point instructions in C? Both FLOPS and MIPS?
Many thanks and best wishes, Ranjan
On Fri, 26 Feb 2016 08:31:07 -0600 Ranjan Maitra wrote:
I have been using get_rusage but I was wondering whether there is a better way?
You most accurate wall time is going to come from looking at the TSC register (if you are on an x86 processor)
Thanks!
On Fri, 26 Feb 2016 10:16:18 -0500 Tom Horsley horsley1953@gmail.com wrote:
On Fri, 26 Feb 2016 08:31:07 -0600 Ranjan Maitra wrote:
I have been using get_rusage but I was wondering whether there is a better way?
You most accurate wall time is going to come from looking at the TSC register (if you are on an x86 processor)
https://en.wikipedia.org/wiki/Time_Stamp_Counter
Yes, I am on an x86_64 Intel processor.
So, here is the problem I am having. I tried running the following timing program:
https://github.com/ChisholmKyle/PosixMachTiming
with the value in the loop increased to UINT_MAX to get better precision. (It also uses clock_gettime().)
And then I run it twice but get different precisions. How does this happen? The number of operations are exactly the same (or should be).
Many thanks and best wishes, Ranjan
____________________________________________________________ FREE ONLINE PHOTOSHARING - Share your photos online with your friends and family! Visit http://www.inbox.com/photosharing to find out more!
On Fri, 26 Feb 2016 09:51:24 -0600 Ranjan Maitra wrote:
How does this happen? The number of operations are exactly the same (or should be).
The number of operations in your program are the same, but your program is running on the same machine as the linux OS which has deamons running in the background, and may even be stopping to page in code your program needs, or grow pages as it allocates memory. Vast numbers of things affect timing. Even the stupid dynamic library load address randomization linux does can result in totally different cache hits in memory. The list goes on and on...
Apart from linux, most motherboards these days have SMI interrupts happening behind everyone's back which leave missing chunks of time no one can account for.
On 02/26/2016 10:44 AM, Tom Horsley wrote:
On Fri, 26 Feb 2016 09:51:24 -0600 Ranjan Maitra wrote:
How does this happen? The number of operations are exactly the same (or should be).
The number of operations in your program are the same, but your program is running on the same machine as the linux OS which has deamons running in the background, and may even be stopping to page in code your program needs, or grow pages as it allocates memory. Vast numbers of things affect timing. Even the stupid dynamic library load address randomization linux does can result in totally different cache hits in memory. The list goes on and on...
Apart from linux, most motherboards these days have SMI interrupts happening behind everyone's back which leave missing chunks of time no one can account for.
Timings based on realtime clocks are not same as the per task timers which are incremented only when the task is actually executing.
On Fri, 26 Feb 2016 11:02:41 -0700 jd1008 jd1008@gmail.com wrote:
On 02/26/2016 10:44 AM, Tom Horsley wrote:
On Fri, 26 Feb 2016 09:51:24 -0600 Ranjan Maitra wrote:
How does this happen? The number of operations are exactly the same (or should be).
The number of operations in your program are the same, but your program is running on the same machine as the linux OS which has deamons running in the background, and may even be stopping to page in code your program needs, or grow pages as it allocates memory. Vast numbers of things affect timing. Even the stupid dynamic library load address randomization linux does can result in totally different cache hits in memory. The list goes on and on...
Apart from linux, most motherboards these days have SMI interrupts happening behind everyone's back which leave missing chunks of time no one can account for.
Timings based on realtime clocks are not same as the per task timers which are incremented only when the task is actually executing.
Hi,
I am not interested in realtime. I am interested in time spent by the processor in executing a set of operations (which can include memory allocation and deallocation) which have to do (only) with the operations being executed. Is this at all possible?
Many thanks, Ranjan
____________________________________________________________ FREE ONLINE PHOTOSHARING - Share your photos online with your friends and family! Visit http://www.inbox.com/photosharing to find out more!
On Fri, 26 Feb 2016 12:44:10 -0500 Tom Horsley horsley1953@gmail.com wrote:
On Fri, 26 Feb 2016 09:51:24 -0600 Ranjan Maitra wrote:
How does this happen? The number of operations are exactly the same (or should be).
The number of operations in your program are the same, but your program is running on the same machine as the linux OS which has deamons running in the background, and may even be stopping to page in code your program needs, or grow pages as it allocates memory. Vast numbers of things affect timing. Even the stupid dynamic library load address randomization linux does can result in totally different cache hits in memory. The list goes on and on...
Apart from linux, most motherboards these days have SMI interrupts happening behind everyone's back which leave missing chunks of time no one can account for.
Thank you! So, is there any way that these other processes can be separated out in the time calculations? I can not come up with definitive statements unless I can do these comparisons in a fair manner.
Best wishes. Ranjan
____________________________________________________________ Can't remember your password? Do you need a strong and secure password? Use Password manager! It stores your passwords & protects your account. Check it out at http://mysecurelogon.com/password-manager
On 02/26/2016 11:06 AM, Ranjan Maitra wrote:
On Fri, 26 Feb 2016 12:44:10 -0500 Tom Horsley horsley1953@gmail.com wrote:
On Fri, 26 Feb 2016 09:51:24 -0600 Ranjan Maitra wrote:
How does this happen? The number of operations are exactly the same (or should be).
The number of operations in your program are the same, but your program is running on the same machine as the linux OS which has deamons running in the background, and may even be stopping to page in code your program needs, or grow pages as it allocates memory. Vast numbers of things affect timing. Even the stupid dynamic library load address randomization linux does can result in totally different cache hits in memory. The list goes on and on...
Apart from linux, most motherboards these days have SMI interrupts happening behind everyone's back which leave missing chunks of time no one can account for.
Thank you! So, is there any way that these other processes can be separated out in the time calculations? I can not come up with definitive statements unless I can do these comparisons in a fair manner.
Best wishes. Ranjan
__
Hi Ranjan, you have to use virtual timers instead of hard clock timers.
Usually since you just want process time, then you start the itimer at the very start of the process, and give it some very long time to expire (say as long as max time). The just before call to exit, query the itimer structure values and print them out. Cheers,
JD
On 02/26/2016 11:28 AM, jd1008 wrote:
On 02/26/2016 11:06 AM, Ranjan Maitra wrote:
On Fri, 26 Feb 2016 12:44:10 -0500 Tom Horsley horsley1953@gmail.com wrote:
On Fri, 26 Feb 2016 09:51:24 -0600 Ranjan Maitra wrote:
How does this happen? The number of operations are exactly the same (or should be).
The number of operations in your program are the same, but your program is running on the same machine as the linux OS which has deamons running in the background, and may even be stopping to page in code your program needs, or grow pages as it allocates memory. Vast numbers of things affect timing. Even the stupid dynamic library load address randomization linux does can result in totally different cache hits in memory. The list goes on and on...
Apart from linux, most motherboards these days have SMI interrupts happening behind everyone's back which leave missing chunks of time no one can account for.
Thank you! So, is there any way that these other processes can be separated out in the time calculations? I can not come up with definitive statements unless I can do these comparisons in a fair manner.
Best wishes. Ranjan
__
Hi Ranjan, you have to use virtual timers instead of hard clock timers.
Usually since you just want process time, then you start the itimer at the very start of the process, and give it some very long time to expire (say as long as max time). The just before call to exit, query the itimer structure values and print them out. Cheers,
JD
Please read/ usr/include/linux/time.h and /usr/include/sys/time.h and look for ITIMER_VIRTUAL and which structure member must be set to ITIMER_VIRTUAL.
Cheers,
JD
__
Hi Ranjan, you have to use virtual timers instead of hard clock timers.
Usually since you just want process time, then you start the itimer at the very start of the process, and give it some very long time to expire (say as long as max time). The just before call to exit, query the itimer structure values and print them out. Cheers,
JD
Please read/ usr/include/linux/time.h and /usr/include/sys/time.h and look for ITIMER_VIRTUAL and which structure member must be set to ITIMER_VIRTUAL.
"JD",
Thanks very much! I will look for examples on how to write this code right now and see if this is what I want.
Best wishes, Ranjan
____________________________________________________________ GET FREE 5GB EMAIL - Check out spam free email with many cool features! Visit http://www.inbox.com/email to find out more!
On Fri, 2016-02-26 at 12:06 -0600, Ranjan Maitra wrote:
Thank you! So, is there any way that these other processes can be separated out in the time calculations? I can not come up with definitive statements unless I can do these comparisons in a fair manner.
Not really. The days of purely deterministic computing are long gone. What you can do is what most benchmarks do, run the program many times and calculate an average time. Depending what your program does though you may have to take steps to ensure the second and later runs are truly representative such as making sure nothing is still cached in theCPU cache, pages purged in memory, residing in the disc cache, etc. Be sure to consider everything that might make the second run go faster than the first.
Those sort of issues are why the literature on benchmarking gets complicated really fast.
On Fri, 2016-02-26 at 08:31 -0600, Ranjan Maitra wrote:
Hi,
Disclaimer: This is clearly marked OT, with the only connection to this group being the fact that I am running F23 on a 20-core Dell T5810 @3.1 GHz each and 64 GiB memory. My OT queries over the past 13 years (almost) here have elicited great wealth of information so I am posing here.
So, I am trying to compare two kinds of methods in a C program. Both are written as efficiently as possible (assumed because no point otherwise). I would like to know which of these is more efficient. I have been using get_rusage but I was wondering whether there is a better way?
I don't claim to be an expert but at first glance I wonder if you've defined what you mean by efficiency. Execution time? Program size? Memory locality (affects virtual memory performance) etc.
You might want to read some of the extensive literature on benchmarking.
Separately, is there a way to get the number of floating point instructions in C? Both FLOPS and MIPS?
Instructions executed on every code path? On the most likely code path?
Remember: MIPS = Meaningless Indicator of the Performance of Systems.
poc
On Fri, 26 Feb 2016 17:29:58 +0000 "Patrick O'Callaghan" pocallaghan@gmail.com wrote:
On Fri, 2016-02-26 at 08:31 -0600, Ranjan Maitra wrote:
Hi,
Disclaimer: This is clearly marked OT, with the only connection to this group being the fact that I am running F23 on a 20-core Dell T5810 @3.1 GHz each and 64 GiB memory. My OT queries over the past 13 years (almost) here have elicited great wealth of information so I am posing here.
So, I am trying to compare two kinds of methods in a C program. Both are written as efficiently as possible (assumed because no point otherwise). I would like to know which of these is more efficient. I have been using get_rusage but I was wondering whether there is a better way?
I don't claim to be an expert but at first glance I wonder if you've defined what you mean by efficiency. Execution time? Program size? Memory locality (affects virtual memory performance) etc.
I responded to this on another thread, but I am interested in the amount of time taken by a program in executing a set of instructions. If that includes memory allocation, etc, then fine.
You might want to read some of the extensive literature on benchmarking.
I have found the literature extremely confusing and unclear.
Separately, is there a way to get the number of floating point instructions in C? Both FLOPS and MIPS?
Instructions executed on every code path? On the most likely code path?
Remember: MIPS = Meaningless Indicator of the Performance of Systems.
OK, is there a way to calculate the FLOP instructions in C?
Many thanks and best wishes, Ranjan
____________________________________________________________ FREE ONLINE PHOTOSHARING - Share your photos online with your friends and family! Visit http://www.inbox.com/photosharing to find out more!
On Fri, 2016-02-26 at 12:40 -0600, Ranjan Maitra wrote:
I don't claim to be an expert but at first glance I wonder if
you've
defined what you mean by efficiency. Execution time? Program size? Memory locality (affects virtual memory performance) etc.
I responded to this on another thread, but I am interested in the amount of time taken by a program in executing a set of instructions. If that includes memory allocation, etc, then fine.
Sorry, still not clear. Memory allocation is done by a combination of library functions and system calls. You need to know what you're trying to measure if your measurements are going to mean anything. The numbers you get can be change a lot between a first run and subsequent runs of the same program (because of cache effects), so you need to decide how much that matters.
You might want to read some of the extensive literature on benchmarking.
I have found the literature extremely confusing and unclear.
Perhaps if you asked specific questions about what you find confusing, people could help you understand.
OK, is there a way to calculate the FLOP instructions in C?
What do you mean "calculate the FLOP instructions"? Are you trying to evaluate an algorithm or benchmark an implementation? These are two different things. You can compare algorithms theoretically or by measurement, but only measurement will work for implementations. You can count each floating point instruction generated by the compiler by looking at the binary code, but maybe you want to count every instruction during execution, or see which instructions take longer, or just measure the total execution time of the algorithm on a given set of input. All of these things are different and need different techniques to measure them. The degree of accuracy you need will also have an effect.
poc
Thanks again!
OK, is there a way to calculate the FLOP instructions in C?
What do you mean "calculate the FLOP instructions"? Are you trying to evaluate an algorithm or benchmark an implementation? These are two different things. You can compare algorithms theoretically or by measurement, but only measurement will work for implementations. You can count each floating point instruction generated by the compiler by looking at the binary code, but maybe you want to count every instruction during execution, or see which instructions take longer, or just measure the total execution time of the algorithm on a given set of input.
Indeed, I wanted to measure the total execution time of the algorithms (i.e. difference in CPU time after and before the function executing the algorithm is called) and independent of extraneous issues such as what other process is running at some time, etc. I wanted to see if, in some case, some of the available theoretical guarantees actually hold or not.
Thanks again!
Best wishes, Ranjan
____________________________________________________________ Can't remember your password? Do you need a strong and secure password? Use Password manager! It stores your passwords & protects your account. Check it out at http://mysecurelogon.com/password-manager
On 02/26/2016 05:28 PM, Ranjan Maitra wrote:
Thanks again!
OK, is there a way to calculate the FLOP instructions in C?
What do you mean "calculate the FLOP instructions"? Are you trying to evaluate an algorithm or benchmark an implementation? These are two different things. You can compare algorithms theoretically or by measurement, but only measurement will work for implementations. You can count each floating point instruction generated by the compiler by looking at the binary code, but maybe you want to count every instruction during execution, or see which instructions take longer, or just measure the total execution time of the algorithm on a given set of input.
Indeed, I wanted to measure the total execution time of the algorithms (i.e. difference in CPU time after and before the function executing the algorithm is called) and independent of extraneous issues such as what other process is running at some time, etc. I wanted to see if, in some case, some of the available theoretical guarantees actually hold or not.
Thanks again!
Best wishes, Ranjan
Did you see my reply? It shows you the basics of what you need to use in order to get exactly what you want.
On Fri, 2016-02-26 at 17:31 -0700, jd1008 wrote:
On 02/26/2016 05:28 PM, Ranjan Maitra wrote:
Thanks again!
OK, is there a way to calculate the FLOP instructions in C?
What do you mean "calculate the FLOP instructions"? Are you trying to evaluate an algorithm or benchmark an implementation? These are two different things. You can compare algorithms theoretically or by measurement, but only measurement will work for implementations. You can count each floating point instruction generated by the compiler by looking at the binary code, but maybe you want to count every instruction during execution, or see which instructions take longer, or just measure the total execution time of the algorithm on a given set of input.
Indeed, I wanted to measure the total execution time of the algorithms (i.e. difference in CPU time after and before the function executing the algorithm is called) and independent of extraneous issues such as what other process is running at some time, etc. I wanted to see if, in some case, some of the available theoretical guarantees actually hold or not.
Thanks again!
Best wishes, Ranjan
Did you see my reply? It shows you the basics of what you need to use in order to get exactly what you want.
Note that the accuracy of this depends on several factors, such as how long the measured interval is compared to the basic unit of timekeeping (IOW if you want to measure the execution time of a short sequence of instructions, you need to loop a few million times and divide the result), and how much overhead is incurred in calling the timing routines.
poc
Indeed, I wanted to measure the total execution time of the algorithms (i.e. difference in CPU time after and before the function executing the algorithm is called) and independent of extraneous issues such as what other process is running at some time, etc. I wanted to see if, in some case, some of the available theoretical guarantees actually hold or not.
Thanks again!
Best wishes, Ranjan
Did you see my reply? It shows you the basics of what you need to use in order to get exactly what you want.
Note that the accuracy of this depends on several factors, such as how long the measured interval is compared to the basic unit of timekeeping (IOW if you want to measure the execution time of a short sequence of instructions, you need to loop a few million times and divide the result), and how much overhead is incurred in calling the timing routines.
Btw, this webpage https://stackoverflow.com/questions/23847588/how-do-i-use-the-functions-seti... says that getitimer and setitimer are obsolete and I should be using timer_gettime() and timer_settime() instead: however, is it possible to handle a virtual clock with these new functions?
Best wishes, Ranjan
____________________________________________________________ FREE ONLINE PHOTOSHARING - Share your photos online with your friends and family! Visit http://www.inbox.com/photosharing to find out more!
On 02/26/2016 06:00 PM, Ranjan Maitra wrote:
Indeed, I wanted to measure the total execution time of the algorithms (i.e. difference in CPU time after and before the function executing the algorithm is called) and independent of extraneous issues such as what other process is running at some time, etc. I wanted to see if, in some case, some of the available theoretical guarantees actually hold or not.
Thanks again!
Best wishes, Ranjan
Did you see my reply? It shows you the basics of what you need to use in order to get exactly what you want.
Note that the accuracy of this depends on several factors, such as how long the measured interval is compared to the basic unit of timekeeping (IOW if you want to measure the execution time of a short sequence of instructions, you need to loop a few million times and divide the result), and how much overhead is incurred in calling the timing routines.
Btw, this webpage https://stackoverflow.com/questions/23847588/how-do-i-use-the-functions-seti... says that getitimer and setitimer are obsolete and I should be using timer_gettime() and timer_settime() instead: however, is it possible to handle a virtual clock with these new functions?
Best wishes, Ranjan
the manp ages for the posix compliant interfaces for timer_settime and timer_gettime do not even mention the word virtual; at least not in my fedora 22 manpages. Whereas the still valid setitimer and getitimer CAN get the per running process virtual time, which will exclude all system time and interrupt handling time. There are plenty of examples online for using setitimer() and getitimer().
Note that the accuracy of this depends on several factors, such as how long the measured interval is compared to the basic unit of timekeeping (IOW if you want to measure the execution time of a short sequence of instructions, you need to loop a few million times and divide the result), and how much overhead is incurred in calling the timing routines.
Btw, this webpage https://stackoverflow.com/questions/23847588/how-do-i-use-the-functions-seti... says that getitimer and setitimer are obsolete and I should be using timer_gettime() and timer_settime() instead: however, is it possible to handle a virtual clock with these new functions?
Best wishes, Ranjan
the manp ages for the posix compliant interfaces for timer_settime and timer_gettime do not even mention the word virtual; at least not in my fedora 22 manpages. Whereas the still valid setitimer and getitimer CAN get the per running process virtual time, which will exclude all system time and interrupt handling time. There are plenty of examples online for using setitimer() and getitimer().
OK, trying an example:
#include <sys/time.h> #include <stdlib.h> #include <stdio.h> #include <limits.h>
#define INTERVAL 1 /* number of milliseconds to go off */
int main() { double sum = 0; struct itimerval initial, updated;
initial.it_value.tv_sec = INTERVAL/1000000; initial.it_value.tv_usec = (INTERVAL/1000000) * 1000000; initial.it_interval = initial.it_value;
if (setitimer(ITIMER_VIRTUAL, &initial, NULL) == -1) { perror("error calling setitimer()"); exit(1); }
for (unsigned int i; i < UINT_MAX; i++) { sum += 1./i; } if (getitimer(ITIMER_VIRTUAL, &updated) == -1) { perror("error calling getitimer()"); exit(1); }
printf("Time started = %ld\n; Time taken = %ld\n: Time taken = %ld\n", initial.it_value.tv_usec, updated.it_value.tv_usec, updated.it_value.tv_usec - initial.it_value.tv_usec); return 0; }
compiled with:
$ gcc -o timer -std=c99 -Wall -pedantic getitimer.c -lrt
However: I always get zero.
$ ./timer Time started = 0 ; Time taken = 0 : Time taken = 0
What am I doing wrong?
Many thanks, Ranjan
____________________________________________________________ Can't remember your password? Do you need a strong and secure password? Use Password manager! It stores your passwords & protects your account. Check it out at http://mysecurelogon.com/manager
On Sat, 2016-02-27 at 07:44 -0600, Ranjan Maitra wrote:
#define INTERVAL 1 /* number of milliseconds to go off */
int main() { double sum = 0; struct itimerval initial, updated;
initial.it_value.tv_sec = INTERVAL/1000000; initial.it_value.tv_usec = (INTERVAL/1000000) * 1000000;
To start with, these are both integer values, so you're initializing them to 0. There may be other bugs but I stopped looking when I saw this.
poc
On Sat, 27 Feb 2016 14:09:25 +0000 "Patrick O'Callaghan" pocallaghan@gmail.com wrote:
On Sat, 2016-02-27 at 07:44 -0600, Ranjan Maitra wrote:
#define INTERVAL 1 /* number of milliseconds to go off */
int main() { double sum = 0; struct itimerval initial, updated;
initial.it_value.tv_sec = INTERVAL/1000000; initial.it_value.tv_usec = (INTERVAL/1000000) * 1000000;
To start with, these are both integer values, so you're initializing them to 0. There may be other bugs but I stopped looking when I saw this.
poc
OK, thanks! So, I tried this:
#include <sys/time.h> #include <stdlib.h> #include <stdio.h> #include <limits.h>
#define INTERVAL 1 /* number of milliseconds to go off */
int main() { double sum = 0; struct itimerval initial, updated; initial.it_value.tv_sec = INTERVAL; initial.it_value.tv_usec = INT_MAX; initial.it_interval = initial.it_value;
printf("%ld\n", initial.it_value.tv_usec); if (setitimer(ITIMER_VIRTUAL, &initial, NULL) == -1) { perror("error calling setitimer()"); exit(1); }
for (unsigned int i; i < 100000; i++) sum += 1./i; if (getitimer(ITIMER_REAL, &updated) == -1) { perror("error calling getitimer()"); exit(1); }
printf("Time started = %ld\n; Time taken = %ld\n: Time taken = %ld\n", initial.it_value.tv_usec, updated.it_value.tv_usec, initial.it_value.tv_usec - updated.it_value.tv_usec); return 0; }
But now setitimer does not execute:-(
$gcc -o timer -std=c99 -Wall -pedantic getitimer.c -lrt -O3 $./timer 2147483647 error calling setitimer(): Invalid argument
Thanks! Ranjan
____________________________________________________________ FREE 3D MARINE AQUARIUM SCREENSAVER - Watch dolphins, sharks & orcas on your desktop! Check it out at http://www.inbox.com/marineaquarium
On 02/27/2016 06:48 AM, Ranjan Maitra wrote:
On Sat, 27 Feb 2016 14:09:25 +0000 "Patrick O'Callaghan" pocallaghan@gmail.com wrote:
On Sat, 2016-02-27 at 07:44 -0600, Ranjan Maitra wrote:
#define INTERVAL 1 /* number of milliseconds to go off */
int main() { double sum = 0; struct itimerval initial, updated;
initial.it_value.tv_sec = INTERVAL/1000000; initial.it_value.tv_usec = (INTERVAL/1000000) * 1000000;To start with, these are both integer values, so you're initializing them to 0. There may be other bugs but I stopped looking when I saw this.
poc
OK, thanks! So, I tried this:
#include <sys/time.h> #include <stdlib.h> #include <stdio.h> #include <limits.h>
#define INTERVAL 1 /* number of milliseconds to go off */
int main() { double sum = 0; struct itimerval initial, updated;
initial.it_value.tv_sec = INTERVAL; initial.it_value.tv_usec = INT_MAX; initial.it_interval = initial.it_value;
If you want the timer to go off every millisecond:
initial.it_value.tv_sec = 0; initial.it_value.tv_usec = 1000;
The ".tv_usec" must fall in the range 0 <= .tv_usec <= 999999.
printf("%ld\n", initial.it_value.tv_usec);
if (setitimer(ITIMER_VIRTUAL, &initial, NULL) == -1) { perror("error calling setitimer()"); exit(1); }
for (unsigned int i; i < 100000; i++) sum += 1./i;
if (getitimer(ITIMER_REAL, &updated) == -1) {
Uh, why are you setting ITIMER_VIRTUAL, then reading ITIMER_REAL?
perror("error calling getitimer()"); exit(1);}
printf("Time started = %ld\n; Time taken = %ld\n: Time taken = %ld\n", initial.it_value.tv_usec, updated.it_value.tv_usec, initial.it_value.tv_usec - updated.it_value.tv_usec); return 0; }
But now setitimer does not execute:-(
$gcc -o timer -std=c99 -Wall -pedantic getitimer.c -lrt -O3 $./timer 2147483647 error calling setitimer(): Invalid argument
That error message is pretty conclusive and indicates what I said at the top. ---------------------------------------------------------------------- - Rick Stevens, Systems Engineer, AllDigital ricks@alldigital.com - - AIM/Skype: therps2 ICQ: 226437340 Yahoo: origrps2 - - - - Admitting you have a problem is the first step toward getting - - medicated for it. -- Jim Evarts (http://www.TopFive.com) - ----------------------------------------------------------------------
On 02/26/2016 05:41 PM, Patrick O'Callaghan wrote:
On Fri, 2016-02-26 at 17:31 -0700, jd1008 wrote:
On 02/26/2016 05:28 PM, Ranjan Maitra wrote:
Thanks again!
OK, is there a way to calculate the FLOP instructions in C?
What do you mean "calculate the FLOP instructions"? Are you trying to evaluate an algorithm or benchmark an implementation? These are two different things. You can compare algorithms theoretically or by measurement, but only measurement will work for implementations. You can count each floating point instruction generated by the compiler by looking at the binary code, but maybe you want to count every instruction during execution, or see which instructions take longer, or just measure the total execution time of the algorithm on a given set of input.
Indeed, I wanted to measure the total execution time of the algorithms (i.e. difference in CPU time after and before the function executing the algorithm is called) and independent of extraneous issues such as what other process is running at some time, etc. I wanted to see if, in some case, some of the available theoretical guarantees actually hold or not.
Thanks again!
Best wishes, Ranjan
Did you see my reply? It shows you the basics of what you need to use in order to get exactly what you want.
Note that the accuracy of this depends on several factors, such as how long the measured interval is compared to the basic unit of timekeeping (IOW if you want to measure the execution time of a short sequence of instructions, you need to loop a few million times and divide the result), and how much overhead is incurred in calling the timing routines.
poc
Hi Patrick. Actually, no. Since the virtual timer is incremented only while the process is actually running on the cpu, the time difference from when you first "start" the timer, and when you "stop" the timer (in Ranjan's case, just before exit), gives the most accurate time possible on a system like linux. What Ranjan can do also, is to set the process scheduling class to REALTIME (which is as approximate to REALTIME as a multiuser virtual mem system can provide). Thus he will get a very accurate benchmark for the segment of code he wants to measure.
On Fri, 2016-02-26 at 18:02 -0700, jd1008 wrote:
Note that the accuracy of this depends on several factors, such as
how
long the measured interval is compared to the basic unit of
timekeeping
(IOW if you want to measure the execution time of a short sequence
of
instructions, you need to loop a few million times and divide the result), and how much overhead is incurred in calling the timing routines.
poc
Hi Patrick. Actually, no. Since the virtual timer is incremented only while the process is actually running on the cpu, the time difference from when you first "start" the timer, and when you "stop" the timer (in Ranjan's case, just before exit), gives the most accurate time possible on a system like linux.
Not what I meant. I know the system is only timing the process while it's running, but there are still overheads to calling the timing routines which count against the process (as they are running in user state before entering the system and after exiting it). If the code of interest is short compared to this, the overhead will skew the result, but Ranjan hasn't said anything about the code to be measured so we don't know if this matters or not. Eliminating these factors is possible but tricky.
What Ranjan can do also, is to set the process scheduling class to REALTIME (which is as approximate to REALTIME as a multiuser virtual mem system can provide). Thus he will get a very accurate benchmark for the segment of code he wants to measure.
I doubt that will make any difference. As you said, the timers are measuring process virtual time, so whether the process is scheduled more or less frequently shouldn't matter.
poc
On Sat, 27 Feb 2016 11:39:11 +0000 Patrick O'Callaghan wrote:
I doubt that will make any difference. As you said, the timers are measuring process virtual time, so whether the process is scheduled more or less frequently shouldn't matter.
You'd be less confident if you'd seen the number of bugs in the high resolution accounting code in the linux kernel :-). It may be working moderately well these days in new kernels.
On Fri, Feb 26, 2016 at 08:31:07AM -0600, Ranjan Maitra wrote:
Hi,
Disclaimer: This is clearly marked OT, with the only connection to this group being the fact that I am running F23 on a 20-core Dell T5810 @3.1 GHz each and 64 GiB memory. My OT queries over the past 13 years (almost) here have elicited great wealth of information so I am posing here.
So, I am trying to compare two kinds of methods in a C program. Both are written as efficiently as possible (assumed because no point otherwise). I would like to know which of these is more efficient. I have been using get_rusage but I was wondering whether there is a better way?
Separately, is there a way to get the number of floating point instructions in C? Both FLOPS and MIPS?
Many thanks and best wishes, Ranjan
Its been ages so I may have faulty memory.
The compiler can generate code to "profile" a piece of code. As I recall the profile included info on number of times a function is entered, total time spent in the function, and maybe min/max times in the function.
You only need to add the profile option for the code you are interested in. And perhaps at the link stage also.
Jon