Jakub's Recommendations for ia32 Support

Josh Boyer jwboyer at gmail.com
Tue Feb 3 20:52:41 UTC 2009


On Tue, Feb 03, 2009 at 09:45:46PM +0100, Dominik 'Rathann' Mierzejewski wrote:
>On Tuesday, 03 February 2009 at 21:01, Ulrich Drepper wrote:
>> Dominik 'Rathann' Mierzejewski wrote:
>> > I'd like to see a case (not involving Pentium 4) where using cmov is slower
>> > than not using it. It definitely is faster for decoding H.264 in FFmpeg
>> > for example.
>> 
>> I don't have a specific test case.  But I do talk to the CPU
>> architectures at Intel regularly.
>
>I didn't know architectures could talk. ;)
>
>> They always say the cmov should be
>> avoided.  Especially with the introduction of the fused micro-ops the
>> various cmp+jcc pairs are likely move faster.
>> 
>> And from the code generation perspective using cmp+jcc is also more
>> flexible.  With cmov you have to tie up two registers.  This is
>> particularly bad with the x86 ABI.
>> 
>> There are certainly cases where cmov can be faster.  Perhaps exclusively
>> on older micro architectures (P4s, early Core2, maybe AMD, haven't
>> checked).  But in general it's no win.
>
>Well, I talk to people who write hand-optimized assembly and care to
>squeeze every cycle out of various CPUs and they say it's definitely
>a win. So please, show me some code instead of hand-waving.

If they can do that, then why can't they rebuild things themselves?

josh




More information about the devel mailing list