P4s, Athlons and bandwidth

Jean Francois Martinez jfm512 at free.fr
Wed Aug 13 22:43:46 UTC 2003

On Wed, 2003-08-13 at 22:35, Mark Mielke wrote:
> The point of the i686 instructions, is that certain key instructions
> became available, not that the scheduling is perfect. Have you timed
> the benefit of using -mcpu=p4 over -mcpu=i686? Is it really worthwhile?
> Or are you just guessing?

In fact I was the guy who timed the gains for gcc 2.9.x for processors
based on the PPro core (ie P2, P3, Ppro, Celeron).  I don't have a P4.
so I cannot evaluate how big would be the gain on -mcpu=p4 versus

> I wouldn't mind if a PIII version came with RedHat - but then again, it
> is very easy to recompile the kernel for PIII, deselect all the crap
> that I don't want, and have an optimal system. I don't need RedHat to
> change their distribution to satisfy my whim.

The -mcpu=p4 means that the compiler will only generate 386
instrructions but will use the P4 time table for optimization.
For use of P4 instructions you need -march=p4 and that is not
what I advocate

If we assume that most boxes in use are P4s or Athlons or at least
that P4s and Athlons are the majority when speed matters (ie not
for ere firewalls and similar tasks who often handled by obsolete
boxes) then RedHat should be optimized for P4 (notice that does
NOT mean  "will run only on the P4"  

> As for GLIBC optimized support for P4 or the latest AMD chips, RedHat
> is probably the wrong organization to ask. What is the business case?
> I'm sure people on the GLIBC mailing lists would be glad to receive
> patches from you that implement proven optimizations...

There is a business case: if, for instance, Linux is left in the dust at
web serving because Apache uses memcpy significantly and Linux's memcpy
is three or four times slower on a P4 than the version used by W2K.

> Just some perspective... :-)
> mark
> On Wed, Aug 13, 2003 at 10:23:01PM +0200, Jean Francois Martinez wrote:
> > Given that most/all of the recent boxes (ie the ones doing the real
> > work) are P4s and Athlons it is time RedHat stopped compiling
> > with -mcpu=i686 and started optimizing for the P4: -mcpu=p4
> > 
> > Another point is that there is no such thing like low-level glibc
> > functions for the P4 and the Athlon.  The highest targetted
> > processor is the PIII.  However documents in AMD's web site show
> > that moving data (ie memcpy and friends) can be made several times
> > faster if using 3DNow instructions and data prefetching, I gave only
> > a cursory glance to the assembler parts of glibc but it didn't look
> > like those parts (targetting the PIII) would be even remotely ideal
> > for the Athlon.  Same thing about the P4.
> > 
> > Would it be possible for RedHat to contact those with an interest ie
> > AMD/Intel in order to get high-pedrformance assembly versions of those
> > low level routines?  Or failing that to have them written by an
> > employee?
Jean Francois Martinez <jfm512 at free.fr>

More information about the test mailing list