On Wed, 2003-08-13 at 22:35, Mark Mielke wrote:
The point of the i686 instructions, is that certain key instructions
became available, not that the scheduling is perfect. Have you timed
the benefit of using -mcpu=p4 over -mcpu=i686? Is it really worthwhile?
Or are you just guessing?
In fact I was the guy who timed the gains for gcc 2.9.x for processors
based on the PPro core (ie P2, P3, Ppro, Celeron). I don't have a P4.
so I cannot evaluate how big would be the gain on -mcpu=p4 versus
-mcpu=i686
I wouldn't mind if a PIII version came with RedHat - but then
again, it
is very easy to recompile the kernel for PIII, deselect all the crap
that I don't want, and have an optimal system. I don't need RedHat to
change their distribution to satisfy my whim.
The -mcpu=p4 means that the compiler will only generate 386
instrructions but will use the P4 time table for optimization.
For use of P4 instructions you need -march=p4 and that is not
what I advocate
If we assume that most boxes in use are P4s or Athlons or at least
that P4s and Athlons are the majority when speed matters (ie not
for ere firewalls and similar tasks who often handled by obsolete
boxes) then RedHat should be optimized for P4 (notice that does
NOT mean "will run only on the P4"
As for GLIBC optimized support for P4 or the latest AMD chips, RedHat
is probably the wrong organization to ask. What is the business case?
I'm sure people on the GLIBC mailing lists would be glad to receive
patches from you that implement proven optimizations...
There is a business case: if, for instance, Linux is left in the dust at
web serving because Apache uses memcpy significantly and Linux's memcpy
is three or four times slower on a P4 than the version used by W2K.
Just some perspective... :-)
mark
On Wed, Aug 13, 2003 at 10:23:01PM +0200, Jean Francois Martinez wrote:
> Given that most/all of the recent boxes (ie the ones doing the real
> work) are P4s and Athlons it is time RedHat stopped compiling
> with -mcpu=i686 and started optimizing for the P4: -mcpu=p4
>
> Another point is that there is no such thing like low-level glibc
> functions for the P4 and the Athlon. The highest targetted
> processor is the PIII. However documents in AMD's web site show
> that moving data (ie memcpy and friends) can be made several times
> faster if using 3DNow instructions and data prefetching, I gave only
> a cursory glance to the assembler parts of glibc but it didn't look
> like those parts (targetting the PIII) would be even remotely ideal
> for the Athlon. Same thing about the P4.
>
> Would it be possible for RedHat to contact those with an interest ie
> AMD/Intel in order to get high-pedrformance assembly versions of those
> low level routines? Or failing that to have them written by an
> employee?
--
Jean Francois Martinez <jfm512(a)free.fr>