Fixing the glibc adobe flash incompatibility
Dave Jones
davej at redhat.com
Thu Nov 18 16:15:57 UTC 2010
On Thu, Nov 18, 2010 at 04:23:56PM +0100, Jakub Jelinek wrote:
> It is very sad that Intel/AMD just didn't make sure rep movsb
> isn't the fastest copying sequence on all of their CPUs,
> which underneath could do whatever magic based on size and src/dst
> alignment (e.g. for small length handle it in hw so it is as quick as
> possible, for larger sizes perhaps handle it in microcode) - rep movsb
> can be easily inlined and is quite short as well. But on many, especially
> recent, CPUs it performs very badly compared to these much larger SSE* optimized
> routines.
>
> If you want exact numbers, best ask Intel folks who wrote and tuned the
> SSE4.2 memcpy routine.
I wonder if the Intel people who benchmarked memcpy throughput also benchmarked
the increased context switch time that will happen now that the kernels lazy-fpu
state saving is effectively disabled every time something calls memcpy.
Dave
More information about the devel
mailing list