On Tue, 2017-09-05 at 15:35 -0500, Carlos O'Donell wrote:
On 09/05/2017 03:03 PM, Steven Munroe wrote:
> On Fri, 2017-09-01 at 09:24 -0500, Carlos O'Donell wrote:
>> On 09/01/2017 09:03 AM, Steven Munroe wrote:
>>> If you want to cut back you can't try 970 as base, plus power6 and
>>> power7 -mtune power8
>> I assume you're suggestion:
>> * 970 base multilib.
>> * power6 multilib.
>> * power7 multilib with power8 tunning.
>> Which implies:
>> * Drop the power8 multilib.
>> That drops only 1 multilib.
>> Do we need power6 or can we fold that into the 970?
>> * 970 base multilib with power6 tuning.
>> * power7 multilib with power8 tuning.
> That seems simple but is ignoring the major features added with each ISA
> So between 970 and power8 we added DFP, VSX, VSX Scalar Float, and HTM.
> This is over 300 new instructions including additions to existing ISA
> categories, like Fixed Point 64-bit, Floating Point, and VMX (Altivec)
> beyond what was in 970.
> So what you propose will mean Power6 would be restricted to software
> emulation of DFP and the DFP hardware unused. And will not have the
> mutex lock hint and compare bytes optimization.
> Power7 will get both HW DFP and VSX (vector double and extended scalar
> double) but without Scalar extended Float. Also will leave the HTM and
> direct moves (GPR <-> VSR) optimization disabled.
> So if you holding on to your Apple G5 you may not know what your
> missing, but your P6 JS21 blade will be disappointing. And your P8 will
> run slow due to missing direct move support.
The library, glibc, *may* be missing the use of some of those instructions
in the library and *iff* the compiler generated them as part of the
compilation of generic C code.
The POWER7 multilib will still have POWER8 IFUNCs which make use of
all the POWER8 assembly implemented IFUNCs that IBM has contributed
How much of these features you quote are actually used by glibc when
compiled for that architecture?
I expect user applications would make use of libdfp to access the
full power of POWER6 DFP, but glibc itself doesn't use DFP internally,
so I still don't see how the above suggested reduction makes things
We are only talking about limiting the glibc multilibs that we build.
* 970 with POWER6 tuning.
- Will not use DFP within glibc, but we have never used DFP in glibc.
Yes but customer do use DFP, Just not the customers you have talked to.
So why not build libdfp for 970 as base for emulation and power6 with
hardware DFP. You can alias the power6 libdfp for power7/8
Then I am OK with skipping power6 for glibc.
* POWER7 with POWER8 tuning.
- Enables POWER8 IFUNCs.
- Enables HTM by checking for capability (not limited to strict POWER7).
- Yes, glibc looses out on direct move optimizations.
That works as long as libdfp is available (power6 build will work).
- Enables everything.
Will need this for PPC64LE with power8 (base) and power9 optimization.
We have no plans for on power9 native PPC64BE (only power8 compatibility
mode). So I am not concerned if PPC64BE multilib stops with power7 or 8.
What are the minimum number of multilibs we need to take real
of the hardware.
Having 5 multilibs is a lot (970, P6, P7, P8, P9) and would be really slow
to finish Fedora builds for CI and integration testing.
Can IBM help identify some subset that would reduce our build times?