On Tue, Sep 5, 2017 at 10:20 PM, Steven Munroe
<munroesj(a)linux.vnet.ibm.com> wrote:
On Tue, 2017-09-05 at 15:35 -0500, Carlos O'Donell wrote:
> On 09/05/2017 03:03 PM, Steven Munroe wrote:
> > On Fri, 2017-09-01 at 09:24 -0500, Carlos O'Donell wrote:
> >> On 09/01/2017 09:03 AM, Steven Munroe wrote:
> >>> If you want to cut back you can't try 970 as base, plus power6 and
> >>> power7 -mtune power8
> >>
> >> s/can't/can/g?
> >>
> >> I assume you're suggestion:
> >>
> >> * 970 base multilib.
> >> * power6 multilib.
> >> * power7 multilib with power8 tunning.
> >>
> >> Which implies:
> >>
> >> * Drop the power8 multilib.
> >>
> >> That drops only 1 multilib.
> >>
> >> Do we need power6 or can we fold that into the 970?
> >> e.g.
> >>
> >> * 970 base multilib with power6 tuning.
> >> * power7 multilib with power8 tuning.
> >>
> >>
> >
> > That seems simple but is ignoring the major features added with each ISA
> > level.
> >
> > So between 970 and power8 we added DFP, VSX, VSX Scalar Float, and HTM.
> > This is over 300 new instructions including additions to existing ISA
> > categories, like Fixed Point 64-bit, Floating Point, and VMX (Altivec)
> > beyond what was in 970.
> >
> > So what you propose will mean Power6 would be restricted to software
> > emulation of DFP and the DFP hardware unused. And will not have the
> > mutex lock hint and compare bytes optimization.
> >
> > Power7 will get both HW DFP and VSX (vector double and extended scalar
> > double) but without Scalar extended Float. Also will leave the HTM and
> > direct moves (GPR <-> VSR) optimization disabled.
> >
> > So if you holding on to your Apple G5 you may not know what your
> > missing, but your P6 JS21 blade will be disappointing. And your P8 will
> > run slow due to missing direct move support.
>
> The library, glibc, *may* be missing the use of some of those instructions
> in the library and *iff* the compiler generated them as part of the
> compilation of generic C code.
>
> The POWER7 multilib will still have POWER8 IFUNCs which make use of
> all the POWER8 assembly implemented IFUNCs that IBM has contributed
> upstream.
>
> How much of these features you quote are actually used by glibc when
> compiled for that architecture?
>
> I expect user applications would make use of libdfp to access the
> full power of POWER6 DFP, but glibc itself doesn't use DFP internally,
> so I still don't see how the above suggested reduction makes things
> really worse.
>
> We are only talking about limiting the glibc multilibs that we build.
>
> How about:
>
> * 970 with POWER6 tuning.
> - Will not use DFP within glibc, but we have never used DFP in glibc.
>
Yes but customer do use DFP, Just not the customers you have talked to.
How many of those customers run Fedora? That is the context of this
conversation.
josh
> So why not build libdfp for 970 as base for emulation and power6 with
> hardware DFP. You can alias the power6 libdfp for power7/8
>
> Then I am OK with skipping power6 for glibc.
>
>> * POWER7 with POWER8 tuning.
>> - Enables POWER8 IFUNCs.
>> - Enables HTM by checking for capability (not limited to strict POWER7).
>> - Yes, glibc looses out on direct move optimizations.
>>
> That works as long as libdfp is available (power6 build will work).
>
>> * POWER9
>> - Enables everything.
>>
> Will need this for PPC64LE with power8 (base) and power9 optimization.
>
> We have no plans for on power9 native PPC64BE (only power8 compatibility
> mode). So I am not concerned if PPC64BE multilib stops with power7 or 8.
>
>> What are the minimum number of multilibs we need to take real advantage
>> of the hardware.
>>
>> Having 5 multilibs is a lot (970, P6, P7, P8, P9) and would be really slow
>> to finish Fedora builds for CI and integration testing.
>>
>> Can IBM help identify some subset that would reduce our build times?
>>
>
>