I would like to tweak glibc so that it only builds one POWER 620/970 libc for ppc64, and not multiple libcs with POWER6/POWER7/POWER8 optimizations.
This does not affect ppc64le in any way.
The reason for this change is that the ppc64 builders are awfully slow as of late, and the multiple builds (four times currently) currently make ppc64 the slowest architecture to build by quite a margin.
Any comments?
Thanks, Florian
On 08/31/2017 04:29 AM, Florian Weimer wrote:
I would like to tweak glibc so that it only builds one POWER 620/970 libc for ppc64, and not multiple libcs with POWER6/POWER7/POWER8 optimizations.
This does not affect ppc64le in any way.
The reason for this change is that the ppc64 builders are awfully slow as of late, and the multiple builds (four times currently) currently make ppc64 the slowest architecture to build by quite a margin.
Any comments?
IBM has argued in the past that these POWER6/POWER7/POWER8 variants were important and brought performance gains that were not attainable with IFUNCs alone.
Therefore we implemented this using IBM's guidance in Fedora to provide the best possible performance for our users systems.
This problem doesn't go away for ppc64le either, once we have POWER9 systems, it will be a question of doing POWER9 multilibs also?
In summary:
- Is the added performance of the multilib builds for users worth the build cost and development slowdown caused by slow builds?
On Thu, Aug 31, 2017 at 11:32 AM, Carlos O'Donell carlos@redhat.com wrote:
On 08/31/2017 04:29 AM, Florian Weimer wrote:
I would like to tweak glibc so that it only builds one POWER 620/970 libc for ppc64, and not multiple libcs with POWER6/POWER7/POWER8 optimizations.
This does not affect ppc64le in any way.
The reason for this change is that the ppc64 builders are awfully slow as of late, and the multiple builds (four times currently) currently make ppc64 the slowest architecture to build by quite a margin.
Any comments?
IBM has argued in the past that these POWER6/POWER7/POWER8 variants were important and brought performance gains that were not attainable with IFUNCs alone.
Therefore we implemented this using IBM's guidance in Fedora to provide the best possible performance for our users systems.
Yes. That is partly why Florian is reaching out now, to ensure we solicit feedback from the relevant community of which IBM is a large part.
As a middle ground, does disabling POWER6 and POWER7 variants save significant time? Having 970 and POWER8 covers the lowest commonly avaiable "cheap" option as well as the newest released POWER hardware.
This problem doesn't go away for ppc64le either, once we have POWER9 systems, it will be a question of doing POWER9 multilibs also?
Perhaps we come up with a defacto standard of supporting N and N-1 for the current POWER hardware generations, with the change happening on a Fedora release boundary? So when POWER9 comes out, we build POWER8 and POWER9?
In summary:
- Is the added performance of the multilib builds for users worth the build cost and development slowdown caused by slow builds?
That is indeed a good question overall. If the answer is no, then all of the above is irrelevant. If the answer is yes, which I suspect it to be, then data showing that would be good to have.
josh
On Thu, 2017-08-31 at 15:40 -0400, Josh Boyer wrote:
As a middle ground, does disabling POWER6 and POWER7 variants save significant time? Having 970 and POWER8 covers the lowest commonly avaiable "cheap" option as well as the newest released POWER hardware.
POWER6 is in-order, so *if* we still care about it at all then there is potentially some benefit in having a dedicated build for it. Dropping POWER7 is probably a no-brainer; it ought to cope with whatever we throw at it.
On 08/31/2017 09:45 PM, David Woodhouse wrote:
On Thu, 2017-08-31 at 15:40 -0400, Josh Boyer wrote:
As a middle ground, does disabling POWER6 and POWER7 variants save significant time? Having 970 and POWER8 covers the lowest commonly avaiable "cheap" option as well as the newest released POWER hardware.
POWER6 is in-order, so *if* we still care about it at all then there is potentially some benefit in having a dedicated build for it. Dropping POWER7 is probably a no-brainer; it ought to cope with whatever we throw at it.
Could we build the 970 baseline build with -mtune=power6? Then perhaps we only need two builds.
But I can certainly drop POWER7 as a first step and see how the build time compares with armhfp.
Thanks, Florian
On Thu, 2017-08-31 at 22:11 +0200, Florian Weimer wrote:
On 08/31/2017 09:45 PM, David Woodhouse wrote:
On Thu, 2017-08-31 at 15:40 -0400, Josh Boyer wrote:
As a middle ground, does disabling POWER6 and POWER7 variants save significant time? Having 970 and POWER8 covers the lowest commonly avaiable "cheap" option as well as the newest released POWER hardware.
POWER6 is in-order, so *if* we still care about it at all then there is potentially some benefit in having a dedicated build for it. Dropping POWER7 is probably a no-brainer; it ought to cope with whatever we throw at it.
Could we build the 970 baseline build with -mtune=power6? Then perhaps we only need two builds.
We are only taking PPC64BE here...
970 is is really power4+ altivec so it is very back level compared to power6 which added DFP.
But I can certainly drop POWER7 as a first step and see how the build time compares with armhfp.
power7 is major update with Vector Scalar Extensions (more registers and 200+ new instructions).
If you want to cut back you can't try 970 as base, plus power6 and power7 -mtune power8
Thanks, Florian _______________________________________________ ppc mailing list -- ppc@lists.fedoraproject.org To unsubscribe send an email to ppc-leave@lists.fedoraproject.org
On 09/01/2017 09:03 AM, Steven Munroe wrote:
If you want to cut back you can't try 970 as base, plus power6 and power7 -mtune power8
s/can't/can/g?
I assume you're suggestion:
* 970 base multilib. * power6 multilib. * power7 multilib with power8 tunning.
Which implies:
* Drop the power8 multilib.
That drops only 1 multilib.
Do we need power6 or can we fold that into the 970? e.g.
* 970 base multilib with power6 tuning. * power7 multilib with power8 tuning.
On Fri, 2017-09-01 at 09:24 -0500, Carlos O'Donell wrote:
On 09/01/2017 09:03 AM, Steven Munroe wrote:
If you want to cut back you can't try 970 as base, plus power6 and power7 -mtune power8
s/can't/can/g?
I assume you're suggestion:
- 970 base multilib.
- power6 multilib.
- power7 multilib with power8 tunning.
Which implies:
- Drop the power8 multilib.
That drops only 1 multilib.
Do we need power6 or can we fold that into the 970? e.g.
- 970 base multilib with power6 tuning.
- power7 multilib with power8 tuning.
That seems simple but is ignoring the major features added with each ISA level.
So between 970 and power8 we added DFP, VSX, VSX Scalar Float, and HTM. This is over 300 new instructions including additions to existing ISA categories, like Fixed Point 64-bit, Floating Point, and VMX (Altivec) beyond what was in 970.
So what you propose will mean Power6 would be restricted to software emulation of DFP and the DFP hardware unused. And will not have the mutex lock hint and compare bytes optimization.
Power7 will get both HW DFP and VSX (vector double and extended scalar double) but without Scalar extended Float. Also will leave the HTM and direct moves (GPR <-> VSR) optimization disabled.
So if you holding on to your Apple G5 you may not know what your missing, but your P6 JS21 blade will be disappointing. And your P8 will run slow due to missing direct move support.
On 09/05/2017 03:03 PM, Steven Munroe wrote:
On Fri, 2017-09-01 at 09:24 -0500, Carlos O'Donell wrote:
On 09/01/2017 09:03 AM, Steven Munroe wrote:
If you want to cut back you can't try 970 as base, plus power6 and power7 -mtune power8
s/can't/can/g?
I assume you're suggestion:
- 970 base multilib.
- power6 multilib.
- power7 multilib with power8 tunning.
Which implies:
- Drop the power8 multilib.
That drops only 1 multilib.
Do we need power6 or can we fold that into the 970? e.g.
- 970 base multilib with power6 tuning.
- power7 multilib with power8 tuning.
That seems simple but is ignoring the major features added with each ISA level.
So between 970 and power8 we added DFP, VSX, VSX Scalar Float, and HTM. This is over 300 new instructions including additions to existing ISA categories, like Fixed Point 64-bit, Floating Point, and VMX (Altivec) beyond what was in 970.
So what you propose will mean Power6 would be restricted to software emulation of DFP and the DFP hardware unused. And will not have the mutex lock hint and compare bytes optimization.
Power7 will get both HW DFP and VSX (vector double and extended scalar double) but without Scalar extended Float. Also will leave the HTM and direct moves (GPR <-> VSR) optimization disabled.
So if you holding on to your Apple G5 you may not know what your missing, but your P6 JS21 blade will be disappointing. And your P8 will run slow due to missing direct move support.
The library, glibc, *may* be missing the use of some of those instructions in the library and *iff* the compiler generated them as part of the compilation of generic C code.
The POWER7 multilib will still have POWER8 IFUNCs which make use of all the POWER8 assembly implemented IFUNCs that IBM has contributed upstream.
How much of these features you quote are actually used by glibc when compiled for that architecture?
I expect user applications would make use of libdfp to access the full power of POWER6 DFP, but glibc itself doesn't use DFP internally, so I still don't see how the above suggested reduction makes things really worse.
We are only talking about limiting the glibc multilibs that we build.
How about:
* 970 with POWER6 tuning. - Will not use DFP within glibc, but we have never used DFP in glibc.
* POWER7 with POWER8 tuning. - Enables POWER8 IFUNCs. - Enables HTM by checking for capability (not limited to strict POWER7). - Yes, glibc looses out on direct move optimizations.
* POWER9 - Enables everything.
What are the minimum number of multilibs we need to take real advantage of the hardware.
Having 5 multilibs is a lot (970, P6, P7, P8, P9) and would be really slow to finish Fedora builds for CI and integration testing.
Can IBM help identify some subset that would reduce our build times?
On Tue, Sep 5, 2017 at 9:35 PM, Carlos O'Donell carlos@redhat.com wrote:
On 09/05/2017 03:03 PM, Steven Munroe wrote:
On Fri, 2017-09-01 at 09:24 -0500, Carlos O'Donell wrote:
On 09/01/2017 09:03 AM, Steven Munroe wrote:
If you want to cut back you can't try 970 as base, plus power6 and power7 -mtune power8
s/can't/can/g?
I assume you're suggestion:
- 970 base multilib.
- power6 multilib.
- power7 multilib with power8 tunning.
Which implies:
- Drop the power8 multilib.
That drops only 1 multilib.
Do we need power6 or can we fold that into the 970? e.g.
- 970 base multilib with power6 tuning.
- power7 multilib with power8 tuning.
That seems simple but is ignoring the major features added with each ISA level.
So between 970 and power8 we added DFP, VSX, VSX Scalar Float, and HTM. This is over 300 new instructions including additions to existing ISA categories, like Fixed Point 64-bit, Floating Point, and VMX (Altivec) beyond what was in 970.
So what you propose will mean Power6 would be restricted to software emulation of DFP and the DFP hardware unused. And will not have the mutex lock hint and compare bytes optimization.
Power7 will get both HW DFP and VSX (vector double and extended scalar double) but without Scalar extended Float. Also will leave the HTM and direct moves (GPR <-> VSR) optimization disabled.
So if you holding on to your Apple G5 you may not know what your missing, but your P6 JS21 blade will be disappointing. And your P8 will run slow due to missing direct move support.
The library, glibc, *may* be missing the use of some of those instructions in the library and *iff* the compiler generated them as part of the compilation of generic C code.
The POWER7 multilib will still have POWER8 IFUNCs which make use of all the POWER8 assembly implemented IFUNCs that IBM has contributed upstream.
How much of these features you quote are actually used by glibc when compiled for that architecture?
I expect user applications would make use of libdfp to access the full power of POWER6 DFP, but glibc itself doesn't use DFP internally, so I still don't see how the above suggested reduction makes things really worse.
We are only talking about limiting the glibc multilibs that we build.
How about:
970 with POWER6 tuning.
- Will not use DFP within glibc, but we have never used DFP in glibc.
POWER7 with POWER8 tuning.
- Enables POWER8 IFUNCs.
- Enables HTM by checking for capability (not limited to strict POWER7).
- Yes, glibc looses out on direct move optimizations.
POWER9
- Enables everything.
I thought Power9 was being focused towards ppc64le and not ppc64 so maybe for BE it might not be worth enabling as you're likely going to want to run this HW as LE anyway
On Tue, Sep 5, 2017 at 5:24 PM, Peter Robinson pbrobinson@gmail.com wrote:
On Tue, Sep 5, 2017 at 9:35 PM, Carlos O'Donell carlos@redhat.com wrote:
On 09/05/2017 03:03 PM, Steven Munroe wrote:
On Fri, 2017-09-01 at 09:24 -0500, Carlos O'Donell wrote:
On 09/01/2017 09:03 AM, Steven Munroe wrote:
If you want to cut back you can't try 970 as base, plus power6 and power7 -mtune power8
s/can't/can/g?
I assume you're suggestion:
- 970 base multilib.
- power6 multilib.
- power7 multilib with power8 tunning.
Which implies:
- Drop the power8 multilib.
That drops only 1 multilib.
Do we need power6 or can we fold that into the 970? e.g.
- 970 base multilib with power6 tuning.
- power7 multilib with power8 tuning.
That seems simple but is ignoring the major features added with each ISA level.
So between 970 and power8 we added DFP, VSX, VSX Scalar Float, and HTM. This is over 300 new instructions including additions to existing ISA categories, like Fixed Point 64-bit, Floating Point, and VMX (Altivec) beyond what was in 970.
So what you propose will mean Power6 would be restricted to software emulation of DFP and the DFP hardware unused. And will not have the mutex lock hint and compare bytes optimization.
Power7 will get both HW DFP and VSX (vector double and extended scalar double) but without Scalar extended Float. Also will leave the HTM and direct moves (GPR <-> VSR) optimization disabled.
So if you holding on to your Apple G5 you may not know what your missing, but your P6 JS21 blade will be disappointing. And your P8 will run slow due to missing direct move support.
The library, glibc, *may* be missing the use of some of those instructions in the library and *iff* the compiler generated them as part of the compilation of generic C code.
The POWER7 multilib will still have POWER8 IFUNCs which make use of all the POWER8 assembly implemented IFUNCs that IBM has contributed upstream.
How much of these features you quote are actually used by glibc when compiled for that architecture?
I expect user applications would make use of libdfp to access the full power of POWER6 DFP, but glibc itself doesn't use DFP internally, so I still don't see how the above suggested reduction makes things really worse.
We are only talking about limiting the glibc multilibs that we build.
How about:
970 with POWER6 tuning.
- Will not use DFP within glibc, but we have never used DFP in glibc.
POWER7 with POWER8 tuning.
- Enables POWER8 IFUNCs.
- Enables HTM by checking for capability (not limited to strict POWER7).
- Yes, glibc looses out on direct move optimizations.
POWER9
- Enables everything.
I thought Power9 was being focused towards ppc64le and not ppc64 so maybe for BE it might not be worth enabling as you're likely going to want to run this HW as LE anyway
We haven't had much discussion on Power9 on this list as it's still not publicly GAed hardware, and therefore not something we can build for in Fedora yet. When that time comes, I'd definitely suggest ONLY enabling for little-endian.
josh
On Tue, 5 Sep 2017 18:21:06 -0400 Josh Boyer jwboyer@fedoraproject.org wrote:
On Tue, Sep 5, 2017 at 5:24 PM, Peter Robinson pbrobinson@gmail.com wrote:
On Tue, Sep 5, 2017 at 9:35 PM, Carlos O'Donell carlos@redhat.com wrote:
On 09/05/2017 03:03 PM, Steven Munroe wrote:
On Fri, 2017-09-01 at 09:24 -0500, Carlos O'Donell wrote:
On 09/01/2017 09:03 AM, Steven Munroe wrote:
If you want to cut back you can't try 970 as base, plus power6 and power7 -mtune power8
s/can't/can/g?
I assume you're suggestion:
- 970 base multilib.
- power6 multilib.
- power7 multilib with power8 tunning.
Which implies:
- Drop the power8 multilib.
That drops only 1 multilib.
Do we need power6 or can we fold that into the 970? e.g.
- 970 base multilib with power6 tuning.
- power7 multilib with power8 tuning.
That seems simple but is ignoring the major features added with each ISA level.
So between 970 and power8 we added DFP, VSX, VSX Scalar Float, and HTM. This is over 300 new instructions including additions to existing ISA categories, like Fixed Point 64-bit, Floating Point, and VMX (Altivec) beyond what was in 970.
So what you propose will mean Power6 would be restricted to software emulation of DFP and the DFP hardware unused. And will not have the mutex lock hint and compare bytes optimization.
Power7 will get both HW DFP and VSX (vector double and extended scalar double) but without Scalar extended Float. Also will leave the HTM and direct moves (GPR <-> VSR) optimization disabled.
So if you holding on to your Apple G5 you may not know what your missing, but your P6 JS21 blade will be disappointing. And your P8 will run slow due to missing direct move support.
The library, glibc, *may* be missing the use of some of those instructions in the library and *iff* the compiler generated them as part of the compilation of generic C code.
The POWER7 multilib will still have POWER8 IFUNCs which make use of all the POWER8 assembly implemented IFUNCs that IBM has contributed upstream.
How much of these features you quote are actually used by glibc when compiled for that architecture?
I expect user applications would make use of libdfp to access the full power of POWER6 DFP, but glibc itself doesn't use DFP internally, so I still don't see how the above suggested reduction makes things really worse.
We are only talking about limiting the glibc multilibs that we build.
How about:
- 970 with POWER6 tuning.
- Will not use DFP within glibc, but we have never used DFP in
glibc.
- POWER7 with POWER8 tuning.
- Enables POWER8 IFUNCs.
- Enables HTM by checking for capability (not limited to strict
POWER7).
- Yes, glibc looses out on direct move optimizations.
- POWER9
- Enables everything.
I thought Power9 was being focused towards ppc64le and not ppc64 so maybe for BE it might not be worth enabling as you're likely going to want to run this HW as LE anyway
We haven't had much discussion on Power9 on this list as it's still not publicly GAed hardware, and therefore not something we can build for in Fedora yet. When that time comes, I'd definitely suggest ONLY enabling for little-endian.
I think the main question regarding ppc64 (BE) is how many users are there at all, is someone using a Power-based HW in addition to the G5 and if yes, then which one - Power 6 or 7 or 8? I suspect the number of users of Power-based HW is close to zero (especially for Power<8). If am I wrong, then please speak up now :-)
Dan
On Wed, Sep 6, 2017 at 5:45 AM, Dan Horák dan@danny.cz wrote:
On Tue, 5 Sep 2017 18:21:06 -0400 Josh Boyer jwboyer@fedoraproject.org wrote:
On Tue, Sep 5, 2017 at 5:24 PM, Peter Robinson pbrobinson@gmail.com wrote:
On Tue, Sep 5, 2017 at 9:35 PM, Carlos O'Donell carlos@redhat.com wrote:
On 09/05/2017 03:03 PM, Steven Munroe wrote:
On Fri, 2017-09-01 at 09:24 -0500, Carlos O'Donell wrote:
On 09/01/2017 09:03 AM, Steven Munroe wrote: > If you want to cut back you can't try 970 as base, plus power6 > and power7 -mtune power8
s/can't/can/g?
I assume you're suggestion:
- 970 base multilib.
- power6 multilib.
- power7 multilib with power8 tunning.
Which implies:
- Drop the power8 multilib.
That drops only 1 multilib.
Do we need power6 or can we fold that into the 970? e.g.
- 970 base multilib with power6 tuning.
- power7 multilib with power8 tuning.
That seems simple but is ignoring the major features added with each ISA level.
So between 970 and power8 we added DFP, VSX, VSX Scalar Float, and HTM. This is over 300 new instructions including additions to existing ISA categories, like Fixed Point 64-bit, Floating Point, and VMX (Altivec) beyond what was in 970.
So what you propose will mean Power6 would be restricted to software emulation of DFP and the DFP hardware unused. And will not have the mutex lock hint and compare bytes optimization.
Power7 will get both HW DFP and VSX (vector double and extended scalar double) but without Scalar extended Float. Also will leave the HTM and direct moves (GPR <-> VSR) optimization disabled.
So if you holding on to your Apple G5 you may not know what your missing, but your P6 JS21 blade will be disappointing. And your P8 will run slow due to missing direct move support.
The library, glibc, *may* be missing the use of some of those instructions in the library and *iff* the compiler generated them as part of the compilation of generic C code.
The POWER7 multilib will still have POWER8 IFUNCs which make use of all the POWER8 assembly implemented IFUNCs that IBM has contributed upstream.
How much of these features you quote are actually used by glibc when compiled for that architecture?
I expect user applications would make use of libdfp to access the full power of POWER6 DFP, but glibc itself doesn't use DFP internally, so I still don't see how the above suggested reduction makes things really worse.
We are only talking about limiting the glibc multilibs that we build.
How about:
- 970 with POWER6 tuning.
- Will not use DFP within glibc, but we have never used DFP in
glibc.
- POWER7 with POWER8 tuning.
- Enables POWER8 IFUNCs.
- Enables HTM by checking for capability (not limited to strict
POWER7).
- Yes, glibc looses out on direct move optimizations.
- POWER9
- Enables everything.
I thought Power9 was being focused towards ppc64le and not ppc64 so maybe for BE it might not be worth enabling as you're likely going to want to run this HW as LE anyway
We haven't had much discussion on Power9 on this list as it's still not publicly GAed hardware, and therefore not something we can build for in Fedora yet. When that time comes, I'd definitely suggest ONLY enabling for little-endian.
I think the main question regarding ppc64 (BE) is how many users are there at all, is someone using a Power-based HW in addition to the G5 and if yes, then which one - Power 6 or 7 or 8? I suspect the number of users of Power-based HW is close to zero (especially for Power<8). If am I wrong, then please speak up now :-)
Hi all, I'm late to this discussion but wanted to chime in as a hobbyist. My main use is software development & exploring different architectures.
I run Fedora ppc64 BE on Power7 (IBM PS701, PS703 blades). Pretty sure it's Fedora 25, had no problems installing and running.
I also have two IBM QS22 blades I'd like to make use of. Should I expect Fedora to work on these?
Would like to run a graphical version on Apple G5 but had boot-time trouble last time I tried installing Fedora. Maybe the same issue Al Dunsmuir mentioned? Ubuntu MATE mostly works on the G5, but it was an unpleasant adventure getting it on there in a working state.
How much would glibc tuning make a difference, given my purpose of custom software development? Most of my stuff is multi-threaded C++11 with vector intrinsics. I'm not running databases or general-purpose software. As long as the shipped compiler can generate tuned code for the host... isn't that enough?
Thanks for keeping ppc alive!
Mike Erwin musician, naturalist, pixel pusher, hacker extraordinaire
On Wed, Dec 13, 2017 at 12:21 PM, Mike Erwin significant.bit@gmail.com wrote:
Hi all, I'm late to this discussion but wanted to chime in as a hobbyist. My main use is software development & exploring different architectures.
I run Fedora ppc64 BE on Power7 (IBM PS701, PS703 blades). Pretty sure it's Fedora 25, had no problems installing and running.
I also have two IBM QS22 blades I'd like to make use of. Should I expect Fedora to work on these?
No. Cell support was dropped a while ago.
Would like to run a graphical version on Apple G5 but had boot-time trouble last time I tried installing Fedora. Maybe the same issue Al Dunsmuir mentioned? Ubuntu MATE mostly works on the G5, but it was an unpleasant adventure getting it on there in a working state.
How much would glibc tuning make a difference, given my purpose of custom software development? Most of my stuff is multi-threaded C++11 with vector intrinsics. I'm not running databases or general-purpose software. As long as the shipped compiler can generate tuned code for the host... isn't that enough?
I'd think so.
josh
On Tue, 2017-09-05 at 15:35 -0500, Carlos O'Donell wrote:
On 09/05/2017 03:03 PM, Steven Munroe wrote:
On Fri, 2017-09-01 at 09:24 -0500, Carlos O'Donell wrote:
On 09/01/2017 09:03 AM, Steven Munroe wrote:
If you want to cut back you can't try 970 as base, plus power6 and power7 -mtune power8
s/can't/can/g?
I assume you're suggestion:
- 970 base multilib.
- power6 multilib.
- power7 multilib with power8 tunning.
Which implies:
- Drop the power8 multilib.
That drops only 1 multilib.
Do we need power6 or can we fold that into the 970? e.g.
- 970 base multilib with power6 tuning.
- power7 multilib with power8 tuning.
That seems simple but is ignoring the major features added with each ISA level.
So between 970 and power8 we added DFP, VSX, VSX Scalar Float, and HTM. This is over 300 new instructions including additions to existing ISA categories, like Fixed Point 64-bit, Floating Point, and VMX (Altivec) beyond what was in 970.
So what you propose will mean Power6 would be restricted to software emulation of DFP and the DFP hardware unused. And will not have the mutex lock hint and compare bytes optimization.
Power7 will get both HW DFP and VSX (vector double and extended scalar double) but without Scalar extended Float. Also will leave the HTM and direct moves (GPR <-> VSR) optimization disabled.
So if you holding on to your Apple G5 you may not know what your missing, but your P6 JS21 blade will be disappointing. And your P8 will run slow due to missing direct move support.
The library, glibc, *may* be missing the use of some of those instructions in the library and *iff* the compiler generated them as part of the compilation of generic C code.
The POWER7 multilib will still have POWER8 IFUNCs which make use of all the POWER8 assembly implemented IFUNCs that IBM has contributed upstream.
How much of these features you quote are actually used by glibc when compiled for that architecture?
I expect user applications would make use of libdfp to access the full power of POWER6 DFP, but glibc itself doesn't use DFP internally, so I still don't see how the above suggested reduction makes things really worse.
We are only talking about limiting the glibc multilibs that we build.
How about:
- 970 with POWER6 tuning.
- Will not use DFP within glibc, but we have never used DFP in glibc.
Yes but customer do use DFP, Just not the customers you have talked to.
So why not build libdfp for 970 as base for emulation and power6 with hardware DFP. You can alias the power6 libdfp for power7/8
Then I am OK with skipping power6 for glibc.
- POWER7 with POWER8 tuning.
- Enables POWER8 IFUNCs.
- Enables HTM by checking for capability (not limited to strict POWER7).
- Yes, glibc looses out on direct move optimizations.
That works as long as libdfp is available (power6 build will work).
- POWER9
- Enables everything.
Will need this for PPC64LE with power8 (base) and power9 optimization.
We have no plans for on power9 native PPC64BE (only power8 compatibility mode). So I am not concerned if PPC64BE multilib stops with power7 or 8.
What are the minimum number of multilibs we need to take real advantage of the hardware.
Having 5 multilibs is a lot (970, P6, P7, P8, P9) and would be really slow to finish Fedora builds for CI and integration testing.
Can IBM help identify some subset that would reduce our build times?
On Tue, Sep 5, 2017 at 10:20 PM, Steven Munroe munroesj@linux.vnet.ibm.com wrote:
On Tue, 2017-09-05 at 15:35 -0500, Carlos O'Donell wrote:
On 09/05/2017 03:03 PM, Steven Munroe wrote:
On Fri, 2017-09-01 at 09:24 -0500, Carlos O'Donell wrote:
On 09/01/2017 09:03 AM, Steven Munroe wrote:
If you want to cut back you can't try 970 as base, plus power6 and power7 -mtune power8
s/can't/can/g?
I assume you're suggestion:
- 970 base multilib.
- power6 multilib.
- power7 multilib with power8 tunning.
Which implies:
- Drop the power8 multilib.
That drops only 1 multilib.
Do we need power6 or can we fold that into the 970? e.g.
- 970 base multilib with power6 tuning.
- power7 multilib with power8 tuning.
That seems simple but is ignoring the major features added with each ISA level.
So between 970 and power8 we added DFP, VSX, VSX Scalar Float, and HTM. This is over 300 new instructions including additions to existing ISA categories, like Fixed Point 64-bit, Floating Point, and VMX (Altivec) beyond what was in 970.
So what you propose will mean Power6 would be restricted to software emulation of DFP and the DFP hardware unused. And will not have the mutex lock hint and compare bytes optimization.
Power7 will get both HW DFP and VSX (vector double and extended scalar double) but without Scalar extended Float. Also will leave the HTM and direct moves (GPR <-> VSR) optimization disabled.
So if you holding on to your Apple G5 you may not know what your missing, but your P6 JS21 blade will be disappointing. And your P8 will run slow due to missing direct move support.
The library, glibc, *may* be missing the use of some of those instructions in the library and *iff* the compiler generated them as part of the compilation of generic C code.
The POWER7 multilib will still have POWER8 IFUNCs which make use of all the POWER8 assembly implemented IFUNCs that IBM has contributed upstream.
How much of these features you quote are actually used by glibc when compiled for that architecture?
I expect user applications would make use of libdfp to access the full power of POWER6 DFP, but glibc itself doesn't use DFP internally, so I still don't see how the above suggested reduction makes things really worse.
We are only talking about limiting the glibc multilibs that we build.
How about:
- 970 with POWER6 tuning.
- Will not use DFP within glibc, but we have never used DFP in glibc.
Yes but customer do use DFP, Just not the customers you have talked to.
How many of those customers run Fedora? That is the context of this conversation.
josh
So why not build libdfp for 970 as base for emulation and power6 with hardware DFP. You can alias the power6 libdfp for power7/8
Then I am OK with skipping power6 for glibc.
- POWER7 with POWER8 tuning.
- Enables POWER8 IFUNCs.
- Enables HTM by checking for capability (not limited to strict POWER7).
- Yes, glibc looses out on direct move optimizations.
That works as long as libdfp is available (power6 build will work).
- POWER9
- Enables everything.
Will need this for PPC64LE with power8 (base) and power9 optimization.
We have no plans for on power9 native PPC64BE (only power8 compatibility mode). So I am not concerned if PPC64BE multilib stops with power7 or 8.
What are the minimum number of multilibs we need to take real advantage of the hardware.
Having 5 multilibs is a lot (970, P6, P7, P8, P9) and would be really slow to finish Fedora builds for CI and integration testing.
Can IBM help identify some subset that would reduce our build times?
On Wed, 2017-09-06 at 10:31 -0400, Josh Boyer wrote:
On Tue, Sep 5, 2017 at 10:20 PM, Steven Munroe munroesj@linux.vnet.ibm.com wrote:
On Tue, 2017-09-05 at 15:35 -0500, Carlos O'Donell wrote:
On 09/05/2017 03:03 PM, Steven Munroe wrote:
On Fri, 2017-09-01 at 09:24 -0500, Carlos O'Donell wrote:
On 09/01/2017 09:03 AM, Steven Munroe wrote:
If you want to cut back you can't try 970 as base, plus power6 and power7 -mtune power8
s/can't/can/g?
I assume you're suggestion:
- 970 base multilib.
- power6 multilib.
- power7 multilib with power8 tunning.
Which implies:
- Drop the power8 multilib.
That drops only 1 multilib.
Do we need power6 or can we fold that into the 970? e.g.
- 970 base multilib with power6 tuning.
- power7 multilib with power8 tuning.
That seems simple but is ignoring the major features added with each ISA level.
So between 970 and power8 we added DFP, VSX, VSX Scalar Float, and HTM. This is over 300 new instructions including additions to existing ISA categories, like Fixed Point 64-bit, Floating Point, and VMX (Altivec) beyond what was in 970.
So what you propose will mean Power6 would be restricted to software emulation of DFP and the DFP hardware unused. And will not have the mutex lock hint and compare bytes optimization.
Power7 will get both HW DFP and VSX (vector double and extended scalar double) but without Scalar extended Float. Also will leave the HTM and direct moves (GPR <-> VSR) optimization disabled.
So if you holding on to your Apple G5 you may not know what your missing, but your P6 JS21 blade will be disappointing. And your P8 will run slow due to missing direct move support.
The library, glibc, *may* be missing the use of some of those instructions in the library and *iff* the compiler generated them as part of the compilation of generic C code.
The POWER7 multilib will still have POWER8 IFUNCs which make use of all the POWER8 assembly implemented IFUNCs that IBM has contributed upstream.
How much of these features you quote are actually used by glibc when compiled for that architecture?
I expect user applications would make use of libdfp to access the full power of POWER6 DFP, but glibc itself doesn't use DFP internally, so I still don't see how the above suggested reduction makes things really worse.
We are only talking about limiting the glibc multilibs that we build.
How about:
- 970 with POWER6 tuning.
- Will not use DFP within glibc, but we have never used DFP in glibc.
Yes but customer do use DFP, Just not the customers you have talked to.
How many of those customers run Fedora? That is the context of this conversation.
That is the problem with open source, customers don't have to tell you what they are using (packagers or distro's). We only here when there is a performance complaint or a bug.
I have talked to customers that are using DFP. They where usually banks trying to fake decimal arithmetic using binary float, and have problems as a result. They where happy to hear they we had real Decimal Floating Point with fast implementation in hardware.
Once they had a solution, we never hear from them again. As is usual for Open source.
So I don't remember any that where specific planning to us Fedora (usually a enterprise distro) but that is NOT proof that none exist.
Also they can always use the IBM Advance Toolchain which always provides multilib and libdfp.
It but is would better (easier for the customer) if it was just there.
On Wed, 2017-09-06 at 10:31 -0400, Josh Boyer wrote:
On Tue, Sep 5, 2017 at 10:20 PM, Steven Munroe munroesj@linux.vnet.ibm.com wrote:
On Tue, 2017-09-05 at 15:35 -0500, Carlos O'Donell wrote:
On 09/05/2017 03:03 PM, Steven Munroe wrote:
On Fri, 2017-09-01 at 09:24 -0500, Carlos O'Donell wrote:
On 09/01/2017 09:03 AM, Steven Munroe wrote:
If you want to cut back you can't try 970 as base, plus power6 and power7 -mtune power8
s/can't/can/g?
I assume you're suggestion:
- 970 base multilib.
- power6 multilib.
- power7 multilib with power8 tunning.
Which implies:
- Drop the power8 multilib.
That drops only 1 multilib.
Do we need power6 or can we fold that into the 970? e.g.
- 970 base multilib with power6 tuning.
- power7 multilib with power8 tuning.
That seems simple but is ignoring the major features added with each ISA level.
So between 970 and power8 we added DFP, VSX, VSX Scalar Float, and HTM. This is over 300 new instructions including additions to existing ISA categories, like Fixed Point 64-bit, Floating Point, and VMX (Altivec) beyond what was in 970.
So what you propose will mean Power6 would be restricted to software emulation of DFP and the DFP hardware unused. And will not have the mutex lock hint and compare bytes optimization.
Power7 will get both HW DFP and VSX (vector double and extended scalar double) but without Scalar extended Float. Also will leave the HTM and direct moves (GPR <-> VSR) optimization disabled.
So if you holding on to your Apple G5 you may not know what your missing, but your P6 JS21 blade will be disappointing. And your P8 will run slow due to missing direct move support.
The library, glibc, *may* be missing the use of some of those instructions in the library and *iff* the compiler generated them as part of the compilation of generic C code.
The POWER7 multilib will still have POWER8 IFUNCs which make use of all the POWER8 assembly implemented IFUNCs that IBM has contributed upstream.
How much of these features you quote are actually used by glibc when compiled for that architecture?
I expect user applications would make use of libdfp to access the full power of POWER6 DFP, but glibc itself doesn't use DFP internally, so I still don't see how the above suggested reduction makes things really worse.
We are only talking about limiting the glibc multilibs that we build.
How about:
- 970 with POWER6 tuning.
- Will not use DFP within glibc, but we have never used DFP in glibc.
Yes but customer do use DFP, Just not the customers you have talked to.
How many of those customers run Fedora? That is the context of this conversation.
And you are not going to get answer to that. Because the data is not available to me (or anyone).
And you don't have data that says they are not...
On Wed, Sep 6, 2017 at 11:59 AM, Steven Munroe munroesj@linux.vnet.ibm.com wrote:
On Wed, 2017-09-06 at 10:31 -0400, Josh Boyer wrote:
On Tue, Sep 5, 2017 at 10:20 PM, Steven Munroe munroesj@linux.vnet.ibm.com wrote:
On Tue, 2017-09-05 at 15:35 -0500, Carlos O'Donell wrote:
On 09/05/2017 03:03 PM, Steven Munroe wrote:
On Fri, 2017-09-01 at 09:24 -0500, Carlos O'Donell wrote:
On 09/01/2017 09:03 AM, Steven Munroe wrote: > If you want to cut back you can't try 970 as base, plus power6 and > power7 -mtune power8
s/can't/can/g?
I assume you're suggestion:
- 970 base multilib.
- power6 multilib.
- power7 multilib with power8 tunning.
Which implies:
- Drop the power8 multilib.
That drops only 1 multilib.
Do we need power6 or can we fold that into the 970? e.g.
- 970 base multilib with power6 tuning.
- power7 multilib with power8 tuning.
That seems simple but is ignoring the major features added with each ISA level.
So between 970 and power8 we added DFP, VSX, VSX Scalar Float, and HTM. This is over 300 new instructions including additions to existing ISA categories, like Fixed Point 64-bit, Floating Point, and VMX (Altivec) beyond what was in 970.
So what you propose will mean Power6 would be restricted to software emulation of DFP and the DFP hardware unused. And will not have the mutex lock hint and compare bytes optimization.
Power7 will get both HW DFP and VSX (vector double and extended scalar double) but without Scalar extended Float. Also will leave the HTM and direct moves (GPR <-> VSR) optimization disabled.
So if you holding on to your Apple G5 you may not know what your missing, but your P6 JS21 blade will be disappointing. And your P8 will run slow due to missing direct move support.
The library, glibc, *may* be missing the use of some of those instructions in the library and *iff* the compiler generated them as part of the compilation of generic C code.
The POWER7 multilib will still have POWER8 IFUNCs which make use of all the POWER8 assembly implemented IFUNCs that IBM has contributed upstream.
How much of these features you quote are actually used by glibc when compiled for that architecture?
I expect user applications would make use of libdfp to access the full power of POWER6 DFP, but glibc itself doesn't use DFP internally, so I still don't see how the above suggested reduction makes things really worse.
We are only talking about limiting the glibc multilibs that we build.
How about:
- 970 with POWER6 tuning.
- Will not use DFP within glibc, but we have never used DFP in glibc.
Yes but customer do use DFP, Just not the customers you have talked to.
How many of those customers run Fedora? That is the context of this conversation.
And you are not going to get answer to that. Because the data is not available to me (or anyone).
And you don't have data that says they are not...
That is true, certainly. However, "customer" in the Fedora sense doesn't make sense so we must translate that to "user" and/or "contributor". Contributors to Fedora make the decisions taking users into account. Considering we have very limited data on users, we have to give more weight to what the contributors think is most supportable going forward. The goal here is continued, sustainable support with the limited contributor base we have.
josh
On 09/06/2017 04:20 AM, Steven Munroe wrote:
- 970 with POWER6 tuning.
- Will not use DFP within glibc, but we have never used DFP in glibc.
Yes but customer do use DFP, Just not the customers you have talked to.
So why not build libdfp for 970 as base for emulation and power6 with hardware DFP. You can alias the power6 libdfp for power7/8
Then I am OK with skipping power6 for glibc.
libdfp currently isn't Fedora. glibc doesn't use _Decimal, so I don't see how these things are related.
Would you please elaborate?
Thanks, Florian
On Wed, 2017-09-06 at 18:08 +0200, Florian Weimer wrote:
On 09/06/2017 04:20 AM, Steven Munroe wrote:
- 970 with POWER6 tuning.
- Will not use DFP within glibc, but we have never used DFP in glibc.
Yes but customer do use DFP, Just not the customers you have talked to.
So why not build libdfp for 970 as base for emulation and power6 with hardware DFP. You can alias the power6 libdfp for power7/8
Then I am OK with skipping power6 for glibc.
libdfp currently isn't Fedora. glibc doesn't use _Decimal, so I don't see how these things are related.
Would you please elaborate?
I though we here talking about a distro and supporting the platform. GLIBC is too narrow a topic.
POWER supports IEEE754R Decimal Floating Point in Hardware, Since Power6 (2007).
_Decimal is included in C14 and C17 standards.
libdfp is NOT new (also circa 2007) And is included in RHEL6 RHEL7 and other enterprise distros (Supplement or extras)
https://github.com/libdfp/libdfp
We cant use the messed up BID code and format from X86. Which is software emulation only.
So the question is: why doesn't Fedora build and ship libdfp.
On Wed, 2017-09-06 at 11:32 -0500, Steven Munroe wrote:
libdfp currently isn't Fedora. glibc doesn't use _Decimal, so I don't see how these things are related. Would you please elaborate?
I though we here talking about a distro and supporting the platform. GLIBC is too narrow a topic.
No.
We are talking *purely* about what builds of glibc should be done, as the current set seems excessive and is taking too long.
On Wed, 2017-09-06 at 20:12 +0100, David Woodhouse wrote:
On Wed, 2017-09-06 at 11:32 -0500, Steven Munroe wrote:
libdfp currently isn't Fedora. glibc doesn't use _Decimal, so I don't see how these things are related.
Would you please elaborate?
I though we here talking about a distro and supporting the platform. GLIBC is too narrow a topic.
No.
We are talking *purely* about what builds of glibc should be done, as the current set seems excessive and is taking too long.
Yes and I suggested in the spirit of compromise, to drop the power6 glibc build if fedora would provide libdfp multilib for emulation and power6 HW DFP. Eliminate 1 GLIBC build for 2 much smaller libdfp builds.
That seem fair and equitable to me in the interest of supporting the POWER platform and the Linux ecosystem. For example develop and test _Decimal[32|64|128] applications and libraries on G5 970 for deployment on power6/7/8/9 systems.
Because ... Decimal floating Point is in all the relevant standards and fully supported in the last 3 generations of POWER hardware (4 generations with power9).
On Wed, 06 Sep 2017 11:32:55 -0500 Steven Munroe munroesj@linux.vnet.ibm.com wrote:
On Wed, 2017-09-06 at 18:08 +0200, Florian Weimer wrote:
On 09/06/2017 04:20 AM, Steven Munroe wrote:
- 970 with POWER6 tuning.
- Will not use DFP within glibc, but we have never used DFP in
glibc.
Yes but customer do use DFP, Just not the customers you have talked to.
So why not build libdfp for 970 as base for emulation and power6 with hardware DFP. You can alias the power6 libdfp for power7/8
Then I am OK with skipping power6 for glibc.
libdfp currently isn't Fedora. glibc doesn't use _Decimal, so I don't see how these things are related.
Would you please elaborate?
I though we here talking about a distro and supporting the platform. GLIBC is too narrow a topic.
POWER supports IEEE754R Decimal Floating Point in Hardware, Since Power6 (2007).
_Decimal is included in C14 and C17 standards.
libdfp is NOT new (also circa 2007) And is included in RHEL6 RHEL7 and other enterprise distros (Supplement or extras)
https://github.com/libdfp/libdfp
We cant use the messed up BID code and format from X86. Which is software emulation only.
So the question is: why doesn't Fedora build and ship libdfp.
weren't there any licensing issues originally? And later no one volunteered to maintain it it seems :-(
Dan
On Wed, 2017-09-06 at 21:52 +0200, Dan Horák wrote:
On Wed, 06 Sep 2017 11:32:55 -0500 Steven Munroe munroesj@linux.vnet.ibm.com wrote:
On Wed, 2017-09-06 at 18:08 +0200, Florian Weimer wrote:
On 09/06/2017 04:20 AM, Steven Munroe wrote:
- 970 with POWER6 tuning.
- Will not use DFP within glibc, but we have never used DFP in
glibc.
Yes but customer do use DFP, Just not the customers you have talked to.
So why not build libdfp for 970 as base for emulation and power6 with hardware DFP. You can alias the power6 libdfp for power7/8
Then I am OK with skipping power6 for glibc.
libdfp currently isn't Fedora. glibc doesn't use _Decimal, so I don't see how these things are related.
Would you please elaborate?
I though we here talking about a distro and supporting the platform. GLIBC is too narrow a topic.
POWER supports IEEE754R Decimal Floating Point in Hardware, Since Power6 (2007).
_Decimal is included in C14 and C17 standards.
libdfp is NOT new (also circa 2007) And is included in RHEL6 RHEL7 and other enterprise distros (Supplement or extras)
https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_libdfp_libdf...
We cant use the messed up BID code and format from X86. Which is software emulation only.
So the question is: why doesn't Fedora build and ship libdfp.
weren't there any licensing issues originally? And later no one volunteered to maintain it it seems :-(
GNU Lesser General Public License v2.1
Seems simple enough.
Yes we loss some continuity when Ryan took a new job, but Tulio and team support libdfp now.
Dan
On Tue, 2017-09-05 at 15:35 -0500, Carlos O'Donell wrote:
On 09/05/2017 03:03 PM, Steven Munroe wrote:
On Fri, 2017-09-01 at 09:24 -0500, Carlos O'Donell wrote:
On 09/01/2017 09:03 AM, Steven Munroe wrote:
If you want to cut back you can't try 970 as base, plus power6 and power7 -mtune power8
s/can't/can/g?
I assume you're suggestion:
- 970 base multilib.
- power6 multilib.
- power7 multilib with power8 tunning.
Which implies:
- Drop the power8 multilib.
That drops only 1 multilib.
Do we need power6 or can we fold that into the 970? e.g.
- 970 base multilib with power6 tuning.
- power7 multilib with power8 tuning.
That seems simple but is ignoring the major features added with each ISA level.
So between 970 and power8 we added DFP, VSX, VSX Scalar Float, and HTM. This is over 300 new instructions including additions to existing ISA categories, like Fixed Point 64-bit, Floating Point, and VMX (Altivec) beyond what was in 970.
So what you propose will mean Power6 would be restricted to software emulation of DFP and the DFP hardware unused. And will not have the mutex lock hint and compare bytes optimization.
Power7 will get both HW DFP and VSX (vector double and extended scalar double) but without Scalar extended Float. Also will leave the HTM and direct moves (GPR <-> VSR) optimization disabled.
So if you holding on to your Apple G5 you may not know what your missing, but your P6 JS21 blade will be disappointing. And your P8 will run slow due to missing direct move support.
The library, glibc, *may* be missing the use of some of those instructions in the library and *iff* the compiler generated them as part of the compilation of generic C code.
The POWER7 multilib will still have POWER8 IFUNCs which make use of all the POWER8 assembly implemented IFUNCs that IBM has contributed upstream.
How much of these features you quote are actually used by glibc when compiled for that architecture?
I expect user applications would make use of libdfp to access the full power of POWER6 DFP, but glibc itself doesn't use DFP internally, so I still don't see how the above suggested reduction makes things really worse.
We are only talking about limiting the glibc multilibs that we build.
How about:
- 970 with POWER6 tuning.
- Will not use DFP within glibc, but we have never used DFP in glibc.
Yes but customer do use DFP, Just not the customers you have talked to.
So why not build libdfp for 970 as base for emulation and power6 with hardware DFP. You can alias the power6 libdfp for power7/8
Then I am OK with skipping power6 for glibc.
- POWER7 with POWER8 tuning.
- Enables POWER8 IFUNCs.
- Enables HTM by checking for capability (not limited to strict POWER7).
- Yes, glibc looses out on direct move optimizations.
That works as long as libdfp is available (power6 build will work).
- POWER9
- Enables everything.
Will need this for PPC64LE with power8 (base) and power9 optimization.
We have no plans for on power9 native PPC64BE (only power8 compatibility mode). So I am not concerned if PPC64BE multilib stops with power7 or 8.
What are the minimum number of multilibs we need to take real advantage of the hardware.
Having 5 multilibs is a lot (970, P6, P7, P8, P9) and would be really slow to finish Fedora builds for CI and integration testing.
Can IBM help identify some subset that would reduce our build times?
On 08/31/2017 02:40 PM, Josh Boyer wrote:
In summary:
- Is the added performance of the multilib builds for users worth the build cost and development slowdown caused by slow builds?
That is indeed a good question overall. If the answer is no, then all of the above is irrelevant. If the answer is yes, which I suspect it to be, then data showing that would be good to have.
Josh,
I like your suggestion of removing POWER6 and POWER7 optimized versions in anticipation that we'll have POWER8 and POWER9 going forward, leaving only 3 multilibs: (a) most compatible and (b) N and N-1 for new hardware. It's a reasonable tradeoff.
Florian,
Are you able to take master and run the microbenchmark on each of POWER6, POWER7, and POWER8, building twice, once with no special options, and once again with the required '-mcpu=powerX -mtune=powerX' (and --with-cpu=powerX in configure)?
Nobody at this point has provided any data if there are discernible differences between the multilibs, but now with what we have in our microbenchmark it would represent some kind of data.
I might argue that if the benchmark doesn't show any statistically significant gains between a generic build and a -mcpu/-mtune build that we go ahead with your suggestion to drop all of the except the most compatible one. At that point our position is defensible.
On 09/01/2017 02:13 AM, Carlos O'Donell wrote:
Are you able to take master and run the microbenchmark on each of POWER6, POWER7, and POWER8, building twice, once with no special options, and once again with the required '-mcpu=powerX -mtune=powerX' (and --with-cpu=powerX in configure)?
I don't have access to the necessary hardware. Installation of Fedora 25 and 26 fails in Beaker. I used jobs like this one:
<job retention_tag="scratch"> <whiteboard></whiteboard> <recipeSet priority="High"> <recipe whiteboard="" role="RECIPE_MEMBERS" ks_meta="" kernel_options="" kernel_options_post=""> <autopick random="false"/> <watchdog panic="ignore"/> <packages/> <ks_appends/> <repos/> <distroRequires> <and> <distro_family op="=" value="Fedora25"/> <distro_variant op="=" value="Server"/> <distro_arch op="=" value="ppc64"/> </and> </distroRequires> <hostRequires> <cpu><model_name op="like" value="%POWER5%"/></cpu> <system_type value="Machine"/></hostRequires> <partitions/> <task name="/distribution/install" role="STANDALONE"/> <task name="/distribution/reservesys" role="STANDALONE"> <params> <param name="RESERVETIME" value="86400"/> </params> </task> </recipe> </recipeSet> </job>
Installation times out with no log messages printed to the console. Same happens with %POWER6% and/or/ Fedora 26.
Thanks, Florian
On 09/01/2017 02:48 AM, Florian Weimer wrote:
On 09/01/2017 02:13 AM, Carlos O'Donell wrote:
Are you able to take master and run the microbenchmark on each of POWER6, POWER7, and POWER8, building twice, once with no special options, and once again with the required '-mcpu=powerX -mtune=powerX' (and --with-cpu=powerX in configure)?
I don't have access to the necessary hardware. Installation of Fedora 25 and 26 fails in Beaker. I used jobs like this one:
I'm taking this up with Dan Horak to see how we can arrange for machine access.
Thanks.
On Thu, 31 Aug 2017 11:29:42 +0200 Florian Weimer fweimer@redhat.com wrote:
I would like to tweak glibc so that it only builds one POWER 620/970 libc for ppc64, and not multiple libcs with POWER6/POWER7/POWER8 optimizations.
This does not affect ppc64le in any way.
The reason for this change is that the ppc64 builders are awfully slow as of late, and the multiple builds (four times currently) currently make ppc64 the slowest architecture to build by quite a margin.
Any comments?
I think the reason for keeping lower bound as powerpc 970 is Apple G5 as the last desktop class hw, with Talos II as the next viable desktop solution coming I would even vote for having power8 as the base line for ppc64 for some next Fedora version. Even when new developments are done in the ppc64le flavour, ppc64 has it's value in for example easier debugging big endian related issues. Until then ppc970 + power8 variants would be good enough (IMO).
Dan
On Thursday, August 31, 2017, 4:41:59 PM, Dan Horák wrote:
On Thu, 31 Aug 2017 11:29:42 +0200 Florian Weimer fweimer@redhat.com wrote:
I would like to tweak glibc so that it only builds one POWER 620/970 libc for ppc64, and not multiple libcs with POWER6/POWER7/POWER8 optimizations.
This does not affect ppc64le in any way.
The reason for this change is that the ppc64 builders are awfully slow as of late, and the multiple builds (four times currently) currently make ppc64 the slowest architecture to build by quite a margin.
Any comments?
I think the reason for keeping lower bound as powerpc 970 is Apple G5 as the last desktop class hw, with Talos II as the next viable desktop solution coming I would even vote for having power8 as the base line for ppc64 for some next Fedora version. Even when new developments are done in the ppc64le flavour, ppc64 has it's value in for example easier debugging big endian related issues. Until then ppc970 + power8 variants would be good enough (IMO).
Dan,
A couple of years ago I posted about my intent to help correct the the G5 boot situation.
I saw that you were working on and off with another gent who was working on some boot changes on GitHub, so that has been some progress.
Aside from the obvious Anaconda and blivet changes, there need to be a number of changes to improve support for Mac APM-format disks in parted, pyparted, gparted, libblockdev, (et al). I came up with a prototype which Brian C. Lane thought were good (including marking the APM disk map partition as R/O to prevent a user accidently trashing the disk). The parted primary maintainer (Phillip Susi) rejected them because he insisted that the APM disk map partition be hidden, not just R/O. A full restart is required, but my available free time evaporated around that time.
Since then I've been busy with home renovations (total basement redo for home office) that has taken the last year, and real work (including porting nearly 700 RPMs to AIX this year.
I've been doing my AIX work on my own P4+ (7028-6C4 4X - CHRP) box. I've picked up a P5 (9113-550) that I've not had time to seriously use yet. My intent is to also get both working with Fedora PPC64 BE release. Not really relevant to current Fedora, I've got a couple of 7046-B50 (32-bit CHRP) that I'm using for some AIX, plus intend to try out NetBSD/ofppc.
I plan to finish the bulk of the AIX RPM work in November, and then to restart the boot related work. Once I get up to speed and that is solved, there seems to be some PREP partition (and GRUB CHRP) related issue that also never seem to get any love, as most new hardware uses PowerVM. I can work on those to the extent that my hardware allows me to reproduce.
The G5, P4+ and P5 boxes are too outdated for RHEL/CentOS (currently P6+) but represent the kind of hardware that a hobbyist can pick up for a reasonable price. The Apple and IBM boxes tend to be very heavily constructed (literally - 100LB+ for the R4+/P5) and if they have made it this far (no capacitor issues) tend to soldier on forever. It seems to me that folks using Fedora ppc64 tend to be either on (b)leading edge hardware (P8 and now P9) or outdated hardware. Anything in between tends to be RHEL/CentOS.
The price of newer systems like Talos II present a high barrier for entry for personal use. The crowd-sourcing failed for this exact reason. It would be fine for commercial use (where one can write off the expense) but that tends to go RHEL/CentOS or be used LE only.
For big-endian platforms as a hobbyist, one is basically constrained to vintage Apple/IBM for real hardware, or zSeries emulation on Hercules (on x86).
It would be really helpful to keep the PPC BE port running for this old hardware (absent restrictions in areas like go, rust and java) as it's an interesting and different place than generic commodity PCs.
For these reason, I'm clearly in favour retaining the P8+ and 970 variants if the number of variations has to be reduced. Al
I had to disable the glibc power6 multilib in rawhide due to a POWER6-specific codegen issue which was only recently fixed on the GCC 7 branch upstream.
Thanks, Florian