Hi,
RStudio is failing consistently on armv7l on F35 [1, 2] and rawhide [3, 4] with this message (memory exhausted). The same build on the same machine (CPU and RAM) succeeds on F34 [5]. Any clue what's going on? Why rawhide and F35 and not F34? Anything I can do in the SPEC to fix this?
[1] https://koji.fedoraproject.org/koji/taskinfo?taskID=77159810 [2] https://koji.fedoraproject.org/koji/taskinfo?taskID=77170457 [3] https://koji.fedoraproject.org/koji/taskinfo?taskID=77159795 [4] https://koji.fedoraproject.org/koji/taskinfo?taskID=77170370 [5] https://koji.fedoraproject.org/koji/taskinfo?taskID=77159829
Regards,
Hi Iñaki,
Iñaki Ucar iucar@fedoraproject.org writes:
Hi,
RStudio is failing consistently on armv7l on F35 [1, 2] and rawhide [3, 4] with this message (memory exhausted). The same build on the same machine (CPU and RAM) succeeds on F34 [5]. Any clue what's going on? Why rawhide and F35 and not F34? Anything I can do in the SPEC to fix this?
You could try to build without LTO, iirc that requires a lot of memory during linking.
And if that doesn't help: ExcludeArch…
Hope this helps,
Dan
Building with lto disabled is a bad idea, as Fedora intentionally enabled lto by default.
What you describe as lto requires a lot of memory is caused by building lto along with non-lto in the same object file requires significantly more memory. For that reason one can disable building non-lto along with lto using the `-f-no-fat-lto-objects` compiler flags instead of `-f-fat-lto-objects`, if and *only IF* the package in question does *NOT* ship static libraries.
Björn
On Wed, Oct 13 2021 at 06:06:50 PM +0200, Björn 'besser82' Esser besser82@fedoraproject.org wrote:
What you describe as lto requires a lot of memory is caused by building lto along with non-lto in the same object file requires significantly more memory. For that reason one can disable building non-lto along with lto using the `-f-no-fat-lto-objects` compiler flags instead of `-f-fat-lto-objects`, if and *only IF* the package in question does *NOT* ship static libraries.
More background: this default is, of course, backwards. Fedora packages do not generally ship static libraries, so it makes more sense for the few packages that do to opt-in instead of opt-out. Jeff proposed a change to improve that here:
https://fedoraproject.org/wiki/Changes/LTOBuildImprovements
but he left Red Hat, so it hasn't been implemented.
Michael
On 10/13/2021 10:37 AM, Michael Catanzaro wrote:
On Wed, Oct 13 2021 at 06:06:50 PM +0200, Björn 'besser82' Esser besser82@fedoraproject.org wrote:
What you describe as lto requires a lot of memory is caused by building lto along with non-lto in the same object file requires significantly more memory. For that reason one can disable building non-lto along with lto using the `-f-no-fat-lto-objects` compiler flags instead of `-f-fat-lto-objects`, if and *only IF* the package in question does *NOT* ship static libraries.
More background: this default is, of course, backwards. Fedora packages do not generally ship static libraries, so it makes more sense for the few packages that do to opt-in instead of opt-out. Jeff proposed a change to improve that here:
https://fedoraproject.org/wiki/Changes/LTOBuildImprovements
but he left Red Hat, so it hasn't been implemented.
I'd still like to tackle this but my time is limited.
However, I strongly suspect fat-lto-objects is not the problem here. If the build is running out of memory at link time, that is the LTO phase. The best solution for that is to either disable LTO on the arm target, or (better) limit the parallelism at link time. There was a change to redhat-rpm-config that I think made it into f35 to allow a package to throttle the link-time parallelism.
jeff
On Fri, 15 Oct 2021 at 06:15, Jeff Law jeffreyalaw@gmail.com wrote:
On 10/13/2021 10:37 AM, Michael Catanzaro wrote:
On Wed, Oct 13 2021 at 06:06:50 PM +0200, Björn 'besser82' Esser besser82@fedoraproject.org wrote:
What you describe as lto requires a lot of memory is caused by building lto along with non-lto in the same object file requires significantly more memory. For that reason one can disable building non-lto along with lto using the `-f-no-fat-lto-objects` compiler flags instead of `-f-fat-lto-objects`, if and *only IF* the package in question does *NOT* ship static libraries.
More background: this default is, of course, backwards. Fedora packages do not generally ship static libraries, so it makes more sense for the few packages that do to opt-in instead of opt-out. Jeff proposed a change to improve that here:
https://fedoraproject.org/wiki/Changes/LTOBuildImprovements
but he left Red Hat, so it hasn't been implemented.
I'd still like to tackle this but my time is limited.
However, I strongly suspect fat-lto-objects is not the problem here. If the build is running out of memory at link time, that is the LTO phase. The best solution for that is to either disable LTO on the arm target, or (better) limit the parallelism at link time. There was a change to redhat-rpm-config that I think made it into f35 to allow a package to throttle the link-time parallelism.
This makes sense, because f34 builds consistently succeed in exactly the same hardware. How do I limit just the link-time parallelism?
On Fri, 15 Oct 2021 at 09:43, Iñaki Ucar iucar@fedoraproject.org wrote:
On Fri, 15 Oct 2021 at 06:15, Jeff Law jeffreyalaw@gmail.com wrote:
On 10/13/2021 10:37 AM, Michael Catanzaro wrote:
On Wed, Oct 13 2021 at 06:06:50 PM +0200, Björn 'besser82' Esser besser82@fedoraproject.org wrote:
What you describe as lto requires a lot of memory is caused by building lto along with non-lto in the same object file requires significantly more memory. For that reason one can disable building non-lto along with lto using the `-f-no-fat-lto-objects` compiler flags instead of `-f-fat-lto-objects`, if and *only IF* the package in question does *NOT* ship static libraries.
More background: this default is, of course, backwards. Fedora packages do not generally ship static libraries, so it makes more sense for the few packages that do to opt-in instead of opt-out. Jeff proposed a change to improve that here:
https://fedoraproject.org/wiki/Changes/LTOBuildImprovements
but he left Red Hat, so it hasn't been implemented.
I'd still like to tackle this but my time is limited.
However, I strongly suspect fat-lto-objects is not the problem here. If the build is running out of memory at link time, that is the LTO phase. The best solution for that is to either disable LTO on the arm target, or (better) limit the parallelism at link time. There was a change to redhat-rpm-config that I think made it into f35 to allow a package to throttle the link-time parallelism.
This makes sense, because f34 builds consistently succeed in exactly the same hardware. How do I limit just the link-time parallelism?
Could this be related to this [1] commit?
[1] https://src.fedoraproject.org/rpms/redhat-rpm-config/c/bc8fa85e907d4b2b88760...
On 10/13/2021 10:06 AM, Björn 'besser82' Esser wrote:
Building with lto disabled is a bad idea, as Fedora intentionally enabled lto by default.
Yes, but there is nothing inherently wrong with not using LTO. Many packages opt-out for various reasons.
What you describe as lto requires a lot of memory is caused by building lto along with non-lto in the same object file requires significantly more memory. For that reason one can disable building non-lto along with lto using the `-f-no-fat-lto-objects` compiler flags instead of `-f-fat-lto-objects`, if and *only IF* the package in question does *NOT* ship static libraries.
I doubt fat-lto-objects is the issue here. The "fat" stuff is ignored at link time as the LTO bytecodes will take precedence. It is far more likely that the parallel link with all those LTO bytecodes is what's sucking up all the memory.
Note that we build with fat-lto-objects for a reason. If a package installs any static objects, then those objects must be compiled down to machine code as the LTO bytestreams are not compatible across GCC releases. -ffat-lto-objects ensures this.
To remedy this situation you have to put in bits to redhat-rpm-config/brp-whatever to fail builds when they install .o/.a files into the buildroot that are purely LTO bytecode streams and identify every package that fails that test and fix them to turn on fat-lto-objects. I actually wrote some code to do this and ran a Fedora build, but never had the time to act on the resultant data. I've since left Red Hat and haven't had time to come back to the issue.
Jeff
On Wed, Oct 13, 2021 at 3:45 PM Iñaki Ucar iucar@fedoraproject.org wrote:
Hi,
RStudio is failing consistently on armv7l on F35 [1, 2] and rawhide [3, 4] with this message (memory exhausted). The same build on the same machine (CPU and RAM) succeeds on F34 [5]. Any clue what's going on? Why rawhide and F35 and not F34? Anything I can do in the SPEC to fix this?
I think you could try either reducing LTO parallelism (-flto=1 instead of -flto=auto, e.g. by overriding "%_lto_cflags"), or by reducing debuginfo verbosity (compiling with -g1 or even -g0 instead of with -g, I think you need to "sed" the C(XX)FLAGS in this case).
Fabio
Am Mittwoch, dem 13.10.2021 um 15:44 +0200 schrieb Iñaki Ucar:
Hi,
RStudio is failing consistently on armv7l on F35 [1, 2] and rawhide [3, 4] with this message (memory exhausted). The same build on the same machine (CPU and RAM) succeeds on F34 [5]. Any clue what's going on? Why rawhide and F35 and not F34? Anything I can do in the SPEC to fix this?
[1] https://koji.fedoraproject.org/koji/taskinfo?taskID=77159810 [2] https://koji.fedoraproject.org/koji/taskinfo?taskID=77170457 [3] https://koji.fedoraproject.org/koji/taskinfo?taskID=77159795 [4] https://koji.fedoraproject.org/koji/taskinfo?taskID=77170370 [5] https://koji.fedoraproject.org/koji/taskinfo?taskID=77159829
As the package doesn't build any *distributed* static library, you can try to avoid building the linker object files to contain non-lto code:
%global optflags %(echo '%{optflags}' | sed -e 's!-ffat-lto-objects!- fno-fat-lto-objects!g')
That should drastically cut the amount of memory the linker needs to create the final ELF binary. It doesn't hurt to do that on all arches / releases, as it will also result in significantly shorter build time.
Björn
Am Mittwoch, dem 13.10.2021 um 16:51 +0200 schrieb Björn 'besser82' Esser:
Am Mittwoch, dem 13.10.2021 um 15:44 +0200 schrieb Iñaki Ucar:
Hi,
RStudio is failing consistently on armv7l on F35 [1, 2] and rawhide [3, 4] with this message (memory exhausted). The same build on the same machine (CPU and RAM) succeeds on F34 [5]. Any clue what's going on? Why rawhide and F35 and not F34? Anything I can do in the SPEC to fix this?
[1] https://koji.fedoraproject.org/koji/taskinfo?taskID=77159810 [2] https://koji.fedoraproject.org/koji/taskinfo?taskID=77170457 [3] https://koji.fedoraproject.org/koji/taskinfo?taskID=77159795 [4] https://koji.fedoraproject.org/koji/taskinfo?taskID=77170370 [5] https://koji.fedoraproject.org/koji/taskinfo?taskID=77159829
As the package doesn't build any *distributed* static library, you can try to avoid building the linker object files to contain non-lto code:
%global optflags %(echo '%{optflags}' | sed -e 's!-ffat-lto-objects!- fno-fat-lto-objects!g')
That should drastically cut the amount of memory the linker needs to create the final ELF binary. It doesn't hurt to do that on all arches / releases, as it will also result in significantly shorter build time.
Björn
Works as suggested in a scratch build:
https://koji.fedoraproject.org/koji/taskinfo?taskID=77177126
Thanks, Björn, Dan and Fabio for your comments.
On Wed, 13 Oct 2021 at 18:20, Björn 'besser82' Esser besser82@fedoraproject.org wrote:
Am Mittwoch, dem 13.10.2021 um 16:51 +0200 schrieb Björn 'besser82' Esser:
Am Mittwoch, dem 13.10.2021 um 15:44 +0200 schrieb Iñaki Ucar:
Hi,
RStudio is failing consistently on armv7l on F35 [1, 2] and rawhide [3, 4] with this message (memory exhausted). The same build on the same machine (CPU and RAM) succeeds on F34 [5]. Any clue what's going on? Why rawhide and F35 and not F34? Anything I can do in the SPEC to fix this?
[1] https://koji.fedoraproject.org/koji/taskinfo?taskID=77159810 [2] https://koji.fedoraproject.org/koji/taskinfo?taskID=77170457 [3] https://koji.fedoraproject.org/koji/taskinfo?taskID=77159795 [4] https://koji.fedoraproject.org/koji/taskinfo?taskID=77170370 [5] https://koji.fedoraproject.org/koji/taskinfo?taskID=77159829
As the package doesn't build any *distributed* static library, you can try to avoid building the linker object files to contain non-lto code:
%global optflags %(echo '%{optflags}' | sed -e 's!-ffat-lto-objects!- fno-fat-lto-objects!g')
That should drastically cut the amount of memory the linker needs to create the final ELF binary. It doesn't hurt to do that on all arches / releases, as it will also result in significantly shorter build time.
Björn
Works as suggested in a scratch build:
https://koji.fedoraproject.org/koji/taskinfo?taskID=77177126
And thanks for this. I launched a scratch build to test this too, but you were faster, so cancelling now. ;-) I'll implement this suggestion then.
On 10/13/2021 8:51 AM, Björn 'besser82' Esser wrote:
Am Mittwoch, dem 13.10.2021 um 15:44 +0200 schrieb Iñaki Ucar:
Hi,
RStudio is failing consistently on armv7l on F35 [1, 2] and rawhide [3, 4] with this message (memory exhausted). The same build on the same machine (CPU and RAM) succeeds on F34 [5]. Any clue what's going on? Why rawhide and F35 and not F34? Anything I can do in the SPEC to fix this?
[1] https://koji.fedoraproject.org/koji/taskinfo?taskID=77159810 [2] https://koji.fedoraproject.org/koji/taskinfo?taskID=77170457 [3] https://koji.fedoraproject.org/koji/taskinfo?taskID=77159795 [4] https://koji.fedoraproject.org/koji/taskinfo?taskID=77170370 [5] https://koji.fedoraproject.org/koji/taskinfo?taskID=77159829
As the package doesn't build any *distributed* static library, you can try to avoid building the linker object files to contain non-lto code:
%global optflags %(echo '%{optflags}' | sed -e 's!-ffat-lto-objects!- fno-fat-lto-objects!g')
That should drastically cut the amount of memory the linker needs to create the final ELF binary. It doesn't hurt to do that on all arches / releases, as it will also result in significantly shorter build time.
I would strongly discourage this. This really needs to be addressed in redhat-rpm-config. See my change proposal for the details.
jeff
On 13/10/2021 15:44, Iñaki Ucar wrote:
Why rawhide and F35 and not F34?
Random issue due to the different builders.
Anything I can do in the SPEC to fix this?
Try this:
%ifarch %{arm} %global _smp_build_ncpus 1 %endif
If it will not help, you can also try this:
%ifarch %{arm} %global _smp_build_ncpus 1 %global optflags %(echo %{optflags} | sed 's/-g /-g1 /') %endif