On 07/27/2017 09:16 AM, Kaleb S. KEITHLEY wrote:
On 07/26/2017 06:25 PM, Al Stone wrote:
> I've been experimenting in a slightly different environment (RHEL vs Fedora)
> but have been seeing oddly similar results. The use or not of the "-pipe"
in
> GCC didn't seem to help. If I forced the make in the %build step to be just
> "make" (aka, "make -j1"), I could always get a build to work,
albeit slowly.
>
> It turns out there is a typo in the spec file; look for the string
> "WTIH_BABELTRACE" -- that should be "WITH_BABELTRACE". In the
environment I'm
> using, "make -j32" is the default state. If I leave the typo alone and do
not
> change the "make -j32", I can pretty consistently get the ceph build to
fail;
> the failure moves around a bit but generally seems to hang around with where
> the babeltrace headers are being used (somewhere in RBD code, usually). If I
> fix the typo -- and change nothing else -- the build succeeds.
>
> Would you mind trying this one change -- fixing the typo *only* -- and see if
> you get the same results?
If by same result you mean the build still fails, then yes. I get the same result.
It's still running out of memory. Not the same way as the prior builds though.
...
[100%] Building CXX object src/rgw/CMakeFiles/radosgw.dir/rgw_main.cc.o
/usr/include/c++/7/bits/stl_map.h: In static member function 'static void
pg_missing_set<TrackChanges>::generate_test_instances(std::__cxx11::list<pg_missing_set<TrackChanges>*>&)
[with bool TrackChanges = false]':
/usr/include/c++/7/bits/stl_map.h:493:4: note: parameter passing for argument of
type 'std::_Rb_tree<hobject_t, std::pair<const hobject_t, pg_missing_item>,
std::_Select1st<std::pair<const hobject_t, pg_missing_item> >,
std::less<hobject_t>, std::allocator<std::pair<const hobject_t,
pg_missing_item>
> >::const_iterator {aka std::_Rb_tree_const_iterator<std::pair<const
hobject_t,
pg_missing_item> >}' changed in GCC 7.1
__i = _M_t._M_emplace_hint_unique(__i, std::piecewise_construct,
^~~
virtual memory exhausted: Operation not permitted
...
make: *** [Makefile:141: all] Error 2
RPM build errors:
error: Bad exit status from /var/tmp/rpm-tmp.RgosXb (%build)
Bad exit status from /var/tmp/rpm-tmp.RgosXb (%build)
Child return code was: 1
...
See
https://koji.fedoraproject.org/koji/taskinfo?taskID=20797264 for full logs.
--
Kaleb
Rats. Thought I had a clue or a pointer; thanks for trying, though. When a
colleague double checked my results, we discovered that what I was seeing was
not a change in behavior from fixing the typo, but just a fluke, pure random
chance -- we've probably got a race condition in the build somehow where if
exactly the right combination of compiles tries to occur in parallel, the OOM
killer gets invoked and compilation fails.
You may have to force the %build section to use "make -j1", as well as remove
the use of "-pipe" as you have done.
In the meantime, I'm following up on a g++ bug that may or may not be relevant;
it definitely stresses memory horribly but I need to prove that something in the
compile is actually hitting the bug. As a workaround, something like this might
help:
diff --git a/ceph.spec b/ceph.spec
index b321a1b..f15171e 100644
--- a/ceph.spec
+++ b/ceph.spec
@@ -811,8 +811,11 @@ cmake .. \
%endif
-DBOOST_J=%{_smp_ncpus}
+%ifarch aarch64
+make
+%else
make %{?_smp_mflags}
-
+%endif
%if 0%{with make_check}
%check
--
ciao,
al
-----------------------------------
Al Stone
Software Engineer
Red Hat, Inc.
ahs3(a)redhat.com
-----------------------------------