help needed to find a bug in zorba (or gcc 4.9)

Jerry James loganjerry at gmail.com
Tue Jun 10 18:44:54 UTC 2014


On Tue, Jun 10, 2014 at 6:16 AM, Martin Gieseking
<martin.gieseking at uos.de> wrote:
> Hi,
>
> I've tried to fix the broken zorba package in rawhide for a couple of weeks
> now but, unfortunately, without much success. The upstream developers don't
> seem to be able to find the cause for the issue either.
>
> The problem is that the package fails to build with gcc 4.9.0 (all archs)
> because the generated zorba binary segfaults for some queries due to
> accessing already freed memory. The issue only occurs with optimized builds
> (-O1, -O2, -O3) using gcc 4.9.0. With gcc 4.8.x the binary and thus the
> whole package build and work correctly. Therefore, it might also be possible
> that there's a bug in gcc's optimizer, but I'm not sure.
>
> valgrind and gcc's address sanitizer report the code sections where the
> error occurs but when stepping through them with a debugger, I'm unable to
> understand what's actually going on there. It looks as if the affected code
> should work properly. So I got stuck now.
>
> It would be great if someone could help to track down the issue in order to
> keep the package available in Fedora.

Here's the first problem pointed out by valgrind:
- class Store (src/store/naive/store.h) has a public member "zstring theEmptyNs"
- that object is set to a string that is also added to "StringPool
*theNamespacePool" inside Store::init() (src/store/naive/store.cpp)
- when the ZorbaImpl destructor runs on the singleton ZorbaImpl
object, it starts this call chain:
  o shutdownInternal(false)
  o StoreManager::shutdownStore(&GENV_STORE)
  o SimpleStore::shutdown(false)
  o Store::shutdown(false)
- Since theNamespacePool is non-NULL, we do this:
  theEmptyNs.~zstring();
  theXmlSchemaNs.~zstring();
  delete theNamespacePool;
  theNamespacePool = NULL;

We deleted theEmptyNs ... but left it sitting in theNamespacePool.  So
when theNamespacePool's destructor runs, it examines that string,
leading to the crash.  The same thing happens with theXmlSchemaNs.
The fix is to remove those strings from the StringPool instead of
explicitly deallocating them, and then let the Store destructor
actually delete the two strings, like so:

--- src/store/naive/store.cpp.orig 2013-11-06 00:20:44.000000000 -0700
+++ src/store/naive/store.cpp 2014-06-10 12:00:00.000000000 -0600
@@ -333,8 +333,8 @@

     if (theNamespacePool != NULL)
     {
-      theEmptyNs.~zstring();
-      theXmlSchemaNs.~zstring();
+      theNamespacePool->erase(theEmptyNs);
+      theNamespacePool->erase(theXmlSchemaNs);

       delete theNamespacePool;
       theNamespacePool = NULL;

Unfortunately, it appears that that is not the only bug.  Valgrind
shows at least two more bugs, both also tied into SimpleStore and
Store somehow, but I'm out of time to look at them.

Off topic: the check for unicode/coll.h (ZORBA_HAVE_COLL_H) is failing
spuriously because CHECK_INCLUDE_FILES is used where
CHECK_INCLUDE_FILE_CXX should be used.  One fix is to do this in
%prep:

# Fix detection of unicode/coll.h
sed -i 's,\(CHECK_INCLUDE_FILE\)S\( ("unicode/coll.h"\),\1_CXX\2,'
CMakeLists.txt

Good luck and regards,
-- 
Jerry James
http://www.jamezone.org/


More information about the devel mailing list