Shared library are heavily used through Linux distributions. Unfortunately, there are
cases of functions in the libraries having undefined behavior. Rather than immediately
reporting the dependence on that undefined behavior, the applications may later fail in
odd and seemingly random ways. On particular example of this problem is the memcpy
function which has undefined behavior when the source and destination regions overlap.
This resulted in the following bug being filled about "Strange sound on mp3 flash
website":
http://koji.fedoraproject.org/koji/taskinfo?taskID=2898613
The diagnosis of this problem was not straightford because the memcpy silently corrupted
the data in the copy. There are many other examples of this type of memcpy problems in
bugzilla.
What would be desirable is catching the dependency on undefined behavior when it occurs.
The LD_PRELOAD environment variable allows wrappers for shared library functions to be
inserted. These wrappers can do additional checks and flag those issues when they occur.
The mutrace package in Fedora is one example of this approach. It makes use of this
mechanism to instrument the mutex operations and can trigger a gdb breakpoint when a
problem mutex operation occurs.
I have taken the code in the mutrace package and made memstomp which looks for the memcpy
of overlapping regions.
git repo at:
http://fedorapeople.org/gitweb?p=wcohen/public_git/memstomp;a=summary
A fedora scratch package RPM at:
http://koji.fedoraproject.org/koji/taskinfo?taskID=2898613
Valgrind does check the arguments for memcpy (and many other memory related checks). The
main advantage to using the specialized wrappers like memstomp is lower overhead. Most
people are not willing to pay for the overhead that valgrind introduces (4x-100x slow
downs). The overhead for the memstomp wrappers should be low enough that it would be
feasible to set the LD_PRELOAD for Fedora alpha releases. This would make the problems
depending on undefined behavior obvious rather than spending a large amount of time trying
to replicate the problem and then diagnosing it.
-Will