Using LD_PRELOAD wrappers to identify problem use of shared library functions

John Reiser jreiser at bitwagon.com
Mon Mar 21 18:28:21 UTC 2011


On 03/10/2011, Jakub Jelinek wrote:

> And, it would be nice when you have such a library not to check just
> memcpy, there are plenty of other commonly used calls which could
> be warned about.
> 
> memcpy, strcpy, strncpy, strcat, strncat, strtok, strtok_r, mempcpy, strsep,
> stpcpy, stpncpy, memccpy

The vast majority of the benefit of checking for overlap comes from checking
only memcpy, strcat, and strcpy.  The usage of other mem* and str* is much less.
[And detecting overlap in routines such as strncpy is not trivial.
I have been working on W.Cohen's memstomp.]

> just to name a few from <string.h>, then for -D_FORTIFY_SOURCE also
> __memcpy_chk, __mempcpy_chk, __strcpy_chk, __stpcpy_chk, __strncpy_chk,
> __stpncpy_chk, __strcat_chk, __strncat_chk.

Those *_chk routines already perform length checks and have a mechanism for
handling bad usage.  Any implementation of *_chk which ignores the possibility
of overlap (does not detect violations of 'restrict') arguably has a bug
in Usability, if not in [the spirit of] Functionality.

> In wchar.h e.g.
> wcscpy, wcsncpy, wcscat, wcsncat, wcstok, wmemcpy, wmempcpy
> and maybe
> mbrtowc, wcrtomb, mbrlen, mbsrtowcs, wcsrtombs, mbsnrtowcs,
> wcsnrtombs, wcstol, wcstoul, wcstoll, wcstoull, ...
> 
> Maybe also sprintf/snprintf if format string contains some %s/%ls/%S
> specifiers and those arguments overlap the target.
> 
> Basically, most of the __restrict/restrict qualified prototypes
> in glibc headers would be good candidates for overlap tests (if possible
> to determine length).

For 90% or more of the 'restrict' declarations, it is not worth developing
a separate library with checking code.  Those checks should be in the
everyday library already, and active always.  For instance, any syscall
wrapper (such as select()) certainly should enforce 'restrict'; the few
cycles to check for overlap are insignificant compared to a syscall.
Not detecting overlap of multiple outputs (select() is an example again)
produces *wrong answers*; this certainly is a bug to be fixed.
For most routines with 'restrict' arguments, the frequency of use
is so small that the total time cost of checking for overlap (for all
such routines *combined*) probably is less than a few seconds per day.
(An example is wcstouq.)

Using a separate library to detect overlap suffers from bad psychology.
The recent episode of memcpy in libflashplayer might account for 1/3 or more
of all the important cases in this year 2011.  A project which uses a
checking library likely will find a few overlaps immediately, but after
that only a couple per year.  Repeated results of "No bad usage detected"
is a strong encouragement to discontinue "extra" work, especially
anything that requires additional setup or administration.
There is no substitute for detecting overlap all the time and every time.
Such checking likely is less expensive than the cost of *ONE* bug per year,
and therefore the everyday library should do it.

-- 



More information about the devel mailing list