On 11/29/2011 03:04 PM, Andrew Haley wrote:
On 11/29/2011 02:46 PM, Gordan Bobic wrote:
> On 11/29/2011 02:42 PM, Andrew Haley wrote:
>> On 11/29/2011 02:01 PM, Gordan Bobic wrote:
>>> One other thing - one of the manifestations of this bug appears to be
>>> random memory corruption (strange, I know - unless I am dealing with two
>>> totally unrelated problems). Specifically, I have seen the bug manifest
>>> during compile jobs where, for example, linking would segfault, and
>>> re-making would segfault again. But doing:
>>> echo 3> /proc/sys/vm/drop_caches
>>> would fix the problem.
>>> My first suspicion was duff hardware/RAM on my AC100. So I got another
>>> one, and it behaves in the exact same way.
>> The most likely explanation is that you've got a data race somewhere.
>> SMP ARM, unlike x86, has a weakly-ordered memory model. Unless
>> everyone is extremely careful, problems like the one you're describing
>> are very likely.
> Indeed, I was thinking about some kind of a concurrency issue, too, but
> the question is how to fix it. Assuming for a moment that it is not a
> kernel issue (other people are running he same kernel with Ubuntu
> without this problem), are we talking about glibc? Or are you saying
> that _any_ package could be responsible for such a thing?
It's possible that glibc is the problem, but not very likely. User
programs can't cause this problem unless they're multi-threaded. As
far as I know the linker isn't multi-threaded, though.
So could a broken program cause something else to crash, or is the
danger limited to the program that is running? The reason I ask is
because I have also seen some weirdness where, for example, gcc would
get stuck in an infinite loop during compiling, typically bloating until
the OOM killer terminates it. It doesn't happen often, but I have seen
it more than once.