On 04Mar2015 18:27, Rick Stevens ricks@alldigital.com wrote:
On 03/04/2015 05:01 PM, Tom Horsley wrote:
Here's a weird one: A system at work has (God knows why) a gazillion symlinks directly under / pointing to NFS mountpoints for filesystems (some of which might well have high latency).
If I run "df -l", using the -l option in the apparently vain hope that it might not timeout forever on some NFS mount, it hangs for a long time.
"Walks"? This is df, not du. Consult mount table, do fstats.
[...]
Since strace requires it to report what it's doing, it's getting interrupted a lot, so it doesn't hang. I mean, it's still waiting on the I/O to complete from NFS, but rather than waiting an interminate time, it's getting interrupts (signals) from strace rather than hanging on the one from the NFS system.
I'm fairly certain that strace does not work that way. The traced process is not doing work for the tracer.
Just a wild guess.
I think so too.
Tom:
- _after_ a fast straced df, is un unstraced df slow again? (thinking about cached answers to call, caches in the OS, possibly quite briefly)
- see if the result of strace's -T option is informative.
- since df's output is line buffered on a terminal, the presentation of the lines should tell you where it is hanging.
df makes pleasingly few system calls on a handy RHEL5 host. It opens /etc/mtab and essentially just calls statfs() on each name. This implies that determining localness is done based entirely on the contents of /etc/mtab.
Also, this shows that there are no other system calls between the write() reporting the prior filesystem and the statfs() inquiring about the next, so watching an unstraced on-a-terminal df should pinpoint the place of stallness.
Is it similar on your fedora box?
Cheers, Cameron Simpson cs@zip.com.au
Against stupidity....the Gods themselves contend in vain!