Please stop apps going into state D uninterrupted sleep !!

Patrick O'Callaghan pocallaghan at gmail.com
Tue May 8 20:32:35 UTC 2012


On Tue, 2012-05-08 at 13:16 -0700, Joe Zeff wrote:
> On 05/08/2012 12:39 PM, Patrick O'Callaghan wrote:
> > On Tue, 2012-05-08 at 19:42 +0100, Andrew Gray wrote:
> >> >  Hi
> >> >
> >> >  Either give use a way to kill a hung cp or rsync  when the VPN goes down
> >> >  and they end up is state D uninterrupted sleep or stop apps being able
> >> >  to go into uninterrupted sleep !!
> > It is*not possible*  to kill a process in D state. D state can be
> > defined as "the state which cannot be interrupted".
> 
> I think it's fairly clear that Mr. O'Callaghan knows that.

I think you mean Mr. Gray.

>   He's 
> complaining about the consequences of there being an uninterruptable 
> sleep.  If I read him right, he's saying that it should always be 
> possible for the user to force a hung app to die when it's clear to the 
> user that something has happened that makes it impossible for the app to 
> continue, such as rsync completing when the remote server's known to 
> have crashed.  At this point, probably the best way to proceed is to 
> request that whoever maintains the programs in question modify them so 
> that they don't enter this state when accessing a remote file system or 
> that there's some way to get the app's attention and force it to abort. 
>   On the surface, at least, the request sounds reasonable, although I'll 
> be the first to admit that things like this are often much more 
> difficult than they sound.

As I tried to explain, rewriting a couple of apps is not going to hack
it. The apps don't *know* they're using a networked filesystem, they're
just accessing files. They could find out and try to take measures, but
then what about all the other apps that also write files? Rewrite tar,
cpio, dd, cat, ...?

The price of treating a networked fs as equivalent to a local one is
that you get screwed when it doesn't behave like a local one. Dealing
with this in a coherent and consistent way is hard. See the literature
on distributed filesystems. The semantics of an NFS system are *not* the
same as a local system. We brush this under the carpet most of the time
because it usually works, but sometimes the differences bite.

poc



More information about the users mailing list