[OT] Hardlinks and directories
Marko Vojinovic
vvmarko at gmail.com
Sat Feb 13 04:12:03 UTC 2010
On Saturday 13 February 2010 00:09:38 Suvayu Ali wrote:
> On Friday 12 February 2010 06:06 AM, Marko Vojinovic wrote:
> > Now, hard links are not allowed for directories since they would allow
> > for creation of loops (a directory containing itself), which is a Bad
> > Idea, since it breaks recursion. The filesystem needs to be a *tree* if
> > recursion is to function, so loops are forbidden (how would you delete a
> > directory which contains itself?).
>
> I don't quite follow you here, don't you mean hardlinking directories
> have the risk of introducing recursion?
No, I am saying that hard linking directories allows creation of loop
structures. These structures break any recursive algorithm that tries to go
"downwards" in some directory structure.
Deleting directories is a textbook example. In order to delete a directory,
you first have to delete all files and subdirectories that it contains, and once
it is empty, delete the directory itself. So deletion goes on via a recursive
algorithm:
1) check whether there are files or dirs in target dir
2) delete any files present
3) execute yourself (from step 1) for every subdir as target dir
4) target dir is now empty, unlink it
5) done
Now consider the directory structure of the form where inside dir a/ you have
dir b/, and inside it dir c/ which is a hard link back to a/ (this is a
simplest loop situation, much more complicated are possible). IOW, the
directory a/ contains itself as a subdirectory two levels down. Now execute
the above algorithm in order to delete a/ --- the algorithm will try to delete
all subdirectories, namely b/, and execute itself over b/ in step 3. But to
delete b/, it needs to execute itself over c/ which is actually a/. But to
delete a/ it needs to execute itself over b/...
As you can see, the algorithm starts to traverse a loop, and goes into an
infinite recursion --- executes more and more instances of itself without
control, while never reaching step 4 in any of the instances. This is a Very
Bad Thing, as you can imagine.
Now, you might think that it is possible to make a more clever deletion
algorithm which would remember where it has already been and deal with this
situation. But I can then invent a directory structure where two or three
loops are interconnected, which would defeat your more clever algorithm. I can
imagine an arbitrarily complicated graph (think lots of chain-links randomly
connected to each other in a shape of your favorite Moore sculpture :-)... ).
It is mathematically impossible to construct an algorithm which would be able
to traverse an arbitrary graph recursively in finite number of steps.
So the only way out is to forbid hard links to directories in general (with
the tightly controlled exception of . and ..).
In the end, note that soft links do not suffer from this problem. The soft link
to a directory is not regarded as a subdirectory, but rather as a file, and
gets deleted in step 2, without following it (going "inside"). A soft link is
not a directory itself, so has no contents that would require recursive
deletion. So creating loops with soft links is completely safe for any
algorithm that can choose not to follow a soft link. :-)
Best, :-)
Marko
More information about the users
mailing list