Performance and the hard disk size

Dave Ihnat dihnat at dminet.com
Sun Jun 23 13:24:11 UTC 2013


Ok, folks, I just want to inject some bit of reality here.  I've started
working on Unix internals in 1980, and have worked ever since on just
about any OS that has come my way--almost all variants of Unix, Linux,
Windows, and a bunch of others that are irrelevant to this conversation.
Why do I say this?  To point out that I work in an heterogeneous
environment, and have for a very long time.

In the following, I respond both to Reindl, and the poster he's responding
to (since I didn't save the original post); note the double-carats.

Once, long ago--actually, on Sat, Jun 22, 2013 at 07:27:11AM CDT--Reindl Harald (h.reindl at thelounge.net) said:
> ...
> Am 22.06.2013 14:18, schrieb Tim:
> > Have you seen Windows complain at boot up that you hadn't shut down
> > properly, and it needs to check the drive? ...

We've all seen Unix/Linux have the same complaint, and force a fsck.
ALL operating systems occasionally have cause to believe their
filesystem(s) need checking.  And ALL operating systems occasionally
crash, or have filesystem issues.

Also, since NTFS, the Windows filesystem has been at least as stable as
most *nix variants in operation.

> > ...
> > Then you have the fun of waiting for it to scan through one hell
> > of a huge drive. ...

Never sat through a fsck of a really big *nix drive or array, have you?
It's all a matter of the level of the check, allocation unit size,
number of large files, and volume of data.

> > More so if your computer likes to regularly screw up.
> 
> which does typically not happen

True.  Quite seriously, Windows filesystems since NTFS are nowhere near as
fragile as they used to be.  Stability improved significantly after XP SP3,
as well, and especially under Windows Server.

> > Then there's drive fragmentation.  Windows still seems to be horrid for
> > that.  I'd hate to have to wait for a 2 TB drive to defrag.  Even if I
> > wasn't watching the box, waiting for it to finish, because I wanted to
> > use it, but left it overnight - it'd be at it all night
> 
> which has nothing to do with *a disk* larger than 1 TB
> it's more depending on the partitions you create

Even more dependent on the allocation unit size selected at filesystem
creation.

> in context of Linux it doe snot matter at all

Beg to differ.  Do you all know how a file is strucured in *nix?
There is a primary inode.  In this are direct block pointers--the
number varies depending on OS, filesystem type, etc.--but generally
there are 12 pointers to direct blocks, 1 to single-indirect blocks,
1 to double-indirect blocks, and 1 to triple-indirect blocks.

What does this mean?  Well, essentially, *nix tries to optimize for
small files--that is, files that can fit in twelve blocks.  How big a
block is depends on the allocation unit you picked when formatting the
filesystem (as with NTFS).  But once you go over that size, things start
to get less efficient.  Grow over the storage that can be addressed by
direct file pointers, you have two lookups to carry out--on for the
indirect block, then the pointers there.  Double indirects guarantee
three lookups; triple, four.  And every one of the allocation units can
be scattered anywhere on the disk--meaning that, after a while, yes,
*nix is fragmented, too.

NTFS has a similarly complicated, but very different, system for directory
and file management.  But it, too, ends up having mechanisms for dealing
with larger files, and it, too, has to deal with fragmentation.  And the
effects of fragmentation have been reduced in NTFS over the earlier
FAT filesystems.

Both *nix and Windows play games with the disk drivers (f'rinstance, look
up the elevator algorithm, and scatter-gather), caching, etc. to minimize
the effect of fragmentation (as do, in fact, all operating systems).
Disks have done their part to obfuscate the issue, since the allocation
unit you think you're reading is certainly remapped internally by the
disk firmware to different physical block(s).

Both *nix and Windows play other games with their internal data structures
to mitigate filesystem corruption and hardware failure--log files,
multiple MFTs or primary inode tables, etc., and these have gotten both
more complicated and sophisticated over time.

Essentially, all filesystems fragment.  All filesystems and operating
systems have mechanisms to minimize the effects of fragmentation.
And all have gotten very, very much better at it over time.

We do no good to the cause of furthering the promulgation of Linux
(I've given up on Unix _per se_) if we carelessly repeat canards that
are no longer applicable, or at worst are much less applicable, when
discussing the differences between operating systems.

Enough pre-coffee pontificating.  I just hit a tipping point, and had
to point out that before we post something we "all know"--"Windows
filesystems are fragile", "Linux doesn't fragment", etc.--we should
think twice.

Cheers,
--
	Dave Ihnat
	dihnat at dminet.com


More information about the users mailing list