SATA II causes system freeze

David A. De Graaf dad at datix.us
Tue Nov 18 19:59:22 UTC 2014


On Thu, Nov 13, 2014 at 06:12:36PM -0500, Bill Davidsen wrote:
> Bill Davidsen wrote:
> >David A. De Graaf wrote:
> >>On Fri, Oct 03, 2014 at 02:19:55PM -0400, David A. De Graaf wrote:
> >>>On Fri, Oct 03, 2014 at 04:01:30AM +0930, Tim wrote:
> >>>>Allegedly, on or about 02 October 2014, Chris Murphy sent:
> >>>>>Cables are often the source of weird problems. Specifically it's the
> >>>>>connectors that are flakey, not the cable portion itself.
> >>>>
> >>>>Though, if you savagely bend SATA leads, the way some of them are
> >>>>supplied in a flattened up zig-zag style, with a rubber band around
> >>>>them, you can mess up the data transmission.
> >>>>
> >>>
> >>>Some quick feedback:  It's now apparent that the cables or SATA
> >>>sockets have nothing to do with my problem.  The finger of guilt
> >>>now seems to point to the RAM sticks.  However, experiments are
> >>>slow.  More later.
> >>
> >>After weeks of experimentation it's clear that my machine crashes have
> >>nothing to do with the SATA connections or the harddrives.
> >>They are caused by a too-small swap space!
> >>
> >>Zero is OK; large is OK; but small is NG.
> >>
> >>For reasons I can't recall, the system is set up with only a 2 GB swap
> >>partition, and for a long while it had a single 4 GB RAM memory stick.
> >>This was OK.
> >>
> >>Then I added a second 4 GB memory stick, identical to the first.
> >>With 8 GB RAM and 2 GB swap the system crashed - froze - after a
> >>random few hours.
> >>
> >>This was maddening.  Not knowing the real cause, I bought a different
> >>motherboard, changed power supply, tried different SATA and ATA
> >>harddrive
> >>connections, changed the SATA cable, removed the extra data drive,
> >>removed the ATA CD drive, used one or the other RAM stick,
> >>disconnected
> >>everything and ran with only a Live F20 Xfce USB stick.  I ran
> >>memtest86
> >>for days without error.  The only thing that worked was to revert to
> >>only a single memory stick - 4 GB.  Either stick was OK.
> >>
> >>I put everything back together, using an ATA/SATA converter for the
> >>350 GB primary disk, the SATA 1TB data harddrive, and the ATA CD.
> >>
> >>Then I noticed the size of the swap partition was 2 GB and, having
> >>nothing else to try, added an 8 GB swap file.
> >>
> >>Eureka!  It ran.
> >>
> >>I have a matrix of test cases which I won't bore you with.
> >>They can be summarized as follows:
> >>1 - with 4 GB RAM, either 0 or 2 GB swap space is OK.
> >>2 - with 8 GB RAM, 0 swap space is OK.
> >>3 - with 8 GB RAM, 2 GB swap space will reliably freeze the system
> >>4 - with 8 GB RAM, 4 GB swap file is OK.
> >>5 - with 8 GB RAM, 2 GB swap partition + 8 GB swap file is OK,
> >>       even if the priority of the smaller one is forced higher.
> >>
> >>At no time during these experiments was swap space actually used
> >>according to the gkrellm display;  the RAM usage remained well
> >>below what was available.
> >>
> >>This is clearly a bug.  No rational design would work like this.
> >>Is it a kernel bug?  Some other component?
> >>Which one gets the Bugzilla?
> >>
> >
> >It's probably too late to check now, but did you try taking the 2GB swap offline
> >and running mkswap on it to check for a glitch somewhere? Yes, I know that's
> >nominally a "can't happen" thing, but having had success with that, I mention
> >it. My sample size (one) is pretty small.
> >
> And to reply to my own suggestion, my notes on that also say you may want to
> change to deadline scheduler on the swap device.
> 

Thanks, Bill Davidsen.
I am now using *only* an 8 GB swap file on the mostly unused 1 TB SATA
disk.  Yesterday the machine froze again.

Just for grins I will try your suggestion to use the deadline
scheduler.  Googling shows that the way to do that is to 
  echo SCHEDULER > /sys/block/DEVICE/queue/scheduler
where SCHEDULER is one of cfq, noop, or deadline and DEVICE the block
device (sda for example). 

[root at datwiz /sys/block/sdb/queue]
# cat scheduler
noop deadline [cfq] 
[root at datwiz /sys/block/sdb/queue]
# echo deadline > scheduler
[root at datwiz /sys/block/sdb/queue]
# cat scheduler
noop [deadline] cfq 

As you can see, the original scheduler was 'cfq'; it is now 'deadline'.
Evidently, the type of scheduler applies to the entire /dev/sdb and
not to just the swapfile or the partition that it's in.
That should be OK.

If there's an improvement, I'll report here.
Thanks, again.

-- 
        David A. De Graaf    DATIX, Inc.    Hendersonville, NC
        dad at datix.us         www.datix.us


"Physics is like sex...it may give some practical results, but that's
not why we do it."                                 -- Richard Feynman


More information about the users mailing list