Hi Gene,
On Mon, 02 Apr 2007 23:01:02 -0400
Gene Heskett <gene.heskett(a)verizon.net> wrote:
On Monday 02 April 2007, Andre Costa wrote:
>Hi,
>
>I've been experiencing weird reboots leately -- not the "bad-RAM"
>kinda reboots, they seem to be software-related, because they seem
>to happen always at the same time.
>
>It usually happens right after boot, when ntpd is synchronizing. This
>is what appears on /var/log/messages:
>
>Apr 2 22:21:43 localhost ntpd[2300]: synchronized to LOCAL(0),
>stratum
> 10 Apr 2 22:21:43 localhost ntpd[2300]: kernel time sync enabled
> 0001 Apr 2 22:22:49 localhost ntpd[2300]: synchronized to
> 200.218.160.160, stratum 2
>
>>>>>> here the system rebooted
>
>Apr 2 22:23:45 localhost syslogd 1.4.1: restart.
>Apr 2 22:23:45 localhost kernel: klogd 1.4.1, log source
>= /proc/kmsg
> started.
>
>After it reboots, it "survives" -- and even synchronizes again:
>
>Apr 2 22:27:09 localhost ntpd[2319]: synchronized to LOCAL(0),
>stratum
> 10 Apr 2 22:27:09 localhost ntpd[2319]: kernel time sync enabled
> 0001 Apr 2 22:29:16 localhost ntpd[2319]: synchronized to
> 193.6.222.47, stratum 2 Apr 2 22:43:16 localhost ntpd[2319]: time
> reset -2.639872 s
>Apr 2 22:47:17 localhost ntpd[2319]: synchronized to LOCAL(0),
>stratum
> 10 Apr 2 22:48:22 localhost ntpd[2319]: synchronized to
> 193.6.222.47, stratum 2
>
>Anyone ever seen something similar? Is it really possible that time
>syncs could cause reboots? Any other log file I could check for
>additional clues?
I believe that big crash corrections backwards can cause this. And I
know that the fedora's all save the time in the mobo's hardware clock
at shutdown time, so the clock should be reasonably close when its
used to set the system time at the next boot. However, if the cmos
battery is getting on in years and laying down on the job of keeping
the hardware clock somewhere near coherent while powered off, the
wrong time might be recovered at bootup, and not corrected until the
startup of ntpd, which actually does a crash correction using ntpdate
before handing the keep it correct chores off to the ntpd, which in
turn fine tunes the second to maintain the system clock within a few
milliseconds of the network time servers.
Mmmh... the weary CMOS battery theory indeed makes sense, it's about
time I replace that. But, what's new to me is that nptdate needs to
reboot the machine in order to correct large time-drifts.
The location of the reboot in your logs would be the #1 clue as to
thise theory to me. The fact that it usually keeps running after one
reboot because the hardware clock hasn't had time to go doofy because
its now running on electric power is another clue.
Right, so far I agree it makes perfect sense. Also, IIRC it tends to
happen more frequently after I spend a couple of days away from the
computer (I guess the clock drifts more and more the longer it is
counting solely on CMOS battery). Eg. today the "auto-reboot" did not
take place (I used the computer yesterday).
So I'd check the cmos battery on the motherboard with a digital
meter
as step one, after its been off overnight. Over 3 volts would be
considered decent for a wee bit yet, below 2.7 or so would be grounds
to replace it soonest. ISTR most of them are around 3.3 to 3.6 volts
new, but read the voltage stamped on the cell to be sure. Less than
say 85% of that rated voltage would be grounds to write the cell's
type number down and get one the next time you are in town.
Nah, I will replace it right away, it is about time =)
Thks a lot for sharing your thoughts, even though it is still
unconfirmed, your theory makes perfect sense so far (I would never
think of that =)). And, if it proves to be right, it was a hardware
problem after all ;-)
Regards,
Andre
--
Andre Oliveira da Costa