Hi Pasi,
Had a hiccup overnite:
The host became unresponsive in a weird way. The time stopped incrementing.
Turns out the clock stopped ticking (which I put down to the interrupts being
disconnected).
Anyway I decided I'd reset the time using 'time -s 10:41:30'.
Kaboom, or actually deathly silence. The machine fully stopped dead in its
tracks.
Just prior to this I connected to the console of one of the 64PV machines
which was just running a ping from yesterday. Anyway, 60,000 or so lines of
pings went to the console zipping up the screen. Then it was dead. I did a
CTRL-C and eventually it returned to the prompt.
So I looked at the other 64PV machine, which was also pining, and identical
situation.
So I reckon, there's some kind of buffer overflow going on when you're not
"xm
console MACHINE" connected. Once you pass 60,000 lines of text this buffer
overflow causes the RTC to hangup somehow.
Do you have xenconsoled running?
I've noticed PV guests that print a lot to the console will stall if xenconsoled is
not running..
xenconsoled needs to clear the guest console buffer..
-- Pasi
I pressed the reset button, but this time the 2 64PV machines are not
logged
in. I'll just let it go and see if it keeps going.
Cheers
V
On Tue, 22 Jun 2010 04:29:06 pm Pasi Kärkkäinen wrote:
> On Tue, Jun 22, 2010 at 12:03:53PM +1000, Virgil wrote:
> > Hi Pasi,
> >
> > On Mon, 21 Jun 2010 08:57:55 pm Pasi Kärkkäinen wrote:
> > > On Mon, Jun 21, 2010 at 01:56:36PM +0300, Pasi Kärkkäinen wrote:
> > > > On Mon, Jun 21, 2010 at 02:28:15PM +1000, Virgil wrote:
> > > > > Another quick update....
> > > > >
> > > > > xen-4.0.1-0.1.rc3.fc13.src.rpm just compiled this under fc12.
> > > > >
> > > > > Identical results with this too (i.e. it's probably in the
kernel).
> > > > >
> > > > > I have a (silly) idea for the serial console. The wiki page
> > > > > recommends using a phone camera to capture the screen....
> > > > >
> > > > > Well my idea is to add an n-millisecond delay every time the
output
> > > > > stream in Xen sees a \n. This would delay the screen updates
enough
> > > > > for the camera to see them. The n should be configurable on the
> > > > > kernel boot command line. It's set to 0 right now.
> > > >
> > > > Yeah, we really need to get a log somehow to troubleshoot your
> > > > problem.
> > > >
> > > > Serial console log would be the best:
> > > >
http://wiki.xensource.com/xenwiki/XenSerialConsole
> > >
> > > Btw are you running the latest kernel:
> > >
http://koji.fedoraproject.org/koji/taskinfo?taskID=2254110
> > >
> > > Or are you running custom/self compiled kernel?
> >
> > Everything is working with:
> > xen-4.0.1-0.1.rc3 compiled from source on fc12 machine and
> > 2.6.32.14-1.2.107.xendom0.fc12.x86_64 from the myoung repo.
> >
> > All fixed.
>
> Good to hear it works!
>
> > We also now have a "null modem" cable to another old computer with a
COM
> > port. Turns out I was the only old man that could remember what a null
> > modem cable is. The young guy said "wtf"? Also turns out I'm the
only
> > one who knows what minicom is and what 8N1 means :-)
>
> Hehe.. yeah I guess young people don't get to play with serial consoles
> nowadays, until they're doing networking stuff..
>
> So I guess most SOL devices in servers go unused.. :)
>
> -- Pasi
>
> > All VMs are now running concurrently.
> >
> > Very happy again. Thanks.
> > V
> >
> > > -- Pasi
> > >
> > > > > Cheers
> > > > > V
> > > > >
> > > > > On Mon, 21 Jun 2010 12:10:17 pm Virgil wrote:
> > > > > > Just a quick update:
> > > > > >
> > > > > > Just tried xen-4.0.0-2. Recompile from source on
fc12.x86_64.
> > > > > >
> > > > > > identical behaviour.
> > > > > >
> > > > > > Cheers
> > > > > > V
> > > > > >
> > > > > > On Fri, 18 Jun 2010 03:17:19 pm Virgil wrote:
> > > > > > > On Sat, 29 May 2010 11:26:50 pm M A Young wrote:
> > > > > > > > If anyone wants to test xen 3.4.3, I have put up
a source RPM
> > > > > > > > at
> > > > > > > >
http://myoung.fedorapeople.org/dom0/src/xen-3.4.3-0.91.fc13.
> > > > > > > > src.r pm
> > > > > > > >
> > > > > > > > Michael Young
> > > > > > > >
> > > > > > > > --
> > > > > > > > xen mailing list
> > > > > > > > xen(a)lists.fedoraproject.org
> > > > > > > >
https://admin.fedoraproject.org/mailman/listinfo/xen
> > > > > > >
> > > > > > > Hi list,
> > > > > > >
> > > > > > > Host crashing on 64FC12 kernel -105 dom0 when 2 PV64
machines
> > > > > > > are run.
> > > > > > >
> > > > > > > I can run HV32WinXP and HV32FC12 and 1 PV64FC12 all at
the same
> > > > > > > time.
> > > > > > >
> > > > > > > However, when any combination involves 2 PV64FC12
(kernel
> > > > > > > version doesn't matter) the host crashes.
> > > > > > >
> > > > > > > Running on the -97 dom0 everything works in all
combos.
> > > > > > >
> > > > > > > Using Xen 3.4.3.
> > > > > > >
> > > > > > > Turning off the virt network cards in the PV64FC12
machines
> > > > > > > makes things go (obviously not much use though).
> > > > > > >
> > > > > > > Tried disabling IPV6, firewall stuff etc. etc.
> > > > > > >
> > > > > > > Sometimes it would fire up and go but whichever
machine is
> > > > > > > started second gets really long ping times like
it's not
> > > > > > > receiving unless it sends something (if that makes
sense).
> > > > > > > Sooner or later the host crashes.
> > > > > > >
> > > > > > > Strangely a PV64FC12 and a PV64FC10 machine coexist
happily.
> > > > > > > It's only when a second PV64FC12 machine starts
up.
> > > > > > >
> > > > > > > V
> > > > > > > --
> > > > > > > xen mailing list
> > > > > > > xen(a)lists.fedoraproject.org
> > > > > > >
https://admin.fedoraproject.org/mailman/listinfo/xen
> > > > > >
> > > > > > --
> > > > > > xen mailing list
> > > > > > xen(a)lists.fedoraproject.org
> > > > > >
https://admin.fedoraproject.org/mailman/listinfo/xen
> > > > >
> > > > > --
> > > > > xen mailing list
> > > > > xen(a)lists.fedoraproject.org
> > > > >
https://admin.fedoraproject.org/mailman/listinfo/xen
> > > >
> > > > --
> > > > xen mailing list
> > > > xen(a)lists.fedoraproject.org
> > > >
https://admin.fedoraproject.org/mailman/listinfo/xen