----- Original Message -----
From: "Kamil Paral" <kparal(a)redhat.com>
To: "AutoQA development" <autoqa-devel(a)lists.fedorahosted.org>
Sent: Monday, November 28, 2011 11:34:52 PM
Subject: Re: stage: hanging jobs
> > On Mon, 28 Nov 2011 09:20:05 -0500 (EST)
> > Kamil Paral <kparal(a)redhat.com> wrote:
> >
> > > I noticed today that we have about 50 queued x86_64/noarch
> > > depchecks
> > > on our staging server. The problem was in this job:
> > >
> > >
http://autoqa-stg.fedoraproject.org/results/2335-autotest/virt27.qa/debug/
> > >
> > > It was hanging for two days in the "running" state. This seems
> > > like
> > > an autotest problem, it should detect when connection times out
> > > and
> > > it should abort the job automatically. (But now I'm not sure if
> > > we
> > > didn't make some adjustments the autotest's watchdog timer, we
> > > need
> > > to investigate).
> > >
> > > I aborted the hanging job and all but the last two depchecks.
> > >
> > > If this happens again we should investigate and report to
> > > autotest
> > > devs.
> >
> > Have you seen the same thing in production? We upgraded to the
> > newer
> > version of autotest last week.
I checked production and ... surprise surprise ... there are 4 jobs
hanging from Nov 26! It seems that Nov 26 was a bad day.
http://autoqa.fedoraproject.org/results/238579-autotest/qa02.qa.fedorapro...
http://autoqa.fedoraproject.org/results/238665-autotest/qa07.qa.fedorapro...
http://autoqa.fedoraproject.org/results/238685-autotest/qa03.qa.fedorapro...
http://autoqa.fedoraproject.org/results/238799-autotest/qa01.qa.fedorapro...
I'll abort them all. We will need to report bug in autotest. The
reproducer will be tricky however.
I want to ask one question here by the way, all the jobs are queued but running
on my local machine, what configuration needed to make them run? Thanks.
Hongqing