last 541 kernel damages fileystem .. starting fresh FC3T2 candidate.

Janina Sajka janina at rednote.net
Wed Sep 8 00:11:56 UTC 2004


Stephen C. Tweedie writes:
> Hi,
> 
> On Sun, 2004-09-05 at 20:25, Janina Sajka wrote:
> 
> > I can get a kernel oops from intense disk writes/reads. On the other
> > hand, heavy CPU usage seems fine. So, I think there's something funky
> > about disk i/o as well.
> 
> Do you have a record of the oops, by any chance?

Well, not exactly. But maybe what I do have may prove useful.
Just in the last hour I've had a scarry experience. 

Situation: I noticed several dozen lynx cache files sitting in my $HOME.
Since I wasn't in lynx, I figured it OK to rm L* -- bad choice. I
froze..
During the subsequent reboot the system froze during file system check.

So, I try going single user -- never got a shell prompt. Froze during
/home check.

So, I went init=/bin/sh. This gave me something like:

Switching to console tty0 ... No such file or directory
Kernel panic, unable to sync, trying to kill init

Got this from both 541 UP and 541 SMP ...

So, I backed up to 521.

I oops'd during my first run of e2fsck on /home. But, on the second
round it almost concluded. It found errors, including an illegal inode
in the orphan list. Just as it was writing the results, I got a GPF that
I was able to snag, namely:



/home: ***** FILE SYSTEM WAS MODIFIED *****
/home: 473361/122077184 files (3.2% non-contiguous), 43865614/244129756 blocks

general protection fault: 0000 [1] SMP
CPU 0
Modules linked in: pcspkr ext3 jbd 3w_9xxx aic79xx sd_mod scsi_mod
Pid: 270, comm: e2fsck Not tainted 2.6.8-1.521smp
RIP: 0010:[<ffffffff80159233>] <ffffffff80159233>{find_get_pages+106}
RSP: 0018:000001007f785d98  EFLAGS: 00010087
RAX: 000000000000000f RBX: 000001003ffae8a0 RCX: 000000000000000f
RDX: 0080000000000000 RSI: 0000000000000010 RDI: 0000000000000032
RBP: 0000000000000010 R08: 0000000000000001 R09: 0000000000000256
R10: 00000000031df7b3 R11: 0000000000000000 R12: 00000000031de326
R13: 000001007f785df0 R14: ffffffffffffffff R15: 000001003ffae8a0
FS:  0000002a9557bd60(0000) GS:ffffffff804e3080(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000002a972ae000 CR3: 0000000000101000 CR4: 00000000000006e0
Process e2fsck (pid: 270, threadinfo 000001007f784000, task 000001003f6d44b0)
Stack: 0000010040b35ed0 0000010040b35ed0 000001007f785de8 000001007f785de8
       00000000031de326 0000000000000010 0000000000001fb0 ffffffff80164105
       0000010040b35ed0 ffffffff801646cf
Call Trace:<ffffffff80164105>{pagevec_lookup+23} <ffffffff801646cf>{invalidate_mapping_pages+203}
       <ffffffff8017f046>{invalidate_bh_lru+45} <ffffffff801839fb>{kill_bdev+14}
       <ffffffff80184c03>{blkdev_put+157} <ffffffff8017ce34>{__fput+77}
       <ffffffff8017b7d1>{filp_close+105} <ffffffff8017b8af>{sys_close+215}
       <ffffffff801115b6>{system_call+126}

Code: 8b 02 48 c1 e8 13 a8 01 74 04 48 8b 52 10 f0 ff 42 04 ff c1
RIP <ffffffff80159233>{find_get_pages+106} RSP <000001007f785d98>
 Segmentation fault
sh-2.05b#
: Looking through my logs I find one Oops actually in there. Seems to be
triggered by dns lookups somehow ...


Sep  5 17:49:11 concerto named[2192]: lame server resolving '30.177.132.209.bl.blueshore.net' (in 'blueshore.NET'?): 63.166.78.19#53
Sep  5 17:49:25 concerto named[2192]: lame server resolving 'ns1.ly.com' (in 'ly.com'?): 63.166.78.19#53
Sep  5 17:49:25 concerto named[2192]: lame server resolving 'ns2.ly.com' (in 'ly.com'?): 63.166.78.19#53
Sep  5 13:55:20 concerto kernel: __journal_remove_journal_head: freeing b_committed_data
Sep  5 13:55:20 concerto kernel: Unable to handle kernel paging request at 0000000000486680 RIP: 
Sep  5 13:55:20 concerto kernel: <ffffffff801632c8>{kfree+75}
Sep  5 13:55:20 concerto kernel: PML4 3a6d9067 PGD 0 
Sep  5 13:55:20 concerto kernel: Oops: 0000 [1] SMP 
Sep  5 13:55:20 concerto kernel: CPU 0 
Sep  5 13:55:20 concerto kernel: Modules linked in: ipt_state ipt_MASQUERADE iptable_nat ip_conntrack nfsd exportfs lockd md5 ipv6 parport_pc lp parport autofs4 rfcomm l2cap bluetooth ds yenta_socket pcmcia_core sunrpc iptable_filter ip_tables dm_mod snd_usb_audio snd_pcm snd_timer snd_page_alloc snd_usb_lib snd_rawmidi snd_seq_device snd soundcore ohci_hcd e100 mii tg3 floppy sg speakup_ltlk speakupmain pcspkr ext3 jbd 3w_9xxx aic79xx sd_mod scsi_mod
Sep  5 13:55:20 concerto kernel: Pid: 1264, comm: kjournald Not tainted 2.6.8-1.541.rootsmp
Sep  5 13:55:20 concerto kernel: RIP: 0010:[<ffffffff801632c8>] <ffffffff801632c8>{kfree+75}
Sep  5 13:55:20 concerto kernel: RSP: 0018:000001003d917aa8  EFLAGS: 00010016
Sep  5 13:55:20 concerto kernel: RAX: 000000007fff0000 RBX: 0000010006676450 RCX: 0000000000000018
Sep  5 13:55:20 concerto kernel: RDX: ffffffff80415168 RSI: 0000000000000001 RDI: 0080000000000000
Sep  5 13:55:20 concerto kernel: RBP: 0080000000000000 R08: ffffffff80415168 R09: 0000000100000000
Sep  5 13:55:20 concerto kernel: R10: ffffff0000086118 R11: 0000010001e426d0 R12: 000001007fe06600
Sep  5 13:55:20 concerto kernel: R13: 000001007ca0aa80 R14: 0000000000000000 R15: 0000000000000008
Sep  5 13:55:21 concerto kernel: FS:  0000002a9557b4c0(0000) GS:ffffffff804efb00(0000) knlGS:0000000000000000
Sep  5 13:55:21 concerto kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Sep  5 13:55:21 concerto kernel: CR2: 0000000000486680 CR3: 0000000000101000 CR4: 00000000000006e0
Sep  5 13:55:21 concerto kernel: Process kjournald (pid: 1264, threadinfo 000001003d916000, task 000001003cfc6130)
Sep  5 13:55:21 concerto kernel: Stack: ffffffff80415168 0000000000000206 0000010006676450 0000010006a504a8 
Sep  5 13:55:21 concerto kernel:        000001007fe06600 ffffffffa006bb44 0000000000000000 0000010006a504a8 
Sep  5 13:55:21 concerto kernel:        0000010006676450 ffffffffa006bba0 
Sep  5 13:55:21 concerto kernel: Call Trace:<ffffffffa006bb44>{:jbd:__journal_remove_journal_head+309} 
Sep  5 13:55:21 concerto kernel:        <ffffffffa006bba0>{:jbd:journal_remove_journal_head+48} 
Sep  5 13:55:21 concerto kernel:        <ffffffffa0065e46>{:jbd:journal_commit_transaction+2246} 
Sep  5 13:55:21 concerto kernel:        <ffffffffa0069337>{:jbd:kjournald+333} <ffffffff801369e2>{autoremove_wake_function+0} 
Sep  5 13:55:21 concerto kernel:        <ffffffff801369e2>{autoremove_wake_function+0} <ffffffffa00691e4>{:jbd:commit_timeout+0} 
Sep  5 13:55:21 concerto kernel:        <ffffffff801121b3>{child_rip+8} <ffffffffa00691ea>{:jbd:kjournald+0} 
Sep  5 13:55:21 concerto kernel:        <ffffffff801121ab>{child_rip+0} 
Sep  5 13:55:21 concerto kernel: 
Sep  5 13:55:21 concerto kernel: Code: 48 0f b6 80 80 66 49 80 48 8b 0c c5 80 67 49 80 48 b8 ff ff 
Sep  5 13:55:21 concerto kernel: RIP <ffffffff801632c8>{kfree+75} RSP <000001003d917aa8>
Sep  5 13:55:21 concerto kernel: CR2: 0000000000486680

> 
> Thanks,
>  Stephen
> 

-- 
	
				Janina Sajka, Chair
				Accessibility Workgroup
				Free Standards Group (FSG)

janina at freestandards.org	Phone: +1 202.494.7040






More information about the test mailing list