Per request, here is the Feb 1 message reporting the problem in the 2895 kernel. I have tried two or three different manufacturers' cards for eth0 with the same result.
I do not recall ever seeing this failure with 2869 and older kernels.
This bug has been present for the last three months. I wonder if it has propagated to the commercial Linix products.
Chuck Forsberg WA7KGX N2469R wrote:
Per request, here is the Feb 1 message reporting the problem in the 2895 kernel. I have tried two or three different manufacturers' cards for eth0 with the same result.
I do not recall ever seeing this failure with 2869 and older kernels.
This bug has been present for the last three months. I wonder if it has propagated to the commercial Linix products.
This is BZ #231687.
I asked about this on linux-kernel and netdev last month and nobody replied.
On Tue, 2007-05-01 at 14:23 -0400, Chuck Ebbert wrote:
Chuck Forsberg WA7KGX N2469R wrote:
Per request, here is the Feb 1 message reporting the problem in the 2895 kernel. I have tried two or three different manufacturers' cards for eth0 with the same result.
I do not recall ever seeing this failure with 2869 and older kernels.
This bug has been present for the last three months. I wonder if it has propagated to the commercial Linix products.
This is BZ #231687.
That bug is against FC6. Have you tried the kernel mentioned in that bug? (2.6.20-1.2944.fc6)
If this is also present in F7, you should probably file another bug, this one against Fedora Core/devel/kernel.
Be sure to include lspci output and the contents of /proc/interrupts.
-w
On Tue, 2007-05-01 at 11:10 -0700, Chuck Forsberg WA7KGX N2469R wrote:
Per request, here is the Feb 1 message reporting the problem in the 2895 kernel. I have tried two or three different manufacturers' cards for eth0 with the same result.
I do not recall ever seeing this failure with 2869 and older kernels.
This bug has been present for the last three months. I wonder if it has propagated to the commercial Linix products.
This still doesn't say what chipset the devices are. It's fairly hard to track problems when we don't know what driver they're in.
You should open a bug in bugzilla (Product "Fedora Core", version "FC6", component "kernel") and add an attachment containing the output of 'lspci'.
-w
Chuck Forsberg WA7KGX N2469R wrote:
I have tried two or three different manufacturers' cards for eth0 with the same result.
What type cards? You mention your motherboard type below (Asus A8N-E), but you indicate the onboard nic (eth1) isn't giving you problems.
Have you tried these PCI add-in nics in different slots?
Posting the output of dmesg and lspci would help identify which driver is in play.
I do not recall ever seeing this failure with 2869 and older kernels.
This bug has been present for the last three months. I wonder if it has propagated to the commercial Linix products.
Subject: Ethernet problem in 2895 kernel From: Chuck Forsberg WA7KGX N2469R caf@omen.com Date: Thu, 01 Feb 2007 10:11:21 -0800 To: Fedora-Test-List fedora-test-list@redhat.com
To: Fedora-Test-List fedora-test-list@redhat.com
Since updating to the FC6 386 2895 kernel I have has a number of partial failures of the Ethernet system under heavy load. System: Asus a8n-e socket 939 with 3 GB RAM running 386 FC6. Eth0 is a PCI 10/100 NIC, eth1 is the onboard gigabit NIC. NAT controlled by rc.firewall 2.4. Eth0 connects to the cable modem, eth1 to the local net.
Under heavy load, the following error messages appear and then eth0 stops working, severing the internet connection. I have not seen this problem prior to the 2895 kernel.
Jan 31 22:01:19 omen kernel: NETDEV WATCHDOG: eth0: transmit timed out Jan 31 22:01:19 omen kernel: eth0: transmit timed out, tx_status 00 status 8601. Jan 31 22:01:19 omen kernel: diagnostics: net 0cd8 media 8880 dma 0000003a fifo 0000 Jan 31 22:01:19 omen kernel: eth0: Interrupt posted but not delivered -- IRQ blocked by another device? Jan 31 22:01:19 omen kernel: Flags; bus-master 1, dirty 80(0) current 80(0) Jan 31 22:01:19 omen kernel: Transmit list 00000000 vs. f714b200. Jan 31 22:01:19 omen kernel: 0: @f714b200 length 80000042 status 00010042 Jan 31 22:01:19 omen kernel: 1: @f714b2a0 length 80000052 status 0c010052 Jan 31 22:01:19 omen kernel: 2: @f714b340 length 80000043 status 0c010043 Jan 31 22:01:19 omen kernel: 3: @f714b3e0 length 80000045 status 0c010045 Jan 31 22:01:19 omen kernel: 4: @f714b480 length 80000042 status 00010042 Jan 31 22:01:19 omen kernel: 5: @f714b520 length 8000004a status 0c01004a Jan 31 22:01:19 omen kernel: 6: @f714b5c0 length 8000004c status 0c01004c Jan 31 22:01:19 omen kernel: 7: @f714b660 length 8000004d status 0c01004d Jan 31 22:01:19 omen kernel: 8: @f714b700 length 80000052 status 0c010052 Jan 31 22:01:19 omen kernel: 9: @f714b7a0 length 80000043 status 0c010043 Jan 31 22:01:19 omen kernel: 10: @f714b840 length 80000045 status 0c010045 Jan 31 22:01:19 omen kernel: 11: @f714b8e0 length 8000004a status 0c01004a Jan 31 22:01:19 omen kernel: 12: @f714b980 length 8000004c status 0c01004c Jan 31 22:01:19 omen kernel: 13: @f714ba20 length 8000004c status 0c01004c Jan 31 22:01:19 omen kernel: 14: @f714bac0 length 80000052 status 8c010052 Jan 31 22:01:19 omen kernel: 15: @f714bb60 length 80000056 status 8c010056
On Tuesday 01 May 2007 14:50, Jay Cliburn wrote:
What type cards? You mention your motherboard type below (Asus A8N-E), but you indicate the onboard nic (eth1) isn't giving you problems.
I see the same problem. This is the lspci output:
05:09.0 Ethernet controller: 3Com Corporation 3c905C-TX/TX-M [Tornado] (rev 78) 3f:00.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5752 Gigabit Ethernet PCI Express
I think Chuck traced the text that comes out to 3Com, but its been a while. AFAICT, the problem started around the 2834 kernel/Feb 15.
-Steve
Steve Grubb wrote:
On Tuesday 01 May 2007 14:50, Jay Cliburn wrote:
What type cards? You mention your motherboard type below (Asus A8N-E), but you indicate the onboard nic (eth1) isn't giving you problems.
I see the same problem. This is the lspci output:
05:09.0 Ethernet controller: 3Com Corporation 3c905C-TX/TX-M [Tornado] (rev 78) 3f:00.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5752 Gigabit Ethernet PCI Express
I think Chuck traced the text that comes out to 3Com, but its been a while. AFAICT, the problem started around the 2834 kernel/Feb 15.
-Steve
The only seemingly relevant change I see in the 3c59x driver in the February time frame is this one, although the change appears harmless enough: http://lkml.org/lkml/2007/2/1/183
commit 0d38ff1d3d34ca9ae2a61cf98cf47530f9d51dee Author: Jiri Kosina jkosina@suse.cz Date: Mon Feb 5 16:29:48 2007 -0800
NET-3c59x: turn local_save_flags() + local_irq_disable() into local_irq_save
drivers/net/3c59x.c::poll_vortex() contains local_irq_disable() after local_save_flags(). Turn it into local_irq_save().
Signed-off-by: Jiri Kosina jkosina@suse.cz Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Jeff Garzik jeff@garzik.org
diff --git a/drivers/net/3c59x.c b/drivers/net/3c59x.c index 80bdcf8..716a472 100644 --- a/drivers/net/3c59x.c +++ b/drivers/net/3c59x.c @@ -792,8 +792,7 @@ static void poll_vortex(struct net_device *dev) { struct vortex_private *vp = netdev_priv(dev); unsigned long flags; - local_save_flags(flags); - local_irq_disable(); + local_irq_save(flags); (vp->full_bus_master_rx ? boomerang_interrupt:vortex_interrupt)(dev->irq local_irq_restore(flags); }
You might try reverting it to see if it helps.
Jay