Nehalem network performance

Wed Jan 27 15:49:53 UTC 2010

On Wed, Jan 27, 2010 at 09:01:53AM +0200, Gilboa Davara wrote:
> On Tue, 2010-01-26 at 19:07 -0500, Kelvin Ku wrote:
> > We recently purchased our first Nehalem-based system with a single Xeon E5530
> > CPU. We were unable to boot FC6 on it and are trying to upgrade our network to
> > F11/F12 anyway, so we installed F11 on it.
> > 
> > Our existing hardware includes Xeon 5100- and 5400-series CPUs running mainly
> > FC6 (2.6.22), except for a single Xeon 5150 system running F11. Our target
> > application consumes multicast data during business hours and has been dropping
> > packets more frequently on the new hardware/OS combination than on our older
> > systems. I've tried using the on-board Intel 82574L dual-port NIC (e1000e
> > driver) and a discrete Intel 82576 dual-port NIC (igb driver). Counters for the
> > NIC, socket layer, and switch don't show any dropped packets.
> > 
> > My question is this: has anyone experienced performance degradation running a
> > UDP-consuming application after moving to a Nehalem-based system? We have yet
> > to identify whether the culprit is the hardware, the OS, or the combination of
> > the two. However, note that our app works fine on the 5150 system running F11
> > that I mentioned above.
> > 
> > Likewise, if you've migrated such an app to a Nehalem system and had to make
> > adjustments to get it to work as before, I'd like to hear from you too.
> > 
> > Thanks,
> > Kelvin Ku
> 
> Please post the output of:
> $ cat /proc/interrupts | grep eth

We rename our interfaces to lan:

$ grep lan /proc/interrupts 
 61:          1          0          0          0   PCI-MSI-edge      lan0
 62:    7194004          0          0          0   PCI-MSI-edge      lan0-TxRx-0
 63:          0          1          0          0   PCI-MSI-edge      lan1
 64:          0          0   49842410          0   PCI-MSI-edge      lan1-TxRx-0

$ pgrep irqbalance
$ 

Note that irqbalance is disabled. I found that it wasn't balancing IRQs like on
our older machines. I note that the irqbalance docs say that NIC interrupts
should not be balanced, which is what we're seeing whether irqbalance is running or not.

> $ ethtool -S ethX

lan0 (LAN interface):

NIC statistics:
     rx_packets: 7429553
     tx_packets: 85327
     rx_bytes: 9752917197
     tx_bytes: 66766666
     rx_broadcast: 7386732
     tx_broadcast: 8610
     rx_multicast: 0
     tx_multicast: 42
     rx_errors: 0
     tx_errors: 0
     tx_dropped: 0
     multicast: 0
     collisions: 0
     rx_length_errors: 0
     rx_over_errors: 0
     rx_crc_errors: 0
     rx_frame_errors: 0
     rx_no_buffer_count: 0
     rx_missed_errors: 0
     tx_aborted_errors: 0
     tx_carrier_errors: 0
     tx_fifo_errors: 0
     tx_heartbeat_errors: 0
     tx_window_errors: 0
     tx_abort_late_coll: 0
     tx_deferred_ok: 0
     tx_single_coll_ok: 0
     tx_multi_coll_ok: 0
     tx_timeout_count: 0
     tx_restart_queue: 0
     rx_long_length_errors: 0
     rx_short_length_errors: 0
     rx_align_errors: 0
     tx_tcp_seg_good: 6893
     tx_tcp_seg_failed: 0
     rx_flow_control_xon: 0
     rx_flow_control_xoff: 0
     tx_flow_control_xon: 0
     tx_flow_control_xoff: 0
     rx_long_byte_count: 9752917197
     rx_csum_offload_good: 7429553
     rx_csum_offload_errors: 0
     tx_dma_out_of_sync: 0
     alloc_rx_buff_failed: 1487
     tx_smbus: 0
     rx_smbus: 0
     dropped_smbus: 0
     tx_queue_0_packets: 85327
     tx_queue_0_bytes: 65978674
     rx_queue_0_packets: 7429553
     rx_queue_0_bytes: 9693480773

lan1 (multicast interface) is below. Note that rx_missed_errors is non-zero. I
previously encountered this with the e1000e NIC after disabling cpuspeed, which
was throttling the CPUs to 1.6 GHz (from a maximum of 2.4 GHz). I attempted to
remedy this by setting InterruptThrottleRate=0,0 in the e1000e driver, after
which we had one full day of testing with zero rx_missed_errors, but the
application still reported packet loss.

Today is the first day of testing with the igb NIC since I disabled cpuspeed. The igb driver is also running with InterruptThrottleRate=0,0.

NIC statistics:
     rx_packets: 54874782
     tx_packets: 161
     rx_bytes: 35581821239
     tx_bytes: 18479
     rx_broadcast: 10
     tx_broadcast: 25
     rx_multicast: 54874635
     tx_multicast: 16
     rx_errors: 0
     tx_errors: 0
     tx_dropped: 0
     multicast: 54874635
     collisions: 0
     rx_length_errors: 0
     rx_over_errors: 0
     rx_crc_errors: 0
     rx_frame_errors: 0
     rx_no_buffer_count: 1
     rx_missed_errors: 22192
     tx_aborted_errors: 0
     tx_carrier_errors: 0
     tx_fifo_errors: 0
     tx_heartbeat_errors: 0
     tx_window_errors: 0
     tx_abort_late_coll: 0
     tx_deferred_ok: 0
     tx_single_coll_ok: 0
     tx_multi_coll_ok: 0
     tx_timeout_count: 0
     tx_restart_queue: 0
     rx_long_length_errors: 0
     rx_short_length_errors: 0
     rx_align_errors: 0
     tx_tcp_seg_good: 0
     tx_tcp_seg_failed: 0
     rx_flow_control_xon: 0
     rx_flow_control_xoff: 0
     tx_flow_control_xon: 0
     tx_flow_control_xoff: 0
     rx_long_byte_count: 35581821239
     rx_csum_offload_good: 54874782
     rx_csum_offload_errors: 0
     tx_dma_out_of_sync: 0
     alloc_rx_buff_failed: 9598
     tx_smbus: 0
     rx_smbus: 0
     dropped_smbus: 0
     tx_queue_0_packets: 161
     tx_queue_0_bytes: 17013
     rx_queue_0_packets: 54874783
     rx_queue_0_bytes: 35362322772

> 
> Which board are you using?

Supermicro X8DTL-iF

> Have you enabled hyper-threading?

It is currently disabled.

> Have you disabled IO vt-d?

This is disabled by default in the BIOS. I'll double-check the setting later
today.

> In which slot did you installed the igb card?

The slot is PCIe x16. The NIC itself is x4.

> Have you tried enabling pci=msi in your kernel's command line?

No. Do I need to do this? MSI seems to be enabled:

$ dmesg | grep -i msi
pcieport-driver 0000:00:01.0: irq 48 for MSI/MSI-X
pcieport-driver 0000:00:03.0: irq 49 for MSI/MSI-X
pcieport-driver 0000:00:07.0: irq 50 for MSI/MSI-X
pcieport-driver 0000:00:09.0: irq 51 for MSI/MSI-X
pcieport-driver 0000:00:1c.0: irq 52 for MSI/MSI-X
pcieport-driver 0000:00:1c.4: irq 53 for MSI/MSI-X
pcieport-driver 0000:00:1c.5: irq 54 for MSI/MSI-X
e1000e 0000:06:00.0: irq 55 for MSI/MSI-X
e1000e 0000:06:00.0: irq 56 for MSI/MSI-X
e1000e 0000:06:00.0: irq 57 for MSI/MSI-X
e1000e 0000:07:00.0: irq 58 for MSI/MSI-X
e1000e 0000:07:00.0: irq 59 for MSI/MSI-X
e1000e 0000:07:00.0: irq 60 for MSI/MSI-X
igb 0000:03:00.0: irq 61 for MSI/MSI-X
igb 0000:03:00.0: irq 62 for MSI/MSI-X
igb: eth2: igb_probe: Using MSI-X interrupts. 1 rx queue(s), 1 tx queue(s)
igb 0000:03:00.1: irq 63 for MSI/MSI-X
igb 0000:03:00.1: irq 64 for MSI/MSI-X
igb: eth3: igb_probe: Using MSI-X interrupts. 1 rx queue(s), 1 tx queue(s)

> 
> Per your question, at least when dealing with packets from within the
> kernel, a Nehalem box is fully capable of handling >20Gbps (depending on
> the packet size) - so I doubt that this is a hardware issue.

Agreed. I ran a local netperf test and was seeing about 8 Gbps of throughput on a single core, so this should be adequate for 1 Gbps traffic.

> 
> - Gilboa
> 

Thanks,
Kelvin