Seeing errors after upgrading from 4.2.8-300 to 4.3.3-300, I also see the same errors on newer kernels for f23 from koji and 4.5.0-0.rc0.git6.1.vanilla.knurd.1
dmesg | egrep -i ‘mlx|dmar'
[ 17.816756] mlx4_core 0000:82:00.0: Mapped 1 chunks/256 KB at 120040000 for ICM [ 17.825330] mlx4_core 0000:8a:00.0: SRIOV, disabling HA mode for intf proto 0 [ 17.825541] <mlx4_ib> mlx4_ib_add: counter index 0 for port 1 allocated 0 [ 17.833869] <mlx4_ib> mlx4_ib_add: counter index 1 for port 2 allocated 0 [ 17.906397] mlx4_core 0000:8a:00.0: Mapped 1 chunks/256 KB at 120040000 for ICM [ 17.911403] mlx4_core 0000:8a:00.0: mlx4_ib: multi-function enabled [ 17.925065] mlx4_core 0000:8a:00.0: mlx4_ib: initializing demux service for 128 qp1 clients [ 17.937459] mlx4_core 0000:8a:00.0: Mapped 1 chunks/256 KB at 128040000 for ICM [ 17.938766] mlx4_core 0000:8a:00.0: Mapped 1 chunks/256 KB at 1200c0000 for ICM [ 29.527780] mlx4_core 0000:8a:00.0: Mapped 1 chunks/256 KB at 128080000 for ICM [ 29.529083] mlx4_core 0000:8a:00.0: Mapped 1 chunks/256 KB at 120140000 for ICM [ 31.330799] DMAR: DRHD: handling fault status reg 2 [ 31.330803] DMAR: DMAR:[DMA Write] Request device [8a:06.1] fault addr fc26e000 DMAR:[fault reason 02] Present bit in context entry is clear [ 31.330865] DMAR: DRHD: handling fault status reg 102 [ 31.330868] DMAR: DMAR:[DMA Read] Request device [8a:06.1] fault addr fc632000 DMAR:[fault reason 02] Present bit in context entry is clear [ 31.530006] DMAR: DRHD: handling fault status reg 202 . . .
All previous f22 and f23 releases I’ve used were fine.
I have two IB cards: all Firmware version: 2.9.1000
82:00.0 InfiniBand: Mellanox Technologies MT26428 [ConnectX VPI PCIe 2.0 5GT/s - IB QDR / 10GigE] (rev b0) 8a:00.0 InfiniBand: Mellanox Technologies MT26428 [ConnectX VPI PCIe 2.0 5GT/s - IB QDR / 10GigE] (rev b0)
The first one has sriov off the second has sriov on.
Thu, 21 Jan 2016 18:21:01 +0000 Nate Pearlstein npearl@sgi.com kirjoitti:
Seeing errors after upgrading from 4.2.8-300 to 4.3.3-300, I also see
I get similar... WARNING: Kernel Errors Present Buffer I/O error on dev dm-1, log ...: 102 Time(s) Buffer I/O error on device dm-1, ...: 390 Time(s) EXT4-fs error (device dm-1): _ ...: 11 Time(s) EXT4-fs warning (device dm-1): ext4_end_bio:329: I/O error -5 writing to in ...: 204 Time(s) WARNING: CPU: 0 PID: 415 at fs/buffer.c:1160 mar ...: 1 Time(s) ata1.00: cmd 61/00:00:98:05:7c/08:00:09:00:00/40 tag 0 ncq 1048576 out#012 res 40/00:08:30:ed:6f/00:00:09:00:00/40 Emask 0x60 (host bus error) ...: 2 Time(s) ata1.00: cmd 61/00:00:98:05:7c/08:00:09:00:00/40 tag 0 ncq 1048576 out#012 res 40/00:10:e0:de:ff/00:00:08:00:00/40 Emask 0x60 (host bus error) ...: 1 Time(s) ata1.00: cmd 61/00:00:98:05:7c/08:00:09:00:00/40 tag 0 ncq 1048576 out#012 res 40/00:20:00:70:bc/00:00:09:00:00/40 Emask 0x60 (host bus error) ...: 2 Time(s) ata1.00: cmd 61/00:00:98:05:7c/08:00:09:00:00/40 tag 0 ncq 1048576 out#012 res 40/00:20:10:93:55/00:00:04:00:00/40 Emask 0x60 (host bus error) ...: 1 Time(s) ata1.00: cmd 61/00:00:98:05:7c/08:00:09:00:00/40 tag 0 ncq 1048576 out#012 res 40/00:20:d8:8c:58/00:00:05:00:00/40 Emask 0x60 (host bus error) ...: 1 Time(s) ata1.00: cmd 61/00:00:98:05:7c/08:00:09:00:00/40 tag 0 ncq 1048576 out#012 res 40/00:60:90:a4:6c/00:00:0d:00:00/40 Emask 0x60 (host bus error) ...: 1 Time(s) ata1.00: cmd 61/00:00:98:05:7c/08:00:09:00:00/40 tag 0 ncq 1048576 out#012 res 40/00:70:00:cf:6b/00:00:0d:00:00/40 Emask 0x60 (host bus error) ...: 1 Time(s) ata1.00: cmd 61/00:00:98:05:7c/08:00:09:00:00/40 tag 0 ncq 1048576 out#012 res
I turned back into uname -a
Linux oh1mrr.ampr.org 4.2.8-300.fc23.i686+PAE #1 SMP Tue Dec 15 17:13:23 UTC 2015 i686 i686 i386 GNU/Linux And no errors/warnings...
Jarmo
On Jan 21, 2016, at 1:21 PM, Nate Pearlstein npearl@sgi.com wrote:
Seeing errors after upgrading from 4.2.8-300 to 4.3.3-300, I also see the same errors on newer kernels for f23 from koji and 4.5.0-0.rc0.git6.1.vanilla.knurd.1
dmesg | egrep -i ‘mlx|dmar'
[ 17.816756] mlx4_core 0000:82:00.0: Mapped 1 chunks/256 KB at 120040000 for ICM [ 17.825330] mlx4_core 0000:8a:00.0: SRIOV, disabling HA mode for intf proto 0 [ 17.825541] <mlx4_ib> mlx4_ib_add: counter index 0 for port 1 allocated 0 [ 17.833869] <mlx4_ib> mlx4_ib_add: counter index 1 for port 2 allocated 0 [ 17.906397] mlx4_core 0000:8a:00.0: Mapped 1 chunks/256 KB at 120040000 for ICM [ 17.911403] mlx4_core 0000:8a:00.0: mlx4_ib: multi-function enabled [ 17.925065] mlx4_core 0000:8a:00.0: mlx4_ib: initializing demux service for 128 qp1 clients [ 17.937459] mlx4_core 0000:8a:00.0: Mapped 1 chunks/256 KB at 128040000 for ICM [ 17.938766] mlx4_core 0000:8a:00.0: Mapped 1 chunks/256 KB at 1200c0000 for ICM [ 29.527780] mlx4_core 0000:8a:00.0: Mapped 1 chunks/256 KB at 128080000 for ICM [ 29.529083] mlx4_core 0000:8a:00.0: Mapped 1 chunks/256 KB at 120140000 for ICM [ 31.330799] DMAR: DRHD: handling fault status reg 2 [ 31.330803] DMAR: DMAR:[DMA Write] Request device [8a:06.1] fault addr fc26e000 DMAR:[fault reason 02] Present bit in context entry is clear [ 31.330865] DMAR: DRHD: handling fault status reg 102 [ 31.330868] DMAR: DMAR:[DMA Read] Request device [8a:06.1] fault addr fc632000 DMAR:[fault reason 02] Present bit in context entry is clear [ 31.530006] DMAR: DRHD: handling fault status reg 202 . . .
All previous f22 and f23 releases I’ve used were fine.
I have two IB cards: all Firmware version: 2.9.1000
82:00.0 InfiniBand: Mellanox Technologies MT26428 [ConnectX VPI PCIe 2.0 5GT/s - IB QDR / 10GigE] (rev b0) 8a:00.0 InfiniBand: Mellanox Technologies MT26428 [ConnectX VPI PCIe 2.0 5GT/s - IB QDR / 10GigE] (rev b0)
The first one has sriov off the second has sriov on.
users mailing list users@lists.fedoraproject.org To unsubscribe or change subscription options: https://admin.fedoraproject.org/mailman/listinfo/users Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct Guidelines: http://fedoraproject.org/wiki/Mailing_list_guidelines Have a question? Ask away: http://ask.fedoraproject.org
I’ve filed https://bugzilla.redhat.com/show_bug.cgi?id=1301210