3.9 merge window kernels

poma pomidorabelisima at gmail.com
Sat Feb 23 05:47:39 UTC 2013


On 02/23/13 02:30, Mauro Carvalho Chehab wrote:
> Em Sat, 23 Feb 2013 02:09:01 +0100
> poma <pomidorabelisima at gmail.com> escreveu:
> 
>> On 02/23/13 01:09, Mauro Carvalho Chehab wrote:
>>> Em Fri, 22 Feb 2013 18:48:25 -0500
>>> Josh Boyer <jwboyer at redhat.com> escreveu:
>>>
>>>> On Sat, Feb 23, 2013 at 12:29:24AM +0100, poma wrote:
>>>>> On 02/22/13 19:15, Bruno Wolff III wrote:
>>>>>> On Fri, Feb 22, 2013 at 10:17:17 -0500,
>>>>>>   Josh Boyer <jwboyer at redhat.com> wrote:
>>>>>>> Just a quick heads up that the 3.9 merge window kernels are being built
>>>>>>> in rawhide now.  I've tried to at least test boot kernels on my machine
>>>>>>> before submitting them to koji, but testers beware.  Merge window
>>>>>>> kernels can be tricky.
>>>>>>
>>>>>
>>>>> Euphemism for broken? ;)
>>>>
>>>> No.  A euphemism for that would be "operationally challenged".
>>>>
>>>>>> I have git3 running on three machines and things seem fine. I'll be
>>>>>> trying out git5 very soon.
>>>>>
>>>>> Linux version 3.9.0-0.rc0.git5.1.fc19.x86_64
>>>>> (mockbuild at bkernel01.phx2.fedoraproject.org) (gcc version 4.8.0 20130220
>>>>> (Red Hat 4.8.0-0.14) (GCC) ) #1 SMP Fri Feb 22 16:35:19 UTC 2013
>>>>>>>>>> =============================================
>>>>> [ INFO: possible recursive locking detected ]
>>>>> 3.9.0-0.rc0.git5.1.fc19.x86_64 #1 Not tainted
>>>>> ---------------------------------------------
>>>>> kworker/0:1/36 is trying to acquire lock:
>>>>>  (hdl->lock){+.+...}, at: [<ffffffffa042d745>] find_ref_lock+0x25/0x60
>>>>> [videodev]
>>>>>
>>>>> but task is already holding lock:
>>>>>  (hdl->lock){+.+...}, at: [<ffffffffa0430888>]
>>>>> v4l2_ctrl_add_handler+0x78/0xf0 [videodev]
>>>>>
>>>>> other info that might help us debug this:
>>>>>  Possible unsafe locking scenario:
>>>>>
>>>>>        CPU0
>>>>>        ----
>>>>>   lock(hdl->lock);
>>>>>   lock(hdl->lock);
>>>>>
>>>>>  *** DEADLOCK ***
>>>>>
>>>>>  May be due to missing lock nesting notation
>>>>>
>>>>> 3 locks held by kworker/0:1/36:
>>>>>  #0:  (events){.+.+.+}, at: [<ffffffff8108d870>]
>>>>> process_one_work+0x190/0x680
>>>>>  #1:  ((&wfc.work)){+.+.+.}, at: [<ffffffff8108d870>]
>>>>> process_one_work+0x190/0x680
>>>>>  #2:  (hdl->lock){+.+...}, at: [<ffffffffa0430888>]
>>>>> v4l2_ctrl_add_handler+0x78/0xf0 [videodev]
>>>>>
>>>>> stack backtrace:
>>>>> Pid: 36, comm: kworker/0:1 Not tainted 3.9.0-0.rc0.git5.1.fc19.x86_64 #1
>>>>> Call Trace:
>>>>>  [<ffffffff810da843>] __lock_acquire+0x19e3/0x1a80
>>>>>  [<ffffffff81021d69>] ? sched_clock+0x9/0x10
>>>>>  [<ffffffff810acf15>] ? sched_clock_cpu+0xb5/0x100
>>>>>  [<ffffffff810db092>] lock_acquire+0xa2/0x1f0
>>>>>  [<ffffffffa042d745>] ? find_ref_lock+0x25/0x60 [videodev]
>>>>>  [<ffffffff81711820>] mutex_lock_nested+0x80/0x3c0
>>>>>  [<ffffffffa042d745>] ? find_ref_lock+0x25/0x60 [videodev]
>>>>>  [<ffffffffa042d745>] ? find_ref_lock+0x25/0x60 [videodev]
>>>>>  [<ffffffff810d8a7d>] ? trace_hardirqs_on_caller+0xfd/0x190
>>>>>  [<ffffffffa042d745>] find_ref_lock+0x25/0x60 [videodev]
>>>>>  [<ffffffffa042fce6>] handler_new_ref+0x46/0x1f0 [videodev]
>>>>>  [<ffffffffa04308c7>] v4l2_ctrl_add_handler+0xb7/0xf0 [videodev]
>>>>>  [<ffffffffa042a657>] v4l2_device_register_subdev+0x97/0x180 [videodev]
>>>>>  [<ffffffffa04caa5e>] ivtv_gpio_init+0x13e/0x180 [ivtv]
>>>>>  [<ffffffffa04c5cb4>] ivtv_probe+0x914/0x1bf0 [ivtv]
>>>>>  [<ffffffff810d8904>] ? mark_held_locks+0xb4/0x130
>>>>>  [<ffffffff81714636>] ? _raw_spin_unlock_irqrestore+0x36/0x70
>>>>>  [<ffffffff810d8b1d>] ? trace_hardirqs_on+0xd/0x10
>>>>>  [<ffffffff813890ae>] local_pci_probe+0x3e/0x70
>>>>>  [<ffffffff810898a4>] work_for_cpu_fn+0x14/0x20
>>>>>  [<ffffffff8108d8d9>] process_one_work+0x1f9/0x680
>>>>>  [<ffffffff8108d870>] ? process_one_work+0x190/0x680
>>>>>  [<ffffffff8108e0e1>] worker_thread+0x111/0x3c0
>>>>>  [<ffffffff8108dfd0>] ? rescuer_thread+0x270/0x270
>>>>>  [<ffffffff810938fd>] kthread+0xed/0x100
>>>>>  [<ffffffff81093810>] ? insert_kthread_work+0x80/0x80
>>>>>  [<ffffffff8171e16c>] ret_from_fork+0x7c/0xb0
>>>>>  [<ffffffff81093810>] ? insert_kthread_work+0x80/0x80
>>>>>>>>>
>>>>> ------------[ cut here ]------------
>>>>> WARNING: at lib/dma-debug.c:947 check_for_stack+0xa0/0x100()
>>>>>>>>>> Pid: 38, comm: kworker/2:1 Not tainted 3.9.0-0.rc0.git5.1.fc19.x86_64 #1
>>>>> Call Trace:
>>>>>  [<ffffffff81068670>] warn_slowpath_common+0x70/0xa0
>>>>>  [<ffffffff810686ec>] warn_slowpath_fmt+0x4c/0x50
>>>>>  [<ffffffff81378dd0>] check_for_stack+0xa0/0x100
>>>>>  [<ffffffff81379180>] debug_dma_map_page+0x100/0x140
>>>>>  [<ffffffff814e6fdb>] usb_hcd_map_urb_for_dma+0x55b/0x620
>>>>>  [<ffffffff814e7335>] usb_hcd_submit_urb+0x295/0x8d0
>>>>>  [<ffffffff810d625c>] ? lockdep_init_map+0x9c/0x470
>>>>>  [<ffffffff814e8905>] usb_submit_urb+0x155/0x3d0
>>>>>  [<ffffffff814e9274>] usb_start_wait_urb+0x74/0x190
>>>>>  [<ffffffff814e9831>] usb_bulk_msg+0xc1/0x170
>>>>>  [<ffffffffa058c646>] dvb_usbv2_generic_rw+0xc6/0x260 [dvb_usb_v2]
>>>>>  [<ffffffffa05943be>] af9015_ctrl_msg+0x16e/0x260 [dvb_usb_af9015]
>>>>>  [<ffffffff8136be2a>] ? debug_object_deactivate+0x8a/0x170
>>>>>  [<ffffffffa0594bbc>] af9015_identify_state+0x3c/0x90 [dvb_usb_af9015]
>>>>>  [<ffffffffa058b08d>] dvb_usbv2_init_work+0x6d/0x10e0 [dvb_usb_v2]
>>>>>  [<ffffffff8108d8d9>] process_one_work+0x1f9/0x680
>>>>>  [<ffffffff8108d870>] ? process_one_work+0x190/0x680
>>>>>  [<ffffffff8108e0e1>] worker_thread+0x111/0x3c0
>>>>>  [<ffffffff8108dfd0>] ? rescuer_thread+0x270/0x270
>>>>>  [<ffffffff810938fd>] kthread+0xed/0x100
>>>>>  [<ffffffff81093810>] ? insert_kthread_work+0x80/0x80
>>>>>  [<ffffffff8171e16c>] ret_from_fork+0x7c/0xb0
>>>>>  [<ffffffff81093810>] ? insert_kthread_work+0x80/0x80
>>>>> ---[ end trace b4a4208b0ed39626 ]---
>>>>
>>>> Mauro, any idea on these two?
>>>
>>> The first one looks to be a know issue: __initdev inside ivtv/cx18, is delaying
>>> initialization to happen after registering the device. There's a patch for it,
>>> pending upstream merge.
>>>
>>
>> This one is gone after after applying
>> 7d180f91eaabfbb813f1eda9e39e4445bfb826fc branch 'master' of
>> git://linuxtv.org/media_build
> 
> OK.
> 
>>> The second one seems to be a bad lock. I think I saw some discussion about 
>>> that at the media ML, but I can't remember the odd details. Perhaps there's
>>> a fix for it already. It should be noticed that there's a pending pull
>>> request with about ~600 patches for 3.9 that is pending its merge.
>>> The ivtv fix should be there. I hope Linus will merge it soon.
>>>
>>
>> This one stayed;
>>>> ------------[ cut here ]------------
>> WARNING: at lib/dma-debug.c:947 check_for_stack+0xa7/0xf0()
>> Hardware name: M720-US3
>> ehci-pci 0000:00:04.1: DMA-API: device driver maps memory fromstack
>> [addr=ffff8801283d7be1]
>> Modules linked in: dvb_usb_af9015(OF+)… dvb_usb_v2(OF)… dvb_core(OF)
>> wm8775(OF) tda9887(OF) tda8290(OF)… tuner(OF)… cx25840(OF)… bttv(OF+)…
>> rc_core(OF) btcx_risc(OF) ivtv(OF) snd_bt87x… videobuf_dma_sg(OF)
>> videobuf_core(OF)… cx2341x(OF) tveeprom(OF) v4l2_common(OF)…
>> videodev(OF)… video i2c_algo_bit… (irrelevant omitted)
>> Pid: 38, comm: kworker/2:1 Tainted: GF          O
>> 3.9.0-0.rc0.git5.1.fc18.x86_64 #1
>> Call Trace:
>> usb 4-2: pl2303 converter now attached to ttyUSB0
>>  [<ffffffff81069b5f>] warn_slowpath_common+0x7f/0xc0
>>  [<ffffffff81069c56>] warn_slowpath_fmt+0x46/0x50
>>  [<ffffffff81386467>] check_for_stack+0xa7/0xf0
>>  [<ffffffff81388228>] debug_dma_map_page+0x118/0x150
>>  [<ffffffff814fa5d9>] usb_hcd_map_urb_for_dma+0x569/0x630
>>  [<ffffffff811be091>] ? set_track+0x61/0x1b0
>>  [<ffffffff814fa945>] usb_hcd_submit_urb+0x2a5/0x900
>>  [<ffffffff810d814c>] ? lockdep_init_map+0xac/0x540
>>  [<ffffffff814fc07f>] usb_submit_urb+0xff/0x3d0
>>  [<ffffffff814fd042>] usb_start_wait_urb+0x82/0x1a0
>>  [<ffffffff814fbdae>] ? usb_alloc_urb+0x1e/0x50
>>  [<ffffffff814fd22c>] usb_bulk_msg+0xcc/0x180
>>  [<ffffffffa05be7e8>] dvb_usbv2_generic_rw+0xd8/0x270 [dvb_usb_v2]
>>  [<ffffffffa05cc5da>] af9015_ctrl_msg+0x1da/0x280 [dvb_usb_af9015]
>>  [<ffffffff810ae7aa>] ? cpuacct_charge+0xfa/0x210
>>  [<ffffffff810ae6c3>] ? cpuacct_charge+0x13/0x210
>>  [<ffffffffa05ccb4c>] af9015_identify_state+0x3c/0x90 [dvb_usb_af9015]
>>  [<ffffffffa05bcf18>] dvb_usbv2_init_work+0x68/0x11a0 [dvb_usb_v2]
>>  [<ffffffff8108e36f>] ? process_one_work+0x19f/0x690
>>  [<ffffffff8108e3db>] process_one_work+0x20b/0x690
>>  [<ffffffff8108e36f>] ? process_one_work+0x19f/0x690
>>  [<ffffffff8108ec10>] worker_thread+0x110/0x380
>>  [<ffffffff8108eb00>] ? rescuer_thread+0x260/0x260
>>  [<ffffffff8109569d>] kthread+0xed/0x100
>>  [<ffffffff810955b0>] ? flush_kthread_worker+0x190/0x190
>>  [<ffffffff8173882c>] ret_from_fork+0x7c/0xb0
>>  [<ffffffff810955b0>] ? flush_kthread_worker+0x190/0x190
>> ---[ end trace 4f4bc1c2abe300b5 ]---
>> --
> 
> Ah, you're running it on a non-x86 hardware, right? On (modern) x86 machines,
> almost all memories can do DMA, including the area used by stack. That's
> not true on other archs like arm.
> 

uname -m
x86_64
Morning is wiser than evening. :)


> I suspect that Antti never tried to run this driver outside x86.
> 
> Let me look into it...
> 
> static int af9015_ctrl_msg(struct dvb_usb_device *d, struct req_t *req)
> {
> #define BUF_LEN 63
> #define REQ_HDR_LEN 8 /* send header size */
> #define ACK_HDR_LEN 2 /* rece header size */
>         struct af9015_state *state = d_to_priv(d);
>         int ret, wlen, rlen;
>         u8 buf[BUF_LEN];
> ...
>         ret = dvb_usbv2_generic_rw(d, buf, wlen, buf, rlen);
> 
> 
> 
> int dvb_usbv2_generic_rw(struct dvb_usb_device *d, u8 *wbuf, u16 wlen, u8 *rbuf,
>                 u16 rlen)
> {
>         int ret, actual_length;
> 
> ...
>         ret = usb_bulk_msg(d->udev, usb_sndbulkpipe(d->udev,
>                         d->props->generic_bulk_ctrl_endpoint), wbuf, wlen,
>                         &actual_length, 2000);
> 
> 
> That's the problem: "buf" is at stack, and it is passed directly to
> usb_bulk_msg(). This doesn't work on -arm (and sometimes may cause
> data to be lost even on x86).
> 
> In order to fix it, the driver should be doing, instead, something like:
> 
> static int af9015_ctrl_msg(struct dvb_usb_device *d, struct req_t *req)
> {
> ...
> 	u8 *buf = kmalloc(BUF_LEN, GFP_ATOMIC);
> 
> 	...
> 
> 	kfree(buf);
> ...
> }
> 
> You shouldn't blame the merge windows kernel here. This never worked
> on arm, even on older kernel versions ;)
> 

Given that a zillion of code is in the rush, to work at all is miracle
enough.

> A more permanent fix would be to allocate a temporary 80 bytes buffer
> embedded with dvb_usb_v2 structs, and change dvb_usbv2_generic_rw to
> always use this temporary buffer for the control URB transfers.
> 
> Antti,
> 
> Could you please tale a look in it?
> 
> Thanks!
> Mauro
> 

Mauro, with your patch 'af9015.c-3.9.0-0.rc0.git5.1.fc18.x86_64.diff'
now the media tree is fine, *but* network… here we go again, rock you
like a hurricane! :)

------------[ cut here ]------------
WARNING: at lib/dma-debug.c:933 check_unmap+0x407/0x8a0()
Hardware name: M720-US3
skge 0000:01:09.0: DMA-API: device driver failed to check map
error[device address=0x0000000105d52522] [size=90 bytes] [mapped as single]
Modules linked in:
 …uetooth rfkill ip6t_REJECT nf_conntrack_ipv6 nf_conntrack_ipv4
nf_defrag_ipv6 nf_defrag_ipv4 xt_conntrack ip6table_filter nf_conntrack
ip6_tables… vhost_net tun macvtap… macvlan… nfsd auth_rpcgss nfs_acl
lockd… r8169… mii skge… sunrpc
Pid: 0, comm: swapper/3 Tainted: GF          O
3.9.0-0.rc0.git5.1.fc18.x86_64 #1
Call Trace:
 <IRQ>  [<ffffffff81069b5f>] warn_slowpath_common+0x7f/0xc0
 [<ffffffff81069c56>] warn_slowpath_fmt+0x46/0x50
 [<ffffffff81387a47>] check_unmap+0x407/0x8a0
 [<ffffffff810ab62f>] ? try_to_wake_up+0x1ff/0x350
 [<ffffffff813880fe>] debug_dma_unmap_page+0x5e/0x70
 [<ffffffffa00750ac>] ? skge_intr+0x2c/0x570 [skge]
 [<ffffffffa0073525>] skge_poll+0x1b5/0x9c0 [skge]
 [<ffffffff815e7672>] net_rx_action+0x172/0x380
 [<ffffffff810737e0>] __do_softirq+0x100/0x400
 [<ffffffff81021dd3>] ? native_sched_clock+0x13/0x80
 [<ffffffff81739c7c>] call_softirq+0x1c/0x30
 [<ffffffff8101c435>] do_softirq+0xa5/0xe0
 [<ffffffff81073cd5>] irq_exit+0xd5/0xe0
 [<ffffffff8173a5b3>] do_IRQ+0x63/0xe0
 [<ffffffff8172f732>] common_interrupt+0x72/0x72
 <EOI>  [<ffffffff8172f150>] ? _raw_spin_unlock_irq+0x30/0x50
 [<ffffffff8172f154>] ? _raw_spin_unlock_irq+0x34/0x50
 [<ffffffff8172f150>] ? _raw_spin_unlock_irq+0x30/0x50
 [<ffffffff810a3ddc>] finish_task_switch+0x7c/0x120
 [<ffffffff810a3d9f>] ? finish_task_switch+0x3f/0x120
 [<ffffffff8172cd02>] __schedule+0x442/0xa00
 [<ffffffff8172d5f9>] schedule+0x29/0x70
 [<ffffffff81023abf>] cpu_idle+0x12f/0x140
 [<ffffffff8171a70e>] start_secondary+0x269/0x26b
---[ end trace d41e441560f83f6a ]---
Mapped at:
 [<ffffffff813881b9>] debug_dma_map_page+0xa9/0x150
 [<ffffffffa0074c83>] skge_xmit_frame+0x163/0x560 [skge]
 [<ffffffff815e8330>] dev_hard_start_xmit+0x260/0x6f0
 [<ffffffff8160bb0e>] sch_direct_xmit+0xfe/0x2a0
 [<ffffffff815e8a07>] dev_queue_xmit+0x247/0x970
…
--


Cheers,
poma





More information about the kernel mailing list