3.9 merge window kernels

Mauro Carvalho Chehab mchehab at redhat.com
Sat Feb 23 01:30:14 UTC 2013


Em Sat, 23 Feb 2013 02:09:01 +0100
poma <pomidorabelisima at gmail.com> escreveu:

> On 02/23/13 01:09, Mauro Carvalho Chehab wrote:
> > Em Fri, 22 Feb 2013 18:48:25 -0500
> > Josh Boyer <jwboyer at redhat.com> escreveu:
> > 
> >> On Sat, Feb 23, 2013 at 12:29:24AM +0100, poma wrote:
> >>> On 02/22/13 19:15, Bruno Wolff III wrote:
> >>>> On Fri, Feb 22, 2013 at 10:17:17 -0500,
> >>>>   Josh Boyer <jwboyer at redhat.com> wrote:
> >>>>> Just a quick heads up that the 3.9 merge window kernels are being built
> >>>>> in rawhide now.  I've tried to at least test boot kernels on my machine
> >>>>> before submitting them to koji, but testers beware.  Merge window
> >>>>> kernels can be tricky.
> >>>>
> >>>
> >>> Euphemism for broken? ;)
> >>
> >> No.  A euphemism for that would be "operationally challenged".
> >>
> >>>> I have git3 running on three machines and things seem fine. I'll be
> >>>> trying out git5 very soon.
> >>>
> >>> Linux version 3.9.0-0.rc0.git5.1.fc19.x86_64
> >>> (mockbuild at bkernel01.phx2.fedoraproject.org) (gcc version 4.8.0 20130220
> >>> (Red Hat 4.8.0-0.14) (GCC) ) #1 SMP Fri Feb 22 16:35:19 UTC 2013
> >>> …
> >>> =============================================
> >>> [ INFO: possible recursive locking detected ]
> >>> 3.9.0-0.rc0.git5.1.fc19.x86_64 #1 Not tainted
> >>> ---------------------------------------------
> >>> kworker/0:1/36 is trying to acquire lock:
> >>>  (hdl->lock){+.+...}, at: [<ffffffffa042d745>] find_ref_lock+0x25/0x60
> >>> [videodev]
> >>>
> >>> but task is already holding lock:
> >>>  (hdl->lock){+.+...}, at: [<ffffffffa0430888>]
> >>> v4l2_ctrl_add_handler+0x78/0xf0 [videodev]
> >>>
> >>> other info that might help us debug this:
> >>>  Possible unsafe locking scenario:
> >>>
> >>>        CPU0
> >>>        ----
> >>>   lock(hdl->lock);
> >>>   lock(hdl->lock);
> >>>
> >>>  *** DEADLOCK ***
> >>>
> >>>  May be due to missing lock nesting notation
> >>>
> >>> 3 locks held by kworker/0:1/36:
> >>>  #0:  (events){.+.+.+}, at: [<ffffffff8108d870>]
> >>> process_one_work+0x190/0x680
> >>>  #1:  ((&wfc.work)){+.+.+.}, at: [<ffffffff8108d870>]
> >>> process_one_work+0x190/0x680
> >>>  #2:  (hdl->lock){+.+...}, at: [<ffffffffa0430888>]
> >>> v4l2_ctrl_add_handler+0x78/0xf0 [videodev]
> >>>
> >>> stack backtrace:
> >>> Pid: 36, comm: kworker/0:1 Not tainted 3.9.0-0.rc0.git5.1.fc19.x86_64 #1
> >>> Call Trace:
> >>>  [<ffffffff810da843>] __lock_acquire+0x19e3/0x1a80
> >>>  [<ffffffff81021d69>] ? sched_clock+0x9/0x10
> >>>  [<ffffffff810acf15>] ? sched_clock_cpu+0xb5/0x100
> >>>  [<ffffffff810db092>] lock_acquire+0xa2/0x1f0
> >>>  [<ffffffffa042d745>] ? find_ref_lock+0x25/0x60 [videodev]
> >>>  [<ffffffff81711820>] mutex_lock_nested+0x80/0x3c0
> >>>  [<ffffffffa042d745>] ? find_ref_lock+0x25/0x60 [videodev]
> >>>  [<ffffffffa042d745>] ? find_ref_lock+0x25/0x60 [videodev]
> >>>  [<ffffffff810d8a7d>] ? trace_hardirqs_on_caller+0xfd/0x190
> >>>  [<ffffffffa042d745>] find_ref_lock+0x25/0x60 [videodev]
> >>>  [<ffffffffa042fce6>] handler_new_ref+0x46/0x1f0 [videodev]
> >>>  [<ffffffffa04308c7>] v4l2_ctrl_add_handler+0xb7/0xf0 [videodev]
> >>>  [<ffffffffa042a657>] v4l2_device_register_subdev+0x97/0x180 [videodev]
> >>>  [<ffffffffa04caa5e>] ivtv_gpio_init+0x13e/0x180 [ivtv]
> >>>  [<ffffffffa04c5cb4>] ivtv_probe+0x914/0x1bf0 [ivtv]
> >>>  [<ffffffff810d8904>] ? mark_held_locks+0xb4/0x130
> >>>  [<ffffffff81714636>] ? _raw_spin_unlock_irqrestore+0x36/0x70
> >>>  [<ffffffff810d8b1d>] ? trace_hardirqs_on+0xd/0x10
> >>>  [<ffffffff813890ae>] local_pci_probe+0x3e/0x70
> >>>  [<ffffffff810898a4>] work_for_cpu_fn+0x14/0x20
> >>>  [<ffffffff8108d8d9>] process_one_work+0x1f9/0x680
> >>>  [<ffffffff8108d870>] ? process_one_work+0x190/0x680
> >>>  [<ffffffff8108e0e1>] worker_thread+0x111/0x3c0
> >>>  [<ffffffff8108dfd0>] ? rescuer_thread+0x270/0x270
> >>>  [<ffffffff810938fd>] kthread+0xed/0x100
> >>>  [<ffffffff81093810>] ? insert_kthread_work+0x80/0x80
> >>>  [<ffffffff8171e16c>] ret_from_fork+0x7c/0xb0
> >>>  [<ffffffff81093810>] ? insert_kthread_work+0x80/0x80
> >>> …
> >>
> >>> ------------[ cut here ]------------
> >>> WARNING: at lib/dma-debug.c:947 check_for_stack+0xa0/0x100()
> >>> …
> >>> Pid: 38, comm: kworker/2:1 Not tainted 3.9.0-0.rc0.git5.1.fc19.x86_64 #1
> >>> Call Trace:
> >>>  [<ffffffff81068670>] warn_slowpath_common+0x70/0xa0
> >>>  [<ffffffff810686ec>] warn_slowpath_fmt+0x4c/0x50
> >>>  [<ffffffff81378dd0>] check_for_stack+0xa0/0x100
> >>>  [<ffffffff81379180>] debug_dma_map_page+0x100/0x140
> >>>  [<ffffffff814e6fdb>] usb_hcd_map_urb_for_dma+0x55b/0x620
> >>>  [<ffffffff814e7335>] usb_hcd_submit_urb+0x295/0x8d0
> >>>  [<ffffffff810d625c>] ? lockdep_init_map+0x9c/0x470
> >>>  [<ffffffff814e8905>] usb_submit_urb+0x155/0x3d0
> >>>  [<ffffffff814e9274>] usb_start_wait_urb+0x74/0x190
> >>>  [<ffffffff814e9831>] usb_bulk_msg+0xc1/0x170
> >>>  [<ffffffffa058c646>] dvb_usbv2_generic_rw+0xc6/0x260 [dvb_usb_v2]
> >>>  [<ffffffffa05943be>] af9015_ctrl_msg+0x16e/0x260 [dvb_usb_af9015]
> >>>  [<ffffffff8136be2a>] ? debug_object_deactivate+0x8a/0x170
> >>>  [<ffffffffa0594bbc>] af9015_identify_state+0x3c/0x90 [dvb_usb_af9015]
> >>>  [<ffffffffa058b08d>] dvb_usbv2_init_work+0x6d/0x10e0 [dvb_usb_v2]
> >>>  [<ffffffff8108d8d9>] process_one_work+0x1f9/0x680
> >>>  [<ffffffff8108d870>] ? process_one_work+0x190/0x680
> >>>  [<ffffffff8108e0e1>] worker_thread+0x111/0x3c0
> >>>  [<ffffffff8108dfd0>] ? rescuer_thread+0x270/0x270
> >>>  [<ffffffff810938fd>] kthread+0xed/0x100
> >>>  [<ffffffff81093810>] ? insert_kthread_work+0x80/0x80
> >>>  [<ffffffff8171e16c>] ret_from_fork+0x7c/0xb0
> >>>  [<ffffffff81093810>] ? insert_kthread_work+0x80/0x80
> >>> ---[ end trace b4a4208b0ed39626 ]---
> >>
> >> Mauro, any idea on these two?
> > 
> > The first one looks to be a know issue: __initdev inside ivtv/cx18, is delaying
> > initialization to happen after registering the device. There's a patch for it,
> > pending upstream merge.
> > 
> 
> This one is gone after after applying
> 7d180f91eaabfbb813f1eda9e39e4445bfb826fc branch 'master' of
> git://linuxtv.org/media_build

OK.

> > The second one seems to be a bad lock. I think I saw some discussion about 
> > that at the media ML, but I can't remember the odd details. Perhaps there's
> > a fix for it already. It should be noticed that there's a pending pull
> > request with about ~600 patches for 3.9 that is pending its merge.
> > The ivtv fix should be there. I hope Linus will merge it soon.
> >
> 
> This one stayed;
>> ------------[ cut here ]------------
> WARNING: at lib/dma-debug.c:947 check_for_stack+0xa7/0xf0()
> Hardware name: M720-US3
> ehci-pci 0000:00:04.1: DMA-API: device driver maps memory fromstack
> [addr=ffff8801283d7be1]
> Modules linked in: dvb_usb_af9015(OF+)… dvb_usb_v2(OF)… dvb_core(OF)
> wm8775(OF) tda9887(OF) tda8290(OF)… tuner(OF)… cx25840(OF)… bttv(OF+)…
> rc_core(OF) btcx_risc(OF) ivtv(OF) snd_bt87x… videobuf_dma_sg(OF)
> videobuf_core(OF)… cx2341x(OF) tveeprom(OF) v4l2_common(OF)…
> videodev(OF)… video i2c_algo_bit… (irrelevant omitted)
> Pid: 38, comm: kworker/2:1 Tainted: GF          O
> 3.9.0-0.rc0.git5.1.fc18.x86_64 #1
> Call Trace:
> usb 4-2: pl2303 converter now attached to ttyUSB0
>  [<ffffffff81069b5f>] warn_slowpath_common+0x7f/0xc0
>  [<ffffffff81069c56>] warn_slowpath_fmt+0x46/0x50
>  [<ffffffff81386467>] check_for_stack+0xa7/0xf0
>  [<ffffffff81388228>] debug_dma_map_page+0x118/0x150
>  [<ffffffff814fa5d9>] usb_hcd_map_urb_for_dma+0x569/0x630
>  [<ffffffff811be091>] ? set_track+0x61/0x1b0
>  [<ffffffff814fa945>] usb_hcd_submit_urb+0x2a5/0x900
>  [<ffffffff810d814c>] ? lockdep_init_map+0xac/0x540
>  [<ffffffff814fc07f>] usb_submit_urb+0xff/0x3d0
>  [<ffffffff814fd042>] usb_start_wait_urb+0x82/0x1a0
>  [<ffffffff814fbdae>] ? usb_alloc_urb+0x1e/0x50
>  [<ffffffff814fd22c>] usb_bulk_msg+0xcc/0x180
>  [<ffffffffa05be7e8>] dvb_usbv2_generic_rw+0xd8/0x270 [dvb_usb_v2]
>  [<ffffffffa05cc5da>] af9015_ctrl_msg+0x1da/0x280 [dvb_usb_af9015]
>  [<ffffffff810ae7aa>] ? cpuacct_charge+0xfa/0x210
>  [<ffffffff810ae6c3>] ? cpuacct_charge+0x13/0x210
>  [<ffffffffa05ccb4c>] af9015_identify_state+0x3c/0x90 [dvb_usb_af9015]
>  [<ffffffffa05bcf18>] dvb_usbv2_init_work+0x68/0x11a0 [dvb_usb_v2]
>  [<ffffffff8108e36f>] ? process_one_work+0x19f/0x690
>  [<ffffffff8108e3db>] process_one_work+0x20b/0x690
>  [<ffffffff8108e36f>] ? process_one_work+0x19f/0x690
>  [<ffffffff8108ec10>] worker_thread+0x110/0x380
>  [<ffffffff8108eb00>] ? rescuer_thread+0x260/0x260
>  [<ffffffff8109569d>] kthread+0xed/0x100
>  [<ffffffff810955b0>] ? flush_kthread_worker+0x190/0x190
>  [<ffffffff8173882c>] ret_from_fork+0x7c/0xb0
>  [<ffffffff810955b0>] ? flush_kthread_worker+0x190/0x190
> ---[ end trace 4f4bc1c2abe300b5 ]---
> --

Ah, you're running it on a non-x86 hardware, right? On (modern) x86 machines,
almost all memories can do DMA, including the area used by stack. That's
not true on other archs like arm.

I suspect that Antti never tried to run this driver outside x86.

Let me look into it...

static int af9015_ctrl_msg(struct dvb_usb_device *d, struct req_t *req)
{
#define BUF_LEN 63
#define REQ_HDR_LEN 8 /* send header size */
#define ACK_HDR_LEN 2 /* rece header size */
        struct af9015_state *state = d_to_priv(d);
        int ret, wlen, rlen;
        u8 buf[BUF_LEN];
...
        ret = dvb_usbv2_generic_rw(d, buf, wlen, buf, rlen);



int dvb_usbv2_generic_rw(struct dvb_usb_device *d, u8 *wbuf, u16 wlen, u8 *rbuf,
                u16 rlen)
{
        int ret, actual_length;

...
        ret = usb_bulk_msg(d->udev, usb_sndbulkpipe(d->udev,
                        d->props->generic_bulk_ctrl_endpoint), wbuf, wlen,
                        &actual_length, 2000);


That's the problem: "buf" is at stack, and it is passed directly to
usb_bulk_msg(). This doesn't work on -arm (and sometimes may cause
data to be lost even on x86).

In order to fix it, the driver should be doing, instead, something like:

static int af9015_ctrl_msg(struct dvb_usb_device *d, struct req_t *req)
{
...
	u8 *buf = kmalloc(BUF_LEN, GFP_ATOMIC);

	...

	kfree(buf);
...
}

You shouldn't blame the merge windows kernel here. This never worked
on arm, even on older kernel versions ;)

A more permanent fix would be to allocate a temporary 80 bytes buffer
embedded with dvb_usb_v2 structs, and change dvb_usbv2_generic_rw to
always use this temporary buffer for the control URB transfers.

Antti,

Could you please tale a look in it?

Thanks!
Mauro


More information about the kernel mailing list