捉虫日记 0015: Freescale 8360e Gadget 插电引起 Kernel panic
来自Jack's Lab
(版本间的差异)
(以“== Phenomenon == 环境: * Freescale 8360e * Linux 2.6.34.6 产品内核版本从 2.6.27 升级到 2.6.34.6,内核启动后手动加载 Gadget Ether 模块 g...”为内容创建页面) |
2014年5月19日 (一) 16:59的最后版本
[编辑] 1 Phenomenon
环境:
- Freescale 8360e
- Linux 2.6.34.6
产品内核版本从 2.6.27 升级到 2.6.34.6,内核启动后手动加载 Gadget Ether 模块 g_ether.ko,成功后显示:
$ dmesg | tail g_ether gadget: using random self ethernet address g_ether gadget: using random host ethernet address usb0: MAC 8a:13:2e:1b:03:4f usb0: HOST MAC 4a:25:7f:fd:8f:c4 g_ether gadget: Ethernet Gadget, version: Memorial Day 2008 g_ether gadget: g_ether ready fsl_qe_udc e01006c0.usb: fsl_qe_udc bind to driver g_ether
但一插 USB 电缆到 PC,内核立刻 panic:
Unable to handle kernel paging request for data at address 0x00000000
Faulting instruction address: 0xc02a0d04
Oops: Kernel access of bad area, sig: 11 [#1]
NIP: c02a0d04LR: c02a0cd8 CTR: c029976c
REGS: c051bcc0 TRAP: 0300 Not tainted (2.6.34.6-WR4.0.0.0_standard)
MSR: 00001032 <ME,IR,DR> CR: 22008042 XER: 20000000
DAR: 00000000, DSISR: 20000000
TASK = c04ee450[0] 'swapper' THREAD: c051a000
GPR00: 00000003 c051bd70 c04ee450 0000003f 000022f5 ffffffff c0260f98 0000000
GPR08: 00000000 cf9679c0 00000000 c051a000 22008044 9a6c7e8a 00000001 2000000
GPR16: 40000000 00800000 00400000 cf98c0f0 0c000000 cf98c0d8 00000000 0000000
GPR24: c050d560 00000000 cf98c000 00000000 c050d560 cf98c000 cf9785c0 cf97860
NIP [c02a0d04] composite_setup+0xa68/0xb58
LR [c02a0cd8] composite_setup+0xa3c/0xb58
Call Trace:
[c051bd70] [c02a0cd8] composite_setup+0xa3c/0xb58 (unreliable)
[c051bdb0] [c029d23c] qe_udc_irq+0xbbc/0xe10
[c051be20] [c0092e48] handle_IRQ_event+0xb8/0x30c
[c051be70] [c0095f7c] handle_level_irq+0xb8/0x184
[c051be90] [c0030138] qe_ic_cascade_low_ipic+0x3c/0x50
[c051bea0] [c00064e4] native_do_IRQ+0x98/0xb4
[c051bec0] [c0005164] do_IRQ+0x10/0x20
[c051bed0] [c0015bb4] ret_from_except+0x0/0x14
--- Exception: 501 at cpu_idle+0x88/0xec
LR = cpu_idle+0x88/0xec
[c051bf90] [c0009b78] cpu_idle+0xe8/0xec (unreliable)
[c051bfb0] [c0003e90] rest_init+0xb0/0xe0
[c051bfc0] [c04b3890] start_kernel+0x304/0x318
[c051bff0] [00003438] 0x3438
Instruction dump:
4812b041 939e000c 3ac00000 3b200000 3ae00001 81380034 2f890000 419e00c4
801a0010 2f800003 419e00a0 81090008 <81680000> 3949003c 2f8b0000 419e0040
Kernel panic - not syncing: Fatal exception in interrupt
Call Trace:
[c051bc00] [c00089b0] show_stack+0x50/0x160 (unreliable)
[c051bc30] [c03cbc94] panic+0x128/0x1a8
[c051bc80] [c0012eb8] die+0x168/0x224
[c051bca0] [c0018fb0] bad_page_fault+0x90/0xc8
[c051bcb0] [c00159b8] handle_page_fault+0x7c/0x80
--- Exception: 300 at composite_setup+0xa68/0xb58
LR = composite_setup+0xa3c/0xb58
[c051bdb0] [c029d23c] qe_udc_irq+0xbbc/0xe10
[c051be20] [c0092e48] handle_IRQ_event+0xb8/0x30c
[c051be70] [c0095f7c] handle_level_irq+0xb8/0x184
[c051be90] [c0030138] qe_ic_cascade_low_ipic+0x3c/0x50
[c051bea0] [c00064e4] native_do_IRQ+0x98/0xb4
[c051bec0] [c0005164] do_IRQ+0x10/0x20
[c051bed0] [c0015bb4] ret_from_except+0x0/0x14
--- Exception: 501 at cpu_idle+0x88/0xec
LR = cpu_idle+0x88/0xec
[c051bf90] [c0009b78] cpu_idle+0xe8/0xec (unreliable)
[c051bfb0] [c0003e90] rest_init+0xb0/0xe0
[c051bfc0] [c04b3890] start_kernel+0x304/0x318
[c051bff0] [00003438] 0x3438
Rebooting in 180 seconds..
[编辑] 2 Analysis
显然这是一个访问空指针的错误。根据 NIP 的值,定位到出错指令地址为 0xc02a0d04
反汇编内核,找到这个指令的所在:
c02a0cfc: 41 9e 00 a0 beq- cr7,c02a0d9c <composite_setup+0xb00> c02a0d00: 81 09 00 08 lwz r8,8(r9) c02a0d04: 81 68 00 00 lwz r11,0(r8) c02a0d08: 39 49 00 3c addi r10,r9,60 c02a0d0c: 2f 8b 00 00 cmpwi cr7,r11,0
位于 composite_setup(),查看 drivers/usb/gadget/composite.c,函数体比较大,很难定位到具体出错 C 语句。再一次 objdump -S -d 反汇编 g_ether.o 文件(-S 表示让 objdump 在输出中穿插原始的 C 源码,因为整个内核的源码很大,所以我们只针对一个小文件即可)得到:
if (gadget->speed == USB_SPEED_HIGH)
3800: 80 1a 00 10 lwz r0,16(r26)
3804: 2f 80 00 03 cmpwi cr7,r0,3
3808: 41 9e 00 a0 beq- cr7,38a8 <composite_setup+0xb00>
descriptors = f->hs_descriptors;
else
descriptors = f->descriptors;
380c: 81 09 00 08 lwz r8,8(r9)
for (; *descriptors; ++descriptors) {
3810: 81 68 00 00 lwz r11,0(r8)
continue;
ep = (struct usb_endpoint_descriptor *)*descriptors;
addr = ((ep->bEndpointAddress & 0x80) >> 3)
| (ep->bEndpointAddress & 0x0f);
set_bit(addr, f->endpoints);
g_ether.o 无绝对地址,因此只能通过指令编码去对应,找到出错源码行是 'for (; *descriptors; ++descriptors) ',一看就知道是 *descriptors 出错,则往上找 descriptors 为何没有被赋值。
加了几条 printk,很快发现是 gadget->speed 的模式设置有误,导致 descriptors 拿错了值,其值应该是 f->descriptors 而不是 f->hs_descriptors