捉虫日记 0011: insmod failed on RMI XLR (1)

来自Jack's Lab
2014年5月19日 (一) 17:16Comcat (讨论 | 贡献)的版本

(差异) ←上一版本 | 最后版本 (差异) | 下一版本→ (差异)
跳转到: 导航, 搜索

1 Phenomenon

环境:

  • RMI XLR732 (8 core, 32 threads)
  • Linux 2.6.27.14


产品内核版本从 2.6.27.8 升级到 2.6.27.14 后,启动到 udev modprobe modules 时,出现如下信息:

......
VFS: Mounted root (nfs filesystem).
Freeing unused kernel memory: 216k freed
INIT: version 2.86 booting
        Welcome to Wind River Linux
Starting udev: [ OK ]
Setting hostname localhost: [ OK ]
Checking filesystems
Checking all file systems.
[ OK ]
Mounting local filesystems: [ OK ]
Enabling /etc/fstab swaps: [ OK ]
CPU 0 Unable to handle kernel paging request at virtual address 0000000000000000, epc == ffffffff800000d0, ra == ffffffff80000008
Oops[#1]:
Cpu 0
$ 0   : 0000000000000000 000000002aab0748 0000000000000000 ffffff0000000000
$ 4   : c000000000000000 000000002aaaa000 0000000000011d88 ffffffffffffffbf
$ 8   : 7f454c4602020100 0000000000000000 0001000800000001 0000000000000000
$12   : 0000000000000000 ffffffff834056fc ffffffff83485048 0000000000002af8
$16   : 000000002aaaa000 0000000000006748 000000002aaaa000 0000000000417148
$20   : c000000000000000 0000000000417100 0000000000417330 000000002aaaa000
$24   : 0000000000000008 ffffffff835e2710                                 
$28   : a800000126710000 a800000126713d20 000000000041733c ffffffff80000008
Hi    : 0000000000000007
Lo    : 0000000000000001
epc   : ffffffff800000d0 0xffffffff800000d0
    Not tainted
ra    : ffffffff80000008 0xffffffff80000008
Status: 10005ce3    KX SX UX KERNEL EXL IE
Cause : 00808008
BadVA : 0000000000000000
PrId : 000c0b04 (RMI Phoenix)
Modules linked in:
Process modprobe (pid: 1573, threadinfo=a800000126710000, task=a8000001266aafe8, tls=000000002aaafec0)
Stack : a800000126716710 a8000001265e2438 a8000001265e23f0 a8000001265e2dc8
        a8000001265e3150 a8000001265e3140 a8000001265587c0 0000000000007000
        a8000001265e2dc8 ffffffff834b25e8 0000000000007000 ffffffff8340fb68
        a8000001276686c0 ffffffff834bdcec a800000127668598 a8000001265e3140
        000000002aaaa000 a8000001265e3150 a8000001265e23f0 0000000008100073
        a8000001265587c0 ffffffff834bfb3c 0000000000000000 0000000000000003
        0000000000000000 ffffffff8340ff98 a800000127668598 000000002aaaa000
        a800000126725380 ffffffff836169e8 0000000010005ce1 a800000126558824
        000000002aaaa000 a800000126725380 0000000000006748 0000000000417148
        000000002aaaa000 ffffffff838c0000 0000000000000000 0000000000417100
        ...
Call Trace:
[<ffffffff834b25e8>] vma_prio_tree_insert+0x28/0x60
[<ffffffff8340fb68>] _spin_unlock+0x18/0x38
[<ffffffff834bdcec>] vma_link+0x13c/0x218
[<ffffffff834bfb3c>] mmap_region+0x67c/0x728
[<ffffffff8340ff98>] _spin_lock_irqsave+0x20/0xf8
[<ffffffff836169e8>] __up_write+0x40/0x2c8
[<ffffffff834850bc>] sys_init_module+0x74/0x1b0
[<ffffffff8340394c>] handle_sys+0x16c/0x188
[<ffffffff834056f4>] __bzero+0x90/0x11c


Code: df7b0000 335a0ff0 037ad82d <df7a0000> df7b0008 001ad1ba 409a1000 001bd9ba 409b1800
INIT: Entering runlevel: 3
Entering non-interactive startup
Starting system logger: [ OK ]
Starting kernel logger: [ OK ]
Starting portmap: [ OK ]
Mounting other filesystems: [ OK ]
Starting sshd:       <------------- kernel hung up here


此为必现,每次 dump 出的信息几乎一样。

若移去 rootfs 中的 modules,则可进入 rootfs,在进入系统后,insmod 则:

root@localhost:/root> insmod ./hwtimer_test.ko
CPU 29 Unable to handle kernel paging request at virtual address 0000000000000000, epc == ffffffff800000c4, ra == ffffffff80000008
Oops[#1]:
Cpu 29
$ 0   : 0000000000000000 00000000004b5c0c 0000000000000000 ffffff0000000000
$ 4   : c000000000000000 000000000049cf58 0000000000024c74 ffffffffffffffbf
$ 8   : 7f454c4602020100 0000000000000000 0001000800000001 0000000000000000
$12   : 0000000000000000 ffffffff834056fc ffffffff83485048 000000000000c808
$16   : 000000000049cf58 0000000000018cb4 000000000049cf58 000000000049cf48
$20   : c000000000000000 0000000000000003 000000000040e284 000000000040b6fc
$24   : 0000000000000034 ffffffff835e2818
$28   : a800000123e6c000 a800000123e6fd20 0000000000000000 ffffffff80000008
Hi    : 0000000000000007
Lo    : 0000000000000001
epc   : ffffffff800000c4 0xffffffff800000c4
    Not tainted
ra    : ffffffff80000008 0xffffffff80000008
Status: 1000dce3    KX SX UX KERNEL EXL IE
Cause : 00800008
BadVA : 0000000000000000
PrId : 000c0b04 (RMI Phoenix)
Modules linked in:
Process insmod (pid: 2052, threadinfo=a800000123e6c000, task=a8000001252fafe8, tls=00000000004a3470)
Stack : 0000000000000000 00000001ffffffff a800000125f59d40 0000000000000000
        0000000000000000 0000000000000000 000001a8000001a7 000001a8000001a8
        a800000125c8eb48 ffffffff838f4720 a800000125c8ea80 ffffffff834100e8
        a80000012666a540 a800000127428960 ffffffff83961240 ffffffff8360fe38
        a800000125c8ea80 ffffffff834100e8 a800000127428960 ffffffff8340fb68
        000001a7000001a6 000001a7000001a7 a800000125c8eb48 ffffffff838f4720
        a800000125c8ea80 ffffffff834fee74 a800000125f59d40 0000000000000000
        a800000125301680 0000000000000003 a800000125301680 0000000000000018
        a8000001253016e0 a800000125f59d40 0000000000018cb4 000000000049cf48
        000000000049cf58 ffffffff838c0000 000000007f94df11 0000000000000003
        ...
Call Trace:
[<ffffffff834100e8>] _spin_lock+0x20/0xb8
[<ffffffff8360fe38>] _atomic_dec_and_lock+0x178/0x240
[<ffffffff834100e8>] _spin_lock+0x20/0xb8
[<ffffffff8340fb68>] _spin_unlock+0x18/0x38
[<ffffffff834fee74>] mntput_no_expire+0x54/0x2b0
[<ffffffff834850bc>] sys_init_module+0x74/0x1b0
[<ffffffff8340394c>] handle_sys+0x16c/0x188


Code: 335a0ff8 037ad82d 403aa000 <df7b0000> 335a0ff0 037ad82d df7a0000 df7b0008 001ad1ba
Segmentation fault
root@localhost:/root> ls
abc hd.tst hwtimer_test.ko microperl n.out t.pl



2 Analysis

From the dumped context, we can get the information:

    Cause: 00808008             ----> TLB load exception
    BadVA: 0000000000000000     ----> access 0x0
    epc: ffffffff800000d0       ----> in TLB refill handler

Decode the dumped Code (df7b0000 335a0ff0 037ad82d <df7a0000> ...) the related instruction is:

   c:   df7b0000    ld k1,0(k1)
10:   335a0ff0    andi    k0,k0,0xff0
14:   037ad82d    daddu   k1,k1,k0
18:   df7a0000    ld k0,0(k1)        <--- cause the exception, so we can guess k1 ==0
1c:   df7b0008    ld k1,8(k1)
20:   001ad1ba    dsrl    k0,k0,0x6
24:   409a1000    mtc0    k0,c0_entrylo0
28:   001bd9ba    dsrl    k1,k1,0x6
2c:   409b1800    mtc0    k1,c0_entrylo1

Obviously it's the last part of TLB refill handler.

So, we can get the process:

sys_init_module() ---> ... ---> TLB refill ---> there is no ralated page table entry for the virtual address ---> exception

2009.2.8 20:55



The old dev git tree (old wrlinux 3.0 with upstream 2.6.27.8) on my machine is OK.

I have tried following upstream git tree:

        kernel.org 2.6.27.14
        kernel.org 2.6.27.11
        kernel.org 2.6.27.9
        kernel.org 2.6.27.8
        kernel.org 2.6.27.7
        kernel.org 2.6.27.4

        linux-mips 2.6.27.14
        linux-mips 2.6.27.8

All are failed. It's cool!

2009.2.8 23:21



Try the tester's wrlinux version, the same phenomenon:

root@localhost:/root> uname -a
Linux localhost 2.6.27.12-WR3.0zz_standard #1 SMP PREEMPT Mon Feb 9 14:16:26 CST 2009 mips64 mips64 mips64 GNU/Linux

root@localhost:/root> insmod ./nls_cp437.ko
Oops[#3]:
Cpu 29
$ 0   : 0000000000000000 000000000049dae8 0000000000000000 ffffff0000000000
$ 4   : c00000000000a000 000000000049af58 0000000000006190 ffffffffffffffbf
$ 8   : 7f454c4602020100 0000000000000000 0001000800000001 0000000000000000
$12   : 0000000000000000 ffffffff834056ec ffffffff83484ee0 0000000000001cb8
$16   : 000000000049af58 0000000000002b90 000000000049af58 000000000049af48
$20   : c00000000000a000 0000000000000003 0000000000000000 0000000000519da8
$24   : 0000000000000010 0000000000000000
$28   : a800000124c5c000 a800000124c5fd20 0000000000000000 ffffffff80000008
Hi    : 0000000000000007
Lo    : 0000000000000001
epc   : ffffffff800000d0 0xffffffff800000d0
    Tainted: G      D
ra    : ffffffff80000008 0xffffffff80000008
Status: 1000dce3    KX SX UX KERNEL EXL IE
Cause : 00800008
BadVA : 0000000000000000
PrId : 000c0b04 (RMI Phoenix)
Modules linked in:
Process insmod (pid: 2463, threadinfo=a800000124c5c000, task=a800000123f45220, tls=00000000004a1470)
Stack : 0000000000000000 ffffffff83400a00 0000000000000000 0000000000000001
        0000000000000000 0000000000000000 00000000000001e3 0000000000010000
        00000000000001e3 0000000000000001 a80000012755dc68 00000000004120b8
        000000000049d1d0 000000000050baa0 0000000000000000 ffffffffc0000008
        ffffffff834da880 0000000300000002 0000000000002b90 0000000000000003
        a800000126529e00 0000000000000003 a800000123f4a580 0000000000000018
        a800000123f4a5e0 a8000001264ab200 0000000000000000 ffffffff835e3b60
        a800000123f4a580 0000000000000003 a800000124c5c000 a800000124c5fe60
        0000000000000000 ffffffff834da9c0 0000000000002b90 000000000049af48
        000000000049af58 ffffffff838c0000 000000000049af48 0000000000000003
        ...
Call Trace:
[<ffffffff83400a00>] ret_from_irq+0x0/0x4
[<ffffffff834da880>] sys_close+0x0/0x210
[<ffffffff835e3b60>] cap_file_free_security+0x0/0x8
[<ffffffff834da9c0>] sys_close+0x140/0x210
[<ffffffff83484f54>] sys_init_module+0x74/0x1b0
[<ffffffff8340394c>] handle_sys+0x16c/0x188
[<ffffffff834705e0>] sys_ni_syscall+0x0/0x8


Code: df7b0000 335a0ff0 037ad82d <df7a0000> df7b0008 001ad1ba 409a1000 001bd9ba 409b1800
Segmentation fault

2009.2.9 12:40



yshi 在 Cavium 的板子上试了下,确认该问题不是一个通用的 MIPS specific issue, only RMI XLR specific issue,总算是个好消息。

2009.2.9 16:07



换了个板子,外设有点不一样,其他都相同,问题不见了,真见了鬼。

2009.2.9 18:40



和兄弟几个暂时找到一个规避方法,用很丑的 TLB refill handler 替代了动态生成的,总算可以喘口气了。明天得好好比较这两个 TLB refill handler 的不同。谢谢 Guijin 和 Mark 的一起努力。

2009.2.9 22:45



严重怀疑是 XLR SoC 内的 MAC 驱动引起的问题,遂在内核中 disable MAC,使用 8139 网卡挂 NFS,问题依然如故,此可以确认问题与 MAC 驱动无关。

2009.2.10 11:15



添加了一个系统调用 sys_comcat >^..^<,在其内直接访问 0x0 地址,用一个用户态的小程序,触发了 sys_comcat, 打算模拟问题发生的上下文。内核行为如下:

root@localhost:/root> uname -a
Linux localhost 2.6.27.15-WR3.0zz_standard-dirty #7 SMP PREEMPT Tue Feb 10 13:53:31 CST 2009 mips64 mips64 mips64 GNU/Linux
root@localhost:/root> ./syscall
CPU 25 Unable to handle kernel paging request at virtual address 0000000000000000, epc == 0000000000000000, ra == ffffffff8340394c
Oops[#1]:
Cpu 25
$ 0   : 0000000000000000 fffffffffffffff8 00000000000010ea 00000000004005f0
$ 4   : 0000000000000001 000000007f90c774 000000007f90c77c 0000000000000000
$ 8   : 000000007f90c774 000000002e0267fc 000000002e176950 0000000000000000
$12   : 0000000000000000 ffffffffc0000008 0000000000000000 fffffffff0000000
$16   : 0000000000400618 00000000004004c0 00000000004b0000 00000000004dc968
$20   : 00000000004dc988 00000000004dc968 0000000000000000 00000000004e6e60
$24   : 0000000000000000 00000000004005e0                                 
$28   : a800000123d24000 a800000123d27eb0 000000007f90c698 ffffffff8340394c
Hi    : 00000000000001a5
Lo    : 0000000000005e17
epc   : 0000000000000000 0x0
    Not tainted
ra    : ffffffff8340394c handle_sys+0x16c/0x188
Status: 1000dce3    KX SX UX KERNEL EXL IE
Cause : 00800008
BadVA : 0000000000000000
PrId : 000c0b04 (RMI Phoenix)
Modules linked in:
Process syscall (pid: 1984, threadinfo=a800000123d24000, task=a80000012519b6c0, tls=000000002aaafea0)
Stack : 0000000000000000 fffffffffffffff8 00000000000010ea 000000002aaa8a30
        0000000000000001 000000007f90c774 000000007f90c77c 0000000000000000
        000000002e170344 000000000ffffffe 000000000000006c 000000007f90c670
        ffffffffffffffff 000000007f90c5c8 000000002dfd9c24 fffffffff0000000
        0000000000400618 00000000004004c0 00000000004b0000 00000000004dc968
        00000000004dc988 00000000004dc968 0000000000000000 00000000004e6e60
        0000000000000000 00000000004005e0 0000000000000000 0000000000000000
        000000002e176950 000000007f90c698 0000000000000000 000000002e026948
        000000000000dcf3 00000000000001a5 0000000000005e17 000000002e03e8a0
        0000000000800020 00000000004005f4 ffffffff8396c000 ffffffff8396c000
        ...
Call Trace:

Code: (Bad address in epc)

Segmentation fault


可以看到,没什么大的收获;

倒是可以确认:问题是内核访问了一个地址,进 TLB refill handler 后,在索引到 PMD 时,对应的项为空,即拿到了一个错误的页表基地址,此后迅速又来了一个 TLB Miss ---> TLB refill exception 由于此时 EXL 为1,则进入 offset 0x180 的通用异常入口,ExcCode 为 2,则为 TLB Load exception,内核进入 handle_tlbl 处理之,不知为何 handle_tlbl 进了最后的 do_page_fault() ,后 dump reg & stack 就发了个 SIGSEGV 给进程了。

让人匪夷所思的是索引到了 PMD 的空项,等于是到 0x0 处去读取页表了

2009.2.10 14:23



晚上出去乱晃了下,金汤玉线,看悠闲匆忙的各色人等,稍感轻松。回来冷静了下,把问题板上的 TLB refill handler dump 出来,好好分析了一下:

dtlb.o:     file format elf64-tradbigmips

Disassembly of section .text:

0000000000000000 <main>:
   0:    67bdfff0     daddiu    sp,sp,-16
   4:    ffbe0000     sd    s8,0(sp)
   8:    03a0f02d     move    s8,sp

--------------------------------------------------------------

    @0xffffffff80000000                       
   c:    07610005     bgez    k1,24 <main+0x24> /* vaddr[61] != 1 (0xc000..0000 ~ 0xdfff..ffff), branch */   
10:    3c1bc000     lui    k1,0xc000        -----> at delay slot, commit new value to k1 after reading k1 (bgez)
                                           
                                            # vaddr is at 0xe000000000000000 ~ 0xffffffffffffffff
14:    035bd02f     dsubu    k0,k0,k1        /* k0 = vaddr - 0xc000000000000000 */
18:    3c1b8396     lui    k1,0x8396
1c:    10000023     b    ac <main+0xac>        ----> @@@
20:    277ba000     addiu    k1,k1,-24576    /* 0xffffffff8395a000, module_pg_dir */

24:    001bd83c     dsll32    k1,k1,0x0        /* (0xffffffffc0000000 << 32) */
28:    035bd02f     dsubu    k0,k0,k1        /* k0 = vaddr - 0xc000000000000000 */
2c:    1000001f     b    ac <main+0xac>        ----> @@@
30:    3c1b8396     lui    k1,0x8396            /* 0xffffffff83960000, swapper_pg_current */
    ...
    ...
    ...
    @0xffffffff80000080
8c:    403a4000     dmfc0    k0,c0_badvaddr        

90:    0740001a     bltz    k0,fc <main+0xfc>    /* if badvaddr >= 0x80000000 00000000 branch */

94:   403b2000    dmfc0   k1,c0_context
98:   001bddfa    dsrl    k1,k1,0x17      /* get (smp_processor_id() << 3) (26-23), see asm/mmu_context.h */
9c:   3c1a8396    lui k0,0x8396           /* swapper_pg_current = 0xffffffff83960000 */
a0:   037ad82d    daddu   k1,k1,k0        /* seem like is p_tmp = swapper_pg_current[smp_processor_id()] */
a4:   403a4000    dmfc0   k0,c0_badvaddr
a8:   df7b2000    ld k1,8192(k1)         /*
                                             * pgd = *((void *)(p_tmp + 8192), 8 bytes per pgd entry, pgd_current = 0xffffffff83962000,
                                             * 0x2000 = 8192
                                             * actually it's pgd = pgd_current[smp_processor_id()]
                                             */

@@@
ac:    001ad6fa     dsrl    k0,k0,0x1b        # >> 27
b0:    335a1ff8     andi    k0,k0,0x1ff8    /* get (vaddr[39:30] << 3), for indexing pgd */
b4:    037ad82d     daddu    k1,k1,k0        /* index pgd */

b8:    403a4000     dmfc0    k0,c0_badvaddr
bc:    df7b0000     ld    k1,0(k1)            /* get p_pmd */
c0:    001ad4ba     dsrl    k0,k0,0x12
c4:    335a0ff8     andi    k0,k0,0xff8        /* get (vaddr[29:21] << 3), for indexing pmd */
c8:    037ad82d     daddu    k1,k1,k0        /* index pmd */

cc:    403aa000     dmfc0    k0,c0_xcontext   
d0:    df7b0000     ld    k1,0(k1)            /* get p_pt */       
d4:    335a0ff0     andi    k0,k0,0xff0        /* get (va[20:13] << 4), actually use va[20:12] index the pt, va[12]=0, for indexing pt */
d8:    037ad82d     daddu    k1,k1,k0        /* index pt */

dc:    df7a0000     ld    k0,0(k1)            /* get even page addr */    <-------------
e0:    df7b0008     ld    k1,8(k1)            /* get odd page addr */

e4:    001ad1ba     dsrl    k0,k0,0x6        /* ignore the low 6 bits, it's for os */
e8:    409a1000     mtc0    k0,c0_entrylo0    /* tlb even page entry */
ec:    001bd9ba     dsrl    k1,k1,0x6        /* same as above */
f0:    409b1800     mtc0    k1,c0_entrylo1    /* tlb odd page entry */
f4:    42000006     tlbwr                    /* random write tlb */
f8:    42000018     eret                   

fc:    001ad8b8     dsll    k1,k0,0x2        # go here, vaddr is in xkphys or xkseg (>0x8000..0000)
                                            # 0x8000..0000 ~ 0xc000..0000 is xkphys, unmapped,
                                            # so vaddr > 0xc000..0000; vaddr << 2
100:    1000ffc2     b    c <main+0xc>

104:    00000000     nop
108:    00000000     nop

--------------------------------------------------------------

10c:    0000102d     move    v0,zero
110:    03c0e82d     move    sp,s8
114:    dfbe0000     ld    s8,0(sp)
118:    03e00008     jr    ra
11c:    67bd0010     daddiu    sp,sp,16


内核直接 dump 的16进制值,稍微处理了下,内嵌到 C 环境里,objdump 来的。

一直自以为对 MIPS TLB 这里很了解,哪知遇到很 COOL 的问题还是无所适从,似乎引以自傲的分析能力下降了不少,严重鄙视自己。

2009.2.11 01:15



把昨晚遗留的不清晰的分析,又仔细看了下,可以确认现内核里 mips64 之 TLB refill handler 会根据 VA 确定使用的 PGD 入口:

0xe0000000 00000000 ~ 0xffffffff ffffffff -----> 用 module_pg_dir
0xc0000000 00000000 ~ 0xdfffffff ffffffff -----> 用 swapper_pg_current
0x00000000 00000000 ~ 0x7fffffff ffffffff -----> 用 pgd_current

显然设计时,内核用 0xc0000000 00000000 作 vmalloc 的始地址 0xe0000000 00000000 作模块所用内存之始地址

2009.2.11 11:30



比较了前天夜里临时用于 workaound 的 TLB refill handler,发现其中没有用 module_pg_dir:

0x00000000 00000000 ~ 0x7fffffff ffffffff -----> 用 pgd_current
0xc0000000 00000000 (xkphys 因是固定映射,忽略之)往上的虚拟地址一律用 swapper_pg_current

2009.2.11 15:40



下午略感疲惫,于是去水里泡了下,仰面朝天,任清水淹没双耳,整个世界清静了...... 四肢任意挥洒,找寻生命本原最舒适的姿势。出来后身轻如燕,可惜只是一种感觉。

回来后思路清晰,把能用的 old git tree (2.6.27.4,没打 workaround 的 patch)patch 导出,到 linux-mips 的 git tree 上试了 2.6.27.4 和 2.6.27.14 皆无问题;鉴于这个 old git tree 仅有部分外设驱动,联想到同样的内核在外设不同板子上的不同现象,直接怀疑是某个驱动导致的问题。

于是把最终 check-in 的 patch 只打串口,网口,PCI-X,试了下,没问题!加 SPI4.2,又没问题。再加 IDE,问题出现!本来隐隐地感觉是后来 review 时,不经意修改了什么 patch 导致的 ,都做好 diff 的心理准备了,总算上苍庇佑,让搜索范围小了很多。

绕了这一圈,是过于在意内核dump 出的 register & stack 信息了,迷失在自我的兴趣里。

2009.02.11 22:02



可以详细确认是一个 PCI IDE Card 驱动引起的问题 (PDC202xx_new):

root@localhost:/root> insmod ./nls_cp437.ko
root@localhost:/root> ls
lib      raza_xlr_atx_64_be-linux-modules-WR3.0zz_standard.tar.bz2
nls_cp437.ko     syscall
pdc202xx_new.ko
root@localhost:/root> lsmod
Module                  Size Used by
nls_cp437               8064 0
root@localhost:/root> rmmod /nls_cp437.ko
root@localhost:/root> ls
lib      raza_xlr_atx_64_be-linux-modules-WR3.0zz_standard.tar.bz2
nls_cp437.ko     syscall
pdc202xx_new.ko
root@localhost:/root> insmod ./pdc202xx_new.ko

isa bounce pool size: 16 pages
*********************************************
cpu_12 received a bus/cache error
*********************************************
Bridge: Phys Addr = 0x0128000000, Device_AERR = 0x00000020
Bridge: The devices reporting AERR are:
    cpu 5
CPU: (XLR specific) Cache Error log = 0x0000000ae43bf601, Phy Addr = 0x0015c877e8
CPU: epc = 0xffffffff83423fc8, errorepc = 0xffffffff83422438, cacheerr = 0x00000000
Can not handle bus/cache error - Halting cpu
Unhandled kernel unaligned access[#1]:
Cpu 6586379
$ 0   : 0000000000000000 0000000000000010 0000000000000000 0000000000000000
$ 4   : ffffffff83423fb0 0000000000000000 ffffffff83431330 afa2001c00001921
$ 8   : 000000000380302d a800000124f1bcd0 a8000001266f2030 ffffffffffffffff
$12   : 000000001000dce1 000000001000001e a800000127807f70 a800000127807f80
$16   : a800000124f1bcd0 ffffffffde530100 afa2001c00001821 ae8200001000fff6
$20   : a8000001250041f0 0000000000000000 0000000000000060 000000000040b6fc
$24   : 0000000000000010 ffffffff8361d8e8                                 
$28   : a800000124f18000 a800000124f1bca0 0000000000000000 ffffffff83400a00
Hi    : fffffffffffffabd
Lo    : 300011aab40b6000
epc   : ffffffff83423fc8 do_ade+0x3c0/0x428
    Not tainted
ra    : ffffffff83400a00 ret_from_exception+0x0/0x20
Status: 1000dce3    <1>CPU 0 Unable to handle kernel paging request at virtual address 0000000000000002, epc == ffffffff83431298, ra == ffffffff83431394


root@localhost:/root> insmod ./pdc202xx_new.ko
isa bounce pool size: 16 pages
root@localhost:/root> lsmod
Module                  Size Used by
<<hung here>> ----> kernel can not respones sysrq interrupt.

2009.02.11 00:55



抽手解决了另外一个小问题:hwtimer 在内核升级后(默认开启 HRT),不准了,Tiejun 帮了看了一下,改起来很麻烦,而且影响到以后 HRT 的实现,索性就将 vCPU0 也用同样的时钟源了,这样就对称了,hwtimer 和 HRT 的问题也就自然的解了,喜欢这个简单的设计。想起了刚被强化了的“大道至简”的理念,太 COOL 了,orz

2009.02.12 22:21


















个人工具
名字空间

变换
操作
导航
工具箱