MIPS Linux SMP 启动流程
1. 以 Cavium Octeon 多核 MIPS 为例,启动时所有的core的入口一起指向kernel_entry,然后在kernel_entry_setup宏里分叉,boot core 继续往下,其它的则不停的判断循环,直到boot core 唤醒之:
#ifdef CONFIG_SMP
rdhwr v0, $0 # 当前core 的 id 入 v0
bne a2, zero, octeon_main_processor # boot core 的 a2 会被bootloader 设为 1
nop
#
# All cores other than the master need to wait here for SMP bootstrap
# to begin
#
PTR_LA t0, octeon_processor_boot # 全局变量,当boot core 完成必要的初始化,挨个 up 其它
core 时,通过向该变量写入下一个要 up 的 core 的 number
来up 下一个core
octeon_spin_wait_boot:
LONG_L t1, (t0) # 读取 octeon_processor_boot 的值
bne t1, v0, octeon_spin_wait_boot # Keep looping if it isn't me :)
nop
PTR_LA t0, octeon_processor_cycle # 如果是我,则同步一下 cycle counter,与其它core 一致
LONG_L t0, (t0)
daddu t0, 122 # Aproximately how many cycles we will be off
dmtc0 t0, CP0_CYCLE_COUNTER
PTR_LA t0, octeon_processor_gp # boot core cpu_up() 一个core 时,会先fork 一个 idle
内核线程,尔后将 octeon_processor_gp 指向 idle 的 thread_info 结构
LONG_L gp, (t0)
PTR_LA t0, octeon_processor_sp # octeon_processor_sp 指向 idle 的kernel stack
LONG_L sp, (t0)
LONG_S zero, (t0) # boot core 会通过不停的读取 octeon_processor_sp 的值来
判断被 cpu_up() 的core 是否已经boot 完。其值设为 0 时,表示我
已起来 :)
#ifdef __OCTEON__
syncw
#else
sync
#endif
b smp_bootstrap # Jump to the normal Linux SMP entry point
nop
octeon_main_processor:
#endif
.endm
注意 Linux 下, SMP 的通用入口是 smp_bootstrap,其定义于[arch/mips/head.S]:
#ifdef CONFIG_SMP
/*
* SMP slave cpus entry point. Board specific code for bootstrap calls this
* function after setting up the stack and gp registers.
*/
NESTED(smp_bootstrap, 16, sp)
#ifdef CONFIG_MIPS_MT_SMTC
/*
* Read-modify-writes of Status must be atomic, and this
* is one case where CLI is invoked without EXL being
* necessarily set. The CLI and setup_c0_status will
* in fact be redundant for all but the first TC of
* each VPE being booted.
*/
DMT 10 # dmt t2 /* t0, t1 are used by CLI and setup_c0_status() */
jal mips_ihb
#endif /* CONFIG_MIPS_MT_SMTC */
setup_c0_status_sec
smp_slave_setup # slave core 额外的操作,一般为空
#ifdef CONFIG_MIPS_MT_SMTC
andi t2, t2, VPECONTROL_TE
beqz t2, 2f
EMT # emt
2:
#endif /* CONFIG_MIPS_MT_SMTC */
j start_secondary
END(smp_bootstrap)
#endif /* CONFIG_SMP */
.macro setup_c0_status_sec
#ifdef CONFIG_64BIT
setup_c0_status ST0_KX ST0_BEV
#else
setup_c0_status 0 ST0_BEV
#endif
.endm
setup_c0_status ST0_KX ST0_BEV 完成的操作为:
将 cp0_status 低5位清0,另加 CU0 | KX 位,清 BEV 位,注意与setup_c0_status_pri 的差别,setup_c0_status_pri 是不清BEV 位,因为 boot core 用它时尚未初始化好异常入口,这个要到 trap_init()中才会被清掉。而此时 slave core 起时系统异常入口已经初始化好了,可以清 BEV 位了。
尔后直接跳到 start_secondary() 这是 slave core 执行的第一个C 代码函数,一个新时代开始了:)
2. 有些 Multi-core 的实现是,最先进入kernel_entry 的只有一个 core (boot core)(此时,其它core的状态,要不没上电,要不就是在死循环),尔后由boot core 在依次 cpu_up() 的时候,通过调用prom_boot_secondary() ,在其中将 slave core 的入口指向 smp_bootstrap,当然还要设置好 slave core 的 sp, gp。
具体的实现有所不一,broadcom 的实现是借助CFE (common firmware envirment),来设置 sp, gp, 入口地址(smp_bootstrap),然后start slave core:
[arch/mips/sibyte/cfe/smp.c]
/*
* Setup the PC, SP, and GP of a secondary processor and start it
* running!
*/
void __cpuinit prom_boot_secondary(int cpu, struct task_struct *idle)
{
int retval;
retval = cfe_cpu_start(cpu_logical_map(cpu), &smp_bootstrap,
__KSTK_TOS(idle),
(unsigned long)task_thread_info(idle), 0);
if (retval != 0)
printk("cfe_start_cpu(%i) returned %i\n" , cpu, retval);
}
[arch/mips/sibyte/cfe/cfe_api.c]
#if defined(CFE_API_cpu_start) || defined(CFE_API_ALL)
int cfe_cpu_start(int cpu, void (*fn) (void), long sp, long gp, long a1)
{
cfe_xiocb_t xiocb;
xiocb.xiocb_fcode = CFE_CMD_FW_CPUCTL;
xiocb.xiocb_status = 0;
xiocb.xiocb_handle = 0;
xiocb.xiocb_flags = 0;
xiocb.xiocb_psize = sizeof(xiocb_cpuctl_t);
xiocb.plist.xiocb_cpuctl.cpu_number = cpu;
xiocb.plist.xiocb_cpuctl.cpu_command = CFE_CPU_CMD_START;
xiocb.plist.xiocb_cpuctl.gp_val = gp;
xiocb.plist.xiocb_cpuctl.sp_val = sp;
xiocb.plist.xiocb_cpuctl.a1_val = a1;
xiocb.plist.xiocb_cpuctl.start_addr = (long) fn;
cfe_iocb_dispatch(&xiocb);
return xiocb.xiocb_status;
}
#endif /* CFE_API_cpu_start || CFE_API_ALL */
这个在kernel 里也容易实现,只要在一个内存区域实现一段代码,然后将这些值写入,最后将slave core 的异常入口指向这断代码(当然这段代码最终是跳往 smp_bootstrap 的),然后写一下相应 slave core 的 reset 控制寄存器即可,这个一般的多核平台都有,cavium octeon 也有,只是它的默认实现没用这个机制而已。
Append: octeon 是从octeon_boot_desc_ptr->core_mask 拿到bootloader传过来的 coremask,然后设置相应的bitmap 位。这个与内核参数 numcores= 没有直接的联系,应该是bootloader根据numcores 的值,计算相应的coremask