【发布时间】:2017-03-20 10:55:51
【问题描述】:
当我在内核线程中安装有死循环的 ko 时,kthread 在 cpu 内核上运行,并且该内核不能再运行任何其他进程。并且 NMI 看门狗被触发更多次:“NMI 看门狗:BUG:软锁定 - CPU#0 卡住了 22 秒![pradeep:1403]”。
为什么?
ko 代码是(我从网上复制的代码可能有错误,ko 不能是 rmmod(是的,我知道)。):
#include<linux/init.h>
#include<linux/module.h>
#include<linux/kernel.h>
#include<linux/kthread.h>
#include<linux/sched.h>
struct task_struct *task;
int data;
int ret;
void zg___aaa(void)
{
int a=0;
while (a<1000)
++a;
return;
}
int zg___thread_function(void *data)
{
int var;
var = 10;
printk(KERN_INFO "IN THREAD FUNCTION");
while(1) {
zg___aaa();
}
return var;
}
static int kernel_init(void)
{
data = 20;
printk(KERN_INFO"--------------------------------------------");
task = kthread_run(&zg___thread_function,(void *)&data,"pradeep");
printk(KERN_INFO"Kernel Thread : %s\n",task->comm);
return 0;
}
static void kernel_exit(void)
{
kthread_stop(task);
}
module_init(kernel_init);
module_exit(kernel_exit);
MODULE_AUTHOR("SHRQ");
MODULE_LICENSE("GPL");
内核配置文件太大了,我只能放一些相关的项目:
~/build-linux$ cat ./.config | grep PREEMPT
CONFIG_PREEMPT_NOTIFIERS=y
# CONFIG_PREEMPT_NONE is not set
CONFIG_PREEMPT_VOLUNTARY=y
# CONFIG_PREEMPT is not set
我将死循环移到 kernel_init 中,和之前一样。 以及来自内核的错误日志:
[ 4463.800938] NMI watchdog: BUG: soft lockup - CPU#0 stuck for 23s! [insmod:1605]
[ 4463.800943] Modules linked in: testko(OE+) xt_CHECKSUM iptable_mangle .......
[ 4463.800986] CPU: 0 PID: 1605 Comm: insmod Tainted: G OEL 4.11.0-rc2+ #14
[ 4463.800987] Hardware name: Hewlett-Packard /304Bh, BIOS 786H1 v01.13 07/14/2011
[ 4463.800988] task: ffff89c378773800 task.stack: ffffb18883264000
[ 4463.800992] RIP: 0010:kernel_init+0x2f/0x40 [testko]
[ 4463.800993] RSP: 0018:ffffb18883267cc8 EFLAGS: 00000286 ORIG_RAX: ffffffffffffff10
[ 4463.800994] RAX: 0000000000000012 RBX: ffffffffc06d6030 RCX: 0000000000000006
[ 4463.800995] RDX: 0000000000000000 RSI: 0000000000000086 RDI: ffff89c39bc0e0a0
[ 4463.800995] RBP: ffffb18883267cc8 R08: 0000000000000000 R09: 000000000000030f
[ 4463.800996] R10: 0000000000000004 R11: 0000000000000000 R12: ffff89c3837038c0
[ 4463.800996] R13: 0000000000000000 R14: ffff89c37862e5a0 R15: ffffb18883267eb0
[ 4463.800997] FS: 00007feb6e1c45c0(0000) GS:ffff89c39bc00000(0000) knlGS:0000000000000000
[ 4463.800998] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 4463.800999] CR2: 00007feb6d717450 CR3: 000000020edc0000 CR4: 00000000000006f0
[ 4463.801000] Call Trace:
[ 4463.801006] do_one_initcall+0x51/0x1b0
[ 4463.801009] ? __vunmap+0x85/0xd0
[ 4463.801013] ? kmem_cache_alloc_trace+0x15c/0x1c0
[ 4463.801014] ? kfree+0x13b/0x180
[ 4463.801016] do_init_module+0x60/0x1fa
[ 4463.801019] load_module+0x22dd/0x2870
[ 4463.801021] ? __symbol_put+0x40/0x40
[ 4463.801022] SYSC_finit_module+0x96/0xd0
[ 4463.801024] SyS_finit_module+0xe/0x10
[ 4463.801027] entry_SYSCALL_64_fastpath+0x1a/0xa9
[ 4463.801028] RIP: 0033:0x7feb6d6aebf9
[ 4463.801028] RSP: 002b:00007ffca2026c48 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
[ 4463.801030] RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007feb6d6aebf9
[ 4463.801030] RDX: 0000000000000000 RSI: 0000558f2ec2c186 RDI: 0000000000000003
[ 4463.801031] RBP: 0000000000000086 R08: 0000000000000000 R09: 00007feb6d96fe80
[ 4463.801031] R10: 0000000000000003 R11: 0000000000000246 R12: 0000558f2fda0130
[ 4463.801032] R13: 0000000000000001 R14: 0000000000000000 R15: 00007ffca2025acc
......
我的问题,为什么在内核模式下死循环时,被抢占的内核调度程序无法抢占死循环代码并切换到其他线程?当死循环在用户模式下运行时,调度器功能正常。
【问题讨论】:
-
内核版本为4.11-rc
-
你能粘贴完整的内核配置吗?
-
好的,我会粘贴所有的配置文件。
-
我从fedora 24获取的内核配置文件,应该修改4.11-rc2内核。
标签: linux-kernel