kernel: scheduling while atomic

来自个人维基
跳转至: 导航搜索

error messgae:

[ 68.382231] BUG: scheduling while atomic: Dequeue buffers/4498/0x00000002
[ 68.389418] kernel BUG at mm/vmalloc.c:2068!
[ 68.393689] Internal error: Oops - BUG: 0 1 PREEMPT SMP
[ 68.399171] Modules linked in: wlan_hasting(O) wlan_cnss_core_pcie(O)
[ 68.405614] CPU: 4 PID: 4498 Comm: Dequeue buffers Tainted: G W O 5.10.110-ge2d35e47d4fc-ab4 #3
[ 68.422827] pstate: 00400009 (nzcv daif +PAN UAO -TCO BTYPE=-)
[ 68.428834] pc : __get_vm_area_node+0x188/0x1b0
[ 68.433359] lr : __vmalloc_node_range+0x6c/0x298
[ 68.437971] sp : ffff8000274cb760
[ 68.441281] x29: ffff8000274cb760 x28: 00000000fffffff3
[ 68.446588] x27: 0000000000000001 x26: ffff000d13f16980
[ 68.451895] x25: ffff000c89173680 x24: 0068000000000f03
[ 68.457201] x23: ffff8000274cb938 x22: 0000000000001fc0
[ 68.462509] x21: 00000000ffffffff x20: ffff8000116d7000
[ 68.467815] x19: 0000000000000cc0 x18: ffff800011fc1f50
[ 68.473122] x17: ffff80001126d074 x16: ffff80001126d074
[ 68.478429] x15: 0000000000000001 x14: 0000000000000002
[ 68.483737] x13: 0000000000000000 x12: 0000000000000a20
[ 68.489044] x11: 0000000f08ff8000 x10: 0000000000000001
[ 68.494351] x9 : 0000000000000000 x8 : 00000000ffffffff
[ 68.499658] x7 : ffff800010bda2e4 x6 : 0000000000000cc0
[ 68.504967] x5 : 00000000ffffffff x4 : fffffdffbfff0000
[ 68.510274] x3 : ffff800010000000 x2 : 0000000000000022
[ 68.515581] x1 : 0000000000000001 x0 : 0000000000001fc0
[ 68.520888] Call trace:
[ 68.523332] __get_vm_area_node+0x188/0x1b0
[ 68.527510] __vmalloc_node_range+0x6c/0x298
[ 68.531775] __vmalloc_node+0x58/0x68
[ 68.535433] vmalloc+0x38/0x50

Causes:
Scheduling while atomic happens when the scheduler gets confused and therefore unable to work properly and this because the scheduler tried to perform a "schedule()" in a section that contains a schedulable code inside of a non schedulable one.

For example using sleeps inside of a section protected by a spinlock. Trying to use another lock(semaphores,mutexes..) inside of a spinlock-proteced code may also disturb the scheduler. In addition using spinlocks in user space can drive the scheduler to behave as such.

Suspections:

static void _enable_clock(void)
{
    unsigned long flags;
 
    spin_lock_irqsave(&clk_lock, flags);
    do {
        // clock
        if( 0 != devm_clk_bulk_get(device, MAX_CLK_SOURCES, clks)){
            pr_err("enable_clock: devm_clk_bulk_get failed!\n");
            break;
        }
        if ( 0 != clk_bulk_prepare(MAX_CLK_SOURCES, clks)) {
            pr_err("enable_clock: clk_bulk_enable error!\n");
            break;
        }
        if ( 0 != set_clock_rate(0) ) {
            pr_warn("enable_clock: clk_set_rate error!\n");
        }
        if( 0 != clk_bulk_enable(MAX_CLK_SOURCES, clks)) {
            pr_err("enable_clock: clk_bulk_enable error!\n");
            break;
        }
 
        // reset
        if (rst_ctrls == NULL) {
            rst_ctrls = devm_reset_control_array_get(device, true, true);
            if (PTR_ERR(rst_ctrls) == -EPROBE_DEFER) {
                rst_ctrls = NULL;
                pr_err("enable_clock: devm_reset_control_array_get error!\n");
                break;
            }
        }
        if (0 != reset_control_deassert(rst_ctrls)) {
            pr_err("enable_clock: devm_reset_control_array_get error!\n");
            break;
        }
        // wait (32 + 128) cycles for the slowest clock (pclk 200M) before ready, it is about 1us
        udelay(1); <<<<<<<<<<<<<<<<<<<<<<<<< possibly go into sleep
        is_clk_on = 1;
        pr_info("g2d: _enable_clock\n");
    }while(0);
 
    spin_unlock_irqrestore(&clk_lock, flags);
}