cpu负载的探讨

原链接：http://blog.chinaunix.net/uid-12693781-id-368837.html

摘要：确定cpu的负载的定义，帮助管理员设置cpu负载阀值，推测可能的导致cpu负载过高的原因，进而保证服务器的正常运行。

1.cpu负载的定义

首先，看看cpu负载的定义。在一般情况下可以将单核心cpu的负载看成是一条单行的桥，数字1代表cpu刚好能够处理过来，即桥上能够顺利通过所有的车辆，

桥外没有等待的车辆，桥是畅通的。当超过1时表示有等待上桥的车辆，小于1时表示车辆能够快速的通过。单核心cpu就表示该cpu能够处理的事务数是1，在多核

cpu中cpu能够并行处理的事务的数量应该是cpu个数*cpu核数，而且负载数最好不要超过这个数值。例如一个4核cpu，则cpu_load最大值为4，不能长期超过4，否则会有任务没有得到及时的处理,而使系统的负载累积增高，导致系统运行缓慢。

大多数的Unix系统中的负载只是记录那些处在运行状态和可运行状态的进程，但是Linux有所不同，它会包含那些不可中断的处于睡眠状态的进程。这时当这些进程由于I/O的阻塞而不能够运行，就可能显著的增加cpu的负载。所以在Unix和Linux下的cpu的负载的计算方法是不一样的，在设定监测值的时候也需要特别考率。

下面从内核源码中分析cpu负载的计算根源，这里能够给出cpu负载的完整计算方法。下面的代码是是在kernel-2.6.32中的kernel/shed.c中截取的，用来计算cpu的平均负载。

/* Variables and functions for calc_load */
static atomic_long_t calc_load_tasks;
static unsigned long calc_load_update;
unsigned long avenrun[3];
EXPORT_SYMBOL(avenrun);
 
/**
 * get_avenrun - get the load average array
 * @loads: pointer to dest load array
 * @offset: offset to add
 * @shift: shift count to shift the result left
 *
 * These values are estimates at best, so no need for locking.
 */
void get_avenrun(unsigned long *loads, unsigned long offset, int shift)
{
loads[0] = (avenrun[0] + offset) << shift;
loads[1] = (avenrun[1] + offset) << shift;
loads[2] = (avenrun[2] + offset) << shift;
}
 
static unsigned long
calc_load(unsigned long load, unsigned long exp, unsigned long active)
{
load *= exp;
load += active * (FIXED_1 - exp);
return load >> FSHIFT;
}
 
/*
 * calc_load - update the avenrun load estimates 10 ticks after the
 * CPUs have updated calc_load_tasks.
 */
void calc_global_load(void)
{
unsigned long upd = calc_load_update + 10;
long active;
 
if (time_before(jiffies, upd))
return;
 
active = atomic_long_read(&calc_load_tasks);
active = active > 0 ? active * FIXED_1 : 0;
 
avenrun[0] = calc_load(avenrun[0], EXP_1, active);
avenrun[1] = calc_load(avenrun[1], EXP_5, active);
avenrun[2] = calc_load(avenrun[2], EXP_15, active);
 
calc_load_update += LOAD_FREQ;
}
 
/*
 * Either called from update_cpu_load() or from a cpu going idle
 */
static void calc_load_account_active(struct rq *this_rq)
{
long nr_active, delta;
 
nr_active = this_rq->nr_running;  //记录在cpu上运行的进程数
nr_active += (long) this_rq->nr_uninterruptible;  //记录不可中断的进程数
 
if (nr_active != this_rq->calc_load_active) {
delta = nr_active - this_rq->calc_load_active;
this_rq->calc_load_active = nr_active;
atomic_long_add(delta, &calc_load_tasks);
}
}

View Code