如何获得Slurm作业的总CPU使用率?

时间:2014-08-03 20:38:52

标签: slurm

我正在尝试获取每个作业使用的CPU总时间。我找到了几个有希望的神圣领域,但我应该使用哪一个呢?

根据文档(https://computing.llnl.gov/linux/slurm/sacct.html),TotalCPU反映了SystemCPU和UserCPU的总数,但不反映子进程。但我想要包括子进程在内的总数......

TotalCPU
    The sum of the SystemCPU and UserCPU time used by the job or job step. The total CPU time of the job may exceed the job's elapsed time for jobs that include multiple job steps. The format of the output is identical to that of the elapsed field.

NOTE: TotalCPU provides a measure of the task's parent process and does not include CPU time of child processes.

对于其他候选人,cputimeraw不提供相同级别的详细信息:

cputime
    Formatted number of cpu seconds a process was allocated.

cputimeraw
    How much cpu time process was allocated in second format, not formatted like above. 

我倾向于使用cputimeraw而不是TotalCPU,但我想确保它是包括作业产生的所有子进程的总数。文档没有以任何方式表明子进程的任何内容。

有人有任何建议吗?

谢谢,

罗伯特

1 个答案:

答案 0 :(得分:0)

以下命令给出了一个不错的摘要:

seff jobid

输出:

Job ID: jobid
Cluster: cluster
User/Group: doe/clusterusers
State: TIMEOUT (exit code 0)
Nodes: 6
Cores per node: 28
CPU Utilized: 32-01:15:44
CPU Efficiency: 9.54% of 336-00:44:48 core-walltime
Job Wall-clock time: 2-00:00:16
Memory Utilized: 58.76 GB
Memory Efficiency: 8.74% of 672.00 GB