Process Accounting

Process Accounting (to find and remove the cause of slowness)
Unix Process Accounting is a tool that literally takes account of every single process on your server, recording how much CPU (processor) and memory (RAM) each process consumes. When summarized, this information can be very useful for tracking down overzealous processes and hogs of resources so that you may restore your server to it's rightful speed and performance!

DreamHost keeps a daily accounting of your processes, which can be found in /var/log/sa/. For example, /var/log/sa/sa.itemized.0 is an itemized report of all the processes run from yesterday, sa.itemized.1 is the itemized report for the day before yesterday, sa.itemized.2 is the itemized report for two days before yesterday. Along those same lines is sa.summary.0, which is a summary of each user's consumption of resources.


 * Note: most web server users can only access files within their /home/username directory. VPS or dedicated servers can have admin users who can use sudo to access logs in /var/logs.

You may have the daily reports of your processes in your home directory, in the logs/resources/ directory, by enabling CPU reporting for the user. See Where can I find resource reports? for more information.

What does all of this mean? The best way to use this information is to examine the CPU-minutes. A CPU-minute can be considered a unit of work for a server. No, CPU-minutes are not "real time", that is, there are not 60minutesX24hours=1440 CPU-minutes in a day. If a machine is idle, doing nothing at all, that doesn't count towards CPU time. Only when the processor is actually doing something does it count as CPU-minutes.

So knowing how many CPU-minutes per day your users and processes are consuming can help you to find the user or process which is causing problems for the entire machine. The best way to understand all of this is to breeze through the examples below, and then try it out on your own machine.

One final confusing note: Yes, there are CPU-seconds as well: 1 CPU-minute = 60 CPU-seconds

In the examples below: "cp" = CPU-minute "cpu" = CPU-second

yourserver:$ head /var/log/sa/sa.summary.0 133573 100.00% 46089720.15re 100.00% 599.89cp 100.00% 0avio 78478595k*sec bob 84064 62.93% 618688.32re 1.34% 349.68cp 58.29% 0avio 35499807k*sec mysql 2158 1.62% 29884017.65re 64.84% 147.47cp 24.58% 0avio 413986887k*sec dhapache 9201 6.89% 14111172.13re 30.62% 87.92cp 14.66% 0avio 17167069k*sec root 35136 26.30% 860049.95re 1.87% 14.14cp 2.36% 0avio 395749k*sec postfix 1267 0.95% 612272.05re 1.33% 0.44cp 0.07% 0avio 21756k*sec mailuser1 271 0.20% 542.82re 0.00% 0.03cp 0.01% 0avio 728k*sec mailuser2 271 0.20% 612.53re 0.00% 0.03cp 0.01% 0avio 673k*sec louanne 271 0.20% 539.83re 0.00% 0.03cp 0.01% 0avio 659k*sec sarahlee 271 0.20% 610.13re 0.00% 0.03cp 0.01% 0avio 655k*sec
 * 1) Give me a summary of the 10 "biggest" users
 * 2) (Note that the entire machine utilized 599 CPU-minutes for the entire day yesterday) yourserver:

yourserver:$ /dh/misc/bin/sa.analysis.pl /var/log/sa/sa.itemized.0 | head Process CPU seconds user machine count average php.cgi 20980.6200 58.291% 87.419% 84063 0.250 mysqld 8786.3600 24.411% 36.610% 2217 3.963 libhttpd.ep 5290.4900 14.699% 22.044% 8650 0.612 ps 286.3300 0.796% 1.193% 726 0.394 sshd 266.5500 0.741% 1.111% 1459 0.183 analog 83.8400 0.233% 0.349% 204 0.411 mysqld 62.3700 0.173% 0.260% 12 5.197 analyze_log.pl 37.9200 0.105% 0.158% 5 7.584 top 26.7400 0.074% 0.111% 2 13.370 gzip 18.5000 0.051% 0.077% 85 0.218
 * 1) Give me an analysis of all the processes run by all the users.
 * 1) That was the information that you really want to look at.
 * 2) Notice that for this machine, php.cgi was run a whopping
 * 84,000 times in one day (about once a second), consuming
 * 1) about 350 CPU-minutes that day (which is a lot!). But it only
 * 2) consumed on average 0.25 CPU-minutes. It looks pretty efficient,
 * 3) yet even a 15% increase in efficiency would drop the amount of
 * 4) resources consumed by php.cgi to under 300.
 * 5) Give me a summary of all the processes run by the user `bob' yesterday:

yourserver:$ grep ^bobblank: /var/log/sa/sa.itemized.0 > /tmp/asdf yourserver:$ /dh/misc/bin/sa.analysis.pl /tmp/asdf |head -10 Process CPU seconds user machine count average php.cgi 20980.6200 99.999% 87.419% 84063 0.250 dothis.pl 0.1300 0.001% 0.001% 1 0.130

yourserver:$ tail -10 /var/log/sa/sa.itemized.0 root 0.01 cpu 311k mem 0 io chmod dhapache 0.04 cpu 3370k mem 0 io libhttpd.ep * root 0.00 cpu 314k mem 0 io date root 0.02 cpu 514k mem 0 io savelog bob 0.23 cpu 1723k mem 0 io php.cgi bob 0.11 cpu 1723k mem 0 io php.cgi dhapache 0.01 cpu 3370k mem 0 io libhttpd.ep * bob 0.03 cpu 1719k mem 0 io php.cgi root 0.01 cpu 310k mem 0 io uname dhapache 0.02 cpu 3370k mem 0 io libhttpd.ep * root 0.00 cpu 310k mem 0 io cut
 * 1) Give me some of the raw data for the fun of it yourserver:

If you have admin access (create an admin user in the VPS > Manage Admin Users section of the Dreamhost panel) you can also check auxiliary processes. These can hang and chew up resources.

yourserver:$ sudo ps aux

USER      PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND root        1  0.0  0.1   6124   676 ? Ss  Nov16   0:21 init [2] dhapache 1893  0.0  2.7 306280 11328 ? S   09:11   0:00 /dh/apache2/apa dhapache 2666  0.0  2.7 306132 11236 ? S   09:12   0:00 /dh/apache2/apa dhapache 2667  0.0  0.0      0     0 ? Z   09:12   0:00 [apa] root     2712  0.0  0.1  18548   424 ? Ss  Nov18   0:02 /usr/sbin/cron root     2714  0.0  0.4  20184  2044 pts/3    S+   09:12   0:00 links http://as root     2786  0.0  0.2  14728   988 pts/1    R+   09:12   0:00 ps aux root     3068  1.6  6.3 305972 25916 ? Ss  05:34   3:35 /dh/apache2/apa root     3540  0.0  0.1  36840   736 ? Ss  Nov18   0:06 /usr/lib/postfi postfix  3606  0.0  0.2  38944   940 ? S   Nov18   0:06 qmgr -l -t fifo nobody   3675  0.0  0.1 106580   704 ? Ss  Nov18   0:37 proftpd: (accep root      9612  0.0  0.4  66076  1816 ?        Ss   08:48   0:00 sshd: root@pts/ root      9617  0.0  0.3 101556  1516 pts/3    Ss   08:48   0:00 -bash postfix  11847  0.0  0.3  38896  1356 ?        S    08:51   0:00 pickup -l -t fi root     12502  0.0  0.4  65928  1672 ?        Ss   08:52   0:00 sshd: pammy pammy	 12561  0.0  0.3  65928  1276 ?        S    08:52   0:00 sshd: pammy@ pammy	 12563  0.0  0.3  17996  1580 pts/1    Ss   08:52   0:00 -bash dhapache 24934  0.0  2.4 293256  9892 ?        S    08:32   0:00 /dh/apache2/apa dhapache 24943  0.0  2.4 305972 10144 ?        S    08:32   0:00 /dh/apache2/apa bunny	 24984  0.0  4.9 134248 20460 ?        S    05:21   0:01 php5.cgi bunny	 25025  0.0  4.9 134244 20460 ?        S    05:21   0:02 php5.cgi bunny	 25092  0.0  6.7 141380 27696 ?        S    05:21   0:01 php5.cgi bunny 	 25093  0.0  3.1 126432 12752 ? S   05:21   0:00 php5.cgi bunny	 25107 0.0  6.6 141024 27284 ? S   05:21   0:00 php5.cgi bunny 	 25170 0.0  6.7 141336 27656 ? S   05:21   0:03 php5.cgi bunny 	 25171 0.0  6.7 141380 27696 ? S   05:21   0:02 php5.cgi bunny 	 25212 0.0  6.6 141068 27416 ? S   05:21   0:02 php5.cgi bunny 	 25226 0.0  6.6 141064 27404 ? S   05:21   0:00 php5.cgi bunny 	 25229 0.0  6.9 141880 28276 ? S   05:21   0:03 php5.cgi root    28627  0.0  0.1  48864   716 ? Ss  05:26   0:02 /usr/sbin/sshd dhapache 30376 0.0  2.7 306280 11300 ? S   09:10   0:00 /dh/apache2/apa bunny 	 31309 2.0  9.0 147332 37028 ? S   08:40   0:40 php5.cgi bunny 	 31316 2.4  9.3 148712 38104 ? S   08:40   0:48 php5.cgi daemon  31529  0.0  0.0  16356   244 ? Ss  Nov18   0:00 /usr/sbin/atd daemon  31670  0.0  0.0  16356   208 ? Ss  Nov18   0:00 /usr/sbin/atd

yourserver:$ sudo pkill php5.cgi processes
 * 1) Some of the CGI scripts in this example are using significant memory (%MEM column).
 * 1) will kill those processes.