Process Accounting
Process Accounting (to find and remove the cause of slowness)
Unix Process Accounting is a tool that literally takes account of every single process on your server, recording how much CPU (processor) and memory (RAM) each process consumes. When summarized, this information can be very useful for tracking down overzealous processes and hogs of resources so that you may restore your server to it's rightful speed and performance!
DreamHost keeps a daily accounting of your processes, which can be found in /var/log/sa/ (note: you may have to request DreamHost to provide you with access to those files, and they are unavailable for PS customers). For example, /var/log/sa/sa.itemized.0 is an itemized report of all the processes run from yesterday, sa.itemized.1 is the itemized report for the day before yesterday, sa.itemized.2 is the itemized report for two days before yesterday. Along those same lines is sa.summary.0, which is a summary of each user's consumption of resources.
You may have the daily reports of your processes in your home directory, in the logs/resources/ directory, by enabling CPU reporting for the user. See Where can I find resource reports? for more information.
What does all of this mean? The best way to use this information is to examine the CPU-minutes. A CPU-minute can be considered a unit of work for a server. No, CPU-minutes are not "real time", that is, there are not 60minutesX24hours=1440 CPU-minutes in a day. If a machine is idle, doing nothing at all, that doesn't count towards CPU time. Only when the processor is actually doing something does it count as CPU-minutes.
So knowing how many CPU-minutes per day your users and processes are consuming can help you to find the user or process which is causing problems for the entire machine. The best way to understand all of this is to breeze through the examples below, and then try it out on your own machine.
One final confusing note: Yes, there are CPU-seconds as well: 1 CPU-minute = 60 CPU-seconds
In the examples below: "cp" = CPU-minute "cpu" = CPU-second
# Give me a summary of the 10 "biggest" users # (Note that the entire machine utilized 599 CPU-minutes for the entire day yesterday) yourserver: yourserver:$ head /var/log/sa/sa.summary.0 133573 100.00% 46089720.15re 100.00% 599.89cp 100.00% 0avio 78478595k*sec bob 84064 62.93% 618688.32re 1.34% 349.68cp 58.29% 0avio 35499807k*sec mysql 2158 1.62% 29884017.65re 64.84% 147.47cp 24.58% 0avio 413986887k*sec dhapache 9201 6.89% 14111172.13re 30.62% 87.92cp 14.66% 0avio 17167069k*sec root 35136 26.30% 860049.95re 1.87% 14.14cp 2.36% 0avio 395749k*sec postfix 1267 0.95% 612272.05re 1.33% 0.44cp 0.07% 0avio 21756k*sec mailuser1 271 0.20% 542.82re 0.00% 0.03cp 0.01% 0avio 728k*sec mailuser2 271 0.20% 612.53re 0.00% 0.03cp 0.01% 0avio 673k*sec louanne 271 0.20% 539.83re 0.00% 0.03cp 0.01% 0avio 659k*sec sarahlee 271 0.20% 610.13re 0.00% 0.03cp 0.01% 0avio 655k*sec # Give me an analysis of all the processes run by all the users. yourserver:$ /dh/misc/bin/sa.analysis.pl /var/log/sa/sa.itemized.0 | head Process CPU seconds user machine count average php.cgi 20980.6200 58.291% 87.419% 84063 0.250 mysqld 8786.3600 24.411% 36.610% 2217 3.963 libhttpd.ep 5290.4900 14.699% 22.044% 8650 0.612 ps 286.3300 0.796% 1.193% 726 0.394 sshd 266.5500 0.741% 1.111% 1459 0.183 analog 83.8400 0.233% 0.349% 204 0.411 mysqld 62.3700 0.173% 0.260% 12 5.197 analyze_log.pl 37.9200 0.105% 0.158% 5 7.584 top 26.7400 0.074% 0.111% 2 13.370 gzip 18.5000 0.051% 0.077% 85 0.218 # That was the information that you really want to look at. # Notice that for this machine, php.cgi was run a whopping # 84,000 times in one day (about once a second), consuming # about 350 CPU-minutes that day (which is a lot!). But it only # consumed on average 0.25 CPU-minutes. It looks pretty efficient, # yet even a 15% increase in efficiency would drop the amount of # resources consumed by php.cgi to under 300. # Give me a summary of all the processes run by the user `bob' yesterday: yourserver:$ grep ^bob[[:blank:]] /var/log/sa/sa.itemized.0 > /tmp/asdf yourserver:$ /dh/misc/bin/sa.analysis.pl /tmp/asdf |head -10 Process CPU seconds user machine count average php.cgi 20980.6200 99.999% 87.419% 84063 0.250 dothis.pl 0.1300 0.001% 0.001% 1 0.130 # Give me some of the raw data for the fun of it yourserver: yourserver:$ tail -10 /var/log/sa/sa.itemized.0 root 0.01 cpu 311k mem 0 io chmod dhapache 0.04 cpu 3370k mem 0 io libhttpd.ep * root 0.00 cpu 314k mem 0 io date root 0.02 cpu 514k mem 0 io savelog bob 0.23 cpu 1723k mem 0 io php.cgi bob 0.11 cpu 1723k mem 0 io php.cgi dhapache 0.01 cpu 3370k mem 0 io libhttpd.ep * bob 0.03 cpu 1719k mem 0 io php.cgi root 0.01 cpu 310k mem 0 io uname dhapache 0.02 cpu 3370k mem 0 io libhttpd.ep * root 0.00 cpu 310k mem 0 io cut
If you have admin access (create an admin user in the VPS > Manage Admin Users section of the Dreamhost panel) you can also check auxiliary processes. These can hang and chew up resources.
yourserver:$ sudo ps aux USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND root 1 0.0 0.1 6124 676 ? Ss Nov16 0:21 init [2] dhapache 1893 0.0 2.7 306280 11328 ? S 09:11 0:00 /dh/apache2/apa dhapache 2666 0.0 2.7 306132 11236 ? S 09:12 0:00 /dh/apache2/apa dhapache 2667 0.0 0.0 0 0 ? Z 09:12 0:00 [apa] <defunct> root 2712 0.0 0.1 18548 424 ? Ss Nov18 0:02 /usr/sbin/cron root 2714 0.0 0.4 20184 2044 pts/3 S+ 09:12 0:00 links http://as root 2786 0.0 0.2 14728 988 pts/1 R+ 09:12 0:00 ps aux root 3068 1.6 6.3 305972 25916 ? Ss 05:34 3:35 /dh/apache2/apa root 3540 0.0 0.1 36840 736 ? Ss Nov18 0:06 /usr/lib/postfi postfix 3606 0.0 0.2 38944 940 ? S Nov18 0:06 qmgr -l -t fifo nobody 3675 0.0 0.1 106580 704 ? Ss Nov18 0:37 proftpd: (accep root 9612 0.0 0.4 66076 1816 ? Ss 08:48 0:00 sshd: root@pts/ root 9617 0.0 0.3 101556 1516 pts/3 Ss 08:48 0:00 -bash postfix 11847 0.0 0.3 38896 1356 ? S 08:51 0:00 pickup -l -t fi root 12502 0.0 0.4 65928 1672 ? Ss 08:52 0:00 sshd: pammy pammy 12561 0.0 0.3 65928 1276 ? S 08:52 0:00 sshd: pammy@ pammy 12563 0.0 0.3 17996 1580 pts/1 Ss 08:52 0:00 -bash dhapache 24934 0.0 2.4 293256 9892 ? S 08:32 0:00 /dh/apache2/apa dhapache 24943 0.0 2.4 305972 10144 ? S 08:32 0:00 /dh/apache2/apa bunny 24984 0.0 4.9 134248 20460 ? S 05:21 0:01 php5.cgi bunny 25025 0.0 4.9 134244 20460 ? S 05:21 0:02 php5.cgi bunny 25092 0.0 6.7 141380 27696 ? S 05:21 0:01 php5.cgi bunny 25093 0.0 3.1 126432 12752 ? S 05:21 0:00 php5.cgi bunny 25107 0.0 6.6 141024 27284 ? S 05:21 0:00 php5.cgi bunny 25170 0.0 6.7 141336 27656 ? S 05:21 0:03 php5.cgi bunny 25171 0.0 6.7 141380 27696 ? S 05:21 0:02 php5.cgi bunny 25212 0.0 6.6 141068 27416 ? S 05:21 0:02 php5.cgi bunny 25226 0.0 6.6 141064 27404 ? S 05:21 0:00 php5.cgi bunny 25229 0.0 6.9 141880 28276 ? S 05:21 0:03 php5.cgi root 28627 0.0 0.1 48864 716 ? Ss 05:26 0:02 /usr/sbin/sshd dhapache 30376 0.0 2.7 306280 11300 ? S 09:10 0:00 /dh/apache2/apa bunny 31309 2.0 9.0 147332 37028 ? S 08:40 0:40 php5.cgi bunny 31316 2.4 9.3 148712 38104 ? S 08:40 0:48 php5.cgi daemon 31529 0.0 0.0 16356 244 ? Ss Nov18 0:00 /usr/sbin/atd daemon 31670 0.0 0.0 16356 208 ? Ss Nov18 0:00 /usr/sbin/atd # Some of the CGI scripts in this example are using significant memory (%MEM column). yourserver:$ sudo pkill php5.cgi processes # will kill those processes.