WordPress and PHP

= Examining Possible WordPress Configurations =

There are four basic configurations you can setup WordPress to run with: out-of-the-box without caching and using the default php5 interpreter, with caching and the default php5 interpreter, without caching and using FastCGI, or with caching and using FastCGI. Each has benefits and drawbacks. Below you'll find some examinations of how effective each is at optimizing your WordPress resource usage. Each setup was created and tested using httperf (a command line tool that's used to measure web server performance). It allows you to simulate high traffic conditions on your site and measures the response time each request takes. All tests were set to make 1000 connections at a rate of 20 connections per second and was conducted on a fresh WordPress installation (default theme and plugins with only the single default post). Each configuration will include a screenshot of a top view at the peak load of each test. For comparison's sake, the load average on the machine this test was conducted on was around 3.00 while tests were not being conducted. Things to note in the top output are load average (top right of the output), memory usage of the running processes (RES column), and the CPU % column.

nocache + nofcgi
This is the out-of-the-box condition you get with an Advanced One-Click install or if you manually install it yourself. It's easy to do and will generally work great if you have little content, not many viewers, and don't get indexed very often. Being as those conditions generally aren't met by anyone who would be concerned with the topic of optimizing a WordPress blog, it's fairly safe to say this is not a good option.

Under this configuration, what will happen is a php5.cgi process will spawn with every request that hits your site. Each process can take anywhere from 8-16MB of memory. During high traffic loads this can increase your user's memory usage significantly -- affecting both your server and thus your site in an adverse way. In most cases, your site will start slowing down and once you reach the allowed memory for each user on a server or the maximum user process limit viewers of your site will start seeing errors when browsing to your site.

At the peak load of the test, top looked like this:



Notice the load average of 19 (up from about 3) and the 21 active php5.cgi processes. Notice that even with the minimal nature of the site it's now using well over 200 MB of memory and quite a large chunk of the server's CPU. This configuration obviously won't hold up to a site that has any kind of decent traffic -- especially if it's running additional plugins, has a busier theme, and/or displays images in posts.

$ httperf --hog --server=blog.example.com --num-conns=1000 --rate=20 --timeout=5 httperf --hog --timeout=5 --client=0/1 --server=blog.example.com --port=80 --uri=/ --rate=20 --send-buffer=4096 --recv-buffer=16384 --num-conns=1000 --num-calls=1 Maximum connect burst length: 1 Total: connections 1000 requests 1000 replies 892 test-duration 52.007 s  Connection rate: 19.2 conn/s (52.0 ms/conn, <=69 concurrent connections) Connection time [ms]: min 60.4 avg 364.7 max 5916.5 median 72.5 stddev 946.6 Connection time [ms]: connect 31.7 Connection length [replies/conn]: 1.000 Request rate: 19.2 req/s (52.0 ms/req) Request size [B]: 79.0 Reply rate [replies/s]: min 15.8 avg 17.5 max 19.0 stddev 1.0 (10 samples) Reply time [ms]: response 316.4 transfer 16.6 Reply size [B]: header 307.0 content 598.0 footer 0.0 (total 905.0) Reply status: 1xx=0 2xx=49 3xx=0 4xx=0 5xx=843

As you can see from this output, the minimum response time was 60 ms, the average response time was 364 ms, and the max response time was nearly 6 seconds. The median response time was 72 ms. So, one plus is that this has slightly faster response times once the server is warmed up. Initial requests after idle time are quite slow, which would be more noticeable on a lower traffic site.

supercache + nofcgi
This configuration requires a bit more setup, but as you'll see the benefits FAR outweigh the initial inconvenience. It uses the wp-super-cache plugin for WordPress. Essentially, every time a page is viewed for the first time a php5.cgi process will be fired up to process it, but once it's done it writes the output to a file caching it. The next time that page is viewed, WordPress will serve the static HTML file that was created and the request won't require a php5.cgi process to fire up. Another potential downside to this is that sometimes changes you make to published articles might not show up immediately without manually purging the cache, but that's a pretty small price to pay for the resource optimizations you get.

Under this configuration, what you'll see is literally no additional processes firing during the same performance test that was conducted previously. That's right. None. Zip. Zilch. Nada.



As you can see, that process list is much shorter than the other one and contains literally no php5.cgi processes. This screenshot was taken at the peak load during the test. Note that the load average peaked at 4.50, which is only a 1.5 increase compared to the 16 increase we saw in the previous configuration. By simply installing and configuring this plugin you have just kept your site from using up an additional 200 MB of memory and kept it from monopolizing a hefty chunk of CPU usage by your user. Pretty much a big win. $ httperf --hog --server=blog.example.com --num-conns=1000 --rate=20 --timeout=5 httperf --hog --timeout=5 --client=0/1 --server=blog.example.com --port=80 --uri=/ --rate=20 --send-buffer=4096 --recv-buffer=16384 --num-conns=1000 --num-calls=1 Maximum connect burst length: 1 Total: connections 1000 requests 1000 replies 1000 test-duration 50.055 s  Connection rate: 20.0 conn/s (50.1 ms/conn, <=8 concurrent connections) Connection time [ms]: min 88.9 avg 113.7 max 1075.1 median 105.5 stddev 42.5 Connection time [ms]: connect 33.7 Connection length [replies/conn]: 1.000 Request rate: 20.0 req/s (50.1 ms/req) Request size [B]: 79.0 Reply rate [replies/s]: min 19.6 avg 20.0 max 20.2 stddev 0.2 (10 samples) Reply time [ms]: response 45.7 transfer 34.2 Reply size [B]: header 335.0 content 5489.0 footer 0.0 (total 5824.0) Reply status: 1xx=0 2xx=1000 3xx=0 4xx=0 5xx=0

As you can see from this output, the minimum response time was 89 ms, the average response time was 114 ms, and the max response time was just over 1 second. The median response time was 105 ms. If you compare them with the nocach/nofcgi version, you'll see that the minimum and median numbers are higher. In fact, the median is a solid 30 ms higher. However, you'll also see that the maximum response time was nearly 5x faster. As such, the initial viewer after a period of idle time for a cached page will see that page significantly faster.

nocache + fcgi
This configuration takes advantage of something called FastCGI. Essentially, what this does is spawn a number of php processes that remain active to process requests. When the request is processed, it stays open. This saves significantly in the memory department because most of the time spent on php5.cgi processes is usually the time spent starting up the process and closing the process. The request itself doesn't usually take much time. As such, by keeping these processes open, they're able to handle more requests faster. You're also able to limit how many can spawn to handle requests, so the amount of memory used is more predictable and controllable. The downside is that this base amount of memory is always used -- even when no traffic is coming through at the time. Eventually the spawned processes will die due to idle time. When that happens, the next request that comes in will fire them back up and (predictably) take significantly longer since the processes have to load again.

Under this configuration, what you'll see is three processes running that are processing all of the incoming requests.



As you can see, with this configuration you save significant amounts of memory over the non-cached version that didn't use FCGI. That said, you still end up monopolizing the CPU in a pretty significant way. The load average went up by about two in this case, but would likely continue increasing if the traffic to your site continued in this fashion. Furthermore, if that level of CPU utilization kept up, it's entirely likely that those processes could start getting killed which would yield errors for your viewers.

$ httperf --hog --server=blog.example.com --num-conns=1000 --rate=20 --timeout=5 httperf --hog --timeout=5 --client=0/1 --server=blog.example.com --port=80 --uri=/ --rate=20 --send-buffer=4096 --recv-buffer=16384 --num-conns=1000 --num-calls=1 Maximum connect burst length: 1 Total: connections 1000 requests 1000 replies 896 test-duration 53.763 s  Connection rate: 18.6 conn/s (53.8 ms/conn, <=102 concurrent connections) Connection time [ms]: min 42.3 avg 1318.9 max 5003.8 median 77.5 stddev 1809.9 Connection time [ms]: connect 45.3 Connection length [replies/conn]: 1.000 Request rate: 18.6 req/s (53.8 ms/req) Request size [B]: 79.0 Reply rate [replies/s]: min 1.6 avg 17.5 max 32.2 stddev 8.7 (10 samples) Reply time [ms]: response 1265.6 transfer 7.5 Reply size [B]: header 308.0 content 1449.0 footer 0.0 (total 1757.0) Reply status: 1xx=0 2xx=201 3xx=0 4xx=0 5xx=695

As you can see from this output, the minimum response time was 42 ms, the average response time was 1.3 seconds, and the max response time was just over 5 seconds. The median response time was 77 ms. The average response time was much higher than the non-cached, non-FCGI configuration. Essentially, all you're saving with this configuration is memory along with reducing the risk of having processes killed due to your user hitting the maximum allowed process limit. Overall, not that great.

supercache + fcgi
This configuration takes advantage of both wp-super-cache and FastCGI. This works the same as the version without supercache except pages are cached as HTML files after they're viewed. The downside is that every time a page needs to be cached all of your FCGI processes will spawn using close to the maximum amount of memory this configuration can use. As such, this is likely only really useful in cases where you have content that changes very frequently, which isn't a very common occurrence with blogs really. As such, this configuration is not one that you'll likely want to use.

Under this configuration, what you'll see is three processes running that are essentially idling.



As you can see, this configuration saves you from monopolizing the CPU, but also uses a fair bit of memory that wouldn't be used in the non-FCGI configuration that uses supercache. The load average increased by nearly 1 in this case.

$ httperf --hog --server=blog.example.com --num-conns=1000 --rate=20 --timeout=5 httperf --hog --timeout=5 --client=0/1 --server=blog.example.com --port=80 --uri=/ --rate=20 --send-buffer=4096 --recv-buffer=16384 --num-conns=1000 --num-calls=1 Maximum connect burst length: 1 Total: connections 1000 requests 1000 replies 1000 test-duration 50.055 s  Connection rate: 20.0 conn/s (50.1 ms/conn, <=30 concurrent connections) Connection time [ms]: min 89.6 avg 149.3 max 3371.6 median 105.5 stddev 247.3 Connection time [ms]: connect 32.9 Connection length [replies/conn]: 1.000 Request rate: 20.0 req/s (50.1 ms/req) Request size [B]: 79.0 Reply rate [replies/s]: min 19.6 avg 20.0 max 20.0 stddev 0.1 (10 samples) Reply time [ms]: response 82.4 transfer 34.0 Reply size [B]: header 337.0 content 5488.0 footer 0.0 (total 5825.0) Reply status: 1xx=0 2xx=1000 3xx=0 4xx=0 5xx=0

As you can see from this output, the minimum response time was 90 ms, the average response time was 149 ms, and the max response time was 3.3 seconds. The median response time was 105 ms. Compared to the non-supercache version, it has significantly better average response time and the maximum response time was also significantly faster. That said, there really aren't many circumstances where this configuration would be all that beneficial.

Conclusion
Ultimately, it seems clear that the supercache + nofcgi version wins out. It has great performance while having the lowest overall resource hit on the server. It should always be using the least amount of memory and CPU time that it possibly can. The only time it might be less efficient is if a large number of un-cached pages are accessed simultaneously as a number of php5.cgi processes would spawn to handle those. Once they're cached though, those processes won't need to show up again whereas in a FCGI configuration the number configured to spawn would remain open whether or not any requests were coming in. You can get instructions on how to install the wp-super-cache plugin here. Keep in mind that you'll likely need to add the htaccess content specified in those instructions manually and the rules listed there for the htaccess file in the root web directory should replace any default WordPress rules that may already be in place.

=See Also=


 * WordPress Optimization
 * Wordpress_performance - Additional WordPress performance tweaks
 * WordPress Shared Hosting Performance