Webalizer

From DreamHost
Jump to: navigation, search
The instructions provided in this article or section are considered advanced.

You are expected to be knowledgeable in the UNIX shell.
Support for these instructions is not available from DreamHost tech support.
Server changes may cause this to break. Be prepared to troubleshoot this yourself if this happens.
We seriously aren't kidding about this.

Analog is the default statistics package that Dreamhost provides with their accounts. However, it's not the most intuitive package. Webalizer is a decent statistics package that's more intelligible. Here's how to get it working on your domain.
  1. Go to ftp://ftp.mrunix.net/pub/webalizer/old/ and download webalizer-2.01-10-linuxelf-x86-bin.tgz binary to your home directory. (If you want to try compiling new version on your own, well, you're on your own.)
  2. In your home directory, create a directory and name it webalizer or some other helpful name.
  3. For each domain that you will run Webalizer on, create a directory to house the stats files. For mine, I created ~/mydomain.com/webalizer
  4. tar -xvzf the Webalizer binary.
  5. copy [cp] the files into the webalizer directory in your home directory. Important: you should confirm the permissions of the webalizer program file (~/webalizer/webalizer) are set so that owner may execute the file. Otherwise, you will receive a permission denied error.
  6. With Webalizer installed, you can now create a .conf file for each domain in your account. I have four domains to log, so i create four .conf files in ~/webalizer. (It's recommended you leave the .conf files in the same directory in which you install Webalizer.)
  7. Edit each .conf file appropriately. For example, you'll want to set the OutputDir to ~/mydomain.com/webalizer/. You need to choose the logs that you want Webalizer to use; I recommend the ~/logs/mydomain.com/http/access.log. In the knowledge base, someone said to use access.log.0 -- however, if you use a cron job (step 9), you will want to use the log that is updated live. [Actually, there's a good reason to use access.log.0 and have your crontab run Webalyzer around 2 a.m. instead of midnight. The log rotator doesn't rotate your logs until well after midnight, up to an hour later. If you run Webalizer very close to midnight, additional hits will continue to be logged to access.log after Webalizer runs, access.log will be rotated out to access.log.0, and the late log entries will be lost. Waiting until 2 a.m. and analysing access.log.0 prevents any log entries from being missed. —LarryGilbert 14:56, 26 Apr 2006 (PDT)]Excellent point , I just figured this outself that the ~/logs/mydomain.com/http/access.log.0 file refer to yesterdays logs so doing you get yesterdays logfile parsed at 2 am —Fmarshall



Example .conf lines:

OutputDir /home/username/mydomain.com/webalizer/
LogFile /home/username/logs/mydomain.com/http/access.log.0

You will also need to enable incremental processing, which requires another two lines in the .conf file:

Incremental     yes
IncrementalName webalizer.current

If you don't enable incremental processing, you'll see only the last day's worth of logs.

Lastly, you might want to add the HostName directive, so that reports for vhosts reflect your true domain name, as opposed to the actual Dreamhost system on which your vhost resides:

HostName    mydomain.com

8. Navigate to the directory in which you installed Webalizer, using SSH. Type:

./webalizer -c mydomain.conf

This will run webalizer. If it's successful, you can go to http://www.yourdomain.com/webalizer and see the results!

9. Finally, create a cron job that will run according to your needs, using the command 'crontab -e' from the shell.

1 * * * * /home/youruser/pathtowebalizer/webalizer -c /home/youruser/pathtowebalizer/yourdomain.conf >/dev/null 2>&1

This crontab runs every hour -- see here for more information about crontab settings.

If you have multiple domains for which to run webalizer, here's a command that will run every .conf file in a specified folder. Try testing it before creating a crontab out of it though! And change the extension of the example.conf [e.g. to example.confoo] or remove it.

for i in /home/youruser/pathtowebalizer/*.conf; do /home/youruser/pathtowebalizer/webalizer -c $i >/dev/null 2>&1; done

Preface that command with the time set up like "59 23 * * * " and your stats will be updated every night just before midnight.


If you get errors along the lines of ""/tmp/crontab.16288":1: bad minute" then make sure your entry takes up one line (no line breaks -- the default editor will create a new line if it wraps in the window) or try using vi to edit your crontab file.

export EDITOR=vi

vi is more difficult to use but eliminates said errors. If you aren't familiar with vi, you can make sense of it with a quick tutorial. (Press i to edit text, ESC to end editing, use :x to save and close the file.) Then try using the above cron jobs.

If you want to be sure that your jobs are taking place, omit ">/dev/null 2>&1" from your crontab so that you will be e-mailed when the jobs occur.

That's all there is to it! Enjoy your stats.

slimstat also was easy to install for me on dreamhost as well as phpmysites but I am preferring webalizer for a real quick visual indication... better than google analytics which is also possible — Fmarshall

http://wettone.com/code/slimstat

http://www.phpmyvisites.net/phpmv2/phpmyvisites.php


Resolving IP-addresses to hostnames. Dreamhost's web servers do not resolve IP-addresses to hostnames when generating the logs. Should you want the visitor IPs in your webalizer statistics to resolve to proper hostnames, you might find the utility logresolve useful. By first parsing your access.log through /usr/sbin/logresolve before you run webalizer, you will get name resolution in your site statistics. Below is a draft shell script that can be run as a nightly cron job to first resolve host names and then generate statistics off the generated temporary file. (Assuming that webalizer resides in home/username/webalizer/, and that username and domain are changed to something more appropriate.) --Liljeqvist.com 14:25, 9 July 2007 (PDT)

#!/bin/sh
# Copy yesterday's log to a temporary file.
cp ~/logs/domain.com/http/access.log.0 ~/domain.yesterdayslog.raw
# Run logresolve on the temp file and create a new temp file with resolved host names.
/usr/sbin/logresolve < ~/domain.yesterdayslog.raw > ~/domain.yesterdayslog.resolved
# Run webalizer on the updated access.log with resolved names in it, output into stat directory.
~/webalizer/webalizer -c ~/webalizer/webalizer.conf  ~/domain.yesterdayslog.resolved -o ~/domain.com/webalizer/
# Lastly, remove the temp files we just created.
rm ~/domain.yesterdayslog.resolved ~/domain.yesterdayslog.raw