There’s already a lot of tutorial on internet on how to install awstats for nginx. I didn’t find any for the configuration I wanted, so I’ll write one, for my record.
I have some custom needs, let’s suppose I have 3 domains :
- master-domain.com
- alpha.com
- beta.com
And I want to have stats for the 2 latest domains. The master-domain.com is used as the master domain of the server, with awstats available at awstats.master-domain.com, instead of having alpha.com/awstats and beta.com/awstats. The idea it to group all the server script/tools (phpmyadmin, zabbix, etc …) under master-domain.com.
We also want to password protect the stats, but with different credential for each vhost.
These steps have been tested on Debian Squeeze, on a Kimsufi.
Содержание
Install Awstats
apt-get install awstats
On debian squeeze, awstats install things in 3 places :
- /etc/awstats : contains all the conf files for each of your awstats installation
- /usr/share/awstats : contains all tools and libraries used by awstats
- /usr/share/doc/awstats : docs, tools for building the static html pages, icons and other static files used by html
Formatting Nginx log
Nginx by default output logs that already can be read by awstats, as long as you use theCombined format. If you set your errors log like this :
error_log /path/to/log.log;
Then you’re good. The combined format is implicit. It’s equivalent to
error_log /path/to/log.log combined;
Optional step
Using the default format is fine, but you can log one more field, that could be pertinent : thehttp_x_forwarded_for.
It’s used to capture the client IP address when he is connecting through a proxy of load balancer.
For that, we define another log format, named main in /etc/nginx/nginx.conf
. In theserver scope, add :
log_format main '$remote_addr - $remote_user [$time_local] "$request" '
'$status $body_bytes_sent "$http_referer" '
'"$http_user_agent" "$http_x_forwarded_for"';).
It’s the same as the combined format, plus the $http_x_forwarded_for
bit at the end. To use this format, add main at the end of your error_log
directive.
error_log /path/to/log.log main;
As this last field is not used by awstats, we should tell it to ignore it. In/etc/awstats/awstats.conf.local
, add :
LogFormat = "%host - %host_r %time1 %methodurl %code %bytesd %refererquot %uaquot %otherquot"
This file should be empty by default. It’s used to set the settings shared by all your awstats config.
We teach awstats the meaning of each field when parsing the log. The last token (%otherquot) means that “Oh, that string here does not mean anything.”.
Creating a configuration file for each vhost
Awstats is picky about the configuration files : you should have one config file by vhost, they should be named following the convention : awstats.domain.tld.conf, and should be placed inside the /ect/awstats/
directory.
So, for the vhost alpha.com and beta.com, you should create these two files :
- awstats.alpha.com.conf
- awstats.beta.com.conf
The official method
There is already a model configuration file inside the /ect/awstats/
directory :awstats.conf. Documentation says to clone that file when creating your own config files, with
cp /ect/awstats/awstats.conf awstats.alpha.com.conf
cp /ect/awstats/awstats.conf awstats.beta.com.conf
Then you just edit these files to your needs… Method I’m not fond of. If you take a look atawstats.conf
, you’ll see that it’s a very complete conf, with plenty of comments, and all the available settings, all of that for just * suspense music * … 1500 lines.
I’m personally not interested into having multiples conf files, for 1500 lines each, with each files differing of just 4 lines.
The DRY method
If you have ls
the /etc/awstats
folder, you’ll see that there’s by default 2 files here :
- awstats.conf
- awstats.conf.local
awstats.conf is the main conf file, origin of all the other conf files. It’ll also fallback to this file if no other config file exists.
awstats.conf.local is an empty file. It’s the parent of all the other config files. If you have some rules that are shared among all your config, you put them here.
What I do is I copy all the contents of awstats.conf
into awstats.conf.local
, and just put the important rules inside each vhost config, so they’re easier to read, and shorter.
What to put in the conf files
Let’s create the conf files for alpha.com.
vi /etc/awstats/awstats.alpha.com.conf
We start with an empty file, insert the following lines
# Path to you nginx vhost log file
LogFile="/var/log/nginx/access.alpha.com.log"
# Domain of your vhost
SiteDomain="www.alpha.com"
# Directory where to store the awstats data
DirData="/var/lib/awstats/"
# Other alias, basically other domain/subdomain that's the same as the domain above
HostAliases="www.alpha.com"
By default, awstats store all its data inside /var/lib/awstats/
, which is the default settings. You could change that to another directory, or have a subdirectory for each vhost, like /var/lib/awstats/alpha.com/
.
But even if you use the default setting, you have to set it in each config, as it can not be inherited from awstats.conf.local
.
You’re free to add more setting if some of your vhost requires additional customization.
Repeat the same steps for each vhost.
Tune the global settings
Edit awstats.conf.local
,
- Disable DNSLookup : DNSLookup = 0
- Remove LogFile, SiteDomain, DirData and HostAliases directive, as they’re useless outside their context.
- Set LogFormat to Combined (if you didn’t use the optional step in formatting the nginx log) LogFormat = 1
- You could also enable some plugin, like GeoIP (require additional steps, beside uncommenting the line).
Computing data
Awstats is now configured for each vhost. We will now tell it to read the log files, and generate the stats from them. It’s a boring operation that should be done regularly (e.g, once a day, each 6 hours, etc…) depending on your need. More you wait, more the log file grow in size, and more time it will take to process it. It’ll depend on your website traffic.
To compute the data, a perl script is available in /usr/share/doc/awstats/examples
. Theawstats_updateall.pl will compute the stats for each available config. It’s easy, just run :
/usr/share/doc/awstats/examples/awstats_updateall.pl now -awstatsprog=/usr/lib/cgi-bin/awstats.pl
The -awstatsprog
flag tell the script where to find the awstats.pl script, becauseawstats_updateall.pl is just a wrapper that is executing awstats.pl for each of your config.
The obvious solution to run this script regularly is to use a cron job. The drawback is that nginx logs are rotated with logrotate. It means that every X days, the log file will be archived (and renamed), and a new log file will be created. If you use a cronjob to compute the stats
- Just before the log rotation, you’ll lose all data between the computation and the rotation, as the file is renamed and not accessible by awstats anymore
- After the rotation, you’ll also lose all data between the computation and the next rotation.
- At the rotation, you’ll experience some weird things.
Solution #1
We could prevent the data loss by telling awstats to always parse 2 logs files : the regular one, and the last archived log.
Logrotate always rename the file using the convention filename.1, filename.2. At each rotation, all filenames are incremented, and filename will become filename.1. A new filename will be created, so the newest archive is always filename.1.
In the awstats config for your vhost, edit the LogFile
setting
LogFile="/usr/share/awstats/tools/logresolvemerge.pl /path/to/log/access.domain.tld.log /path/to/log/access.domain.tld.log.1 |"
logresolvemerge.pl will combine the 2 log files into one.
You’ll never lose data because of the rotation, since you’ll parse the rotated file too.
Solution #2
Execute the computation just before the rotation, using logrotate postrotate hook. This is useful especially if your computation interval equal the rotation interval (e.g, you rotate every day at midnight, and you compute also every day at midnight).
Edit the logrotate config for nginx :
vi /etc/logrotate/nginx.conf
I like to rotate log every day, to keep them lighter. By default, nginx rotate logs weekly.
/var/log/nginx/*.log {
daily # rotate daily
missingok
rotate 52 # Keep 52 days
compress
delaycompress
notifempty
create 0640 www-data adm
sharedscripts
prerotate
# Trigger awstats computation
/usr/share/doc/awstats/examples/awstats_updateall.pl now -awstatsprog=/usr/lib/cgi-bin/awstats.pl
endscript
postrotate
# Reload Nginx to make it read the new log file
[ ! -f /var/run/nginx.pid ] || kill -USR1 `cat /var/run/nginx.pid`
endscript
}
You could also trigger manually computation by running the
/usr/share/doc/awstats/examples/awstats_updateall.pl now -awstatsprog=/usr/lib/cgi-bin/awstats.pl
directly in the shell, if you don’t want to wait for the log rotation at midnight.
You could use a regular cronjob on a single log file if you compute more than once a day, and use the postrotate hook just for the computation near midnight.
Building the html reports
awstats_updateall.pl will compute new stats, but not build the html pages. Awstats come with 2 options :
- Build the static html page yourself
- Use cgi to build the page dynamically
I’ll use the dynamic options, explained below. There’s already plenty of articles on internet explaining how to build static pages if it’s the way you want to go.
Exposing awstats
Now that awstats is configured and charged with data, let’s make it viewable by the internet.
Let’s create the subdomain where awstats will live : awstats.master-domain.com
, linked to /var/www/awstats
.
Let’s assume that the subdomain is already redirected to your server (creating the subdomain is not in the scope of this post), you just have to create the nginx virtual host forawstats.master-domain.com.
How you create it is your own choice, there’s multiple ways (single conf file, ‘sites-enabled’ a la apache, etc …).
A regular nginx vhost conf should looks like that :
server {
listen 80;
server_name awstats.master-domain.com;
root /var/www/awstats;
}
Let’s define the error log, and disable access log
error_log /var/log/nginx/awstats.master-domain.com.error.log;
access_log off;
log_not_found off;
Alias the icon folder, so it’s viewable online, instead of copy/pasting it.
location ^~ /icon {
alias /usr/share/awstats/icon/;
}
Finally, configure /cgi-bin/scripts to go through php-fastcgi
location ~ ^/cgi-bin/.*\\.(cgi|pl|py|rb) {
gzip off;
include fastcgi_params;
fastcgi_pass unix:/var/run/php5-fpm.sock;
fastcgi_index cgi-bin.php;
fastcgi_param SCRIPT_FILENAME /etc/nginx/cgi-bin.php;
fastcgi_param SCRIPT_NAME /cgi-bin/cgi-bin.php;
fastcgi_param X_SCRIPT_FILENAME /usr/lib$fastcgi_script_name;
fastcgi_param X_SCRIPT_NAME $fastcgi_script_name;
fastcgi_param REMOTE_USER $remote_user;
}
Edit the fastcgi_pass
to your own php-fpm server.
Create the /etc/nginx/cgi-bin.php
file
<?php
$descriptorspec = array(
0 => array("pipe", "r"), // stdin is a pipe that the child will read from
1 => array("pipe", "w"), // stdout is a pipe that the child will write to
2 => array("pipe", "w") // stderr is a file to write to
);
$newenv = $_SERVER;
$newenv["SCRIPT_FILENAME"] = $_SERVER["X_SCRIPT_FILENAME"];
$newenv["SCRIPT_NAME"] = $_SERVER["X_SCRIPT_NAME"];
if (is_executable($_SERVER["X_SCRIPT_FILENAME"])) {
$process = proc_open($_SERVER["X_SCRIPT_FILENAME"], $descriptorspec, $pipes, NULL, $newenv);
if (is_resource($process)) {
fclose($pipes[0]);
$head = fgets($pipes[1]);
while (strcmp($head, "\\n")) {
header($head);
$head = fgets($pipes[1]);
}
fpassthru($pipes[1]);
fclose($pipes[1]);
fclose($pipes[2]);
$return_value = proc_close($process);
} else {
header("Status: 500 Internal Server Error");
echo("Internal Server Error");
}
} else {
header("Status: 404 Page Not Found");
echo("Page Not Found");
}
?>
Final vhost config :
server {
listen 80;
server_name awstats.master-domain.com;
root /var/www/awstats;
error_log /var/log/nginx/awstats.master-domain.com.error.log;
access_log off;
log_not_found off;
location ^~ /icon {
alias /usr/share/awstats/icon/;
}
location ~ ^/cgi-bin/.*\\.(cgi|pl|py|rb) {
gzip off;
include fastcgi_params;
fastcgi_pass unix:/var/run/php5-fpm.sock;
fastcgi_index cgi-bin.php;
fastcgi_param SCRIPT_FILENAME /etc/nginx/cgi-bin.php;
fastcgi_param SCRIPT_NAME /cgi-bin/cgi-bin.php;
fastcgi_param X_SCRIPT_FILENAME /usr/lib$fastcgi_script_name;
fastcgi_param X_SCRIPT_NAME $fastcgi_script_name;
fastcgi_param REMOTE_USER $remote_user;
}
}
Beautifying the url
You can now view multiple websites stats, from a single website : awstats.master-domain.com.
But awstats don’t use url rewriting for beautiful link, and you end up with long and ugly url like :
http://awstats.master-domain.com/cgi-bin/awstats.pl?config=alpha.com
http://awstats.master-domain.com/cgi-bin/awstats.pl?config=beta.com
We could make them easier to share, by transforming them into :
http://awstats.master-domain.com/alpha.com
http://awstats.master-domain.com/beta.com
In the awstats conf for your vhost, add :
location ~ ^/([a-z0-9-_\.]+)$ {
return 301 $scheme://awstats.master-domain.com/cgi-bin/awstats.pl?config=$1;
}
Protecting the stats
Let’s now protect the stats. The idea is to have different credential for each awstats config. The login used to view alpha.com stats should not let the user browse beta.com stats.
Let’s edit the /cgi-bin/ location bloc in the vhost
location ~ ^/cgi-bin/.*\\.(cgi|pl|py|rb) {
# Protect each config with a different credential
if ($args ~ "config=([a-z0-9-_\.]+)") {
set $domain $1;
}
auth_basic "Admin";
auth_basic_user_file /etc/awstats/awstats.$domain.htpasswd;
gzip off;
include fastcgi_params;
fastcgi_pass unix:/var/run/php5-fpm.sock;
fastcgi_index cgi-bin.php;
fastcgi_param SCRIPT_FILENAME /etc/nginx/cgi-bin.php;
fastcgi_param SCRIPT_NAME /cgi-bin/cgi-bin.php;
fastcgi_param X_SCRIPT_FILENAME /usr/lib$fastcgi_script_name;
fastcgi_param X_SCRIPT_NAME $fastcgi_script_name;
fastcgi_param REMOTE_USER $remote_user;
}
This will protect each awstats config with it’s own credential, stored in/etc/awstats/awstats.domain.tld.htpasswd
. Authentication is based on HTTP Basic Authentication.
For the examples alpha.com and beta.com websites, the login and password are stored in
- /etc/awstats/awstats.alpha.com.htpasswd
- /etc/awstats/awstats.beta.com.htpasswd
Each files contains the credential for the corresponding domain.
You can create these files with htpasswd
(tools shipped with apache):
htpasswd -c /etc/awstats/awstats.alpha.com.htpasswd username
You’ll be prompt for the password next.
Final Nginx Awstats vHost
server {
listen 80;
server_name awstats.master-domain.com;
root /var/www/awstats;
error_log /var/log/nginx/awstats.master-domain.com.error.log;
access_log off;
log_not_found off;
location ^~ /icon {
alias /usr/share/awstats/icon/;
}
location ~ ^/([a-z0-9-_\.]+)$ {
return 301 $scheme://awstats.master-domain.com/<cgi-></cgi->bin/awstats.pl?config=$1;
}
location ~ ^/cgi-bin/.*\\.(cgi|pl|py|rb) {
if ($args ~ "config=([a-z0-9-_\.]+)") {
set $domain $1;
}
auth_basic "Admin";
auth_basic_user_file /etc/awstats/awstats.$domain.htpasswd;
gzip off;
include fastcgi_params;
fastcgi_pass unix:/var/run/php5-fpm.sock;
fastcgi_index cgi-bin.php;
fastcgi_param SCRIPT_FILENAME /etc/nginx/cgi-bin.php;
fastcgi_param SCRIPT_NAME /cgi-bin/cgi-bin.php;
fastcgi_param X_SCRIPT_FILENAME /usr/lib$fastcgi_script_name;
fastcgi_param X_SCRIPT_NAME $fastcgi_script_name;
fastcgi_param REMOTE_USER $remote_user;
}
}
And voila !
alpha.com webmaster can browse its stats via awstats.master-domain.com/alpha.com, and beta.com, via awstats.master-domain.com/beta.com. And they’re protected with their own credential, no peeking.
Thanks a lot for this explanation but I get a 502 bad gateway. My server is working with Nginx, Python. Since we are using a cgi-bin.php, I assume I also have to install php7?