Recently I observed that AdWords-generated traffic dissapeared from Analytics panel. I thought: WTH?
I checked the logs and saw that URL called by AdWords:
http://my-site.com/?gclid=342343445345....
Generated 403 (Forbidden) server response. That was caused by recent change in Lighttpd filtering rules. I was paying for AdWords traffic but customer hit 403 error page. Ops!
In order to easily spot such problems in future I created the following scanner to easily find all error server responses.
awk '$9>=400' /var/log/lighttpd/access.log | less
If you are boring of 404 errors you can filter them out as well (leaving only 403 / 500 errors for investigation):
awk '$9>=400 && $9 != 404' /var/log/lighttpd/access.log | less
I discovered that the following URLs were inaccessible:
- /robots.txt (exclusion rules for web crawlers)
- /favicon.ico (icon used by web browsers)
Next step could be automation of this check (cron job that will send an alert if errant responses count is higher than N). It's left as exercise for the reader.