Home arrow Guide to Logaholic
Guide to Logaholic



Guide to Logaholic Web Analytics

This is a basic guide to the Logaholic 2.0 series website statistics software. It is installed on the server, and reads the server access logs in order to analyse and present website performance data. Its main use is to monitor and improve website traffic and performance; subsidiary functions include resolution of website errors.

Using Logaholic

Logaholic website statistics software is a popular choice as it combines good performance with low price. There are four areas in which it can be used:
  • The web metrics overview - website stats can be monitored to ensure the website is running correctly. This is a core requirement for safe operation.

  • Organic traffic tuning - visitor / traffic overview, keyword and referrer tuning, page and navigation tuning, site conversion, and many other tasks.

  • PPC tuning - advertising performance, PPC referrer monitoring, click fraud reduction.

  • Error removal - precise location and resolution of faults.


This document concentrates on basic use for organic traffic performance improvement and error resolution.


1 - Logaholic Installation

Logaholic can be installed on a normal server, ie a LAMP server. It is a PHP - MySQL application. It may also be installed on a Windows server that has PHP and MySQL installed. SQLite is now also supported, though with a slightly reduced range of features.

The application files are placed in a webroot folder via FTP. The folder can have any name, though /logaholic/ will be convenient. For websites that experience a high level of attacks, the folder can profitably be given a totally unrelated name, since this does not affect the results.

The geoIP plugin data file must be placed in the /logaholic/geoip/ folder. The file, geolitecity.dat, is a large one of around 27MB that usually takes time to FTP up to the site. If it is in zip format (.zip, .gz etc) then unzip it before uploading. This plugin locates visitors and referrers down to city level in many cases.

Make sure that there is an empty folder, /files/, within the Logaholic folder, so that it has a path of /logaholic/files/ (though this can vary if the Logaholic folder is given a different name). The folder permissions must allow writing to this folder and it may be found that a permission of 777 is needed.

You will need a spare MySQL database, and the username and password for it. Alternatively, you can share a DB and use a table prefix such as log_  - although we do not advise sharing with a main site application such as a CMS.

After installation is complete, you should:
1. Delete the install.php file.
2. Password-protect the /logaholic/ folder.

Desktop version

There is now also a Windows Desktop version. This is a client-side program that can view and use the access logs on the home / office PC. It means that MySQL is no longer required; and that 'difficult' servers are now more easily used.

The access log is downloaded by FTP, and viewed on the local machine. This will be a convenient option with IIS servers in some circumstances.


Hosted version

You can also use the hosted version. This requires JS page tags as for Google Analytics. There are pros and cons to this method (see page one of our website statistics guide for more detail). The big advantage is that the application is hosted by Logaholic and the data updates are processed automatically. There is a low monthly fee.

This version will suit large and busy sites - sites on non-standard servers - sites where the log files are faulty or of an unusual format - Windows servers with no MySql, or other issues - sites that don't want yet another PHP app on the server - and so on.

Viewing data

The standard server-based application is installed via browser, and is viewed in the browser. A tabbed browser such as Firefox or Internet Explorer 7 is recommended as that will allow more than one page of data to be displayed.

To view the application, go to:

www.a3webtech.com/logaholic/

...changing 'a3webtech' to that of your domain name, and the folder name too if you varied it. Bookmark this page immediately so that you can return with one click.

Log type options

The application will read several types of logs or scripts. There are two main options:
  • Collecting the data by reading the server logs (the website access logs)
  • Using a page tag tracking script

The latter option is used for high traffic sites or when the server logs are unusable.


Normally the server logs can be read directly from their usual file location, which is often in a folder in the root directory (the directory level above the webroot). If this is not possible, a cron job can be set up to dump the logs into a webroot folder daily.

Note that if Logaholic is set to read the actual server log, it can report near real-time data, as it sees the data as soon as the server logs it (immediately). However, if it is necessary to use cron dumped files, then data will be one to two days behind, as it is necessary to wait until the cron job dumps the files daily (unless a 12-hourly or 6-hourly dump is arranged).

For this reason the first method is preferred, but there is no effect on the final data. If the daily dump file method is used, then today's data cannot be accessed, though this has little effect on final analysis. The application will report that it is viewing 'Today', but if you look at the date, it will be for yesterday (the day for which it has received the dump file).

Where large log files of over 50MB are accumulated, Logaholic takes time to process them. In this case it is best to use a tabbed browser. Allow the application to start processing the file, and open another tab on your browser, so you can work elsewhere while the data is processed, which may take 5 minutes or so. Normally, Logaholic knows the point at which it got to in the log file, so it does not need to process the entire log again.

Advantages of Logaholic

Logaholic is a sophisticated application in that it is hard to fool, and it can extract, rationalise and display data in advanced ways. Its performance in general is far in excess of what the initial cost would indicate.

In particular, coverage and interpretation of visits and visitor numbers - 'uniques' - is particularly good, and this is just what we need to tune organic search performance. We could work on this aspect to the exclusion of all else, until good basic traffic and conversion results allow us to go further. Keyword and referrer data need particular examination and improvement, and we have the tools here to make progress. Do not be sidetracked into less important areas such as pageviews and clickthroughs until the basics are taken care of.

Logaholic statistics presentation and analysis is superior to that of applications costing many times more. In addition, even some of these expensive web analytics solutions will not allow us to resolve all the website errors with the ease that Logaholic allows.

The way it prioritises visitor data is unequalled until a very much larger sum is spent. We need to take maximum advantage of this and make best use of these tools. Don't forget to use the fine error data resolution tools either.

As with any software, there are areas for improvement. Bot data for instance is one such area, and it is still the case that too much of that origin is incorporated into the results. Bots include both searchbots and 'attack' bots such as worms and other probes, and website rippers. At least, though, it is is fairly clear to see where this data is being included, and to discount it. Each program upgrade brings improvement here in any case. Logaholic certainly doesn't amass such data in the final visit figures to the extent that AWstats and Webalyzer do, for example, so the results are far more accurate. Webalyzer will usually tend to read 20% too high on the visit numbers, compared to AWstats, which has a better final result. Logaholic, though, excludes even more of the 'false' data, so that the final visitor and visits figures are generally more accurate. This is especially true in those cases where results are being skewed by a false data source. Where the visit numbers from Logaholic, AWstats and Webalyzer disagree (as they always will), use the Logaholic figure. It may be lower, but it will be more accurate.

A top-class server access log analyser such as this will always give superior results to one that uses page tags, since more data will be available. It is true, though, that a hybrid application that uses both can potentially be more powerful - at extended cost of course. Since Logaholic can use either, it is not unreasonable to expect that in the future it may be set up to use both concurrently.

Create a profile

The first task after successful installation is to create a profile: a configuration set that allows the program to read the server logs in a given location, and to interpret them according to how you specify. Go to the Profiles  section >> Manage Profiles, and create a new one:

General Settings
The Profile Name should be a short and relevant one.
Enter the domain name as www.a3webtech.com or whatever it may be.

Data Collection
An individual log file can be named; or a directory - in which case all logs will be read. A filter can be applied so that only the required log or logs are read.

At the foot of the page, the text box should be filled with your home and office IPs, so that hits from these locations are not counted in the stats. The IPs are separated by a comma, and without spaces. IP exclusion of this type is a useful feature, and can also be used to exclude certain other sites.

KPIs
These are targets, usually pages, which when triggered signify that a sale or other success has occurred. For example a 'Thank You' page that is displayed after a sale would be a good target.

These 'sales targets' can be monitored and looked at weekly or monthly to see how website improvements are working. Targets have to be hit more frequently, or something is not being done right. A KPI can be set at the end of a funnel, and split testing can be used to tune the funnel or click route to that target.

After completing all parameters, save and exit to the stats display. The profiles can be viewed by clicking the link in the top left corner of the main logaholic page. You can return and edit the profile at any time.


2 - Using Logaholic

Logaholic Menus

There are 4 top-level menus in Logaholic:

1. The Profiles menu - in the top left corner.
2. The Main menu - the top horizontal tab menu.
3. The Date menu - just below the Main menu.
4. The Category menu - the left column vertical menu.

The Profiles menu can be used to edit and update your current profile. A common use is to add an additional IP to the exclude filter (the list of your own IPs that hits will be ignored from).

The Main menu tabs give access to function groups. The first tab, Summary Reports, is used most.

The Date menu is used to look at data for periods (such as the last week or month) or for individual days (such as yesterday).

The Category menu breaks down all the data into small categories that can be examined by day or by period.

Organic traffic

The best way to learn to use the application is by following links through the pages of data, and getting an overview of what is available.

Then, pick a task to complete. For example, we could check which are our top (most popular or often-viewed) pages. In the Category menu, go down to the Popular Content section, and click Top Pages.

You can see that the date range is for the current month. This is normally a good place to start; but will not perhaps be so useful in the first week of any month, as it uses the calendar month. Therefore, go to the Date menu just above, choose the Quick Date spin box at left (aka jump menu), and pick Last 30 Days (for example).

Another important category is Top Keywords. Move down the Category menu and choose this one; and again, pick a date range. Last 7 Days is a useful one to look at here.

Tuning example

Let's take a look at a simple way to improve traffic. Go to the Top Keywords category and bring up the page showing results for the Last 7 Days - go to the Date menu and choose that period, then hit the report button to the right.

Go right down to the bottom of the page and look at the keywords that returned only 1 time; that is, keywords that got you 1 hit in the last week.

All these 1-hit keywords (which of course are actually keyphrases) should be looked at as targets for you to improve, so that they get more traffic and move up the page. Look at them carefully. Some will be misspellings; or unusually-phrased requests; or searches that do not correspond to site areas that you wish to promote in any case. But: one in five, or maybe one in three, will be keywords that should perform much better. Unless they are really obscure searches, they should get more than one hit in a week.

Look to the right, at the page the search ended at. Open another browser tab and go to that page and check how many times that keyword occurred on the page; if perhaps it only appears once, then increase it. Use both the singular and the plural of the term; use variations; and perhaps even a misspelling. Also ensure the keyphrase exists several times on similar, related pages on the site.

Check that the page is very easily accessed from a main menu - and if not, fix it. See if there would be an advantage to linking to that page from other routes, such as sub-menus or text links. And if the page really needs promoting strongly, then find a way to insert an additional link to it from the front page; a Recent Content or Featured Pages menu display (called a module if within a CMS) can be used here. Backlinks to the page from other websites are important, using that keyword as anchor text.

Come back to the Logaholic start page. Click on the keyword and you get an Action menu. Hit View Click Trails, and see what results. On the right you can see what pages were visited as a result of this keyword search, and how long they spent on the page. Open another browser tab to view this page data. If they bounced (left immediately), you should consider placing a subheading with that keyword, at the top of the page (if it is important to you).

Occasionally you will find that a 1-hit keyword is one you should be promoting far more strongly, and you will go ahead and build a page specifically for it.

Tuning example #2

Let's look at our keyword results as seen by the search engines. The easiest way to start is by going to the left column menu Summary Reports >> track down to the Incoming Traffic section >> Google Rankings link.

Hit this and look at the list of Google results for keywords that resulted in a site visit. The list shows your keyword (phrase / term) in the left column, and the page it was found on in Google, in the centre column. Obviously our task is to (a) get more keywords on this list, on pages of a higher position; and (b) once on page 1, to then get them in position #1 to #4.

A keyword needs to be in the 1 to 4 slot on G.com in order to earn money. To get there simply requires better SEO. We interpret this to mean better quality resources, which are then given better visibility - but of course there are all sorts of interpretations.

It's true that you can find this info elsewhere, though perhaps at greater effort - but having it laid out clearly and cohesively here in a list - which you can print or download to Excel - means that you will appreciate the issues better.

Ultimate success here would be the whole page full of #1's in the centre column, with one-word terms at left, which when checked would all show up in the top 4 of page 1. Well worth aiming for, if not achieving in full.

Tuning out negative factors

Sometimes you will find 'negative' keywords in the list - phrases you don't want to be found for. These can often include terms found on your Contact or About Us page. This shows that you should consider making these pages harder to index, so that priority falls on your more valuable pages. If you then look at a search engine's cache for your site, you might well find that pages such as your 404 page, Contact Us, webforms of several types, and so forth are indexed. You don't want this, because you should try to get just your main content pages indexed and ranked well. If your resource pages and assorted forms and so on are indexed and ranked, it tends to dilute the value of the important pages. In any case you don't want certain pages appearing in search results.

To fix this you should add a 'noindex' metatag to these unimportant pages. This type of adjustment is one facet of an internal linking policy, or 'PageRank sculpture' if you like, another being the internal linking methods we briefly looked at earlier. In general it isn't worth the time to get too involved with this sort of esoteric technical factor. However, basic tuning requires that we should promote the content pages and try to restrict indexing of low-value pages, such as webforms and resource pages like your privacy policy.

Tabbed browser pages

When using Logaholic, you can work with two or three browser tabs open. The main data start page is kept in the left (first) tab, and you can open new tabs for subsequent data so that you don't lose your place on the main page. Indexing backward and forward through pages is the wrong way to do it, since this is too slow and you tend to lose the thread; you need to be able to look at different pages rapidly and sequentially. If you don't normally use tabbed browsing then this is the time to start.

Error tracking

To find and remove errors, we need to know:

1. The error code.
2. What page the request was made from that resulted in the error.
3. What page or file was requested.
4. How many times and on what dates the error/s occured.

The two important errors to remove are 404s and 302s. Of these, the 302s  represent more of a threat since search engines will penalise a site for using them extensively. It can happen that a redirect defaults to a 302, so we need to find these and eliminate them.

However, many 302s will be generated by bot probes (attacks), as the web application will refuse the request and redirect the probe elsewhere. These we can ignore as they are invisible to a search engine.

In the left column menu, under Various, choose Error Report. Click on 302. Click a page name in small blue text at left. Choose 'View Click Trails to this page' in the action menu that appears. Here, you will see the pages that were being viewed when this error was returned.

If the page has a gibberish URL such as:
//admin/include/lib.module.php  -or-  /0//includes/functions_portal.php

...then we can ignore these since they are bot probes that were redirected away (most likely to a 403 page). However if the page is one that you recognise as being on the site (or has been deleted at some time), then action must be taken as the page is being redirected in the wrong way. All redirects of any form whatsoever on a website or associated with it in any way are always a 301 (permanent) redirect.

A 302 (temporary) redirect is NEVER used for ANY purpose on a website or a web server.
The only occasion a 302 is seen (or should be) is for bot attack probes.

The same procedure is used to locate and identify 404 errors - the missing page (or file) and the  page it was requested on are viewable. Therefore a dead link or similar problem can be found easily.


I can't find that feature - where is it?

If you can't find, for example, the Google Rankings page, or the Export to Excel button - then you need to upgrade to a later version of Logaholic.


3 - Logaholic security


This stats app is robust, and has only ever had one minor exploit to the best of my knowledge.
The single vulnerability was patched immediately.This is a fine record since all server software has been exploited at some time, usually with multiple exploits. There is no such thing as a server application that has never been exploited. If there was, the authors would rightfully advertise that fact widely, due to its extreme rarity, and I know of no such example. Indeed, some server apps are well known for their poor record in this area.

However, you should recognise that servers are under continuous attack and any steps to improve their resistance to exploitation should be taken. Here are some you might find useful; they are not all endorsed by the software author so you should act accordingly.


1. Install the application to a 'camouflage' directory. This means don't use a name such as /logaholic/, use something else. Any name can be chosen, the program will still work.

2. Password-protect the folder. You can do this from your cPanel website control panel, where there is an icon 'Password Protect Directories'. If your host does not have a proper control panel (which always has this facility), then you should do two things:

a) Research the use of htaccess files. Use a local htaccess file (ie one within the Logaholic directory), and its counterpart, the htpasswd file, up in the root directory (the folder level above the webroot).
b) Move hosts as soon as is convenient. Find a real host.

3. And best of all use a local htaccess file to restrict access by IP.

This means to place an htaccess file within the Logaholic folder that only allows access (of any type) from IPs you specify. This is useful (and in my opinion vital) because otherwise it may be possible for others to view your stats. This is sensitive commercial information that may be of use to your competitors.

As an example of how this can happen, even if you install to a camouflage folder: the IA Archiver is a well-known spambot, and is the front for Alexa, the Internet data purveyors. The bot gets in everywhere and provides no useful service unless your site is one of the top 100,000 on the Net.

If you have the Alexa toolbar or widget installed on your browser, any page you visit (even private pages) will have their URL logged by Alexa. The bot then attempts to access these pages. If they are password-protected this is a first step - but basic security has been breached because a bot now knows the private URL and keeps attempting to access it.

Luckily there is a way to improve security here: deny by IP. Simply place an htaccess file in the /logaholic/ folder (or whatever you called it), using IP selectors - you need no other file or anything anywhere else - and the problem is solved. Only you, at your office or home address, can view the stats. There are some important points to remember about this method:
  • Always include the server's own IP in the Allow instruction.

  • Put all your IPs in there - office and home for example.

  • Don't use any other format than the one given here unless you test it exhaustively.

  • Especially, don't vary the allow, deny positions or insert other allow / deny variations. I have seen other variants of this script and some DON'T WORK.

  • Test it to see if it works - try it without your IP allowed. Obviously it must block you.

The file must be placed in the Logaholic folder (only). Don't place it in your webroot! That would lock everyone out of your site except you. The file locks access to the folder it is in - large or small. Find your IP by using an online IP identifier tool. Find your server's IP by looking in your site hosting records or using an online domain name to IP converter.

The first allowed IP is the server itself; the second is your office; the third is your home. You can put in as many as you like. FTP the file up to the Logaholic folder. Make the file up as a plain text file first, called logaholic-htaccess.txt

When it is on the server, in the Logaholic folder, then change the name to .htaccess and it will then work. First, send a version up there without your IP allowed. Of course, it must block you from seeing the stats. Then add your IP - and you should be allowed in. Give someone else your Logaholic access URL and ask them to try and view your stats - they must be blocked out with a 403 error. If they get any other error code they're just going to the wrong address, maybe by typing wrongly.

Note how lines are 'commented-out'. You should use two ## on lines of info, and one # for script lines that are switched off. This makes the difference clear but has no effect on the result, they are both disregarded. Extensive text comments are inserted because in 6 months' time you won't be able to remember why you did what you did. It is very good practice to comment code like this.

Note that if/when your IP changes, you'll be locked out. You fix it by changing the IP in your htaccess file then sending up the new version by FTP. So if you are elsewhere, you won't be able to view your stats as the IP will be blocked. And if your IP changes at home, unless you can FTP a new htaccess file up - you're blocked. If you ever find yourself blocked out, your IP has changed. All standard broadband (aka DSL, ADSL) has dynamic IPs - meaning they can and will change. Cable DSL and business DSL are more stable.

If you want to understand what the script does, it says, "Files - all of them in this directory - block to everyone - but allow access to the addresses listed below".

The 3 sets of numbers must be replaced with your own IP numbers.

---------------------------------------------------------
## htaccess for logaholic directory
## deny all IPs except those listed here
## allows the server itself - office IP - home IP
<Files *>
order deny,allow
deny from all
allow from 88.xxx.xx.xxx
allow from 86.xx.xxx.xx
allow from 212.xxx.xxx.xxx
</Files>

---------------------------------------------------------

Logaholic tips

Include the file extension .htc in the files excluded filter, for filetypes that are not counted. It's done on the Data Collection tab, in the Skip Files text box, in your profile settings. Then Save.

The .htc file is a CSS additional script that is used to correct deficiencies in Internet Explorer. The file is iepngfix.htc (by Angus Turnbull) and is seen in CMS templates. It adds instructions to Internet Explorer to overcome the fact that this faulty browser cannot handle .png images, one of the three main types of image on the Net.

If you don't do this, and you are running a CMS or other comprehensively managed application, the web analytics will list this file repeatedly as a viewed page, and it will become extremely popular and prominent in your stats. Not really wanted.


Daily check

This has been a basic guide and overview. The best way to utilise the application is to make full use of it. Look at it daily, and you will soon find ways to make it work for you and earn you money.

You can download and try Logaholic free here:

   Logaholic Web Analytics

Do you have a Logaholic tip? Tell us in the Forum.
 
Ethical SEO Agency