Dynamically count exit link click-throughs using JavaScript and PHP
Posted Monday 28.07.08Jump down to:
Background
If you are looking for a way to count how often outbound links located in your content are clicked, it doesn’t take long to find some sort of solution documented on the Internet. Most exit link counters work in one of two ways:
- They require you to actually set the link to point to a script on your website that in turn redirects the user to the real location. For example, you would set the link as /link/?id=1 and then your script would look up what real destination ID 1 is, log a click against it and then redirect the user.
- A JavaScript onClick event needs to be added that in turn calls a link counter script, perhaps related to Google Analytics. The downside to this is that in many cases such a method results in the logging of a unique URI that you associate with the link, for example an outbound link to www.yahoo.com would be registered such as onClick=”registerClick(’/outclick/yahoo_com’);”, then in the stats software you look up how many times a hit has been registered against /outclick/yahoo_com to see how popular a link is. Obviously for long URLs this can get messy.
The major downside to both of these methods is that you manually have to adjust all links in your content in order for them to be logged. If you run a blog, a news site or any other type of website that can receive content via users or is continually being updated with new content not necessarily by yourself, it is either nigh on impossible, or would at least be very difficult to instruct all users on how to add link tracking code, or extremely inconvenient for you to have to go through all content and adjust links yourself.
This was the situation I faced with PhDSeek.com, a website of mine that allows educational organisations to advertise their available PhD projects and studentships free of charge. It seems that in more cases than not, universities and institutes like to use their listings as a gateway to their website. Although I can view from my Google Analytics reports popularity of a given listing, it is hard to know whether those reading the listings actually click the link(s) contained within the text to visit the organisation’s website and if they do, if they ever come back to PhDSeek.com given that the listers probably don’t select to open the link in a new window.
With these points in mind I set about creating a simple link tracking set-up, that would dynamically tag all links to external websites listed on a page, then automatically open them in a new window as well as log an outclick count against them when clicked on. The beauty of the script is that it could be customised further so that the outclicks are logged against the listing, which in future means I can add a feature to the listing software that will allow users listing PhD’s to view how well it is performing in terms of generating click throughs, without them having to request access to organisation log files and learn how to process them to find if my site is generating traffic (extremely unlikely to happen given that these users are on the whole very non-technical).
The solution - overview
The solution I created uses JavaScript to find all external links and adjust them accordingly. When an outbound link is clicked, JavaScript dynamically adds a hidden image to the document, the source (or src) of which is a PHP script on the server that returns a 1×1 transparent GIF. When this image is added to the document, it is called with the URL of the link as an argument. Because the request for the image is essentially coming from the user, their IP and user agent can also be registered with the hit.
The PHP back-end script logs the link (unless it has already logged it in the past) and assigns it a unique ID. The link’s ID is then logged in a separate log table against the IP and user agent of the user, as well as the click through date and time. This data can then be used in what ever way you wish to generate your own custom reports. In the case of PhDSeek.com, I have further extended the code to log the ID of the listing a link appears in, assuming the same link can be present in multiple adverts that are not necessarily owned by the same user.
Because this solution primarily uses JavaScript to dynamically tag all external links on a page, it means that the majority of search engine crawlers will not cause this software to kick in and as the href of anchor tags remains as the real destination link, you can be sure your click-thrus will only be logged when a real user follows the link and not search engines. Furthermore, as we’re not freaking with links per se, any penalisation that may occur from a /redirect/123/ style link should not apply. As the user agent and IP address is stored for each hit though, you can easily remove non-human generated hits and adjust the back-end PHP script to ignore future counts should a bot cause counts to be registered.
One downside to the solution and code below though is that it overrides any existing onClick handler attached to an anchor that points to an external site (it doesn’t touch your own internal links with onClick handlers). For my situation this is fine though, in fact preferred, given that I don’t want advertisers having the option to add their own onClick events. If you think this is going to be a problem for you, there is a function you can adjust to utilise addEventListener and attachEvent which preserve existing onClick event handlers and append your own.
I have used PHP’s PEAR DB library in my code as the database abstraction layer. This is purely out of habit. If you don’t use PEAR DB, don’t want to download the library or would simply prefer to use another type of abstraction layer or method for querying your database, it is very straight forward to see where minimal changes are required. I have used MySQL as my database of choice, naturally the choice is yours.
The solution - files
Two files are required - you can view them in full or save them directly:
- The JavaScript
- The PHP script (don’t forget to rename it!)
The PHP script has the MySQL table structure at the top of it as a comment.
Both files are fully commented. It should be very straight forward to modify them for your exact needs if they don’t do quite what you want.
The only configurables you need to worry about are:
In the JS: aIgnoreHosts - an array of hostnames to ignore and NOT track. You’ll probably want to include your site’s main hostname here. sLogScript - the location of the PHP script that does the logging. It can be local or remote. If you don’t want tracked links to open in a new window, or you rather that choice was left to the person putting the links into the content, look for the CT_LOGGER.log_click function and remove the set ‘target’ attribute line.
In the PHP: Your database connection details. Find and replace db_username, db_password, db_hostname and db_name accordingly.
The JavaScript should be included at the end of your pages that you want to monitor the click throughs of. If you have a general ‘footer’ template, this should be easy. Whatever your scenario, ensure the include is just before the </body> tag. e.g.:
<script type="text/javascript" src="/javascript/tracker.js"></script> </body>
Advisable modifications
It is unlikely your PHP logging script will be discovered easily, but as with all things web - never say never. So it is advisable to implement a precautionary step or two to avoid the script being exploited if it is found. A systematic attack could result in your database filling up with thousands or bogus ‘hits’. Here are a couple of suggested checks you can make that if not satisfied, should result in your script exiting or just returning the 1×1 GIF without any processing/logging being carried out:
- Check for $_SERVER['HTTP_REFERER'] and if it is not set, or it is not equal to your site’s hostname, assume abuse and bail out. In most legitimate cases this will be set so your results will be quite accurate, even if at the odd time it is not set.
- Monitor how many times an IP is logging clicks within a given time-frame, more than a handful per minute over the course of a few minutes, you might like to start getting suspicious.
- Set a cookie when the user arrives at your site and then check this cookie is present before logging any clicks.
If as in my case the same link can appear on multiple pages and you want to be able to see which pages generate a higher outclick rate, modify the JavaScript to pull out a unique ID from the host page URL. Say for instance you have an outclick on http://www.mysite.com/article.php?id=123, get the JS to extract the ‘123′ and send it along as an additional parameter to the ‘url’. Update the PHP to grab this ID and then log it in the ct_Log table with the IP and user agent. You’ll then know which article a particular user was viewing when they clicked out.
November 27th, 2008 at 20:11
Cool post, great read, thanks for contributing to the community!