CodeTron (Rob) commented on one of my posts. I looked at Rob's blog and that reminded me of something I did a while ago. He was analysing the Google page-rank and mentioned the probablity of being on a web-page.
I ran an internal website for a couple of years and wanted an efficient and effective design. There are many programs that will analyse your web statistics, but few that tell you how to improve your site.
Running our website we wanted to provide information quickly and efficiently and I used one simple measure; for each page I counted how many pages were viewed before getting to the current page.
The number of web pages viewed is a stopping probability and followed a geometric distribution; the probability of stopping gives a measure of how good you web-site is at providing information. The higher the probability of someone leaving, the better the site is at providing information. i.e. The user found the information quickly and left.
Re- reading that (25OCT07) I need to explain. Assuming the site is an information site or product catalogue you want the person to find what they are looking for as quickly as possible and you do that by having found whate they want in as few clicks as possible, i.e. they found what they wanted in the minumum number of clicks and left the site. Most analysis programs count the number of pages viewed per hour/day/whatever and you want that number to be increasing. There are a few ways of doing that, increasing the number of visitors, increasing the number of visits, increasing the number of pages viewed per visit.
Assuming they left on the page they were looking for , then I moved the pages that were most frequently stopped higher up the hierachy (search).
By moving the pages I effectively increased the probability of finding the answer. Within a month I had doubled the probability of finding the required page (as measured by the slope of the geometric distribution). At the same time the number of hits increased dramatically. As the site became more useful, it was used more frequently. I had more visits that viewed fewer pages but my overall traffic (visitors per day) increased substantially.
Subscribe to:
Post Comments (Atom)
2 comments:
So you were effectively attempting to minimize your expected stopping time? This was a custom application that you built? Was it done by manual trial and error, or did you solve any equations to get there? This is pretty cool stuff, you have inspired me to write up some more material. Also, I like your claim of the stopping time following a geometry distribution. With the google model, (1-d) is your prob of exiting to the outside world at any time, and d is the probability of staying within your site (though there is a negligable chance of reentering your site randomly), so P(exiting at time n) = d^n(1-d).
I was reducing the number of links someone had to follow to find the information they wanted. The stopping count (number of pages viewed before leaving the site) was a measure of this. Assuming they stopped browsing once they found the data they wanted, you only needed to count how many pages were viewed in a session and which page was viewed last.
I discovered this simple measure after analysing a lot more data. I started by doing a box and whisker plot for each page using custom software. I noticed that the most frequently viewed pages had a very short tail as if people had stopped on that page. Then it struck me that they had found the page they wanted and stopped looking.
From your description the google model is similar to mine although I was trying to optimise a single website and find the measurements to do so.
The effort was driven by dislike of programs that measured page views as if it were the holy grail to serve up as many pages as you can for as long as you can to each visitor. These programs have lots of data but no information on how to improve your website. I was after something that would actually help by showing what to do and where to apply effort.
Counter-convention, I was reducing the number of pages viewed. The result was that traffic to the internal site trebled as it became more useful.
Remember this is for an information site. Other types of site may want a different model.
Post a Comment