Anyway, there was a recent paper published that collected traffic going into and out of the servers at Indiana U. Using this traffic they were able to disprove 3 major assumptions underlying PageRank. PageRank assumes
- a user is equally likely to follow any link on a page.
- the probability of "teleporting" (or going directly) to any web page is equal to any other web page.
- the probability of "teleporting" from any web page is equal across all web pages.
The bottom line is that the links of the web are not that good at determining what actual paths people follow while browsing. However, this is the basis of major search engines that link structure determines popularity. The redeeming quality of search engines from this paper though is that they lead people to less popular sites, or sites we would not otherwise find out about and thus spread the wealth of clicks around (which is in conflict with what I had previously said in my first post on Google bias).