I am of the belief that just because something says that it is healthy does not mean it is. For example, "low fat" to me means "less bad" rather than actually being "good". I originally was of the same opinion on educational TV. TV is brain rot for children (which is sometimes worth the quiet it brings). However, there have been two cartoons on PBS that have changed my view: Super Why and Sid the Science Kid.
Our daughter rarely seemed to be that interested in learning letters from mother and father, but once she started to watch Super Why she really started to pick up on all of the letter understanding the show brought. She had the alphabet down after just a couple weeks of watching the show once in a day. Entertaining education really worked for her.
Sid the Science Kid is the latest show that actually teaches our daughter something. She has learned about washing your hands to remove germs, what "melting" means, and what seeds are good for. I am a real fan of this show.
Wednesday, December 3, 2008
Wednesday, November 19, 2008
Google's Ranking Algorithm In Review
Google started on the basis of a ranking algorithm called PageRank (discussed in previous posts here and here). Of course there is so much more to the secret sauce for these search engines now. We just don't know what they are using.
Anyway, there was a recent paper published that collected traffic going into and out of the servers at Indiana U. Using this traffic they were able to disprove 3 major assumptions underlying PageRank. PageRank assumes
The bottom line is that the links of the web are not that good at determining what actual paths people follow while browsing. However, this is the basis of major search engines that link structure determines popularity. The redeeming quality of search engines from this paper though is that they lead people to less popular sites, or sites we would not otherwise find out about and thus spread the wealth of clicks around (which is in conflict with what I had previously said in my first post on Google bias).
Anyway, there was a recent paper published that collected traffic going into and out of the servers at Indiana U. Using this traffic they were able to disprove 3 major assumptions underlying PageRank. PageRank assumes
- a user is equally likely to follow any link on a page.
- the probability of "teleporting" (or going directly) to any web page is equal to any other web page.
- the probability of "teleporting" from any web page is equal across all web pages.
The bottom line is that the links of the web are not that good at determining what actual paths people follow while browsing. However, this is the basis of major search engines that link structure determines popularity. The redeeming quality of search engines from this paper though is that they lead people to less popular sites, or sites we would not otherwise find out about and thus spread the wealth of clicks around (which is in conflict with what I had previously said in my first post on Google bias).
Thursday, November 13, 2008
The Machine is Us/ing Us
Monday, November 3, 2008
Google Bias Take 2
I earlier posted that Google's ranking of search results caused a rich-get-richer problem. In other words sites linked to most often will be ranked first leading to more links.
Here is a paper that uses traffic information from Alexa to disprove this theory. It turns out that queries on search engines are very diverse. This leads to sites appearing towards the top that more specifically target the keywords given. For example Google's Udi Manber said "20 to 25% of the queries we see today, we have never seen before".
Current traffic from Alexa more closely follows the random surfer model, or discovering of web pages by viewing non-search web pages and clicking on links. It is good to see that worrisome theories are being put to the test.
Here is a paper that uses traffic information from Alexa to disprove this theory. It turns out that queries on search engines are very diverse. This leads to sites appearing towards the top that more specifically target the keywords given. For example Google's Udi Manber said "20 to 25% of the queries we see today, we have never seen before".
Current traffic from Alexa more closely follows the random surfer model, or discovering of web pages by viewing non-search web pages and clicking on links. It is good to see that worrisome theories are being put to the test.
Wednesday, October 29, 2008
Pandora.com
For a time I had no hope that recommender systems like Amazon.com's "Recommended for You" section would be useful to me specifically. The predictions were often predictable. Buy a CD from artist A and get a list of the most popular CD's from that artist. Not useful.
Some time ago I came across Pandora.com, which is an adapting radio station, which chooses songs to play based on what songs you have added to a station and what songs you rate positively. I actually learned of several songs and artists I was unfamiliar with that I now like (such as "Question Everything" by 8Stops7). However, it does not play all songs that are similar to the songs I tell it. And some days I find myself disagreeing with all songs played.
I think that as time goes on recommender systems will improve and we will give some credibility to recommenders. Perhaps the Netflix prize will help in that regard.
Some time ago I came across Pandora.com, which is an adapting radio station, which chooses songs to play based on what songs you have added to a station and what songs you rate positively. I actually learned of several songs and artists I was unfamiliar with that I now like (such as "Question Everything" by 8Stops7). However, it does not play all songs that are similar to the songs I tell it. And some days I find myself disagreeing with all songs played.
I think that as time goes on recommender systems will improve and we will give some credibility to recommenders. Perhaps the Netflix prize will help in that regard.
Netflix Recommender System
Netflix is trying to motivate research in the area of recommender systems and on Oct. 2, 2006 offered $1 million to anyone that could improve upon their current recommender system by a specific measure (improve RMSE by 10%). Recently I took a look at the current standings and one team is very close (improvement around 9%). Interestingly enough they had a few papers showing how they do it.
Specifically what we are talking about is collaborative filtering. There are two main approaches, either you look for global patterns in the matrix of ratings or you use the ratings from similar items or users. BellKor (team name) was able to successfully merge these two ideas into a single solution that outperformed (at the time of submission) any other approaches using one of the two approaches.
What impressed me most about the paper I read (Factorization Meets the Neighborhood: a Multifaceted Collaborative Filtering Model) was that in addition to testing RMSE for the test, they tried to look at the users perspective. We want to know what movie to watch now. They compared other approaches against theirs on whether they would recommend in the top 5 or top 20 a movie you would watch and rate a 5. Well done. We should all keep the end user in mind.
Any one have a really good or bad experience with recommendations made by computers?
Specifically what we are talking about is collaborative filtering. There are two main approaches, either you look for global patterns in the matrix of ratings or you use the ratings from similar items or users. BellKor (team name) was able to successfully merge these two ideas into a single solution that outperformed (at the time of submission) any other approaches using one of the two approaches.
What impressed me most about the paper I read (Factorization Meets the Neighborhood: a Multifaceted Collaborative Filtering Model) was that in addition to testing RMSE for the test, they tried to look at the users perspective. We want to know what movie to watch now. They compared other approaches against theirs on whether they would recommend in the top 5 or top 20 a movie you would watch and rate a 5. Well done. We should all keep the end user in mind.
Any one have a really good or bad experience with recommendations made by computers?
Tuesday, September 30, 2008
Do Good Grades Predict Success? (Freakonomics blog entry)
I recently read the post in the title of this blog entry at the Freakonomics blog, which I frequent. I love the question and have wondered myself some of the of the following related questions:
- Do grades measure our understanding or ability to learn?
- How fair is it to compare grades of different students from different schools, classes, teachers? (Some teachers are "easy" and some "hard".)
- Looking for a job.
- Interviewing well.
- Being a programmer in the real-world.
Subscribe to:
Posts (Atom)