Introduction

Ok, so smaller result sets show an MSN Search bias to IIS... what about a random selection of 1000 words? What happens then?

So one dictionary file and perl shuffle later.... et viola

The Results:

The results might need a bit more processing... I think unreachable hosts (that currently drop into the 'Unknown' category) might be altering the results slightly.

But there's still a 7% IIS bias with this large set of completely random words

A few of the accented characters are causing problems, I'll sort out escaping the character codes asap.

Detailed Results:

The full set of 1000 words creates a rather heavy page so the detailed per word analysis is now on a separate page and can be found here.

Out of the full set of words (currently 984 actually) a startling trend can be seen by looking at the number of times Apache is the lowest counted webserver for each search engine and IIS the highest. Using MSN search you are three times as likely to get a result set containing more IIS server results than using Google, Teoma or Yahoo.

 


These graphs show the number of times a particular search engine had the lowest count of Apache servers in its result set, or the highest number of IIS servers. With an unbiased set of results you would expect the ratio to be even across each search engine.

 

 

MSN - 502
Google - 216
Teoma - 216
Yahoo - 161
MSN - 590
Google - 199
Teoma - 155
Yahoo - 122

 

Top 100 summary

This is the total results totted up across all words between Google and MSN search.

As you can see the results for Google, Teoma and Yahoo are fairly consistent, but the MSN search results are significantly skewed.

The numbers right at the bottom of the table show the total count of times where over the full list of words each search engine returned either the lowest number of Apache servers or the highest number of IIS servers

Server%
Apache65.03%
IIS30.89%
Other4.08%

Server%
Apache67.68%
IIS26.37%
Other5.95%

Server%
Apache71.38%
IIS23.50%
Other5.11%

Server%
Apache69.83%
IIS24.66%
Other5.50%

 

Top 10 summary

This shows the distribution of webservers just within the crucial top ten front page.

The results are inline with the top 100 results.

Server%
Apache66%
IIS31%
Other5%

Server%
Apache68%
IIS27%
Other6%

Server%
Apache72%
IIS24%
Other6%

Server%
Apache70%
IIS25%
Other6%

Back to main results.

All content Copyright © 2005 Ivor Hewitt.

http://www.ivor.it - Technology - http://www.ivor.org - The Hedge.