[Search-l] Wikia - Global focus or country level search?

John McCormac jmcc at hackwatch.com
Mon Aug 13 15:25:17 UTC 2007


peter burden wrote:
> This suggests that it is possible, accurately, to determine the country 
> of a web-site.
> This isn't possible. Neither DNS nor IP block based geo-coding will 
> deliver the information
> for a variety of reasons. And should a national SE ignore global 
> content? If I (in England)
> am searching for a restaurant, I'm clearly not interested in those in 
> Seattle, but if I'm
> searching for information about a computing problem, I really am 
> interested in what
> a certain company in the Seattle area might have to say.

Actually it is possible to get group sites by country with a reasonable 
level of accuracy. The more mature markets tend to be a lot easier 
because the bulk of the country's domains/sites are hosted on that 
country's IP space. However it requires a lot more than simple DNS/IP 
analysis to get that level of accuracy up from the 
Google/Yahoo/Microsoft level.

Whether a national SE should ignore global content is, as they used to 
say in the "Father Ted" sitcom, an ecumenical question. But it does 
strengthen the argument for a clustered, global/country indices approach.

What is more difficult is identifying associated sites outside the 
country's IP/DNS space. This is where Google/Yahoo/Microsoft fails when 
dealing with TLDs/gTLDs associated with a country. It is also a lot 
easier than it appears though I only ran the algorithm for identifying 
Irish sites. It had a relatively high success rate (about 95%) but that 
is due to the quality of data I have on the Irish domain space. I could 
probably apply it with the lower quality UK data (less than 40% .uk 
coverage) that I have here to see if the algorithm would work as well.

Regards...jmcc
-- 
******************************************************
John McCormac  *  e-mail: jmcc at whoisireland.com
MC2            *  voice:  +353-51-873640
22 Viewmount   *  web:  http://www.whoisireland.com/
Waterford      *  blog: http://blog.whoisireland.com
Ireland        *  Irish Domain Stats & Market Research
******************************************************



More information about the Search-l mailing list