[Search-l] Wikia - Global focus or country level search?
John McCormac
jmcc at hackwatch.com
Mon Aug 13 15:25:17 UTC 2007
peter burden wrote:
> This suggests that it is possible, accurately, to determine the country
> of a web-site.
> This isn't possible. Neither DNS nor IP block based geo-coding will
> deliver the information
> for a variety of reasons. And should a national SE ignore global
> content? If I (in England)
> am searching for a restaurant, I'm clearly not interested in those in
> Seattle, but if I'm
> searching for information about a computing problem, I really am
> interested in what
> a certain company in the Seattle area might have to say.
Actually it is possible to get group sites by country with a reasonable
level of accuracy. The more mature markets tend to be a lot easier
because the bulk of the country's domains/sites are hosted on that
country's IP space. However it requires a lot more than simple DNS/IP
analysis to get that level of accuracy up from the
Google/Yahoo/Microsoft level.
Whether a national SE should ignore global content is, as they used to
say in the "Father Ted" sitcom, an ecumenical question. But it does
strengthen the argument for a clustered, global/country indices approach.
What is more difficult is identifying associated sites outside the
country's IP/DNS space. This is where Google/Yahoo/Microsoft fails when
dealing with TLDs/gTLDs associated with a country. It is also a lot
easier than it appears though I only ran the algorithm for identifying
Irish sites. It had a relatively high success rate (about 95%) but that
is due to the quality of data I have on the Irish domain space. I could
probably apply it with the lower quality UK data (less than 40% .uk
coverage) that I have here to see if the algorithm would work as well.
Regards...jmcc
--
******************************************************
John McCormac * e-mail: jmcc at whoisireland.com
MC2 * voice: +353-51-873640
22 Viewmount * web: http://www.whoisireland.com/
Waterford * blog: http://blog.whoisireland.com
Ireland * Irish Domain Stats & Market Research
******************************************************
More information about the Search-l
mailing list