[Search-l] Wikia - Global focus or country level search?

John McCormac jmcc at hackwatch.com
Thu Aug 16 12:24:39 UTC 2007


jer wrote:
>>> There isn't "a" plan at all, there's lots of ideas and projects,  
>>> and  many more to come.
>>
>>
>> So there isn't "a" plan then. Is there even a strategy?
> 
> 
> Transparency, Community, Quality, and Privacy.  The strategy is to  
> support development of any search technology or resource according to  
> those four principles.

Finally some answers. :)

>> No. I was building a search index and cleaning out rubbish from it  - 
>> real search engine work.
> 
> 
> Real closed search engine work.

No. It is all highly automated at this stage using software tools that 
I've developed over seven years or so of search engine work. The process 
has to take about 2M websites and produce a viable index. It also has to 
classify each website and build a metadata model for each site (Peter 
Burden's idea is similar but this includes some more elements so that 
the detection of clones / junk / coming soon/ ppc / geolocation/ 
framesrc/ metadata/ similarities etc is efficient for a country level 
search engine).

>> A bit of dot.bomb dejavue. Search is a business. Those who don't  
>> approach it as such end up getting devoured by it. That's why  Google, 
>> Yahoo and Microsoft dominate the market. They don't blindly  hope that 
>> things will happen. They make them happen.
> 
> 
> A quote I just saw seems appropriate:
> 
> "Hope is definitely not the same thing as optimism. It is not the  
> conviction that something will turn out well, but the certainty that  
> something makes sense, regardless of how it turns out." - Vaclav Havel.

Though to those who have faced Google/Yahoo/Microsoft in the search 
market, the quote that "the basis for optimism is sheer terror" could be 
apt. Though you might reply with "action is the last refuge of those who 
cannot dream". But beyond all that wild literary abandon, the reality is 
that work has to be done.

> Collective knowledge?  This is about *creating* knowledge, new ways  of 
> organizing and building a search engine openly.  If you're talking  
> about collecting content and indexing, we're just not there yet, but  
> getting closer and closer.

Perhaps "search expertise" would have been a better phrase. What I was 
talking about was if there was any knowledge in Wikia on which people 
could draw on to solve problems.

> Now that some of the kinks are worked out of grub.org we can finally  
> get the source repos online and begin expanding the vision  
> significantly.  The resulting crawls/updates will be available for  any 
> OS developer to start experimenting with, at least lowering the  bar a 
> little bit, it's a step and an experiment.

When will the first data be available? And is there any idea as to the 
size of the dataset?

Regards...jmcc
-- 
******************************************************
John McCormac  *  e-mail: jmcc at whoisireland.com
MC2            *  voice:  +353-51-873640
22 Viewmount   *  web:  http://www.whoisireland.com/
Waterford      *  blog: http://blog.whoisireland.com
Ireland        *  Irish Domain Stats & Market Research
******************************************************



More information about the Search-l mailing list