[Search-l] Wikia - Global focus or country level search?
John McCormac
jmcc at hackwatch.com
Thu Aug 16 12:24:39 UTC 2007
jer wrote:
>>> There isn't "a" plan at all, there's lots of ideas and projects,
>>> and many more to come.
>>
>>
>> So there isn't "a" plan then. Is there even a strategy?
>
>
> Transparency, Community, Quality, and Privacy. The strategy is to
> support development of any search technology or resource according to
> those four principles.
Finally some answers. :)
>> No. I was building a search index and cleaning out rubbish from it -
>> real search engine work.
>
>
> Real closed search engine work.
No. It is all highly automated at this stage using software tools that
I've developed over seven years or so of search engine work. The process
has to take about 2M websites and produce a viable index. It also has to
classify each website and build a metadata model for each site (Peter
Burden's idea is similar but this includes some more elements so that
the detection of clones / junk / coming soon/ ppc / geolocation/
framesrc/ metadata/ similarities etc is efficient for a country level
search engine).
>> A bit of dot.bomb dejavue. Search is a business. Those who don't
>> approach it as such end up getting devoured by it. That's why Google,
>> Yahoo and Microsoft dominate the market. They don't blindly hope that
>> things will happen. They make them happen.
>
>
> A quote I just saw seems appropriate:
>
> "Hope is definitely not the same thing as optimism. It is not the
> conviction that something will turn out well, but the certainty that
> something makes sense, regardless of how it turns out." - Vaclav Havel.
Though to those who have faced Google/Yahoo/Microsoft in the search
market, the quote that "the basis for optimism is sheer terror" could be
apt. Though you might reply with "action is the last refuge of those who
cannot dream". But beyond all that wild literary abandon, the reality is
that work has to be done.
> Collective knowledge? This is about *creating* knowledge, new ways of
> organizing and building a search engine openly. If you're talking
> about collecting content and indexing, we're just not there yet, but
> getting closer and closer.
Perhaps "search expertise" would have been a better phrase. What I was
talking about was if there was any knowledge in Wikia on which people
could draw on to solve problems.
> Now that some of the kinks are worked out of grub.org we can finally
> get the source repos online and begin expanding the vision
> significantly. The resulting crawls/updates will be available for any
> OS developer to start experimenting with, at least lowering the bar a
> little bit, it's a step and an experiment.
When will the first data be available? And is there any idea as to the
size of the dataset?
Regards...jmcc
--
******************************************************
John McCormac * e-mail: jmcc at whoisireland.com
MC2 * voice: +353-51-873640
22 Viewmount * web: http://www.whoisireland.com/
Waterford * blog: http://blog.whoisireland.com
Ireland * Irish Domain Stats & Market Research
******************************************************
More information about the Search-l
mailing list