[Search wiki] Scale of Search wiki

Rainer Blome rainer.blome at gmx.de
Thu Jan 24 19:59:46 UTC 2008


Jimmy Wales <jwales at wikia punkt com> wrote:
> writing mini articles is only ever going to cover the "fat head" not 
> the "long tail" of search, for sure.

Surely the set of queries can never be entirely covered by human editing,
because it is too big and changing too fast. 
But it is possible and can be beneficial to cover some of the tail.
Wikia Search is already doing it.  Here's why:

It depends on how you define "cover", "head" and "long tail".
Let us define "cover a query" here as "there is a edited content for the query" (at Wikia Search that would be a mini-article). 

In the "tail" picture, the position of a query on the x-axis is defined by
the frequency of the query, right?  Let me define two more notions:
"Dense" coverage means that all or most queries are covered, of a given
set.  "Sparse" coverage means that it is not dense, that only some queries
are covered.

Some search services expressly aim for dense coverage, down to a certain
frequency.  However, coverage does not have to be dense to make a search
service useful, it can be beneficial to sparsely cover queries.  At Wikia
Search, coverage does not need to be dense.  The coverage is determined by
the users themselves.

As an example, the first user searches for something, and doesn't find an
answer at first, then finds an answer and writes a mini article about it.
Then a second user comes along and finds the mini-article straight away. 
The frequency of the query would be "two" at this point, positioning it
almost at the end of the "long tail". Yet it was covered all right.

This can have two effects:
  First, that new queries can be covered *very* quickly (starting from the
second occurrence of the query).  Methods using the query frequency do not
care about new queries until they notice a significant number of these
queries.
  Second, that rare queries *can* be covered (not all of them, of course,
but at least those that can be answered).  This is one of the features of
Wikia Search that I love.

As an aside, both properties contribute to NPOV, because they allow
coverage of queries that are not mainstream yet or never will be.

> Even the first million most 
> popular search terms, if we become as successful a wiki as major 
> language wikipedias, would not cover the long tail.

A million human-edited articles is, I assume, much better than just 10 000. 
And they would sparsely cover some of the "long tail".

By the way, Wikia Search delivered excellent results when I searched for
"long tail", among them
[http://www.longtail.com/the_long_tail/2008/01/the-fat-tail-wi.html
 The fat tail will be human, the medium tail social, the long tail
 algorithmic].  Currently, "social" is effectively the same as "human", but
I assume that "human" was supposed to mean "editorial".  In the analogy
used there, Wikia Search's mini-articles would cover the "medium tail" (or
"fat middle", as it should have been).  

Some day, "search query" wikis such as the Wikia Search one will densely cover the fat middle.  To me, the question is not if, but when, how many, and which ones.  The race is on.

Rainer

-- 



GMX FreeMail: 1 GB Postfach, 5 E-Mail-Adressen, 10 Free SMS.
Alle Infos und kostenlose Anmeldung: http://www.gmx.net/de/go/freemail


More information about the SearchWiki mailing list