[Search-l] Delta

qiip at freesurf.ch qiip at freesurf.ch
Thu May 10 08:24:16 UTC 2007


Nathan Braun wrote:

> On 5/9/07, *qiip at freesurf.ch <mailto:qiip at freesurf.ch>* 
> <qiip at freesurf.ch <mailto:qiip at freesurf.ch>> wrote:
>
>     Some time ago I designed and implemented a prototype system (Delta)
>     which allows users to rate search engine results[...]:
>
>     1) Ratings can be used to generate recommendations for certain
>     queries.
>     2) Ratings can be used to learn a relevance function which takes a
>     (user,query,document)-tuple as input and returns a relevance value.
>     3) Ratings can be used to cluster users into groups, a
>     prerequisite for
>     group-specific (personalized) search.
>
>
> Would this function somewhat like a "Digg" of search (with up/down 
> ratings?)?  If so, I believe this would be a phenomenally effective 
> approach.

Yes, similar. Put simply, if a (query,document)-pair gets rated highly 
often, then a search for the query will result in the document.

>
> If so, this could also effectively address Fred Bauder's concerns:
>
> -----Fred Bauder [mailto:fredbaud at waterwiki.info 
> <mailto:fredbaud at waterwiki.info>] wrote-----
> "Let's think a little bit about how users fit into the scheme. I use 
> google quite a bit; I'm not going to use a start-up since I'm there to 
> get information, not fool around....

Exactly. The user can continue using his/her favorite search engine.

>
> (ie, this could also work similar to Del.icio.us <http://Del.icio.us>)
>
> ... accessible to "everyone" (not just registered users like on Digg??).

Indeed!

> Such a voting capability, however, would necessarily need to be a) 
> personalized (so that your voting only directly applied to you and 
> influenced your own search results) and/or b) restricted access (to 
> prevent abuse).

Here is where user clustering comes in. Similar users are clustered into 
groups. Ratings from the same or related clusters are preferred. Again, 
this is just the general idea. There are quite a few subtleties and 
technical challenges involved...

>
> Spam-type results could effectively be removed from the system by a 
> community approach (similar to Wikipedia) where only "recognized and 
> accredited contributors" contribute to the overall group's total 
> datapool, in terms of ratings...

Actually clustering might take care of spam to a certain degree. Assume 
that clustering happens automatically: since the ratings of some user U 
and a spammer strongly disagree, U and the spammer will be clustered 
into very distanced groups. Hence, the spammers ratings will not affect 
the results shown to U.

Besides the (non-trivial!) technical challenges involved,  two important 
questions are:
1) are there enough users willing to invest 1-3 seconds for rating the 
results for their last query?
2) is the concept of rating (query,document)-pairs easy to understand 
and adopt?


Regards,

Joel



More information about the Search-l mailing list