[Search-l] [canonizers] Fwd: Re: NPOV for Search?

Jimmy Wales jwales at wikia.com
Wed Jan 9 16:43:04 UTC 2008


Bryan Bishop wrote:
> What about a balanced selection of 'bad' web pages? What about a search 
> that can leave the moral compass (that's what good/bad is, after all) 
> up to its users?

Yes, this is my intention, but perhaps I should be specific, so that we 
can check if we are more or less on the same page here.

My goal is to take every point in the search engine process where there 
is an editorial decision, and push that decision out of the company and 
into the community.  That's a goal, but of course it will take time to 
achieve it, indeed it will take time to even determine how to do it in 
many cases.

Leaving aside for a moment personal customization (a concept I think we 
should partly leave for the future and which I think has limited 
usefulness across a broad spectrum of searches anyway), when we type 
"Barack Obama" with no qualifiers, or "Thai food" with no qualifiers, I 
think the community will generally agree (within some very broad 
parameters) about what kinds of sites should show up.

There is a notion of "quality search result" which can be articiulated 
by most users, and there can be a general consensus about what they look 
like, even as we might quibble endlessly over the details.

This is very similar, by the way, to neutrality as it plays out in 
Wikipedia.  Almost everyone agrees (and I would go further and say that 
everyone _serious about the question_ agrees) that an encyclopedia 
article about Barack Obama should not say either of "Obama sucks!!11!" 
or "Ombada rulezzzy!"  We may not agree about Obama, and we may indeed 
have some difficult and serious questions about how the article should 
read, but we do nonetheless have a broad middle ground where we can 
function effectively because we have a shared understanding of what a 
quality encyclopedia article should look like.

We may not agree on every detail of what should be contained in a search 
result, of course.  But I think most people would agree that pages 
trying to sell subscriptions to a gambling website, pages which contain 
the single word "Obama" repeated 10,000 time along with pornographic 
pictures linking to a site selling subscriptions to a porn site, pages 
which are incoherent, pages which used to be a real page but which now 
are just link farms... these are bad search results and should not be there.

So I think it is perfectly consistent to say "leave the moral compass up 
to the users" while also saying that we want to return good web pages 
and not return bad ones.

> Anyway, some of the guys in #wikiasearch on freenode have pointed out 
> that this is too philosophical and getting into some 'heated debate'. I 
> want to say that this is not my intention. I started this thread for 
> the expressed purpose of adding value to search engines. I think that 
> this discussion can result in fruitful code. (But, equally, in good 
> intentions, I'll readily stop if our benevolent leader asks me so.)

I think this discussion has been absolutely wonderful and has been 
conducted in the spirit of rational inquiry.  I think it is of course 
true that being philosophical forever doesn't result in any useful code 
being written, but on the other hand, I think a shared understanding of 
the sorts of things we are looking to achieve, and some reflection on 
what our editorial goals for the end product are, can be quite useful.

One of the great things a discussion like this can do is to help us 
avoid treating the results of some algorithm as being God-like and 
unquestionable.

This story may be apocryphal, but I have been told it by someone who 
worked with the Altavista team, back when Google was just starting to 
stomp them into the ground.  Apparently, the Altavista team had an 
ideological bias against using link text to generate keywords for a web 
page, in any way, shape or form.  They had a view of what a search 
algorithm was supposed to look like, and stuck to it come hell or high 
water.

It would be easy for us to fall into the same trap.  "Sort and rank the 
results based on what the user feedback says about the underlying 
websites, and if things look weird at the end, that's the fault of the 
users or of the person who thinks it looks weird, because that's the 
Judgment Of The Community(tm)."

Well, I think we rather need some broad shared understanding of what 
high quality search results look like, so that we can evaluate the 
algorithm that takes community input -> long tail search results and see 
if it seems sane... and probably we will find that it is same in some 
areas and not others, etc.  And we will have to revise, revise, revise.

--Jimbo




More information about the Search-l mailing list