[Search-l] [canonizers] Fwd: Re: NPOV for Search?

Bryan Bishop kanzure at gmail.com
Wed Jan 9 23:22:13 UTC 2008


On Wednesday 09 January 2008, Jimmy Wales wrote:
> Yes, this is my intention, but perhaps I should be specific, so that
> we can check if we are more or less on the same page here.
<snip>

Transparency of editorial decisions. Agree. 

> Leaving aside for a moment personal customization (a concept I think
> we should partly leave for the future and which I think has limited
> usefulness across a broad spectrum of searches anyway), when we type
> "Barack Obama" with no qualifiers, or "Thai food" with no qualifiers,
> I think the community will generally agree (within some very broad
> parameters) about what kinds of sites should show up.

(Ideally, in the MPOV search model, the users would provide their 
context so that they do not have to provide the qualifiers, and I bet 
some people can figure out a way to build up qualifiers behind the 
scenes (editable, of course) to track a user's search progress if they 
want to see what fruitful results might turn up.)

> There is a notion of "quality search result" which can be
> articiulated by most users, and there can be a general consensus
> about what they look like, even as we might quibble endlessly over
> the details.

Oh, yes, definitely. But that's the old way, and a new way can coexist. 
One where we are allowed to our own qualities.

> This is very similar, by the way, to neutrality as it plays out in
> Wikipedia.  Almost everyone agrees (and I would go further and say
> that everyone _serious about the question_ agrees) that an
> encyclopedia article about Barack Obama should not say either of
> "Obama sucks!!11!" or "Ombada rulezzzy!"  We may not agree about
> Obama, and we may indeed have some difficult and serious questions
> about how the article should read, but we do nonetheless have a broad
> middle ground where we can function effectively because we have a
> shared understanding of what a quality encyclopedia article should
> look like.

Jimmy, I do not know if you have been reading the other emails on the 
list, but some others have been talking about the way that we're 
talking about 'bias' and 'neutrality' here. We're not talking about 
editorial bias like on Wikipedia, which is a very serious issue and I 
agree there. But we're talking about automated *selection* for just one 
person. One person's experience. We want to give them the best 
searching experience they'll ever have, with the most relevant results 
tailored to *them* and not to the average and not to a general 
consensus. Granted, this might be called discrimination or stereotyping 
of some users ... but for some it might very well work better than no 
tailoring at all.

> We may not agree on every detail of what should be contained in a
> search result, of course.  But I think most people would agree that
> pages trying to sell subscriptions to a gambling website, pages which
> contain the single word "Obama" repeated 10,000 time along with
> pornographic pictures linking to a site selling subscriptions to a
> porn site, pages which are incoherent, pages which used to be a real
> page but which now are just link farms... these are bad search
> results and should not be there.

Yeah, I think that's squarely under the definition of spam.

> So I think it is perfectly consistent to say "leave the moral compass
> up to the users" while also saying that we want to return good web
> pages and not return bad ones.

Yep.

> > Anyway, some of the guys in #wikiasearch on freenode have pointed
> > out that this is too philosophical and getting into some 'heated
> > debate'. I want to say that this is not my intention. I started
> > this thread for the expressed purpose of adding value to search
> > engines. I think that this discussion can result in fruitful code.
> > (But, equally, in good intentions, I'll readily stop if our
> > benevolent leader asks me so.)
>
> I think this discussion has been absolutely wonderful and has been
> conducted in the spirit of rational inquiry.  I think it is of course
> true that being philosophical forever doesn't result in any useful
> code being written, but on the other hand, I think a shared

Maybe I'll start drafting up an API for the type of module compatible 
system that I am thinking of. I don't know if you do any programming, 
but it sounds like I'll have to take a look at nutch? (And then the web 
interface can come after.)

> One of the great things a discussion like this can do is to help us
> avoid treating the results of some algorithm as being God-like and
> unquestionable.

Many people, many gods. And hopefully, one day, many algorithms. ;)

> This story may be apocryphal, but I have been told it by someone who
> worked with the Altavista team, back when Google was just starting to
> stomp them into the ground.  Apparently, the Altavista team had an
> ideological bias against using link text to generate keywords for a
> web page, in any way, shape or form.  They had a view of what a
> search algorithm was supposed to look like, and stuck to it come hell
> or high water.

That's a horror story if I've ever heard one.

> It would be easy for us to fall into the same trap.  "Sort and rank
> the results based on what the user feedback says about the underlying
> websites, and if things look weird at the end, that's the fault of
> the users or of the person who thinks it looks weird, because that's
> the Judgment Of The Community(tm)."

Ideally, the MPOV modules would be selectable by any of the users, so if 
they think the community is wrong with their algorithms, they can come 
up with a new variation on it all and create a better searching 
experience. 

> Well, I think we rather need some broad shared understanding of what
> high quality search results look like, so that we can evaluate the
> algorithm that takes community input -> long tail search results and
> see if it seems sane... and probably we will find that it is same in
> some areas and not others, etc.  And we will have to revise, revise,
> revise.

You are aiming for one large, over-ruling community? It doesn't have to 
be that way.

- Bryan
________________________________________
Bryan Bishop
http://heybryan.org/



More information about the Search-l mailing list