[Search-l] wrong questions
Paul Vixie
vixie at isc.org
Thu May 15 15:14:11 UTC 2008
"I love the smell of napalm in the morning...smells like...victory."
jwales at wikia.com (Jimmy Wales) writes:
> Paul Vixie wrote:
> > 1. why is ISC's the only backend? ...
>
> We strongly support this notion of backend syndication, and we are
> hopeful that as the infrastructure and protocols mature, we will get more
> and more people working with us on this.
working with which "us"? is it wikia's remit to recruit new co-backends,
to add flooding features to the backend so that crawlers can bulk-share
their findings with other crawlers? if so, please be as transparent as
possible about it, up to and including writing "open letters" to prospects
which are cc'd to search-l, and doing an RFC-like public review on search-l
of the protocols and mechanisms that will be used at the backend-backend
layer. this request follows from my observation that everybody is excited
about search syndication, but the voices we most need to hear belong to
people who don't want to be "along for the ride" with wikia as sole driver.
(note, i am NOT saying that being a sole driver was wikia's intent, only
that if somebody had other reasons to question this effort, they could use
any lack of transparency as support for their assertion that wikia was the
sole driver and the rest of us were along for the ride, untrue though it is.)
the first step on this path would seem to be an exhaustive list of other
search crawlers/indexers, with a confidence rating of each, such that we'd
all know who the players are and we'd all know how likely it was that each
of them would want to be part of a coordinated universal search syndicate.
(for example, i'd assign confidence level "0%" to google.)
> > 3. who is driving the syndication model? ...
>
> We just transitioned our New York office to fulltime work on the search
> project, and Dan Lewis is being put fulltime on the task of community
> outreach: answering critics, beating the drum, and doing the detailed
> work of working with inbound inquiries from potential partners who are
> already interested, outreach to potential partners who are not yet
> interested, etc.
i hope i get to meet dan, in my ISC role. i hope everybody on search-l gets
to meet dan or at least hear from dan, in our roles as interested supporters
in the syndicated search model that wikia is championing here.
> > 4. what else is jer working on? ...
>
> Jer is fulltime on search, as are several others. Dennis, Seth,
> Jeffrey, David, Aaron, Dan... I feel that I am forgetting someone.
>
> We are prepared to ramp up our commitment as we start to get traction,
> as well. At the present time, every time I ask the team what we need to
> buy, they say "not yet, we are coding". :)
at the backend, i want to suggest that even though we have another thousand
or so donated servers in our warehouse, they are power-inefficent anti-green,
and it might be good to cap the power utilization at 20kW (where it is now)
and eventually buy more modern 1U's as a way to get more computrons here. we
could throw another 10kW at it during a transition, to avoid downtime, but in
the long run we need all Pentium III chips to move to museums, or landfills,
and starting pretty soon, Pentium 4 chips as well.
> > 5. who else is working on this, outside of wikia? ...
>
> Strong preference that we get lots of people coding on a fully open
> system, as they like it. I think so far we have not done a great job of
> outreach, but then again, we have not had everything in place to get
> people oriented and started.
>
> Also, we view ourselves as a "good neighbor" part of the existing Nutch
> project: Dennis is a Nutch committer who is starting to work on a set of
> ideas he is calling "Nutch 2.0".
domain names are brands, and swlabs.org isn't a meaningful one for an SVN
used by this project. i suggest moving it to svn.search.wikia.com or
svn.search.isc.org or similar, and adding a bug tracker, or installing a
sourceforge instance, or similar. a wiki by itself does not a community
make.
> > 6. where are the mini-articles stored? ...
>
> It's all GFDL, and we make available database dumps. We would have to
> consider a "flood" of incoming data from a community/editorial point of
> view, but totally welcome it, and are totally committed to sharing
> everything extremely liberally.
that's good to hear. it'll be even better to have a PoC. who here can run
a non-wikia.com social search backend, which can consume these database dumps
and allow local editing/authorship of mini-articles, so that the editorial
considerations of this kind of flooding can be learned while the monster is
still in its larval stage? note that if it's free and open, ISC could host
it, and i'd even be willing to throw a couple more P-III's and P4's on the
stove for it, as long as nobody at ISC has to do any sysadmin on them.
> > 7. given that the idea of "taking on google" is silly, ...
>
> I think this is a really great question. :)
>
> One of the things I have been arguing is that we are no threat to google
> even if we are wildly successful at "making search part of the internet
> infrastructure" as you put it...
>
> Google's brand is tied up with search, but Google's business is not
> searhc, per se, but the matching of advertisements to user actions and
> intentions online. The threat to google is not an open source
> alternative that helps 1,000 small competitors to flourish, but a single
> large proprietary competitor (Powerset?) that captures enough market
> share to take away the advertising marketplace.
>
> 1,000 small competitors are much more likely to simply partner with
> Google for ad revenues, because buyers go where the sellers are, and
> sellers go where the buyers are.
you've gotta write a whitepaper to that effect. not just an e-mail message
and not a wiki article, but an honest to gods high end marketing hit piece
which you roll up and brandish at audiences when you speak about wikia
search. the noncredibility of the early claims about "taking on google" is
the biggest weakness wikia search has got.
> > 8. has anybody reached out to yahoo and microsoft to see if they'd like to
> > join this effort or at least sponsor it, since as #2 and #3 in internet
> > search today, they're the ones with the most to gain if we change the game.
> > and if nobody's doing this now, and i did it, what would wikia say about
> > sharing the sponsorship burden with other players, perhaps larger players?
>
> We have done some of this, and would be eager to support you if you want
> to help us with it. We can talk privately about the status of current
> talks but there is nothing to report and nothing likely to happen right
> away... but there are a lot of interested parties in the industry.
ok, have dan call me, i understand that strategic relationships like this are
covered by NDA during negotiations, and that i need to know what irons are
in which fires before i go poking around. note that more whitepapers about
the design and goals of all this would help me in that kind of outreach.
--
Paul Vixie
More information about the Search-l
mailing list