[Search-l] wrong questions
Paul Vixie
vixie at isc.org
Sun May 11 22:53:15 UTC 2008
various folks here, as well as sethf in his blogs, have asked plenty of hard
questions about wikia.com's search project -- who owns it, is it truly open,
what is wikia's agenda, and so on. these are the wrong questions, since they
focus on the company who sponsors the work, and not on the work itself. i've
been trying to think of the right questions. here's an approximation of what
critics of this project ought to be investigating.
1. why is ISC's the only backend? jer's vision is backend syndication, so,
if his XML schema is stable and if there's at least one f/l/oss implementation
of crawling and of indexing, then, why aren't there more crawlers and more
indexers, conforming to jer's XML, possibly flooding data between each other
and possibly dividing up the workload so that all crawlers don't have to
crawl all sites? ISC ought to have peers, and we ought to be able to have
gentlemen's agreements, like, "we'll do [a-l].com, you do [m-z].com", etc.
2. why is Wikia's the only frontend? again referring to the syndication
model, and knowing that there are other "social search engines", when will
we see someone other than wikia use ISC's backend, or any other backend whose
data can be reached using jer's XML?
3. who is driving the syndication model? it's clear that ISC knows how to
provide network and power, and that jer knows how to design the system and
build various parts of it, but who is the champion for jer's vision -- who
will drive us to better answers for #1 and #2 above? who ought to be in here
answering critics and beating the drum, which is a distraction to jer (and
candidly he's too busy to do this part well unless he drops other stuff
that's already late)? remembering that jimbo keeps this issue alive in the
press, the overall project still lacks a day to day "programme manager".
4. what else is jer working on? has wikia dedicated him to this project or
does he also handle day to day fire fighting on wikia's existing services to
justify his paycheck? and while we're on that topic, what other personnel
has wikia dedicated to this -- how seriously are they really taking it, in
terms of cash on the barrel head?
5. who else is working on this, outside of wikia? what outside volunteers
or wikia competitor's employees have commit access to the source pool for
the crawler, or indexer, or front end, or have root access to the donated
back end machines hosted by ISC? if the answer is nobody, then is that due
to lack of outreach (see #3 above) or is it wikia's preference that outsiders
contribute content rather than code and sysops? (is that written anywhere?)
6. where are the mini-articles stored? if outside volunteers are mostly
contributing data, is that data stored on wikia's front end? if so, what are
the redistribution terms -- would wikia flood this data to competing front
end operators, and accept incoming floods of similar data from competitors?
or, is this the "secret sauce", there's no way to get access to contributed
data of this kind except one article at a time, inside wikia's advertising
system?
7. given that the idea of "taking on google" is silly, given their size and
focus and ambition and brand strength and so on, and that what we can
actually hope to achieve with this project is to change the game and make
search part of the internet infrastructure, where are the white papers,
journal articles, and outreach glossies explaining what the new world of
internet search could look like, and what effect this change will have on
google, microsoft, yahoo, and the current market hierarchy, and the rest of
the "social search" scene?
8. has anybody reached out to yahoo and microsoft to see if they'd like to
join this effort or at least sponsor it, since as #2 and #3 in internet
search today, they're the ones with the most to gain if we change the game.
and if nobody's doing this now, and i did it, what would wikia say about
sharing the sponsorship burden with other players, perhaps larger players?
this list of questions isn't meant to be exhaustive. but as in my own
controversial efforts over the years, i find the quality of criticism here
somewhat low. forget about jimbo and his tv news girlfriend and the instant
messenger chat logs. forget about wikia's corporate interests, or whether
wikipedia was a once in a lifetime event, or whether wikipedia's admins are
running amok, or what the wikimedia foundation board is up to. there are
plenty of excellent questions about the wikia-sponsored search project whose
backend is hosted at ISC, which are not salacious or even delicate. the
above list is meant to show the kind of questions i mean.
for the record, i'd like it if jer did not jump in and answer any of these,
because he's got more important stuff to do, and because i'd like to see
wikia provide another face, another name, another voice, to this forum.
also for the record, ISC's hosting of this project has been a cash neutral
event for us, which is important since we don't have cash for this kind of
thing. the 15-ton air handler wikia bought feeds a room that has other
projects in it too, and our network is a fixed cost, and wikia has agreed
to pay for the power we use for search, and the servers were all donated,
and that donation was targetted for this project, and we got a lot more
servers than we needed, and we've been passing the excess along to other
f/l/oss and internet security projects. so no matter whether this project
changes the world, ISC is already winning.
--
Paul Vixie
More information about the Search-l
mailing list