[Search-l] wrong questions
Mark (Markie)
newsmarkie at googlemail.com
Thu May 15 14:29:11 UTC 2008
:-D thanks
mark
On Thu, May 15, 2008 at 3:25 PM, Jimmy Wales <jwales at wikia.com> wrote:
> Paul Vixie wrote:
> > 1. why is ISC's the only backend? jer's vision is backend syndication,
> so,
> > if his XML schema is stable and if there's at least one f/l/oss
> implementation
> > of crawling and of indexing, then, why aren't there more crawlers and
> more
> > indexers, conforming to jer's XML, possibly flooding data between each
> other
> > and possibly dividing up the workload so that all crawlers don't have to
> > crawl all sites? ISC ought to have peers, and we ought to be able to
> have
> > gentlemen's agreements, like, "we'll do [a-l].com, you do [m-z].com",
> etc.
>
> We strongly support this notion of backend syndication, and we are
> hopeful that as the infrastructure and protocols mature, we will get
> more and more people working with us on this.
>
> > 2. why is Wikia's the only frontend? again referring to the syndication
> > model, and knowing that there are other "social search engines", when
> will
> > we see someone other than wikia use ISC's backend, or any other backend
> whose
> > data can be reached using jer's XML?
>
> Hopefully people can start on this soon... as has already been pointed
> out, a great start might be to use our code... sounds like Jer is
> deciding on the license now.
>
> > 3. who is driving the syndication model? it's clear that ISC knows how
> to
> > provide network and power, and that jer knows how to design the system
> and
> > build various parts of it, but who is the champion for jer's vision --
> who
> > will drive us to better answers for #1 and #2 above? who ought to be in
> here
> > answering critics and beating the drum, which is a distraction to jer
> (and
> > candidly he's too busy to do this part well unless he drops other stuff
> > that's already late)? remembering that jimbo keeps this issue alive in
> the
> > press, the overall project still lacks a day to day "programme manager".
>
> We just transitioned our New York office to fulltime work on the search
> project, and Dan Lewis is being put fulltime on the task of community
> outreach: answering critics, beating the drum, and doing the detailed
> work of working with inbound inquiries from potential partners who are
> already interested, outreach to potential partners who are not yet
> interested, etc.
>
> > 4. what else is jer working on? has wikia dedicated him to this project
> or
> > does he also handle day to day fire fighting on wikia's existing services
> to
> > justify his paycheck? and while we're on that topic, what other
> personnel
> > has wikia dedicated to this -- how seriously are they really taking it,
> in
> > terms of cash on the barrel head?
>
> Jer is fulltime on search, as are several others. Dennis, Seth,
> Jeffrey, David, Aaron, Dan... I feel that I am forgetting someone.
>
> We are prepared to ramp up our commitment as we start to get traction,
> as well. At the present time, every time I ask the team what we need to
> buy, they say "not yet, we are coding". :)
>
> > 5. who else is working on this, outside of wikia? what outside
> volunteers
> > or wikia competitor's employees have commit access to the source pool for
> > the crawler, or indexer, or front end, or have root access to the donated
> > back end machines hosted by ISC? if the answer is nobody, then is that
> due
> > to lack of outreach (see #3 above) or is it wikia's preference that
> outsiders
> > contribute content rather than code and sysops? (is that written
> anywhere?)
>
> Strong preference that we get lots of people coding on a fully open
> system, as they like it. I think so far we have not done a great job of
> outreach, but then again, we have not had everything in place to get
> people oriented and started.
>
> Also, we view ourselves as a "good neighbor" part of the existing Nutch
> project: Dennis is a Nutch committer who is starting to work on a set of
> ideas he is calling "Nutch 2.0".
>
> > 6. where are the mini-articles stored? if outside volunteers are mostly
> > contributing data, is that data stored on wikia's front end? if so, what
> are
> > the redistribution terms -- would wikia flood this data to competing
> front
> > end operators, and accept incoming floods of similar data from
> competitors?
> > or, is this the "secret sauce", there's no way to get access to
> contributed
> > data of this kind except one article at a time, inside wikia's
> advertising
> > system?
>
> It's all GFDL, and we make available database dumps. We would have to
> consider a "flood" of incoming data from a community/editorial point of
> view, but totally welcome it, and are totally committed to sharing
> everything extremely liberally.
>
> > 7. given that the idea of "taking on google" is silly, given their size
> and
> > focus and ambition and brand strength and so on, and that what we can
> > actually hope to achieve with this project is to change the game and make
> > search part of the internet infrastructure, where are the white papers,
> > journal articles, and outreach glossies explaining what the new world of
> > internet search could look like, and what effect this change will have on
> > google, microsoft, yahoo, and the current market hierarchy, and the rest
> of
> > the "social search" scene?
>
> I think this is a really great question. :)
>
> One of the things I have been arguing is that we are no threat to google
> even if we are wildly successful at "making search part of the internet
> infrastructure" as you put it...
>
> Google's brand is tied up with search, but Google's business is not
> searhc, per se, but the matching of advertisements to user actions and
> intentions online. The threat to google is not an open source
> alternative that helps 1,000 small competitors to flourish, but a single
> large proprietary competitor (Powerset?) that captures enough market
> share to take away the advertising marketplace.
>
> 1,000 small competitors are much more likely to simply partner with
> Google for ad revenues, because buyers go where the sellers are, and
> sellers go where the buyers are.
>
> > 8. has anybody reached out to yahoo and microsoft to see if they'd like
> to
> > join this effort or at least sponsor it, since as #2 and #3 in internet
> > search today, they're the ones with the most to gain if we change the
> game.
> > and if nobody's doing this now, and i did it, what would wikia say about
> > sharing the sponsorship burden with other players, perhaps larger
> players?
>
> We have done some of this, and would be eager to support you if you want
> to help us with it. We can talk privately about the status of current
> talks but there is nothing to report and nothing likely to happen right
> away... but there are a lot of interested parties in the industry.
>
> > this list of questions isn't meant to be exhaustive. but as in my own
> > controversial efforts over the years, i find the quality of criticism
> here
> > somewhat low.
>
> :-) Quality criticism is extremely valuable.
>
> > also for the record, ISC's hosting of this project has been a cash
> neutral
> > event for us, which is important since we don't have cash for this kind
> of
> > thing. the 15-ton air handler wikia bought feeds a room that has other
> > projects in it too, and our network is a fixed cost, and wikia has agreed
> > to pay for the power we use for search, and the servers were all donated,
> > and that donation was targetted for this project, and we got a lot more
> > servers than we needed, and we've been passing the excess along to other
> > f/l/oss and internet security projects. so no matter whether this
> project
> > changes the world, ISC is already winning.
>
> :-)
>
> --Jimbo
> _______________________________________________
> Wikia Search mailing list
> http://alpha.search.wikia.com/
> Change options or unsubscribe:
> http://lists.wikia.com/mailman/options/search-l
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.wikia.com/pipermail/search-l/attachments/20080515/faaafda3/attachment.html
More information about the Search-l
mailing list