[Search-l] What Is Wikia and How Real Is It?
Chris Desouza
chrisdesouza at yahoo.com
Mon Aug 6 15:46:03 UTC 2007
ah! the bickering about search dollars aside, i had a
momentary lapse in distraction.
we all have to keep this in mind - google grew
gradually. it picked up speed out of the gate and had
time and resources to shore up the tracks.
wikia search will not have this advantage where
scalability is concerned. with so much media coverage,
wikia search must be prepared to handle the search
onslaught.
a search engine launch is one party where the host
cannot afford to run out of food and drinks.
google will find it's match. and soon enough!
chris
--- Jimmy Wales <jwales at wikia.com> wrote:
> John McCormac wrote:
> > The venture is being portrayed as a Google Killer
> in the media coverage
> > and spin. The problem is that there is no actual
> basis for such a claim
> > other than it gives the media a nice soundbite and
> keeps the investors
> > happy.
>
> Actually, I think it makes the investors wonder what
> kind of lunatic I
> am. :)
>
> We are trying to downplay the "google killer" story
> line, but it is a
> great story line, and so the media runs with it
> anyway.
>
> You would get the same story lines about RedHat and
> Microsoft a few
> years ago. It's an interesting story, but has
> little relationship to
> getting some work done.
>
> > So if I read this right, there is no search
> engine?
>
> There currently is no search engine. This is a
> project to build one,
> but more importantly, to build this:
>
> > It is just an idea for a platform that is scalable
> and can be used for
> > search engine development? But without knowing the
> processing
> > requirements, the storage requirements and the
> bandwidth requirements,
> > it is difficult to design such a platform.
>
> Figuring out those things is part of the process,
> yes?
>
> > The bandwidth required to spider tens of millions
> of websites on an
> > ongoing basis is considerable. Therefore such a
> venture would need a lot
> > of available bandwidth.
> >
> > The hardware is also a very significant
> requirement. It would need a lot
> > of servers to do a proper crawl of the web. It
> would also require a
> > backend to process the resulting data into
> something usable. And a
> > search interface would be required.
>
> Yes, so that matches my own very scientific
> estimates. "a lot of
> bandwidth" and a "lot of servers". :)
>
> > The search index is the hard part. It takes a long
> time to develop a
> > good, clean index. The Infinite Monkeys approach
> to building an index
> > (following links and hoping that they will lead to
> new pages) is not the
> > most efficient method of building an index quickly
> when any of the prior
> > requirements are absent or deficient.
>
> I absolutely agree with that. I don't think anyone
> is proposing an
> Infinite Monkey approach to spidering.
>
> > A good index makes the difference between a great
> search engine and a
> > spam infested pile of junk. I'm not convinced that
> the Wikia people
> > quite appreciate the level of work that goes into
> that aspect of
> > developing a search engine. Crawling a clearly
> defined index such as
> > that of Wikipedia or some other silo site is easy.
> However crawling the
> > web is like trying to take a slice of a swirling
> nebula.
>
> Would it help if I say that I *do* appreciate the
> level of work that
> goes into that aspect of things? Not sure what you
> are looking for here.
>
> The task at the moment for me is to design the
> social aspect of the
> community part of the site. The goal is to have
> good tools to allow the
> community to control the crawl in intelligent ways.
> This is not
> Infinite Monkeys, and it has to deal with
> interesting questions about
> self-interested editors, trust, etc.
>
> > So what exactly can Wikia offer? Bandwidth?
> Hardware? Expertise? Can you
> > give us some descriptions and specifications of
> the resources and
> > expertise that is available to search engine
> developers? For most of us,
> > we have to deal with the realities imposed by
> hardware and bandwidth
> > limitations. We don't have the luxury of just
> theorising - everything we
> > do is geared towards survival in a highly
> competitive market. Perhaps we
> > SE people really are on a different wavelength to
> the Wikia people.
>
> Well, we do have the luxury of being able to provide
> hardware and
> bandwidth to the community. So we don't have to cut
> corners in those areas.
>
> > Perhaps the question foremost in the minds of many
> of the SE people on
> > this list is this: why should be provide the
> search expertise? Or, to
> > put it less diplomatically, why should we make you
> rich?
>
> I am not asking you to make me rich. If you don't
> want to participate,
> then don't.
>
> If you think you can go out on your own and build a
> proprietary search
> engine that makes you money, go ahead. If you think
> that you could find
> it useful to work with a broader community to
> leverage each others
> talents so that in whatever you are doing
> (enterprise search? niche
> search on the web? social search?), there is a
> chance for you to compete
> with the big players on a much more level playing
> field, then come and
> help us.
>
> If you want us to build it for you, for free, giving
> it all to you and
> asking for nothing in return, then... well, that's
> fine too. :) That's
> what we do.
>
> --Jimbo
> _______________________________________________
> Search-l mailing list
> Search-l at wikia.com
> http://lists.wikia.com/mailman/listinfo/search-l
> Change options or unsubscribe:
> http://lists.wikia.com/mailman/options/search-l
>
--- Jimmy Wales <jwales at wikia.com> wrote:
> John McCormac wrote:
> > The venture is being portrayed as a Google Killer
> in the media coverage
> > and spin. The problem is that there is no actual
> basis for such a claim
> > other than it gives the media a nice soundbite and
> keeps the investors
> > happy.
>
> Actually, I think it makes the investors wonder what
> kind of lunatic I
> am. :)
>
> We are trying to downplay the "google killer" story
> line, but it is a
> great story line, and so the media runs with it
> anyway.
>
> You would get the same story lines about RedHat and
> Microsoft a few
> years ago. It's an interesting story, but has
> little relationship to
> getting some work done.
>
> > So if I read this right, there is no search
> engine?
>
> There currently is no search engine. This is a
> project to build one,
> but more importantly, to build this:
>
> > It is just an idea for a platform that is scalable
> and can be used for
> > search engine development? But without knowing the
> processing
> > requirements, the storage requirements and the
> bandwidth requirements,
> > it is difficult to design such a platform.
>
> Figuring out those things is part of the process,
> yes?
>
> > The bandwidth required to spider tens of millions
> of websites on an
> > ongoing basis is considerable. Therefore such a
> venture would need a lot
> > of available bandwidth.
> >
> > The hardware is also a very significant
> requirement. It would need a lot
> > of servers to do a proper crawl of the web. It
> would also require a
> > backend to process the resulting data into
> something usable. And a
> > search interface would be required.
>
> Yes, so that matches my own very scientific
> estimates. "a lot of
> bandwidth" and a "lot of servers". :)
>
> > The search index is the hard part. It takes a long
> time to develop a
> > good, clean index. The Infinite Monkeys approach
> to building an index
> > (following links and hoping that they will lead to
> new pages) is not the
> > most efficient method of building an index quickly
> when any of the prior
> > requirements are absent or deficient.
>
> I absolutely agree with that. I don't think anyone
> is proposing an
> Infinite Monkey approach to spidering.
>
> > A good index makes the difference between a great
> search engine and a
> > spam infested pile of junk. I'm not convinced that
> the Wikia people
> > quite appreciate the level of work that goes into
> that aspect of
> > developing a search engine. Crawling a clearly
> defined index such as
> > that of Wikipedia or some other silo site is easy.
> However crawling the
> > web is like trying to take a slice of a swirling
> nebula.
>
> Would it help if I say that I *do* appreciate the
> level of work that
> goes into that aspect of things? Not sure what you
> are looking for here.
>
> The task at the moment for me is to design the
> social aspect of the
> community part of the site. The goal is to have
> good tools to allow the
> community to control the crawl in intelligent ways.
> This is not
> Infinite Monkeys, and it has to deal with
> interesting questions about
> self-interested editors, trust, etc.
>
> > So what exactly can Wikia offer? Bandwidth?
> Hardware? Expertise? Can you
> > give us some descriptions and specifications of
> the resources and
> > expertise that is available to search engine
> developers? For most of us,
> > we have to deal with the realities imposed by
> hardware and bandwidth
> > limitations. We don't have the luxury of just
> theorising - everything we
> > do is geared towards survival in a highly
> competitive market. Perhaps we
> > SE people really are on a different wavelength to
> the Wikia people.
>
> Well, we do have the luxury of being able to provide
> hardware and
> bandwidth to the community. So we don't have to cut
> corners in those areas.
>
> > Perhaps the question foremost in the minds of many
> of the SE people on
> > this list is this: why should be provide the
> search expertise? Or, to
> > put it less diplomatically, why should we make you
> rich?
>
> I am not asking you to make me rich. If you don't
> want to participate,
> then don't.
>
> If you think you can go out on your own and build a
> proprietary search
> engine that makes you money, go ahead. If you think
> that you could find
> it useful to work with a broader community to
> leverage each others
> talents so that in whatever you are doing
> (enterprise search? niche
> search on the web? social search?), there is a
> chance for you to compete
> with the big players on a much more level playing
> field, then come and
> help us.
>
> If you want us to build it for you, for free, giving
> it all to you and
> asking for nothing in return, then... well, that's
> fine too. :) That's
> what we do.
>
> --Jimbo
> _______________________________________________
> Search-l mailing list
> Search-l at wikia.com
> http://lists.wikia.com/mailman/listinfo/search-l
> Change options or unsubscribe:
> http://lists.wikia.com/mailman/options/search-l
>
-----------------------------------------------------
Last quarter I made $15,000 plus in sales. Very nice!
http://www.richdollar.com
Most people are conditioned to wait a maximum of 2 weeks
to see results of their labor. Those people are the ones
with a J.O.B
Y would U want someone to do good for U more than
U would want to do it yourself? Because the latter takes
a lot of work.
One cannot spell change in their lives by reading the
past over and over again. - Chris The Buddha
I, Chris Desouza is the Buddha. I am the know it all.
I am the enlightened one.
The person who says it can't be done shouldn't
interrupt the person who is doing it.- from the wall
of the Mountain Rose Cafe in Winter Park, Colorado.
I don't want to be successful.
No one can take credit for it. (Me said that!)
I'd rather be poor than dishonest. (Me said that too!)
-----------------------------------------------------
____________________________________________________________________________________
Building a website is a piece of cake. Yahoo! Small Business gives you all the tools to get online.
http://smallbusiness.yahoo.com/webhosting
More information about the Search-l
mailing list