From sethf at sethf.com Mon Jul 2 18:47:22 2007 From: sethf at sethf.com (Seth Finkelstein) Date: Mon, 2 Jul 2007 14:47:22 -0400 Subject: [Search-l] Wales - "Wikia as a search portal" Message-ID: <20070702184722.GA17752@sethf.com> [Disclaimer: I'm not sure what the following even means, much less agree or disagree with it, I just thought it was interesting] http://www.econtentmag.com/Articles/ArticleReader.aspx?ArticleID=36790 O'Reilly Conference Plots Pathways through the Digital Frontier for Publishers and Authors By Jon Leland - Posted Jun 29, 2007 What's Next Jimmy Wales, co-founder of Wikipedia, illuminated his latest venture, a site that is totally independent of Wikipedia called Wikia.com. Wikias, as these content-creating enthusiasts are known, are wiki software enabled online communities that are creating free content that Wales described as a large online "library" when compared to Wikipedia's online "encyclopedia." He boasts that Wikia is growing just as fast as Wikipedia did three years ago and showcased some StarTrek and Muppets-related pages -- tens of thousands of articles in just these specific subject areas. In fact, these user-authored and edited pages may well represent a whole new dimension in user-generated content. His vision is that this content in some cases could provide more useful search results than Google's and he is thus promoting Wikia as a search portal as well. -- Seth Finkelstein Consulting Programmer http://sethf.com/ Infothought blog - http://sethf.com/infothought/blog/ Interview: http://sethf.com/essays/major/greplaw-interview.php From jwales at wikia.com Mon Jul 2 19:31:41 2007 From: jwales at wikia.com (Jimmy Wales) Date: Mon, 2 Jul 2007 15:31:41 -0400 Subject: [Search-l] Wales - "Wikia as a search portal" In-Reply-To: <20070702184722.GA17752@sethf.com> References: <20070702184722.GA17752@sethf.com> Message-ID: It is basically by someone who jumped to random conclusions based on incomplete understanding, so Seth was right to not understand what it means. I don't understand either. :) On Jul 2, 2007, at 2:47 PM, Seth Finkelstein wrote: > [Disclaimer: I'm not sure what the following even means, much > less agree or disagree with it, I just thought it was interesting] > > http://www.econtentmag.com/Articles/ArticleReader.aspx?ArticleID=36790 > > O'Reilly Conference Plots Pathways through the Digital Frontier for > Publishers and Authors > By Jon Leland - Posted Jun 29, 2007 > > What's Next > > Jimmy Wales, co-founder of Wikipedia, illuminated his latest venture, > a site that is totally independent of Wikipedia called Wikia.com. > Wikias, as these content-creating enthusiasts are known, are wiki > software enabled online communities that are creating free content > that Wales described as a large online "library" when compared to > Wikipedia's online "encyclopedia." He boasts that Wikia is growing > just as fast as Wikipedia did three years ago and showcased some > StarTrek and Muppets-related pages -- tens of thousands of articles in > just these specific subject areas. In fact, these user-authored and > edited pages may well represent a whole new dimension in > user-generated content. His vision is that this content in some cases > could provide more useful search results than Google's and he is thus > promoting Wikia as a search portal as well. > > -- > Seth Finkelstein Consulting Programmer http://sethf.com/ > Infothought blog - http://sethf.com/infothought/blog/ > Interview: http://sethf.com/essays/major/greplaw-interview.php > _______________________________________________ > Search-l mailing list > Search-l at wikia.com > http://lists.wikia.com/mailman/listinfo/search-l > Change options or unsubscribe: http://lists.wikia.com/mailman/ > options/search-l > From stpeter at jabber.org Mon Jul 2 20:58:50 2007 From: stpeter at jabber.org (Peter Saint-Andre) Date: Mon, 02 Jul 2007 14:58:50 -0600 Subject: [Search-l] Wales - "Wikia as a search portal" In-Reply-To: References: <20070702184722.GA17752@sethf.com> Message-ID: <4689670A.5090702@jabber.org> Jimmy Wales wrote: > It is basically by someone who jumped to random conclusions based on > incomplete understanding Isn't that the definition of a journalist? /psa -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/x-pkcs7-signature Size: 7358 bytes Desc: S/MIME Cryptographic Signature Url : http://lists.wikia.com/pipermail/search-l/attachments/20070702/8fe9d882/attachment.bin From jason at calacanis.com Mon Jul 2 21:32:41 2007 From: jason at calacanis.com (Jason Calacanis) Date: Mon, 2 Jul 2007 14:32:41 -0700 Subject: [Search-l] Wales - "Wikia as a search portal" In-Reply-To: References: <20070702184722.GA17752@sethf.com> Message-ID: <70b3cf150707021432v6d3c0117iba04a360918bc840@mail.gmail.com> > On Jul 2, 2007, at 2:47 PM, Seth Finkelstein wrote > > [Disclaimer: I'm not sure what the following even means, much > > less agree or disagree with it, I just thought it was interesting] > > http://www.econtentmag.com/Articles/ArticleReader.aspx?ArticleID=36790 OK, well, I was there and Jimmy specifically said... umm.... well.... come to think of it he didn't say much! He certainly didn't say anything specific about Wikia's plans. All I can remember him saying was that he thought search software should be open source and that the algorithms should be transparent. That's it.... really. Other than that he just took a *LOT* of notes and let other folks talk. Now, during werewolf he had a LOT to say (inside joke). ;-) [[ more here: http://www.eblong.com/zarf/werewolf.html and http://en.wikipedia.org/wiki/Mafia_%28game%29 ]] best j --------------------- Jason McCabe Calacanis CEO, http://www.Mahalo.com Mobile: 310-456-4900 My blog: http://www.calacanis.com AOL IM/Skype: jasoncalacanis From gil at wikia.com Mon Jul 2 22:06:09 2007 From: gil at wikia.com (gil penchina) Date: Mon, 02 Jul 2007 15:06:09 -0700 Subject: [Search-l] Search-l Digest, Vol 8, Issue 1 In-Reply-To: References: Message-ID: <468976D1.5020503@wikia.com> Jason, I'm wondering if there's a way to limit you to posting less than HALF the total emails on this DL.... or is that just your overly competitive nature that forces you to be the most vocal person here? :-) Gil search-l-request at wikia.com wrote: > Send Search-l mailing list submissions to > search-l at wikia.com > > To subscribe or unsubscribe via the World Wide Web, visit > http://lists.wikia.com/mailman/listinfo/search-l > or, via email, send a message with subject or body 'help' to > search-l-request at wikia.com > > You can reach the person managing the list at > search-l-owner at wikia.com > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of Search-l digest..." > > > Today's Topics: > > 1. Re: "directory" vs. "search engine" (Aerik Sylvan) > 2. Re: "directory" vs. "search engine" (Seth Finkelstein) > 3. Mahalo Greenhouse (Jason Calacanis) > 4. Re: Mahalo Greenhouse (Aerik Sylvan) > 5. Re: Mahalo Greenhouse (Allen Stern) > 6. Re: Mahalo Greenhouse (Jason Calacanis) > 7. Re: Mahalo Greenhouse (Aerik Sylvan) > 8. Re: Mahalo Greenhouse (Jason Calacanis) > 9. Re: Mahalo Greenhouse (Jason Calacanis) > 10. Wiki Culture? (Aerik Sylvan) > 11. Re: Wiki Culture? ( Jason McCabe Calacanis ) > 12. /Talkshow: Jeremie Miller interview on June 28 at 10:30am PT > (Seth Finkelstein) > 13. NYT: The Human Touch That May Loosen Google's Grip > (Seth Finkelstein) > 14. Mahalo (Michael Wechner) > 15. Re: Mahalo (Ashkan Karbasfrooshan) > 16. Re: Mahalo ( Jason McCabe Calacanis ) > 17. Re: Mahalo (Michael Wechner) > 18. Mahalo Greenhouse (Jason Calacanis) > 19. Hi to all! (Liberic Development) > 20. Oreilly OSCON next month! (jer) > 21. open source search - on the desktop (jer) > 22. Re: /Talkshow: Jeremie Miller interview on June 28 at 10:30am > PT (jer) > 23. Wales - "Wikia as a search portal" (Seth Finkelstein) > 24. Re: Wales - "Wikia as a search portal" (Jimmy Wales) > 25. Re: Wales - "Wikia as a search portal" (Peter Saint-Andre) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Wed, 13 Jun 2007 12:28:49 -0700 > From: "Aerik Sylvan" > Subject: Re: [Search-l] "directory" vs. "search engine" > To: "Jason Calacanis" , search-l at wikia.com > Message-ID: > <355a36af0706131228l5987c3c9o55676c8c8304dd6e at mail.gmail.com> > Content-Type: text/plain; charset="iso-8859-1" > > On 6/13/07, Jason Calacanis wrote: >> On 6/11/07, Nitin Borwankar wrote: >>> There are different kinds of bloggers. >>> Those who >>> b) have "fun" blogging AND are self employed due to ad-revenue (and >>> possibly premium subscription revenue) from blogging >> Nitin: You nailed it. We paid 300 bloggers a month when I was running >> Weblogs, Inc. (which produces Engadget, Joystiq, Autoblog, etc). So, >> there is a model between work for free for fun, and get paid and hate >> your job. Frankly, I find it crazy that folks would work for free for >> a venture-backed company when sales people, CEOs, programmers and >> others are getting paid--but that's just me. > > > > Hmm... I'd like to look at this whole question from some new angles. First, > is this topic of "working for free". I think we can safely say that no one > wants to "work for free". Pretty much by definition, we work for money, or > some other compensation (fame, glory, to alleviate feelings of guilt, > whatever). And certainly there are folks who work for some of those other > forms of compensation - to be brutally honest, I hope to achieve something > with many my contributions. Creating the Wiki Directory ( > http://wikidweb.com) was fun and interesting, but I hopeed to eventually get > something - not necessarily money, but something - out of it. So not all > work is for money or immediate compensation. > > BUT, there is also the category of contributions that are either altruistic, > or just fun. Probably many wikipedia editors fall in this category. They > are not being fleeced. They are not hoping to get rich or famous (perhaps a > few are, but many are not). They're having fun, or they're doing it out of > altruism. > > That leads nicely to another topic, which is the question of culture. Jason > rather famously said that Wikipedia should have advertisements ( > http://www.calacanis.com/2006/10/28/wikipedia-leaves-100m-on-the-table-or-please-jimbo-reconsider/ > ). > I think the culture of Wikipeida is probably kind of like public television > or radio - it's not a perfect analogy, because NPR pays (most) of it's > contributors, but many folks who call in or participate on shows are not > paid - so it's not too much of a stretch. Okay, so contrast public radio > and television to the for-profit broadcast channels (who are monetized by > advertising). > > The content and culture of the two are totally different. And I'm inclined > to theorize that there's a dynamic there that drives that. A large > non-profit supported by advertisements would be an interesting experiment, > but it's easy to think that the content decisions might be colored by a > desire to increase revenue. If you're pinching pennies and begging for > donations to make ends meet, the environment is one where it's easier to > stay focussed on > the goal of the endeavor, and have no temptations to tweak the content to > increase revenues. > > Now, the reason I bring it up is (well, it's intesting to think about on > it's own) that there is more to be considered than just business models. > Or, maybe more accurately in the case of Wikia, there's more to the business > model than just the revenue model. Wikipedia has a culture that drives it. > Mahalo will have a very different culture. What kind of culture will Wikia > Search have? > > (As a total side note, I can tell you that I am trying to foster a culture > more like Wikipedia at my wiki directory - thus, no quick sources of revenue > (Google Ads)... I'd guess that what I'm attempting to do is a funky hybrid > business model since I don't think donations could support it, at scale). > > Best Regards, > Aerik > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: http://lists.wikia.com/pipermail/search-l/attachments/20070613/bb3e5740/attachment-0001.html > > ------------------------------ > > Message: 2 > Date: Wed, 13 Jun 2007 16:03:11 -0400 > From: Seth Finkelstein > Subject: Re: [Search-l] "directory" vs. "search engine" > To: search-l at wikia.com > Message-ID: <20070613200311.GA740 at sethf.com> > Content-Type: text/plain; charset=iso-8859-1 > > On Wed, Jun 13, 2007 at 02:31:53AM -0700, Jason Calacanis wrote: >> On 6/11/07, Nitin Borwankar wrote: >>> There are different kinds of bloggers. >>> Those who >>> b) have "fun" blogging AND are self employed due to ad-revenue (and >>> possibly premium subscription revenue) from blogging >> Nitin: You nailed it. We paid 300 bloggers a month when I was running >> Weblogs, Inc. (which produces Engadget, Joystiq, Autoblog, etc). So, >> there is a model between work for free for fun, and get paid and hate >> your job. > > Hmm ... > > http://www.blogherald.com/2005/08/29/time-for-a-long-cold-shower-on-blogging-pay-rates/ > http://www.blogherald.com/2005/08/26/weblogs-inc-pay-rates-revealed-by-disgruntled-potential-recruit/ > http://www.metafilter.com/44548/Weblogs-Inc-Contract#1021977 > > Now, now, Jason - you're a businessman. I'm sure you don't > pay more than the market will bear. Which is better than zero, of > course, where relevant. But still, a job's a job. > >> Frankly, I find it crazy that folks would work for free for a >> venture-backed company when sales people, CEOs, programmers and >> others are getting paid--but that's just me. > > People sell flowers at airports for the greater glory of > their guru. Real-world economics is a complicated topic :-). > >> ...today I'm thrilled to announce that you can create search results >> at Mahalo.com and get paid for doing them! More details here: >> >> http://greenhouse.mahalo.com/ > > Piecework! Yet another old business model brought to digital > sharecropping :-). > From jason at calacanis.com Mon Jul 2 23:06:23 2007 From: jason at calacanis.com (=?utf-8?B?SmFzb24gTWNDYWJlIENhbGFjYW5pcw==?=) Date: Mon, 2 Jul 2007 23:06:23 +0000 Subject: [Search-l] Search-l Digest, Vol 8, Issue 1 In-Reply-To: <468976D1.5020503@wikia.com> References: <468976D1.5020503@wikia.com> Message-ID: <1569473234-1183417758-cardhu_decombobulator_blackberry.rim.net-1605016823-@bxe112.bisx.prod.on.blackberry> Competitive?!? Really? I thought this was an open source project that everyone could be involved in. As I told jimmy at foo camp Mahalo.com hopes to a) use Wikia's open source search software and b) wants to help build it. We *share* the mission to open up search. I find it odd you see us competitive at all. We're both outsiders with zero marketshare at this point after all. Also, didn't Jimmy say he wanted to have all the up and coming search engines involved in the project. It isn't a zero sum game in your mind is it? If you want exclude Mahalo.com from the project... Well, I will respect your wishes I guess, but it feels odd after jimmy was so open about us being involved. Also, how can I be half the posts when seth is 80% of them?!? :-) All the best, Jason --------------- Jason at Calacanis.com | 310-456-4900 www.calacanis.com -----Original Message----- From: gil penchina Date: Mon, 02 Jul 2007 15:06:09 To:search-l at wikia.com Subject: Re: [Search-l] Search-l Digest, Vol 8, Issue 1 Jason, I'm wondering if there's a way to limit you to posting less than HALF the total emails on this DL.... or is that just your overly competitive nature that forces you to be the most vocal person here? :-) Gil search-l-request at wikia.com wrote: > Send Search-l mailing list submissions to > search-l at wikia.com > > To subscribe or unsubscribe via the World Wide Web, visit > http://lists.wikia.com/mailman/listinfo/search-l > or, via email, send a message with subject or body 'help' to > search-l-request at wikia.com > > You can reach the person managing the list at > search-l-owner at wikia.com > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of Search-l digest..." > > > Today's Topics: > > 1. Re: "directory" vs. "search engine" (Aerik Sylvan) > 2. Re: "directory" vs. "search engine" (Seth Finkelstein) > 3. Mahalo Greenhouse (Jason Calacanis) > 4. Re: Mahalo Greenhouse (Aerik Sylvan) > 5. Re: Mahalo Greenhouse (Allen Stern) > 6. Re: Mahalo Greenhouse (Jason Calacanis) > 7. Re: Mahalo Greenhouse (Aerik Sylvan) > 8. Re: Mahalo Greenhouse (Jason Calacanis) > 9. Re: Mahalo Greenhouse (Jason Calacanis) > 10. Wiki Culture? (Aerik Sylvan) > 11. Re: Wiki Culture? ( Jason McCabe Calacanis ) > 12. /Talkshow: Jeremie Miller interview on June 28 at 10:30am PT > (Seth Finkelstein) > 13. NYT: The Human Touch That May Loosen Google's Grip > (Seth Finkelstein) > 14. Mahalo (Michael Wechner) > 15. Re: Mahalo (Ashkan Karbasfrooshan) > 16. Re: Mahalo ( Jason McCabe Calacanis ) > 17. Re: Mahalo (Michael Wechner) > 18. Mahalo Greenhouse (Jason Calacanis) > 19. Hi to all! (Liberic Development) > 20. Oreilly OSCON next month! (jer) > 21. open source search - on the desktop (jer) > 22. Re: /Talkshow: Jeremie Miller interview on June 28 at 10:30am > PT (jer) > 23. Wales - "Wikia as a search portal" (Seth Finkelstein) > 24. Re: Wales - "Wikia as a search portal" (Jimmy Wales) > 25. Re: Wales - "Wikia as a search portal" (Peter Saint-Andre) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Wed, 13 Jun 2007 12:28:49 -0700 > From: "Aerik Sylvan" > Subject: Re: [Search-l] "directory" vs. "search engine" > To: "Jason Calacanis" , search-l at wikia.com > Message-ID: > <355a36af0706131228l5987c3c9o55676c8c8304dd6e at mail.gmail.com> > Content-Type: text/plain; charset="iso-8859-1" > > On 6/13/07, Jason Calacanis wrote: >> On 6/11/07, Nitin Borwankar wrote: >>> There are different kinds of bloggers. >>> Those who >>> b) have "fun" blogging AND are self employed due to ad-revenue (and >>> possibly premium subscription revenue) from blogging >> Nitin: You nailed it. We paid 300 bloggers a month when I was running >> Weblogs, Inc. (which produces Engadget, Joystiq, Autoblog, etc). So, >> there is a model between work for free for fun, and get paid and hate >> your job. Frankly, I find it crazy that folks would work for free for >> a venture-backed company when sales people, CEOs, programmers and >> others are getting paid--but that's just me. > > > > Hmm... I'd like to look at this whole question from some new angles. First, > is this topic of "working for free". I think we can safely say that no one > wants to "work for free". Pretty much by definition, we work for money, or > some other compensation (fame, glory, to alleviate feelings of guilt, > whatever). And certainly there are folks who work for some of those other > forms of compensation - to be brutally honest, I hope to achieve something > with many my contributions. Creating the Wiki Directory ( > http://wikidweb.com) was fun and interesting, but I hopeed to eventually get > something - not necessarily money, but something - out of it. So not all > work is for money or immediate compensation. > > BUT, there is also the category of contributions that are either altruistic, > or just fun. Probably many wikipedia editors fall in this category. They > are not being fleeced. They are not hoping to get rich or famous (perhaps a > few are, but many are not). They're having fun, or they're doing it out of > altruism. > > That leads nicely to another topic, which is the question of culture. Jason > rather famously said that Wikipedia should have advertisements ( > http://www.calacanis.com/2006/10/28/wikipedia-leaves-100m-on-the-table-or-please-jimbo-reconsider/ > ). > I think the culture of Wikipeida is probably kind of like public television > or radio - it's not a perfect analogy, because NPR pays (most) of it's > contributors, but many folks who call in or participate on shows are not > paid - so it's not too much of a stretch. Okay, so contrast public radio > and television to the for-profit broadcast channels (who are monetized by > advertising). > > The content and culture of the two are totally different. And I'm inclined > to theorize that there's a dynamic there that drives that. A large > non-profit supported by advertisements would be an interesting experiment, > but it's easy to think that the content decisions might be colored by a > desire to increase revenue. If you're pinching pennies and begging for > donations to make ends meet, the environment is one where it's easier to > stay focussed on > the goal of the endeavor, and have no temptations to tweak the content to > increase revenues. > > Now, the reason I bring it up is (well, it's intesting to think about on > it's own) that there is more to be considered than just business models. > Or, maybe more accurately in the case of Wikia, there's more to the business > model than just the revenue model. Wikipedia has a culture that drives it. > Mahalo will have a very different culture. What kind of culture will Wikia > Search have? > > (As a total side note, I can tell you that I am trying to foster a culture > more like Wikipedia at my wiki directory - thus, no quick sources of revenue > (Google Ads)... I'd guess that what I'm attempting to do is a funky hybrid > business model since I don't think donations could support it, at scale). > > Best Regards, > Aerik > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: http://lists.wikia.com/pipermail/search-l/attachments/20070613/bb3e5740/attachment-0001.html > > ------------------------------ > > Message: 2 > Date: Wed, 13 Jun 2007 16:03:11 -0400 > From: Seth Finkelstein > Subject: Re: [Search-l] "directory" vs. "search engine" > To: search-l at wikia.com > Message-ID: <20070613200311.GA740 at sethf.com> > Content-Type: text/plain; charset=iso-8859-1 > > On Wed, Jun 13, 2007 at 02:31:53AM -0700, Jason Calacanis wrote: >> On 6/11/07, Nitin Borwankar wrote: >>> There are different kinds of bloggers. >>> Those who >>> b) have "fun" blogging AND are self employed due to ad-revenue (and >>> possibly premium subscription revenue) from blogging >> Nitin: You nailed it. We paid 300 bloggers a month when I was running >> Weblogs, Inc. (which produces Engadget, Joystiq, Autoblog, etc). So, >> there is a model between work for free for fun, and get paid and hate >> your job. > > Hmm ... > > http://www.blogherald.com/2005/08/29/time-for-a-long-cold-shower-on-blogging-pay-rates/ > http://www.blogherald.com/2005/08/26/weblogs-inc-pay-rates-revealed-by-disgruntled-potential-recruit/ > http://www.metafilter.com/44548/Weblogs-Inc-Contract#1021977 > > Now, now, Jason - you're a businessman. I'm sure you don't > pay more than the market will bear. Which is better than zero, of > course, where relevant. But still, a job's a job. > >> Frankly, I find it crazy that folks would work for free for a >> venture-backed company when sales people, CEOs, programmers and >> others are getting paid--but that's just me. > > People sell flowers at airports for the greater glory of > their guru. Real-world economics is a complicated topic :-). > >> ...today I'm thrilled to announce that you can create search results >> at Mahalo.com and get paid for doing them! More details here: >> >> http://greenhouse.mahalo.com/ > > Piecework! Yet another old business model brought to digital > sharecropping :-). > _______________________________________________ Search-l mailing list Search-l at wikia.com http://lists.wikia.com/mailman/listinfo/search-l Change options or unsubscribe: http://lists.wikia.com/mailman/options/search-l From ic at thelastfrontier.com.au Tue Jul 3 00:21:40 2007 From: ic at thelastfrontier.com.au (Grahame Gould) Date: Tue, 3 Jul 2007 08:21:40 +0800 Subject: [Search-l] Competitive contributing? In-Reply-To: <1569473234-1183417758-cardhu_decombobulator_blackberry.rim.net-1605016823-@bxe112.bisx.prod.on.blackberry> References: <468976D1.5020503@wikia.com> <1569473234-1183417758-cardhu_decombobulator_blackberry.rim.net-1605016823-@bxe112.bisx.prod.on.blackberry> Message-ID: <3ADB1E55E4668048BF0A96E7828915F8A23DCA@jabiru.SWEK.local> Oops, you're sounding overly touchy, Jason. I think the competitive comment (and the whole email) was tongue-in-cheek. At least, that's how I read it. I'd therefore be very surprised if the email should be read as meaning Mahalo should not be involved. We're all here because we support open source and involvement. I certainly don't think you (or anyone else) is contributing too much! Grahame Gould Information Coordinator Shire of Wyndham East Kimberley Kununurra, WA, Australia (08) 9168 4100 Phone (08) 9168 1798 Fax www.thelastfrontier.com.au From sethf at sethf.com Tue Jul 3 13:54:56 2007 From: sethf at sethf.com (Seth Finkelstein) Date: Tue, 3 Jul 2007 09:54:56 -0400 Subject: [Search-l] FYI - June author posting statistics Message-ID: <20070703135456.GA22522@sethf.com> If anyone is interested, here's the list posting statistics for the month of June: Rank Number Percent Name 1 13 16.7% Jason Calacanis 2 11 14.1% Aerik Sylvan 3 10 12.8% jer 4 9 11.5% Seth Finkelstein 5 6 7.7% William Surowiec 6 5 6.4% Nitin Borwankar 7 4 5.1% peter burden 8 3 3.8% Jimmy Wales 9 3 3.8% Pushparajan V 10 2 2.6% Sami M 11 2 2.6% Michael Wechner 12 2 2.6% Tall Street 13 2 2.6% Fred Benenson 14 1 1.3% Patrick Corcoran 15 1 1.3% Ashkan Karbasfrooshan 16 1 1.3% Allen Stern 17 1 1.3% Chris 18 1 1.3% Hua Fang 19 1 1.3% Liberic Development Total = 78 posts I parsed the list archive by-date page's HTML to generate this. In the spirit of Open Source, here's some Perl code if anyone wants to play with it: while (<>) { chomp; $posts->{$'}++ if (/^/); } # get basic data from HTML $posts->{'Jason Calacanis'} += $posts->{'Jason McCabe Calacanis'}; delete($posts->{'Jason McCabe Calacanis'}); # merge names map { $total += $_ } values(%$posts); # add up total posts foreach $author (sort { $posts->{$b} <=> $posts->{$a} } keys(%$posts)) { # statistics printf("%4d\t%d\t%4.1f%%\t%s\n", ++$i, $posts->{$author},100.0*$posts->{$author}/$total,$author); } print "Total = $total\n"; [Amusingly, Jason ended up being a special case! :-)] -- Seth Finkelstein Consulting Programmer http://sethf.com/ Infothought blog - http://sethf.com/infothought/blog/ Interview: http://sethf.com/essays/major/greplaw-interview.php From vprajan at gmail.com Tue Jul 3 15:15:05 2007 From: vprajan at gmail.com (Pushparajan V) Date: Tue, 3 Jul 2007 20:45:05 +0530 Subject: [Search-l] Is Search-I a vision less project ? Message-ID: I am in this mailing list for about 2 months and nothing fruitful discussed which leads to a particular development model.. This project diverts like anything and now Mahalo coming in is quite confusing. What is the future plans ?.. I don't know why only very few question about the vision of this project.. Is this project exponentially leading to death ??.. How many users are there in this mailing list ? I was really existed when Wikipedia started such a project and loved to contribute.. But now no passion is getting mixed in this mailing list.. Is wikia project going into hands of Mahalo or something similar going towards marketing perspective ??.. Jer, why not set a focus instead of experimenting with various technologies ?.. In the end, technology is not going to be big matter. The product that comes out of this mailing list matters.. First of all, Why this mailing list goes under domain " wikia.com" ??.. Itching is the ".com"... Is this project encourages commercial interest at the first plot itself ?.. If wikia want to make profit out of the project, encourage open source development first.. Get popularity first by placing it to open source developers.. Don't just shutdown the whole flow.. Please.. Why not open up a project in sourceforge.net or freshmeat.net ?.. Why not take YaCy project and move with it ?.. Thanks, Pushparajan V http://www.vprajan.org -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.wikia.com/pipermail/search-l/attachments/20070703/32cd6cee/attachment.html From jason at calacanis.com Tue Jul 3 16:28:47 2007 From: jason at calacanis.com (=?utf-8?B?SmFzb24gTWNDYWJlIENhbGFjYW5pcw==?=) Date: Tue, 3 Jul 2007 16:28:47 +0000 Subject: [Search-l] FYI - June author posting statistics In-Reply-To: <20070703135456.GA22522@sethf.com> References: <20070703135456.GA22522@sethf.com> Message-ID: <1888942085-1183480203-cardhu_decombobulator_blackberry.rim.net-1942984633-@bxe028.bisx.prod.on.blackberry> See, much less than 50%! :-) June was the month mahalo was launched and folks were talking about it. Perhaps a three/six month view might look a little different no? J --------------- Jason at Calacanis.com | 310-456-4900 www.calacanis.com -----Original Message----- From: Seth Finkelstein Date: Tue, 3 Jul 2007 09:54:56 To:search-l at wikia.com Subject: [Search-l] FYI - June author posting statistics If anyone is interested, here's the list posting statistics for the month of June: Rank Number Percent Name 1 13 16.7% Jason Calacanis 2 11 14.1% Aerik Sylvan 3 10 12.8% jer 4 9 11.5% Seth Finkelstein 5 6 7.7% William Surowiec 6 5 6.4% Nitin Borwankar 7 4 5.1% peter burden 8 3 3.8% Jimmy Wales 9 3 3.8% Pushparajan V 10 2 2.6% Sami M 11 2 2.6% Michael Wechner 12 2 2.6% Tall Street 13 2 2.6% Fred Benenson 14 1 1.3% Patrick Corcoran 15 1 1.3% Ashkan Karbasfrooshan 16 1 1.3% Allen Stern 17 1 1.3% Chris 18 1 1.3% Hua Fang 19 1 1.3% Liberic Development Total = 78 posts I parsed the list archive by-date page's HTML to generate this. In the spirit of Open Source, here's some Perl code if anyone wants to play with it: while (<>) { chomp; $posts->{$'}++ if (/^/); } # get basic data from HTML $posts->{'Jason Calacanis'} += $posts->{'Jason McCabe Calacanis'}; delete($posts->{'Jason McCabe Calacanis'}); # merge names map { $total += $_ } values(%$posts); # add up total posts foreach $author (sort { $posts->{$b} <=> $posts->{$a} } keys(%$posts)) { # statistics printf("%4d\t%d\t%4.1f%%\t%s\n", ++$i, $posts->{$author},100.0*$posts->{$author}/$total,$author); } print "Total = $total\n"; [Amusingly, Jason ended up being a special case! :-)] -- Seth Finkelstein Consulting Programmer http://sethf.com/ Infothought blog - http://sethf.com/infothought/blog/ Interview: http://sethf.com/essays/major/greplaw-interview.php _______________________________________________ Search-l mailing list Search-l at wikia.com http://lists.wikia.com/mailman/listinfo/search-l Change options or unsubscribe: http://lists.wikia.com/mailman/options/search-l From jason at calacanis.com Tue Jul 3 16:54:38 2007 From: jason at calacanis.com (Jason Calacanis) Date: Tue, 3 Jul 2007 09:54:38 -0700 Subject: [Search-l] FYI - June author posting statistics In-Reply-To: <20070703135456.GA22522@sethf.com> References: <20070703135456.GA22522@sethf.com> Message-ID: <70b3cf150707030954i38326fcn81435a2b271b9b03@mail.gmail.com> On 7/3/07, Seth Finkelstein wrote: > If anyone is interested, here's the list posting statistics for the > month of June: A sample for May take from: http://lists.wikia.com/pipermail/search-l/2007-May/date.html Seth Finkelstein: 29 Jer: 22 Jimmy Wales: 10 Jason Calacanis: 5 (all in relation to Seth's non-stop posts about Mahalo's launch) To quote Jimmy Wales "I'm sorry Seth is trolling about it, but you know, that's Seth's way" I still love you Seth. :-) best j --------------------- Jason McCabe Calacanis CEO, http://www.Mahalo.com Mobile: 310-456-4900 My blog: http://www.calacanis.com AOL IM/Skype: jasoncalacanis From aerik at thesylvans.com Tue Jul 3 18:09:44 2007 From: aerik at thesylvans.com (Aerik Sylvan) Date: Tue, 3 Jul 2007 11:09:44 -0700 Subject: [Search-l] FYI - June author posting statistics In-Reply-To: <70b3cf150707030954i38326fcn81435a2b271b9b03@mail.gmail.com> References: <20070703135456.GA22522@sethf.com> <70b3cf150707030954i38326fcn81435a2b271b9b03@mail.gmail.com> Message-ID: <355a36af0707031109y5ddaff7cpd0c97733a3add1c4@mail.gmail.com> I was gonna bite my tongue, really, but I just can't. This is both compliment and criticism: Jason, you are a master marketer. I tried to pitch wikidweb in some relevant forums and got blasted several times for self promotion. You have taken probably every opportunity of anything remotely relevant to mention (and promote) Mahalo. It's really quite impressive and perhaps a bit annoying, and honestly, I wish I had that talent. Best Regards, Aerik On 7/3/07, Jason Calacanis wrote: > > On 7/3/07, Seth Finkelstein wrote: > > If anyone is interested, here's the list posting statistics for the > > month of June: > > A sample for May take from: > http://lists.wikia.com/pipermail/search-l/2007-May/date.html > > Seth Finkelstein: 29 > Jer: 22 > Jimmy Wales: 10 > Jason Calacanis: 5 (all in relation to Seth's non-stop posts about > Mahalo's launch) > > To quote Jimmy Wales "I'm sorry Seth is trolling about it, but you > know, that's Seth's way" > > I still love you Seth. :-) > > best j > --------------------- > Jason McCabe Calacanis > CEO, http://www.Mahalo.com > Mobile: 310-456-4900 > My blog: http://www.calacanis.com > AOL IM/Skype: jasoncalacanis > _______________________________________________ > Search-l mailing list > Search-l at wikia.com > http://lists.wikia.com/mailman/listinfo/search-l > Change options or unsubscribe: > http://lists.wikia.com/mailman/options/search-l > -- http://www.wikidweb.com - the Wiki Directory of the Web -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.wikia.com/pipermail/search-l/attachments/20070703/f486e578/attachment.html From aerik at thesylvans.com Tue Jul 3 18:16:15 2007 From: aerik at thesylvans.com (Aerik Sylvan) Date: Tue, 3 Jul 2007 11:16:15 -0700 Subject: [Search-l] Is Search-I a vision less project ? In-Reply-To: References: Message-ID: <355a36af0707031116x328609bcjcc0da1cf3d3cb908@mail.gmail.com> On 7/3/07, Pushparajan V wrote: > > I am in this mailing list for about 2 months and nothing fruitful > discussed which leads to a particular development model.. This project > diverts like anything and now Mahalo coming in is quite confusing. What is > the future plans ?.. I don't know why only very few question about the > vision of this project.. > > Is this project exponentially leading to death ??.. How many users are > there in this mailing list ? > > Yeah, I'm kind of with you on this. I mentioned a few times that I think what the project needs is a leader with a vision - wait, that's not fair. Jimmy has given us a vision, but it's too vague. We, the mailing list, have proposed all kinds of disparate ideas, but thus far the bazaar has failed to self organize enough to actually get going. And I think that may be due at least in part to the .com - the for-profit motive of the parent. But I also think it may be largely inherent in the model. Past bazaar development software project probably at least had a preconceived idea of what their software was going to do, function-wise. Going to the masses and saying "let's build something cool and open - what should we built?" just may not work. And indeed, it doesn't seem to be. There have been many hints of stuff in the works, and perhaps we, the general mailing list, do not have access to all the goings on in Wikia... In the spirit of open-ness, I hope that is not true - that at least our fearless leaders would clue is in on the *plans*, if indeed there are any concrete ones, to build a Wikia search engine. Best Regards, Aerik -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.wikia.com/pipermail/search-l/attachments/20070703/04c7ccc0/attachment.html From aerik at thesylvans.com Tue Jul 3 18:26:04 2007 From: aerik at thesylvans.com (Aerik Sylvan) Date: Tue, 3 Jul 2007 11:26:04 -0700 Subject: [Search-l] Tech question about clucene... Message-ID: <355a36af0707031126v1fdd9c1dt9b18b31f78e5c48c@mail.gmail.com> Has anyone on the list used clucence ( http://clucene.sourceforge.net/index.php/Main_Page)? I've been playing with Lucene indexes for a side project (category intersections on wikipedia) and while it has been easy to develop using zend_search_lucene, but want to try for faster results. I'm thinking this stuff is very relevant to a Lucene backed search engine and that, even though I'm risking offending the "premature optimization" gods, a search app written in c/c++ is very likely to be inherently faster (in execution, not development) that anything else (all other factors being equal). So, that brings me to my question. I can clumsily write in c, and compile basic programs, but do not know any of the nuances or "advanced features" of c++. I'm trying to compile some example code and having problems. If anybody was even remotely interested or was willing to humor a few newbie questions, it might lead to some exploratory development that might be of use to the search project. (So, if anyone can help, would you email me and we'll talk "off-line"?) Thanks! Aerik -- http://www.wikidweb.com - the Wiki Directory of the Web -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.wikia.com/pipermail/search-l/attachments/20070703/7528ec43/attachment.html From allen at centernetworks.com Tue Jul 3 18:43:28 2007 From: allen at centernetworks.com (Allen Stern) Date: Tue, 03 Jul 2007 14:43:28 -0400 Subject: [Search-l] FYI - June author posting statistics In-Reply-To: <355a36af0707031109y5ddaff7cpd0c97733a3add1c4@mail.gmail.com> References: <20070703135456.GA22522@sethf.com> <70b3cf150707030954i38326fcn81435a2b271b9b03@mail.gmail.com> <355a36af0707031109y5ddaff7cpd0c97733a3add1c4@mail.gmail.com> Message-ID: <468A98D0.3090800@centernetworks.com> Aerik - on the same topic, yesterday I compared three Silicon Valley Stars: Rose, Calacanis, Guy and their ability to generate buzz from their star power. While I am still not a fan of Mahalo, I gave Jason the highest mark of the three. http://www.centernetworks.com/silicon-valley-star-face-off-rose-calacanis-guy -- Allen Aerik Sylvan wrote: > I was gonna bite my tongue, really, but I just can't. This is both > compliment and criticism: > > Jason, you are a master marketer. I tried to pitch wikidweb in some > relevant forums and got blasted several times for self promotion. You > have taken probably every opportunity of anything remotely relevant to > mention (and promote) Mahalo. It's really quite impressive and > perhaps a bit annoying, and honestly, I wish I had that talent. > > Best Regards, > Aerik > > > > On 7/3/07, *Jason Calacanis* < jason at calacanis.com > > wrote: > > On 7/3/07, Seth Finkelstein > wrote: > > If anyone is interested, here's the list posting statistics for the > > month of June: > > A sample for May take from: > http://lists.wikia.com/pipermail/search-l/2007-May/date.html > > Seth Finkelstein: 29 > Jer: 22 > Jimmy Wales: 10 > Jason Calacanis: 5 (all in relation to Seth's non-stop posts about > Mahalo's launch) > > To quote Jimmy Wales "I'm sorry Seth is trolling about it, but you > know, that's Seth's way" > > I still love you Seth. :-) > > best j > --------------------- > Jason McCabe Calacanis > CEO, http://www.Mahalo.com > Mobile: 310-456-4900 > My blog: http://www.calacanis.com > AOL IM/Skype: jasoncalacanis > _______________________________________________ > Search-l mailing list > Search-l at wikia.com > http://lists.wikia.com/mailman/listinfo/search-l > > Change options or unsubscribe: > http://lists.wikia.com/mailman/options/search-l > > > > > -- > http://www.wikidweb.com - the Wiki Directory of the Web > ------------------------------------------------------------------------ > > _______________________________________________ > Search-l mailing list > Search-l at wikia.com > http://lists.wikia.com/mailman/listinfo/search-l > Change options or unsubscribe: http://lists.wikia.com/mailman/options/search-l From beesley at gmail.com Tue Jul 3 19:45:37 2007 From: beesley at gmail.com (Angela) Date: Tue, 3 Jul 2007 20:45:37 +0100 Subject: [Search-l] Is Search-I a vision less project ? In-Reply-To: <355a36af0707031116x328609bcjcc0da1cf3d3cb908@mail.gmail.com> References: <355a36af0707031116x328609bcjcc0da1cf3d3cb908@mail.gmail.com> Message-ID: <8b722b800707031245g2d048898vb590d660380fbd35@mail.gmail.com> I realize there aren't as many updates here as many people would like, but that doesn't mean nothing is happening. We do have the vision (Jimmy) and the leader (Jeremie) and even a non-commercial domain (swlabs.org). I'm sure there will be more information soon as Jeremie and Jimmy are meeting before OSCON to work on Wikia Search. Angela -- Angela Beesley Wikia.com From mb at mariobehling.de Tue Jul 3 19:52:31 2007 From: mb at mariobehling.de (Mario Behling) Date: Wed, 04 Jul 2007 00:22:31 +0430 Subject: [Search-l] Is Search-I a vision less project ? In-Reply-To: <355a36af0707031116x328609bcjcc0da1cf3d3cb908@mail.gmail.com> References: <355a36af0707031116x328609bcjcc0da1cf3d3cb908@mail.gmail.com> Message-ID: <468AA8FF.4080309@mariobehling.de> An HTML attachment was scrubbed... URL: http://lists.wikia.com/pipermail/search-l/attachments/20070704/1b3b9567/attachment.html From vprajan at gmail.com Tue Jul 3 20:11:47 2007 From: vprajan at gmail.com (Pushparajan V) Date: Wed, 4 Jul 2007 01:41:47 +0530 Subject: [Search-l] Is Search-I a vision less project ? In-Reply-To: <468AA8FF.4080309@mariobehling.de> References: <355a36af0707031116x328609bcjcc0da1cf3d3cb908@mail.gmail.com> <468AA8FF.4080309@mariobehling.de> Message-ID: On 7/4/07, Mario Behling wrote: > > Aerik Sylvan schrieb: > > > On 7/3/07, Pushparajan V wrote: > > > > What is the future plans ?.. > > > > > There have been many hints of stuff in the works, and perhaps we, the > general mailing list, do not have access to all the goings on in Wikia... In > the spirit of open-ness, I hope that is not true - that at least our > fearless leaders would clue is in on the *plans*, if indeed there are any > concrete ones, to build a Wikia search engine. > > Best Regards, > Aerik > > Hi! > > I am on the list only for a few days, but I am also curious about the > ideas/plans - what is to become of this project. > > I also see working examples of search engines that function already. I met > Doug Cutting about three years ago when he introduced Nutch in Berlin. As he > put it, starting a search engine based on free software is mainly a > financial problem - running a server farm and so on. (By the way, I did not > find information what software Mahalo is using.) > > In Germany we have founded an association to promote free search engines ( > www.suma-ev.de, Website going Web 2.0 soon). Michael Christen from yacy is > a member too. To support developers with helping to get better funding could > really give projects like yacy the additional drive they need. Generally it > seems to me that some projects kind of lack the promotion they deserve. > Wikiasearch had lots of promotion, but I do not yet see how the ideas Jimmy > told us could be implemented right now. > The association for decentralized search is very interesting.. But ideas like this can be popularized in local LUG's, Linux conferences and in Linux Magazines also.. The only thing YaCy like projects lack is the promotion and more developers getting involved.. Infrastructure and funding will not be a problem if the project is known to many people and its useful for there daily needs.... Best regards, > > Mario > > > http://freelayers.org > http://perspektive89.com > -- Pushparajan V http://www.vprajan.org - - - - - - - - Know me: http://www.hackerkey.com/decrypt.php?hackerkey=v4sw57BCHJUY$hw3/5ln2pr6AFOPSck3ma4u7FLMSw7DTWXm6l6FGIKLRSU$i862NLJ0CAe6$t3b4en4a23Ns3MSr9g5AGO - - - - - - - - -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.wikia.com/pipermail/search-l/attachments/20070704/0b0677f6/attachment.html From jeremie at jabber.org Tue Jul 3 20:59:07 2007 From: jeremie at jabber.org (jer) Date: Tue, 3 Jul 2007 15:59:07 -0500 Subject: [Search-l] Is Search-I a vision less project ? In-Reply-To: References: Message-ID: <2E750102-F74D-4AEA-952C-6E0E972DE46D@jabber.org> Give me a day here to pull together a summary for everyone of everything I'm aware of, I've been meaning to do this but stuff keeps changing :) Imagine right now this effort as a young Nebula, a gas cloud with lots of elements, starting to collapse into some hazy objects. I'll do my best to name them and predict what they might become, but there's a lot of raw physics going on so who really knows, I'll do my best to contribute some positive gravity, *grin*. Jer On Jul 3, 2007, at 10:15 AM, Pushparajan V wrote: > I am in this mailing list for about 2 months and nothing fruitful > discussed which leads to a particular development model.. This > project diverts like anything and now Mahalo coming in is quite > confusing. What is the future plans ?.. I don't know why only very > few question about the vision of this project.. > > Is this project exponentially leading to death ??.. How many users > are there in this mailing list ? > > I was really existed when Wikipedia started such a project and > loved to contribute.. But now no passion is getting mixed in this > mailing list.. Is wikia project going into hands of Mahalo or > something similar going towards marketing perspective ??.. > > Jer, why not set a focus instead of experimenting with various > technologies ?.. In the end, technology is not going to be big > matter. The product that comes out of this mailing list matters.. > > First of all, Why this mailing list goes under domain " > wikia.com" ??.. Itching is the ".com"... Is this project encourages > commercial interest at the first plot itself ?.. > > If wikia want to make profit out of the project, encourage open > source development first.. Get popularity first by placing it to > open source developers.. Don't just shutdown the whole flow.. Please.. > > Why not open up a project in sourceforge.net or freshmeat.net ?.. > Why not take YaCy project and move with it ?.. > > Thanks, > Pushparajan V > http://www.vprajan.org > > _______________________________________________ > Search-l mailing list > Search-l at wikia.com > http://lists.wikia.com/mailman/listinfo/search-l > Change options or unsubscribe: http://lists.wikia.com/mailman/ > options/search-l -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.wikia.com/pipermail/search-l/attachments/20070703/b0583af7/attachment.html From sethf at sethf.com Tue Jul 3 21:14:02 2007 From: sethf at sethf.com (Seth Finkelstein) Date: Tue, 3 Jul 2007 17:14:02 -0400 Subject: [Search-l] hipmojo.com - "Jimmy Wales vs. Jason Calacanis" Message-ID: <20070703211402.GA27219@sethf.com> [Disclaimer: I didn't forward this earlier because it would seem to open a can of worms I thought better left undisturbed at the time. But since they appear to be squirming around already today, well, I suppose now I won't get flamed *too* badly ... Again, I didn't write the post below, though I'm forwarding it with awareness of the context, that it brings up problematic issues, and it's a very thoughtful exploration.] http://www.watchmojo.com/web/blog/?p=1723 24 Jun 2007 by Ashkan Karbasfrooshan Jimmy Wales vs. Jason Calacanis Sometimes, the nicest things you can say about someone come across as awfully critical. I say that because this post actually is intended to give a lot of credit to both Jason Calacanis and Jimmy Wales, but a cynic would argue it strives to do the opposite. This morning it occured to me that Jimmy Wales might have blown his opportunity for redemption, before he even had a chance. I'm hoping I'll be proven wrong, but we'll say. The Background? Last year, between Christmas and New Year's, Wikipedia co-founder Jimmy Wales said that he was looking to rely on humans and open source technology to build a Google-killing search engine. That raised a lot of eyebrows, but it got a lot of people excited. Over the next few months, people rushed to join his army of developers in the attempt to develop an open-source search engine built by legions of programmers. That's right, people signed up to a mailing list and got cracking. I always wondered: how many Google employees signed up to that list, too? Frankly, from the get-go, it was a hazy objective with a murky action plan. To his credit, Wales had built one of the most successful social media properties in Wikipedia.org, the world's largest free encyclopedia, relying on wiki software that allows everyone and anyone to make changes to a text. Wikia = a for profit Wikipedia When the plans were being conceived, people were confused as to Wikipedia.org's role in the new startup, whose name, it was reported by Wikiasari, then Wikia Search. Wikia was the for-profit company Wales had set up to develop wiki-powered communities. By the looks of it today, the name of his search is Wikia Search, though we could be wrong and this could change. All Quiet on the Inbox Front? Sometime in February or March 2007, I was going to publish a cynical post wondering if all of the members on the mailist list had been kidnapped, because I went one whole day without getting an email. At its peak, subscribers would get bombarded with 20-30 emails per day with everyone chiming in on everything you can imagine. It would start off with someone introducing themselves, why they wanted to collaborate and what they saw as a problem with the current search landscape. It was information overload but a great social experiment. Things have settled down now, but you still get the odd flurry of emails, usually when there is something in the news that touches on the project. Enter a New Incumbent In all honestly, this mailing list information overload phenomenon led me to realize why Google was successful: there were only two guys early on. It was easy to get things done. Ultimately, I thought, Wikia Search's biggest drawback was its greatest strength. The wisdom of the crowds theory was making it unlikely for anything to get done, at least on the surface. Over the subsequent months, web entrepreneur Jason Calacanis - who had founded and ran Silicon Alley Reporter in the late 1990s and sold his second startup Weblogs Inc. to AOL for $25M - encouraged Wales to embrace advertising on Wikipedia. That prompted me to publish "What would Wikipedia.org be worth as a for-profit," which got a lot of people excited about the commercial upside of Wikipedia.org. Calacanis was one of them. What If... But, because Wikipedia.org had launched as a non-profit, it had to remain as such, and Wales to this say maintains, quite honorably and candidly, that launching Wikipedia.org as a non-profit was his best and / or worst decision ever. No doubt, Wikipedia.org would have maybe not become as big had it been a for-profit, but then seeing socially-edited content sites like Digg and YouTube explode run counter to that assertion... what if Wikipedia.org had been a for-profit all along? With that thought in the back of Wales' mind, he sought out to start a new company, one that was structured outside of Wikipedia.org, but built on a lot of the tangible and intangible tenets thereof, that would become his brass ring. Man's Hubris: Search And in order not to try to re-invent the wheel - though technically he was doing that too, with Wikia - Wales decided to tackle search. I've long stated that search is the hubris of the Web entrepreneur. Why, I thought, would someone with Wales' mythical stature risk it all to lose to Google in search? And lose badly? By way of disclosure: I should state that I too developed a search engine, called MetaMojo.com, but I did so out of a personal interest as a hobby, and because it was something that would not violate my employment and non-competition agreement. Today, I spend a good amount of my time and energy on search, but the bulk of it goes to video, a far more nascent field when we've already developed a leadership position. After all, search, while still a somewhat new sector of the overall communications and commercial economy is a more mature space. Oh, there's also a massive player in the room called Google that has gone from $0 to a market cap of $160B in less than ten years, which last year generated $10B in revenues, over $3 billion in profits and could at current growth rates surpass Microsoft in market valuation by 2010, maybe at least. Enter the Naysayers Wales was not only ridiculed by many for attempting to create a Google-killer out of thin air, but he also has been criticized in the past six months for what comes across as inaction. As the mailing list at Wikia Search expands every day, people become impatient and Wales himself has frequently stated that "you can't build a search engine like his overnight." To his credit, he recently hired Jabber founder Jeremie Miller to spearhead the project. I say this to emphasize: no doubt Wikia Search is making a lot of inroads and advancing, and there's plenty of time left in search, but clearly, others are not standing still. Thank You for nothing! Of course, as excitement and innocence surrounding that mailing list gave way to cynicism and frustration, others have not sat still. Today's New York Times article on search and the human touch should have technically talked mainly about Wales' project, be it Wikpedia or Wikia Search. But instead, it does not, it touched on Jason Calacanis' Mahalo. And not too long after someone in that same venerable mailing list sent the article around, another subscriber asked: I am not sure if something similar like Mahalo (e.g. http://www.mahalo.com/Java) was considered by Wikia. It seems to me that their way of providing search results is closely related to the idea of a public Wiki. I wonder what the reactions of both Calacanis and Wales were after that email was opened. Surely, Calacanis must have grinned when he clicked and opened that message, and Wales must have been somewhat miffed, for Calacanis is surely combining a lot of the elements of Wikipedia into his new project, Mahalo. I'm not saying that Calacanis borrowed from Wikipedia, though the use of wiki software, the human-compiled database etc., are all signature tactics of Wikipedia.org, though clearly by way of its open source status, not proprietary to it. What Could Have Been? It should also be stated, that had Wales long ago encouraged Wikipedia editors to add the most relevant and pertinent links for each topic, it would already have a human-compiled [DEL: search engine :DEL] directory consisting of millions upon millions of web sites. Of course the same things that plagued Yahoo! Directory, Looksmart, DMOZ and that will plague Mahalo would threaten it, but as it stands now, you can't help but think that Calacanis pulled a Digg/Netscape 2 by borrowing heavily from Wales' baby and applying to search at Mahalo.com. In no shape, form or fashion does this imply that they two are directly competing etc., but when I opened that email this morning (yes, I'm on the list) my reaction was: "Wales must not be smiling." Heavy Backers While Wales was able to secure funding from the likes of Amazon.com and raised $3.5M for his search project, Mahalo raised a massive $16M on a valuation on $100M. That is an insane amount of money for a startup that many will ridicule as Looksmart or Squidoo, or even Chacha, all companies with many obstacles in the marketplace and no real strategy to be relevant. In Search, Distribution is Everything After all, the key consideration in search is not the quality of the algorithm, size of the index etc., but rather the distribution thereof. While Wales had been openly talking about his new search engine since Christmas in a mailing list, Calacanis was coy. It was only recently that Valleywag eerily accurately described Calacanis' project that the details came to light. Shortly thereafter, Calacanis unveiled Mahalo.com and the naysayers outnumbered the believers quite a bit. I'm still not sure of how relevant Mahalo.com or Wikia Search will be, frankly. I do think however, that by having secured Sequoia as a backer, Calacanis has managed to ensure his company's exit before Wales even hits the entrance. Of course, if distribution is king in search, then Wales cannot be written off, since adding Wikia Search (whenever it launches) across the sprawling Wikipedia.org site would overnight create a winner out of his project, too. The lesson, I suppose, is that online advertising is only starting to come to fruition so even the laggards in the space might have better prospects than the winners of many of the new segments of online commerce and communications that many are getting over-excited about. Posted in Internet & Web, Financing, Search Wars, Open Source, Management, Online Advertising, Wikipedia | Ashkan Karbasfrooshan -- Seth Finkelstein Consulting Programmer http://sethf.com/ Infothought blog - http://sethf.com/infothought/blog/ Interview: http://sethf.com/essays/major/greplaw-interview.php From jwales at wikia.com Tue Jul 3 21:34:14 2007 From: jwales at wikia.com (Jimmy Wales) Date: Tue, 3 Jul 2007 14:34:14 -0700 Subject: [Search-l] Wales - "Wikia as a search portal" In-Reply-To: <70b3cf150707021432v6d3c0117iba04a360918bc840@mail.gmail.com> References: <20070702184722.GA17752@sethf.com> <70b3cf150707021432v6d3c0117iba04a360918bc840@mail.gmail.com> Message-ID: <5D5F3721-E408-4241-AA88-73008C63745D@wikia.com> On Jul 2, 2007, at 2:32 PM, Jason Calacanis wrote: > OK, well, I was there and Jimmy specifically said... umm.... well.... > come to think of it he didn't say much! I'm a pretty quiet guy, actually. I like to listen and learn. --Jimbo From jason at calacanis.com Tue Jul 3 21:43:22 2007 From: jason at calacanis.com (Jason Calacanis) Date: Tue, 3 Jul 2007 14:43:22 -0700 Subject: [Search-l] FYI - June author posting statistics In-Reply-To: <468A98D0.3090800@centernetworks.com> References: <20070703135456.GA22522@sethf.com> <70b3cf150707030954i38326fcn81435a2b271b9b03@mail.gmail.com> <355a36af0707031109y5ddaff7cpd0c97733a3add1c4@mail.gmail.com> <468A98D0.3090800@centernetworks.com> Message-ID: <70b3cf150707031443u31334f7bn966bf14f75ec5959@mail.gmail.com> For the good of the group--and my relationship with Jimmy/Gil--can we please take discussions of Mahalo/me being on the list/etc. off the list going forward please? For the record: I'm here to listen, learn, and participate in the open-source Wikia Search project. Mahalo is a search service that plans to *USE* Wikia search software--like 100 other search companies will I'm sure. We're not competition for Wikia search software. Note: Anyone can reach me at jason at calacanis dot com any time and I'll talk with you one-on-one. best j --------------------- Jason McCabe Calacanis CEO, http://www.Mahalo.com Mobile: 310-456-4900 My blog: http://www.calacanis.com AOL IM/Skype: jasoncalacanis Add my podcast to iTunes: http://phobos.apple.com/WebObjects/MZStore.woa/wa/viewPodcast?id=204314545 From ic at thelastfrontier.com.au Wed Jul 4 00:52:58 2007 From: ic at thelastfrontier.com.au (Grahame Gould) Date: Wed, 4 Jul 2007 08:52:58 +0800 Subject: [Search-l] Open forum In-Reply-To: <70b3cf150707031443u31334f7bn966bf14f75ec5959@mail.gmail.com> References: <20070703135456.GA22522@sethf.com><70b3cf150707030954i38326fcn81435a2b271b9b03@mail.gmail.com><355a36af0707031109y5ddaff7cpd0c97733a3add1c4@mail.gmail.com><468A98D0.3090800@centernetworks.com> <70b3cf150707031443u31334f7bn966bf14f75ec5959@mail.gmail.com> Message-ID: <3ADB1E55E4668048BF0A96E7828915F8A23F88@jabiru.SWEK.local> I think you're being overly sensitive again, Jason. I assume you're referring to Ashkan's article posted by Seth. The article specifically says that the author thinks you're not competing (directly) with Wikia. People are entitled to speculate and wonder, and form opinions, and this listserve should be an open discussion. I'm glad you've put on record your position. Hopefully, it will clear things up for those who aren't sure what the real purpose/motivation of Mahalo is. Instead of asking for censorship, continue to address what you see as incorrect information. We're all listening, including Ashkan. Grahame Gould Information Coordinator Shire of Wyndham East Kimberley Kununurra, WA, Australia (08) 9168 4100 Phone (08) 9168 1798 Fax www.thelastfrontier.com.au From jwales at wikia.com Wed Jul 4 18:04:11 2007 From: jwales at wikia.com (Jimmy Wales) Date: Wed, 4 Jul 2007 11:04:11 -0700 Subject: [Search-l] hipmojo.com - "Jimmy Wales vs. Jason Calacanis" In-Reply-To: <20070703211402.GA27219@sethf.com> References: <20070703211402.GA27219@sethf.com> Message-ID: On Jul 3, 2007, at 2:14 PM, Seth Finkelstein wrote: > Why, I thought, would someone with Wales' mythical stature risk it all > to lose to Google in search? And lose badly? I always like to do whatever seems like it will be the most fun. And if the mission were viewed as "beating Google" then I suppose I would lose, but I don't look at it that way at all, any more than the mission with Wikipedia was or is to "beat Britannica". The difference between Mahalo and Wikia is pretty clear, isn't it? Mahalo is proprietary. Wikia is free. Or at least, I haven't heard Jason say that he's going to release either the software or data under a free license. And so, for me, his project is just not that interesting. I mean, I am sure it is lovely and all, but I really don't care about it. I do wish he would stop spamming this list with irrelevancies, but of course he is welcome to contribute and welcome to use our work if it suits him. As to the pace, I can only advise people to have a bit of patience! I may have "mythical stature" to some, but the truth is I always move slowly, think a lot, talk a lot, discuss a lot. Review the history of Wikipedia... going allllll the way back. There were first nearly 2 years of "Nupedia" which ended up going nowhere but involved a lot of thoughtful people discussing how to build an encyclopedia, a lot of friendship building, community building, vision shaping, etc. That's where we are today. Hopefully we will have the first stab of something that sucks up by the end of the year. And then we will start to revise, reconsider, rethink, delete, add, edit, change, until we start making something better and better over time. Will it take 2 years? 5 years? I dunno. It will take however long it takes. But it will be fun. :-) At that meeting at Foo Camp, the one where I didn't say much, there was something that I did say. I said that for me, this a political (small-p politics) mission in the same way that Wikipedia is a political mission. Search is a part of the fundamental infrastructure of the Internet, and it should be free in the same way that much of the rest of the infrastructure is free. Talk about competition always confuses or bores me. I just don't think of the world in that way. I think we should just build something cool and have a good time doing it. If it becomes huge and important, great. If someone else does something better, great. As long as we do work we are proud of, and have a productive impact on the world, I'm happy. One of my accidentally famous remarks is "I advise the world to relax a notch or two!" I really mean it, too. :-) In practical news, Jeremie is coming out to spend a week brainstorming with me in a couple of weeks. We hope to put together some useful proposals for initial steps after that. --Jimbo From jason at calacanis.com Wed Jul 4 19:43:31 2007 From: jason at calacanis.com (Jason Calacanis) Date: Wed, 4 Jul 2007 12:43:31 -0700 Subject: [Search-l] hipmojo.com - "Jimmy Wales vs. Jason Calacanis" In-Reply-To: References: <20070703211402.GA27219@sethf.com> Message-ID: <70b3cf150707041243j76ff873bua0bc5716b0682b8e@mail.gmail.com> On 7/4/07, Jimmy Wales wrote: > On Jul 3, 2007, at 2:14 PM, Seth Finkelstein wrote: > The difference between Mahalo and Wikia is pretty clear, isn't it? > Mahalo is proprietary. Wikia is free. Or at least, I haven't heard > Jason say that he's going to release either the software or data > under a free license. And so, for me, his project is just not that > interesting. I mean, I am sure it is lovely and all, but I really > don't care about it. Mahalo for the feedback Jimmy--you are about 1/3rd correct. 1. On software ------------------------ All the changes to MediaWiki, the software we are using, are being contributed back. We've made a great threaded message board that I think would be a great tool for Wikipedia to transition their discussion pages to and I hope they consider it. In fact, you know this Jimmy because at Foo Camp I told you about our Message Board while we drank scotch and offered it to you to use at Wikia--I guess you forgot (maybe the scotch?). :-) In terms of search software we use Nutch right now and might use Wikia's software down the road. Again, all software is free and we are contributing all changes back. We are also using Squid and contributing back changes. On the software front we are exactly the same as Wikia. 2. Editorial ------------------------ In terms of the hand-written search results we are paying people to create them in the Greenhouse (http://greenhouse.mahalo.com) and on our staff. We are retaining ownership of the editorial we pay for so we can continue to pay for it. That is where Jimmy and I diverge it seems: we feel since we're paying for the results we should own them; Jimmy isn't paying editors so doesn't have the same worry. Now, this is not written in stone. In the future we might move to a Creative Commons model for the results--perhaps non-commercial so someone doesn't just life the entire Mahalo index and dilute our ability to pay the contributors. That's my main concern: figuring out a way to keep paying folks who want to get paid for their contributions. So, I like CC Noncommerical and I like paying people. My previous company Weblogs, Inc. paid people to create amazing blogs like HackADay, Engadget, Autoblog, Joysitq, etc. These folks were able to make careers out working from home--many were stay at home parents with five hours to work while their kids were at school. I'm very proud of the fact that we were able to pay them. In fact, I see being able to create sustainable ways to pay folks for working on distributed editorial projects as a great success--that is what I hope to do with Mahalo. If folks do not want to be paid--if that kills the excitement for them--we are letting them donate the fees to Wikipedia or Mozilla (we might add more projects in the future). I think this is the best model going forward balancing Yochi's dreams while letting people keep a roof over their head and take care of their kids. My feeling on the subject of people getting paid for work is that venture-backed companies should pay people for their work *if* those people want to be paid. If you're spending five hours a day editing Mahalo I feel it's only right to give you something other than a pat on the back or the rush of building a search engine/directory. If you're spending five minutes a month suggesting links? Maybe not. There really are many classes of people in these systems and you can create various compensation systems for them from monetary to altruistic. Again, I take nothing away from folks who do it for the fun... I'm one of them! My edits to the wikipedia are free. My podcast and blog sponsorships go to a charity that puts foster kids into private schools with small class sizes. Honestly, as a writer by trade, I always find it odd that the creative people are the ones left behind when VC-backed companies start a project. The management team, core developers, administrators always seem to get paid... the "content creators" (to use a horrible term) always seem to get left behind. I hope we can turn that model on it's head and help the creative/editors/artists make a living. This is just my vision... other people can--and have--created large systems that thrive without paying creative types. > I do wish he would stop spamming this list with > irrelevancies, but of course he is welcome to contribute and welcome > to use our work if it suits him. At this point Gil and Jimmy feel I am spamming the list, but are telling me off list they want me involved in the project. Talk about mixed messages!!! My emails to the list have only been in response to people talking about Mahalo. If folks want to discuss Mahalo and Wikia I'm willing to discuss it--I'm not sure why Jimmy and Gil want to censor this discussion or consider it irrelevant--the people posting messages about Mahalo don't seem to think they are irrelevant Jimmy. Also, Jimmy I'm kind of hurt (really--just crushed :-) that you're resorting to using aggressive language like "spamming" when all I've ever done is offer to support the Wikia search project. I even offered to share costs for the servers at Foo Camp. You said you wanted to form a group of up and coming search companies who could leverage open software--that's what we want too. What happened to "assume good faith?" thing? Other folks are asking me to be involved on the list (and project), Gil is asking me to be involved off list, so... it feels like I'm damed if I do and dammed if I don't to a certain extent. I mean, I have to respond to this email because it's got wrong assumptions, but is Jimmy going to continue saying I'm a spammer because I'm correcting the errors in his assumptions? Ugh. I'm going to keep my position of only replying to direct questions and Mahalo misinformation/clarifications (like this one) going forward. If you want me to respond to anything else my email and phone number below and I'm more then willing to talk to anyone at any time.... but for now I'm gonna jump in the pool!!!! Happy 4th everyone... best j --------------------- Jason McCabe Calacanis CEO, http://www.Mahalo.com Mobile: 310-456-4900 My blog: http://www.calacanis.com AOL IM/Skype: jasoncalacanis From ash at mojosupreme.com Wed Jul 4 20:57:08 2007 From: ash at mojosupreme.com (Ashkan Karbasfrooshan) Date: Wed, 4 Jul 2007 16:57:08 -0400 Subject: [Search-l] hipmojo.com - "Jimmy Wales vs. Jason Calacanis" In-Reply-To: <70b3cf150707041243j76ff873bua0bc5716b0682b8e@mail.gmail.com> References: <20070703211402.GA27219@sethf.com> <70b3cf150707041243j76ff873bua0bc5716b0682b8e@mail.gmail.com> Message-ID: Shouldn't you boys be BBQing something, watching fireworks and drinking beer (or scotch) today? ;) Happy 4th of July to all US recipients of this email. Ash On 7/4/07, Jason Calacanis wrote: > > On 7/4/07, Jimmy Wales wrote: > > On Jul 3, 2007, at 2:14 PM, Seth Finkelstein wrote: > > The difference between Mahalo and Wikia is pretty clear, isn't it? > > Mahalo is proprietary. Wikia is free. Or at least, I haven't heard > > Jason say that he's going to release either the software or data > > under a free license. And so, for me, his project is just not that > > interesting. I mean, I am sure it is lovely and all, but I really > > don't care about it. > > Mahalo for the feedback Jimmy--you are about 1/3rd correct. > > 1. On software > ------------------------ > All the changes to MediaWiki, the software we are using, are being > contributed back. We've made a great threaded message board that I > think would be a great tool for Wikipedia to transition their > discussion pages to and I hope they consider it. > > In fact, you know this Jimmy because at Foo Camp I told you about our > Message Board while we drank scotch and offered it to you to use at > Wikia--I guess you forgot (maybe the scotch?). :-) > > In terms of search software we use Nutch right now and might use > Wikia's software down the road. Again, all software is free and we are > contributing all changes back. We are also using Squid and > contributing back changes. > > On the software front we are exactly the same as Wikia. > > 2. Editorial > ------------------------ > In terms of the hand-written search results we are paying people to > create them in the Greenhouse (http://greenhouse.mahalo.com) and on > our staff. We are retaining ownership of the editorial we pay for so > we can continue to pay for it. That is where Jimmy and I diverge it > seems: we feel since we're paying for the results we should own them; > Jimmy isn't paying editors so doesn't have the same worry. > > Now, this is not written in stone. In the future we might move to a > Creative Commons model for the results--perhaps non-commercial so > someone doesn't just life the entire Mahalo index and dilute our > ability to pay the contributors. That's my main concern: figuring out > a way to keep paying folks who want to get paid for their > contributions. > > So, I like CC Noncommerical and I like paying people. > > My previous company Weblogs, Inc. paid people to create amazing blogs > like HackADay, Engadget, Autoblog, Joysitq, etc. These folks were able > to make careers out working from home--many were stay at home parents > with five hours to work while their kids were at school. I'm very > proud of the fact that we were able to pay them. In fact, I see being > able to create sustainable ways to pay folks for working on > distributed editorial projects as a great success--that is what I hope > to do with Mahalo. > > If folks do not want to be paid--if that kills the excitement for > them--we are letting them donate the fees to Wikipedia or Mozilla (we > might add more projects in the future). I think this is the best model > going forward balancing Yochi's dreams while letting people keep a > roof over their head and take care of their kids. > > My feeling on the subject of people getting paid for work is that > venture-backed companies should pay people for their work *if* those > people want to be paid. If you're spending five hours a day editing > Mahalo I feel it's only right to give you something other than a pat > on the back or the rush of building a search engine/directory. If > you're spending five minutes a month suggesting links? Maybe not. > There really are many classes of people in these systems and you can > create various compensation systems for them from monetary to > altruistic. > > Again, I take nothing away from folks who do it for the fun... I'm one > of them! My edits to the wikipedia are free. My podcast and blog > sponsorships go to a charity that puts foster kids into private > schools with small class sizes. > > Honestly, as a writer by trade, I always find it odd that the creative > people are the ones left behind when VC-backed companies start a > project. The management team, core developers, administrators always > seem to get paid... the "content creators" (to use a horrible term) > always seem to get left behind. I hope we can turn that model on it's > head and help the creative/editors/artists make a living. This is just > my vision... other people can--and have--created large systems that > thrive without paying creative types. > > > I do wish he would stop spamming this list with > > irrelevancies, but of course he is welcome to contribute and welcome > > to use our work if it suits him. > > At this point Gil and Jimmy feel I am spamming the list, but are > telling me off list they want me involved in the project. Talk about > mixed messages!!! > > My emails to the list have only been in response to people talking > about Mahalo. If folks want to discuss Mahalo and Wikia I'm willing to > discuss it--I'm not sure why Jimmy and Gil want to censor this > discussion or consider it irrelevant--the people posting messages > about Mahalo don't seem to think they are irrelevant Jimmy. > > Also, Jimmy I'm kind of hurt (really--just crushed :-) that you're > resorting to using aggressive language like "spamming" when all I've > ever done is offer to support the Wikia search project. I even offered > to share costs for the servers at Foo Camp. You said you wanted to > form a group of up and coming search companies who could leverage open > software--that's what we want too. What happened to "assume good > faith?" thing? > > Other folks are asking me to be involved on the list (and project), > Gil is asking me to be involved off list, so... it feels like I'm > damed if I do and dammed if I don't to a certain extent. I mean, I > have to respond to this email because it's got wrong assumptions, but > is Jimmy going to continue saying I'm a spammer because I'm correcting > the errors in his assumptions? Ugh. > > I'm going to keep my position of only replying to direct questions and > Mahalo misinformation/clarifications (like this one) going forward. If > you want me to respond to anything else my email and phone number > below and I'm more then willing to talk to anyone at any time.... but > for now I'm gonna jump in the pool!!!! > > Happy 4th everyone... > > best j > --------------------- > Jason McCabe Calacanis > CEO, http://www.Mahalo.com > Mobile: 310-456-4900 > My blog: http://www.calacanis.com > AOL IM/Skype: jasoncalacanis > _______________________________________________ > Search-l mailing list > Search-l at wikia.com > http://lists.wikia.com/mailman/listinfo/search-l > Change options or unsubscribe: > http://lists.wikia.com/mailman/options/search-l > -- Ashkan Karbasfrooshan President Mojo Supreme 5413 St. Laurent Blvd Suite 200 Montreal, QC H2T 1S5 Canada p: 1-514/448-1631 c: 1-514/827-2532 f: 1-866/868-0981 Ash at MojoSupreme.com http://www.WatchMojo.com http://www.MojoSupreme.com/about.html -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.wikia.com/pipermail/search-l/attachments/20070704/fc0e5254/attachment.html From gevangasteren at gmx.net Wed Jul 4 21:38:34 2007 From: gevangasteren at gmx.net (=?ISO-8859-15?Q?G=E9_van_Gasteren?=) Date: Wed, 04 Jul 2007 23:38:34 +0200 Subject: [Search-l] Wales - "Wikia as a search portal" Message-ID: <468C135A.4020903@gmx.net> >I'm a pretty quiet guy, actually. I like to listen and learn. >--Jimbo Just like to give you some cheers for that remark! 1. Wikipedia's success is speaking for you, and very eloquently too! 2. A story from India: When the Creator wanted to make the universe, he tried and tried, but nothing came out. Then he got the inspiration to first meditate. So he sat for a thousand years. And when he came out of it, his thoughts were so powerful and perfect that they manifested instantly and effortlessly. From jeremie at jabber.org Thu Jul 5 05:00:08 2007 From: jeremie at jabber.org (jer) Date: Thu, 5 Jul 2007 00:00:08 -0500 Subject: [Search-l] Short Summary Message-ID: <390ECF8C-77F6-43A7-9441-BCA2AFC58BB5@jabber.org> Hope everyone in these parts had a great 4th, I'm sitting around a campfire yet with the family as I type this :) Jimmy posted the most important part already: this is a long-term process and still very much evolving. I'll break that evolution into three general categories... First, we're here to support any open source search project around the four principles we discussed earlier: Transparency, Community, Quality, and Privacy. We've got swlabs.org servers humming and anyone needing more than sourceforge style project hosting is welcomed to take advantage this. Any open source projects that form as a result of anything here, this will be their home. Also, we hope to time a number of exciting project announcements with OSCON :) Where I solve problems with protocols, Jimmy does it with social systems, and he has some really beautiful ideas to contribute to the search space in a number of unique and important ways. These systems require a lot of patience and time to build and get right, expect this discussion to really blossom at OSCON as well. The plan is to use this mailing list for all related discussions and transform search.wikia.com into the user-driven system, and any/all content will always be available under the GFDL and any code under an open source license. Finally, the one ring to bind them is the protocol (well, more of an open architecture really) I've been teasing with for way too long. I'll follow up this message with a short introduction to "Atlas." Not a lot of written detail yet as the vision and fundamentals have gotten all the attention, but it won't take long for the details to emerge. As Jimmy said, it's going to be a lot of fun! Jer From jeremie at jabber.org Thu Jul 5 05:03:57 2007 From: jeremie at jabber.org (jer) Date: Thu, 5 Jul 2007 00:03:57 -0500 Subject: [Search-l] Atlas - Internet Search Infrastructure Message-ID: <30D55DA4-47A8-4AEB-A0A4-466FEAA9DE33@jabber.org> This is a brief overview of a large vision: enabling search to become a part of the Internet's infrastructure. Building on Atlas as an open protocol, search can become a fully distributed and interoperable world-wide community. All of the participants can interact openly and in any role where they believe they can add value to the network. A search engine can be constructed from many independent entities serving different roles instead of one monolithic system. These entities are exchanging aggregate information, or knowledge, and can decide with whom they want to work with. To design this working economy based on knowledge, there must be balance between these various entities. Each actor must have incentive to act both for their own benefit and for the benefit of the whole, and enough information to make and validate those decisions. Reputations and relationships are the essential fabric of Atlas, just as they are in a real-world free market. There are three primary roles within Atlas: Factory - Responsible to the content. Collector - Responsible to the keyword. Broker - Responsible to the searcher. Each of these actors must interact with the others to complete any search request. Any two roles could be performed by a single entity (whereas if all three are performed by one entity, the result would be a traditional, monolithic search engine). A Factory is akin to a crawler in today's search engines. An Atlas Factory must fetch and process the content as intelligently as possible, performing analysis (such as Natural Language Processing) and normalizing it into distinct units. A Factory shares its highly refined and processed output with one or more Collectors based on who they believe is best utilizing it. A Collector absorbs and indexes output from one or more Factories, with one primary goal: ranking. An Atlas Collector must provide the most intelligent ranking and relationship analysis possible. A Collector has to compete for the output of a Factory, as well as compete to provide the best ranking quality for Brokers. A Broker must provide a searcher with the best possible results. It does so by combining diverse ranking results from Collectors and also by retrieving content from the original Factories. This last step, a Broker interacting with a Factory, is critical to maintaining a balanced ecosystem. All Factories must be aware of and approve how their results are being used and by whom. Reputation and reward is bi-directional between all parties (Factory- Collector, Collector-Broker, and Broker-Factory). Each entity may choose to interact on principle (free, Commons), attribution (results provided by), or commercially (as a paid service), the Atlas protocol is purely a facilitator and does not restrict how the relationships between any entities are formed. In considering these motives for the various entities, it's likely that the free-based networks will tend to become more specialized, commercial ones will compete on quality, and attribution based networks will mature in both directions. This simple yet powerful division of roles, responsibilities, and relationships will result in a distributed economic foundation for an Internet Search Infrastructure. The wire protocol and further definition of the interactions between these entities is openly evolving, anyone interested is welcomed to join the discussions and see the initial proposals at http://lists.wikia.com/mailman/listinfo/ atlas-l over the coming weeks. Thanks, looking forward to a radically different search ecosystem in the coming years :) Jer From jwales at wikia.com Thu Jul 5 06:54:35 2007 From: jwales at wikia.com (Jimmy Wales) Date: Wed, 4 Jul 2007 23:54:35 -0700 Subject: [Search-l] hipmojo.com - "Jimmy Wales vs. Jason Calacanis" In-Reply-To: <70b3cf150707041243j76ff873bua0bc5716b0682b8e@mail.gmail.com> References: <20070703211402.GA27219@sethf.com> <70b3cf150707041243j76ff873bua0bc5716b0682b8e@mail.gmail.com> Message-ID: On Jul 4, 2007, at 12:43 PM, Jason Calacanis wrote: > At this point Gil and Jimmy feel I am spamming the list, but are > telling me off list they want me involved in the project. Talk about > mixed messages!!! Love to have you involved. Just try to be more collaborative, eh? From wsurowiec at gmail.com Thu Jul 5 22:48:30 2007 From: wsurowiec at gmail.com (William Surowiec) Date: Thu, 05 Jul 2007 18:48:30 -0400 Subject: [Search-l] Atlas - Internet Search Infrastructure In-Reply-To: <30D55DA4-47A8-4AEB-A0A4-466FEAA9DE33@jabber.org> References: <30D55DA4-47A8-4AEB-A0A4-466FEAA9DE33@jabber.org> Message-ID: <468D753E.4060902@gmail.com> Jer, I appreciate high level statements that point out a direction. And your email certainly does that. I already sense it having a "pot stirring" effect within me. But something is troubling me that I would first like to clarify. I will highlight the phrases that have raised this concern: ... A Factory shares its highly refined and processed output with one or more Collectors based on who they believe is best utilizing it. ... A Collector has to compete for the output of a Factory, as well as compete to provide the best ranking quality for Brokers. ... All Factories must be aware of and approve how their results are being used and by whom. ... This seems to imply that all data will not be freely available to all. Is that the intent? If so, one question immediately arises in my mind (sorry, maybe this is a New York City instinct): could some one (or small entity) create a Factory + Collector + Broker (and authorize only that specific set as its authorized users?) If so, then could that entity then use the groups' servers and bandwidth to offer search services? Sorry if I am off target (but I would be glad to be wrong in this instance.) Bill From jwales at wikia.com Fri Jul 6 00:02:15 2007 From: jwales at wikia.com (Jimmy Wales) Date: Thu, 5 Jul 2007 17:02:15 -0700 Subject: [Search-l] Atlas - Internet Search Infrastructure In-Reply-To: <468D753E.4060902@gmail.com> References: <30D55DA4-47A8-4AEB-A0A4-466FEAA9DE33@jabber.org> <468D753E.4060902@gmail.com> Message-ID: <08A96629-6470-420C-A010-6BE7DA1EF0BE@wikia.com> On Jul 5, 2007, at 3:48 PM, William Surowiec wrote: > This seems to imply that all data will not be freely available to all. > Is that the intent? Wikia has the intention of making all of our stuff freely available to all. As I understand Jeremie in terms of the protocol/structure he is proposing, other players may want to do things differently. The idea is to create a framework where competition is possible. But he should answer for himself, since he understands his ideas far better than I do. :) From dizzyd at gmail.com Fri Jul 6 01:21:58 2007 From: dizzyd at gmail.com (Dave Smith) Date: Thu, 5 Jul 2007 19:21:58 -0600 Subject: [Search-l] Atlas - Internet Search Infrastructure In-Reply-To: <468D753E.4060902@gmail.com> References: <30D55DA4-47A8-4AEB-A0A4-466FEAA9DE33@jabber.org> <468D753E.4060902@gmail.com> Message-ID: > If so, one question immediately arises in my mind (sorry, maybe this is > a New York City instinct): could some one (or small entity) create a > Factory + Collector + Broker (and authorize only that specific set as > its authorized users?) If so, then could that entity then use the > groups' servers and bandwidth to offer search services? In general, I believe the answer is yes -- there could be entirely private sets of F/C/Bs that are only available for select users. The key point is that the architecture/protocol remains the same, independent of deployment. This is in keeping with lots of other "internet scale" architecture/protocol (HTTP, Jabber, Email, etc.) where it's possible to do private deployments using the same pieces as public deployments. I don't _think_ that the services Wikia will be providing (hardware wise) will permit large scale deployment of these "private" services, but I may be wrong. At the heart of all these pieces is a belief that interoperability _facilitates_ competition, and (feature/value) competition is a good thing for the Internet. At least, that's my $0.02... :) D. From vprajan at gmail.com Tue Jul 17 19:29:32 2007 From: vprajan at gmail.com (Pushparajan V) Date: Wed, 18 Jul 2007 00:59:32 +0530 Subject: [Search-l] Discussions about search-l project at Barcamp, Bangalore, India Message-ID: Hi all, I am doing a small BoF at Barcamp, Bangalore, India, explaining the atlas project and the future plans so that it will reach more Indian open source developers too. If any one in this mailing list is from Bangalore, please join me and we will have hot discussions on this Barcamp. Venue and date: *28th and 29th July 2007 @ IIM Bangalore* Register your name here: http://barcampbangalore.org/wiki/BCB4_Internet_Collective Thanks, Pushparajan V http://www.vprajan.org -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.wikia.com/pipermail/search-l/attachments/20070718/e3ba1718/attachment.html From jeremie at jabber.org Tue Jul 24 22:00:57 2007 From: jeremie at jabber.org (jer) Date: Tue, 24 Jul 2007 15:00:57 -0700 Subject: [Search-l] Search BOF tomorrow night @OSCON Message-ID: For those that are here at OSCON or in the area, be sure to stop in Wednesday evening 8:30-9:30 at the Oregon Convention Center room F151 for a BOF to discuss anything and everything open-source search related :) Jer From stpeter at jabber.org Tue Jul 24 22:32:29 2007 From: stpeter at jabber.org (Peter Saint-Andre) Date: Tue, 24 Jul 2007 15:32:29 -0700 Subject: [Search-l] Search BOF tomorrow night @OSCON In-Reply-To: References: Message-ID: <46A67DFD.7070902@jabber.org> jer wrote: > For those that are here at OSCON or in the area, be sure to stop in > Wednesday evening 8:30-9:30 at the Oregon Convention Center room F151 > for a BOF to discuss anything and everything open-source search > related :) Rock on. See you there. :) Peter -- Peter Saint-Andre https://stpeter.im/ -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/x-pkcs7-signature Size: 7354 bytes Desc: S/MIME Cryptographic Signature Url : http://lists.wikia.com/pipermail/search-l/attachments/20070724/affc3225/attachment.bin From vprajan at gmail.com Wed Jul 25 06:55:38 2007 From: vprajan at gmail.com (Pushparajan V) Date: Wed, 25 Jul 2007 12:25:38 +0530 Subject: [Search-l] Search BOF tomorrow night @OSCON In-Reply-To: <46A67DFD.7070902@jabber.org> References: <46A67DFD.7070902@jabber.org> Message-ID: On 7/25/07, Peter Saint-Andre wrote: > > jer wrote: > > For those that are here at OSCON or in the area, be sure to stop in > > Wednesday evening 8:30-9:30 at the Oregon Convention Center room F151 > > for a BOF to discuss anything and everything open-source search > > related :) > > Rock on. See you there. :) Please don't forget to give a summary of these meet ups on the mailing list.. ;) Peter > > -- > Peter Saint-Andre > https://stpeter.im/ > > > _______________________________________________ > Search-l mailing list > Search-l at wikia.com > http://lists.wikia.com/mailman/listinfo/search-l > Change options or unsubscribe: http://lists.wikia.com/mailman/options/search-l > > > -- Pushparajan V http://www.vprajan.org - - - - - - - - Know me: http://www.hackerkey.com/decrypt.php?hackerkey=v4sw57BCHJUY$hw3/5ln2pr6AFOPSck3ma4u7FLMSw7DTWXm6l6FGIKLRSU$i862NLJ0CAe6$t3b4en4a23Ns3MSr9g5AGO - - - - - - - - -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.wikia.com/pipermail/search-l/attachments/20070725/7e83bdc6/attachment.html From ray.slakinski at gmail.com Wed Jul 25 13:24:23 2007 From: ray.slakinski at gmail.com (Ray Slakinski) Date: Wed, 25 Jul 2007 06:24:23 -0700 Subject: [Search-l] Search BOF tomorrow night @OSCON In-Reply-To: References: <46A67DFD.7070902@jabber.org> Message-ID: <46A74F07.6060709@gmail.com> I'm here as well, call my cell if you want to meet up 905 399 3845 Ray Slakinski Mahalo.com Pushparajan V wrote: > On 7/25/07, *Peter Saint-Andre* > wrote: > > jer wrote: > > For those that are here at OSCON or in the area, be sure to stop in > > Wednesday evening 8:30-9:30 at the Oregon Convention Center room > F151 > > for a BOF to discuss anything and everything open-source search > > related :) > > Rock on. See you there. :) > > > Please don't forget to give a summary of these meet ups on the mailing > list.. ;) > > Peter > > -- > Peter Saint-Andre > https://stpeter.im/ > > > _______________________________________________ > Search-l mailing list > Search-l at wikia.com > http://lists.wikia.com/mailman/listinfo/search-l > Change options or unsubscribe: > http://lists.wikia.com/mailman/options/search-l > > > > > > -- > Pushparajan V > http://www.vprajan.org > - - - - - - - - > Know me: > http://www.hackerkey.com/decrypt.php?hackerkey=v4sw57BCHJUY$hw3/5ln2pr6AFOPSck3ma4u7FLMSw7DTWXm6l6FGIKLRSU$i862NLJ0CAe6$t3b4en4a23Ns3MSr9g5AGO > - - - - - - - - > ------------------------------------------------------------------------ > > _______________________________________________ > Search-l mailing list > Search-l at wikia.com > http://lists.wikia.com/mailman/listinfo/search-l > Change options or unsubscribe: http://lists.wikia.com/mailman/options/search-l -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 249 bytes Desc: OpenPGP digital signature Url : http://lists.wikia.com/pipermail/search-l/attachments/20070725/5fc939ef/attachment.bin From jeremie at jabber.org Fri Jul 27 00:31:19 2007 From: jeremie at jabber.org (jer) Date: Thu, 26 Jul 2007 17:31:19 -0700 Subject: [Search-l] quick update from OSCON Message-ID: <907B2362-AD1B-4EA7-A6AC-CC9B24B87F11@jabber.org> Our BOF was fun the other night, a very varied and casual discussion with about a dozen folks. My talk this AM started by focusing on why I believe it's so important to tackle this space and make search open (relating the advent of search to that of the printing press), and then ended with a whiteboarding session about the architecture of Atlas. I believe there's one or more audio recordings of it so maybe they'll get posted at some point there :) Lots of hallway banter, tons of interest, great to see that so many people are supportive and we're all looking forward to having a framework to start piling on to help out! Jer From jeremie at jabber.org Fri Jul 27 16:10:46 2007 From: jeremie at jabber.org (jer) Date: Fri, 27 Jul 2007 09:10:46 -0700 Subject: [Search-l] big announcement: grub is back! Message-ID: I'm sending this as Jimmy is on stage announcing it in his keynote here at OSCON, telling how Wikia has been working with LookSmart to acquire the Grub project and open-source it again. A lot of work has gone into this process and I want to thank Jimmy, Gil, and some of the thoughtful folks at LookSmart. Grub was open a long time ago, and it is now again. We're going to be working hard over the coming weeks to get the codebase up on a repository and keep the service running in a testing mode so people can start to play with it, so keep an eye on grub.org and it's wiki page, http://search.wikia.com/wiki/Grub. On the "big vision" map, this is just one project, a distributed crawler, and all of the contents and results it will compile will be fully available under an open document license (yay!). There is more coming along these lines, more projects, this looks like an exciting trend :) Jer From seth.ford at gmail.com Fri Jul 27 16:20:38 2007 From: seth.ford at gmail.com (Seth Ford) Date: Fri, 27 Jul 2007 10:20:38 -0600 Subject: [Search-l] quick update from OSCON In-Reply-To: <907B2362-AD1B-4EA7-A6AC-CC9B24B87F11@jabber.org> References: <907B2362-AD1B-4EA7-A6AC-CC9B24B87F11@jabber.org> Message-ID: I met you all at the OSCON conference and have to say your approach is unique but different than I thought that it would be. I have a community powered search mash-up (mediawiki lucene based) going here internally at where I work can't publish it publicly but am willing to send out screen shots and go over technology if there is interest. Personally I think it is very close to what community search should be and would be the killer app if someone would take it and run. Seth On 7/26/07, jer wrote: > > Our BOF was fun the other night, a very varied and casual discussion > with about a dozen folks. > > My talk this AM started by focusing on why I believe it's so > important to tackle this space and make search open (relating the > advent of search to that of the printing press), and then ended with > a whiteboarding session about the architecture of Atlas. I believe > there's one or more audio recordings of it so maybe they'll get > posted at some point there :) > > Lots of hallway banter, tons of interest, great to see that so many > people are supportive and we're all looking forward to having a > framework to start piling on to help out! > > Jer > _______________________________________________ > Search-l mailing list > Search-l at wikia.com > http://lists.wikia.com/mailman/listinfo/search-l > Change options or unsubscribe: > http://lists.wikia.com/mailman/options/search-l > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.wikia.com/pipermail/search-l/attachments/20070727/1682d80c/attachment.html From jeremie at jabber.org Fri Jul 27 17:19:56 2007 From: jeremie at jabber.org (jer) Date: Fri, 27 Jul 2007 10:19:56 -0700 Subject: [Search-l] big announcement: grub is back! In-Reply-To: <20070727171842.GC5720@ompka.net> References: <20070727171842.GC5720@ompka.net> Message-ID: > So - is this the first Factory ? It's a great place to start :) Jer From piers at ompka.net Fri Jul 27 17:18:42 2007 From: piers at ompka.net (Piers Harding) Date: Fri, 27 Jul 2007 17:18:42 +0000 Subject: [Search-l] big announcement: grub is back! In-Reply-To: References: Message-ID: <20070727171842.GC5720@ompka.net> So - is this the first Factory ? Cheers. On Fri, Jul 27, 2007 at 09:10:46AM -0700, jer wrote: > I'm sending this as Jimmy is on stage announcing it in his keynote > here at OSCON, telling how Wikia has been working with LookSmart to > acquire the Grub project and open-source it again. > > A lot of work has gone into this process and I want to thank Jimmy, > Gil, and some of the thoughtful folks at LookSmart. > > Grub was open a long time ago, and it is now again. We're going to > be working hard over the coming weeks to get the codebase up on a > repository and keep the service running in a testing mode so people > can start to play with it, so keep an eye on grub.org and it's wiki > page, http://search.wikia.com/wiki/Grub. > > On the "big vision" map, this is just one project, a distributed > crawler, and all of the contents and results it will compile will be > fully available under an open document license (yay!). > > There is more coming along these lines, more projects, this looks > like an exciting trend :) > > Jer > > _______________________________________________ > Search-l mailing list > Search-l at wikia.com > http://lists.wikia.com/mailman/listinfo/search-l > Change options or unsubscribe: http://lists.wikia.com/mailman/options/search-l -- Home - http://www.piersharding.com xmpp:piers at ompka.net From jwales at wikia.com Sun Jul 29 00:19:12 2007 From: jwales at wikia.com (Jimmy Wales) Date: Sat, 28 Jul 2007 17:19:12 -0700 Subject: [Search-l] Privacy and sharing browsing data In-Reply-To: References: <46A95FB1.10807@wikia.com> <20070727194130.NNML6544.mta11.adelphia.net@FEZZIK.usa.net> Message-ID: <46ABDD00.6020607@wikia.com> (This was about faroo.com ) jer wrote: > Yeah, noticed them too, completely not open source... Yup! But doing this: >> "When an user opens a page with the browser, it will be automatically >> inserted into the distributed index of the p2p network. The >> additional network load and the site submission of a traditional >> crawler is omitted. Assuming a wide spread of FAROO this enables an >> almost complete index, updated in real time." Seems pretty easy to do with a simple firefox extension. The difficult bit is thinking about user privacy and stopping spam. Let me explain what I mean: When we have a public way for people to submit, tag, and rate urls there are no particular difficult issues with privacy because when you submit something, you are doing it publicly and if you want privacy, you'd best use a pseudonym to login... just like with any wiki. Anyone who is inserting junk into the index will be quickly detected and blocked or rated as a spammer, and there you go. But simply browsing the web is a different matter. I would not be happy with having my click stream of what I am surfing made public -- even if I was using a pseudonym. There are simply too many ways to guess who I am from my click stream. And yet, if no one can see my click stream, then I might just be a spammer merrily trolling around on my own spamtastic crap site. I think there are some clever solutions to this possible. One would be that my browsing history would never be made public BUT if urls that I have submitted made it into the index, and people subsequently mark them as spam, then this fact shows up publicly in the form of a number: "This user has submitted X urls which were subsequently judged by the community to be spam." This could be said without revealing what they were. That is just the first thought of how to go about it. I am eager to think about way that we can encourage passive participation by GOOD people who simply believe in our mission, would like to give us good data on real browsing patterns, but who rightly value their privacy, while at the same time preventing spammers from wasting too much of our time. --Jimbo From stpeter at jabber.org Mon Jul 30 19:26:48 2007 From: stpeter at jabber.org (Peter Saint-Andre) Date: Mon, 30 Jul 2007 13:26:48 -0600 Subject: [Search-l] Privacy and sharing browsing data In-Reply-To: <46ABDD00.6020607@wikia.com> References: <46A95FB1.10807@wikia.com> <20070727194130.NNML6544.mta11.adelphia.net@FEZZIK.usa.net> <46ABDD00.6020607@wikia.com> Message-ID: <46AE3B78.4000408@jabber.org> Jimmy Wales wrote: > (This was about faroo.com ) > > jer wrote: >> Yeah, noticed them too, completely not open source... > > Yup! But doing this: > >>> "When an user opens a page with the browser, it will be automatically >>> inserted into the distributed index of the p2p network. The >>> additional network load and the site submission of a traditional >>> crawler is omitted. Assuming a wide spread of FAROO this enables an >>> almost complete index, updated in real time." > > Seems pretty easy to do with a simple firefox extension. > > The difficult bit is thinking about user privacy and stopping spam. Let > me explain what I mean: > > When we have a public way for people to submit, tag, and rate urls there > are no particular difficult issues with privacy because when you submit > something, you are doing it publicly and if you want privacy, you'd best > use a pseudonym to login... just like with any wiki. Unfortunately, in the absence of identity, submissions and tags and ratings may not mean very much... just like with any wiki. ;-) https://stpeter.im/?p=1927 > Anyone who is > inserting junk into the index will be quickly detected and blocked or > rated as a spammer, and there you go. > > But simply browsing the web is a different matter. I would not be happy > with having my click stream of what I am surfing made public -- even if > I was using a pseudonym. There are simply too many ways to guess who I > am from my click stream. Yes, click stream can be used as an identifying signature (it probably is more "personal" than most passwords). > And yet, if no one can see my click stream, then I might just be a > spammer merrily trolling around on my own spamtastic crap site. It would be good to define what we mean by "click stream". It is a history of what you viewed in what order, or "only" a list of URLs visited (with order scrambled) that is updated once a week or whatever? BTW, the folks at me.dium.com might have some interesting experience to share with regard to click streams and user privacy. > I think there are some clever solutions to this possible. One would be > that my browsing history would never be made public Where is the data stored? Is that data store hackable? Probably. > BUT if urls that I > have submitted made it into the index, and people subsequently mark them > as spam, then this fact shows up publicly in the form of a number: "This > user has submitted X urls which were subsequently judged by the > community to be spam." This could be said without revealing what they were. Yes, a reputation system might be interesting. Though reputation matters to people more if they are actively involved in the relevant community. So building a real community here is critically important, I think. > I am eager to think about way that we can encourage passive > participation by GOOD people who simply believe in our mission, would > like to give us good data on real browsing patterns, but who rightly > value their privacy, while at the same time preventing spammers from > wasting too much of our time. Passivity and privacy may be hard to reconcile. But it's worth thinking about. We have some challenges ahead. :) Peter -- Peter Saint-Andre https://stpeter.im/ -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/x-pkcs7-signature Size: 7354 bytes Desc: S/MIME Cryptographic Signature Url : http://lists.wikia.com/pipermail/search-l/attachments/20070730/8555476c/attachment.bin From seth.ford at gmail.com Mon Jul 30 19:35:19 2007 From: seth.ford at gmail.com (Seth Ford) Date: Mon, 30 Jul 2007 13:35:19 -0600 Subject: [Search-l] Privacy and sharing browsing data In-Reply-To: <46ABDD00.6020607@wikia.com> References: <46A95FB1.10807@wikia.com> <20070727194130.NNML6544.mta11.adelphia.net@FEZZIK.usa.net> <46ABDD00.6020607@wikia.com> Message-ID: Thats why I think it has to be a mash-up. You have to allow people to look to the community first and then look to the crawl, be it tab based or inline. It's seems people are more interested in participating once the trust they can find the data they are looking for and then given encouragement to participate it organize it in a more reasonable fashion. I have sent out some of the implementation I have done along these lines internally where I work. It does seem like it comes down to a matter of trust, internally it's much easier to do a community powered search engine built of a wiki mashed by a crawl. Externally how do you hinder spam and gaming and foster the sense of identity? Maybe it's a /. like implementation or simply wikipedia is as good as it gets...? Seth On 7/28/07, Jimmy Wales wrote: > > (This was about faroo.com ) > > jer wrote: > > Yeah, noticed them too, completely not open source... > > Yup! But doing this: > > >> "When an user opens a page with the browser, it will be automatically > >> inserted into the distributed index of the p2p network. The > >> additional network load and the site submission of a traditional > >> crawler is omitted. Assuming a wide spread of FAROO this enables an > >> almost complete index, updated in real time." > > Seems pretty easy to do with a simple firefox extension. > > The difficult bit is thinking about user privacy and stopping spam. Let > me explain what I mean: > > When we have a public way for people to submit, tag, and rate urls there > are no particular difficult issues with privacy because when you submit > something, you are doing it publicly and if you want privacy, you'd best > use a pseudonym to login... just like with any wiki. Anyone who is > inserting junk into the index will be quickly detected and blocked or > rated as a spammer, and there you go. > > But simply browsing the web is a different matter. I would not be happy > with having my click stream of what I am surfing made public -- even if > I was using a pseudonym. There are simply too many ways to guess who I > am from my click stream. > > And yet, if no one can see my click stream, then I might just be a > spammer merrily trolling around on my own spamtastic crap site. > > I think there are some clever solutions to this possible. One would be > that my browsing history would never be made public BUT if urls that I > have submitted made it into the index, and people subsequently mark them > as spam, then this fact shows up publicly in the form of a number: "This > user has submitted X urls which were subsequently judged by the > community to be spam." This could be said without revealing what they > were. > > That is just the first thought of how to go about it. > > I am eager to think about way that we can encourage passive > participation by GOOD people who simply believe in our mission, would > like to give us good data on real browsing patterns, but who rightly > value their privacy, while at the same time preventing spammers from > wasting too much of our time. > > --Jimbo > _______________________________________________ > Search-l mailing list > Search-l at wikia.com > http://lists.wikia.com/mailman/listinfo/search-l > Change options or unsubscribe: > http://lists.wikia.com/mailman/options/search-l > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.wikia.com/pipermail/search-l/attachments/20070730/d04a79f5/attachment.html From martins at bebr.ufl.edu Mon Jul 30 20:20:43 2007 From: martins at bebr.ufl.edu (Martin Smith) Date: Mon, 30 Jul 2007 16:20:43 -0400 Subject: [Search-l] big announcement: grub is back! In-Reply-To: References: <46A95FB1.10807@wikia.com><20070727194130.NNML6544.mta11.adelphia.net@FEZZIK.usa.net><46ABDD00.6020607@wikia.com> Message-ID: <2709A069CB844242A469ECC57C29D6210205F42E@kobe.bebr.ufl.edu> Hi folks, That's great news -- I heard the announcement at OSCON last week, and joined the list because of it. I'm looking forward to creating a Java crawler client that will work everywhere. I'll be keeping my eyes open for the release of the previously-closed "latest" version for Windows, so that I can begin. Cheers, Martin Smith, Systems Developer martins at bebr.ufl.edu Bureau of Economic and Business Research University of Florida (352) 392-0171 Ext. 221 On Fri, Jul 27, 2007 at 09:10:46AM -0700, jer wrote: > I'm sending this as Jimmy is on stage announcing it in his keynote > here at OSCON, telling how Wikia has been working with LookSmart to > acquire the Grub project and open-source it again. > > A lot of work has gone into this process and I want to thank Jimmy, > Gil, and some of the thoughtful folks at LookSmart. > > Grub was open a long time ago, and it is now again. We're going to > be working hard over the coming weeks to get the codebase up on a > repository and keep the service running in a testing mode so people > can start to play with it, so keep an eye on grub.org and it's wiki > page, http://search.wikia.com/wiki/Grub. > > On the "big vision" map, this is just one project, a distributed > crawler, and all of the contents and results it will compile will be > fully available under an open document license (yay!). > > There is more coming along these lines, more projects, this looks > like an exciting trend :) > > Jer From jeremie at jabber.org Tue Jul 31 22:41:51 2007 From: jeremie at jabber.org (jer) Date: Tue, 31 Jul 2007 17:41:51 -0500 Subject: [Search-l] Grub Update Message-ID: Tremendous response, and thanks to everyone for their patience as we muddle through this :) First, there's lots of clamoring for some source code, so here's *something* to start with even though it's not yet in a source repository or even stamped with the right license headers, should have an official source release in a week or so but this will get things rolling: http://grub.org/client/grub_client2.tgz Second, quick status update on the grub.org service itself, there's many hundreds of clients running already but we're having a problem getting the stats output working. Igor Stojanovski and Kord Campbell (two of the original founders) have been actively helping and putting in some serious time to get everything running smoothly again, many thanks guys! Third, answering the "what's it going to do" question. In the long run within the Atlas framework, Grub could become a distributed "Factory" and be publishing crawled/refined content to anyone for indexing. Immediately though there's one big desire: collect all of the crawled content into one shared repository and publish it under the GFDL, with simple ways of accessing updates and providing quality feedback. A compressed and lightly indexed (by time/URL) common crawled output source is a place to start and something many will benefit from. More to come as this all evolves, feedback encouraged, thanks! Jer From jmcc at hackwatch.com Tue Jul 31 23:54:53 2007 From: jmcc at hackwatch.com (John McCormac) Date: Wed, 01 Aug 2007 00:54:53 +0100 Subject: [Search-l] Grub Update In-Reply-To: References: Message-ID: <46AFCBCD.8010502@hackwatch.com> jer wrote: > Tremendous response, and thanks to everyone for their patience as we > muddle through this :) > > First, there's lots of clamoring for some source code, so here's > *something* to start with even though it's not yet in a source > repository or even stamped with the right license headers, should > have an official source release in a week or so but this will get > things rolling: > http://grub.org/client/grub_client2.tgz I'm a bit new to this wikia search thing but the concept of using Grub is a bit confusing. It is almost an implementation of the Infinite Number of Monkeys approach to spidering the web. It still requires a powerful backend to make sense of all the data spidered and that was always Grub's flaw. > Second, quick status update on the grub.org service itself, there's > many hundreds of clients running already but we're having a problem > getting the stats output working. Igor Stojanovski and Kord Campbell > (two of the original founders) have been actively helping and putting > in some serious time to get everything running smoothly again, many > thanks guys! The state of the web has changed since Grub was a player. Most of the larger sites now block spidering by DSL and dialup connections. Some directories block on User Agent and from what I remember, Grub was one string that used to get blocked a lot. > Third, answering the "what's it going to do" question. In the long > run within the Atlas framework, Grub could become a distributed > "Factory" and be publishing crawled/refined content to anyone for > indexing. Immediately though there's one big desire: collect all of > the crawled content into one shared repository and publish it under > the GFDL, with simple ways of accessing updates and providing quality > feedback. A compressed and lightly indexed (by time/URL) common > crawled output source is a place to start and something many will > benefit from. This is the other flaw with a Grub approach - there is no quality assurance of the index. Many small search engines have followed the Infinite Monkeys approach to indexing, following each URL to find more. The problem with this approach is that it relies on the back end to give the data context. They tend to last about 18 months on average. > More to come as this all evolves, feedback encouraged, thanks! It should be interesting to see how things turn out. Regards...jmcc -- ****************************************************** John McCormac * e-mail: jmcc at whoisireland.com MC2 * voice: +353-51-873640 22 Viewmount * web: http://www.whoisireland.com/ Waterford * blog: http://blog.whoisireland.com Ireland * Irish Domain Stats & Market Research ******************************************************