From marcnaweb at gmail.com Mon Sep 1 18:49:32 2008 From: marcnaweb at gmail.com (Marc .) Date: Mon, 1 Sep 2008 15:49:32 -0300 Subject: [Search-l] Fwd: Scour invite In-Reply-To: References: <5f2640d0808300937v10b48e26t718037ae43d990a3@mail.gmail.com> <5f2640d0808311406o3a707b2atc18c767a9c47fde@mail.gmail.com> Message-ID: <5f2640d0809011149td4d6e87ob7c3887c1af50916@mail.gmail.com> Hi Mark, I agree that it "may just be a waiting game for now" with more users improving the quality, however there are a lot of new SE (and competition) out there and we may be missing users (and quality) if we "only wait" for users to react --that's why a reward system could help. Anyway, a reward may not be "That hard" to do so. According to the Mike Duffy and Oren Shani mails from 29/03 and 23/03 2007 (yes, i have a good memory ; ) it's seems that is possible to have a "self organized" system based on the people behaviours (there are detailed articles written at http://search.wikia.com/wiki/Mini:Strobili and http://ishi.lanl.gov/symintel.html ). To make the reward system, I think it should be possible to analyse the people behaviours and when the user behaviours "match" to the other people behaviour, according to the "Strobili" concepts, the people receives a "good points" --I may be wrong, but maybe Mike or Ofren can give us their opinion. What do you (and the other guys) think? Best Regards, Marc Rosenfeld 2008/8/31 Mark (Markie) : > reading back over my mail, maybe i was a bit ott and i appologise for that > now. > > however wikia does have revenue, both ad and private investment, so them not > having money is not a problem. the issue of rewards though to me is > synonymous with greed. ie where money comes so do more peole, however > quality may not. managing to seperate between "good" and "bad" > contributions and then seperating out rewards to only the "right" people > would IMO be VERY hard to do, so to me isnt really a viable idea. however i > do agree that it would be nice :-p i would also agree that an incentive to > edit/contribute would be good, however looking at projects such as wikipedia > they have no incentives, so it may just be a waiting game for now. if we > manage to increase the use, then hopefully the quality and therefore useage > would then snowball and would be a positive feedback cycle of improving > quality -> more users -> improving quality etc etc. also the site stats are > showing a steady and evenly increasing trend of around 2-4% so hoepfully we > are on the first part of the above cycle :-) > > regards (and appologies for the bulky mail) > > mark > > On Sun, Aug 31, 2008 at 10:06 PM, Marc . wrote: >> >> Hi Mark, >> >> The first link that appears in the mail was only www.scour .com >> anyway I understand your angry and u are right, i should delete any >> other link (i did it now) --and put a space in the scoour link to >> avoid any misunderstands).. >> >> Coming to the reward system, this approach could benefit wikia if it >> made in a different/ "wikia original" way to engage people to >> collaborate in the community. >> >> Apparently wikia doesn?t has any revenue yet, but the quality of the >> SE could allow it to have it, in this case the contributors could have >> a share (let say, 20%) of the revenues according to their contribution >> --this could be done if we "adapt" the ESP game (cited here in a mail >> before) to work in results pages http://en.wikipedia.org/wiki/ESP_Game >> (yes, i am a marketing man and think that a strong marketing plan >> could make wikia works better ; ) >> What do you think? >> >> Best Regards, >> Marc Rosenfeld >> >> >> >> >> >> >> 2008/8/30, Mark (Markie) : >> > ive gotta say this is a lame attempt to get your self referal credits on >> > scour :-( also ive had a look at it and it seems the credit system >> > means >> > that the actual input is lame to be honest, (try searching for cheese >> > and >> > look at the comments) >> > >> > the idea of rewards in itself isnt too bad, but finding the funds and >> > checking it to make sure that what your paying people for isnt indeed >> > crap >> > is the hard bit. mahalo seems to manage it well enough, but ive heard >> > from >> > various people that contibuting/working for them is a pain in the.... >> > >> > anyways ive refused to click your referal link, try harder next time >> > >> > regards >> > >> > mark >> > >> > >> > On Sat, Aug 30, 2008 at 5:37 PM, Marc . wrote: >> > > >> > > Hi, >> > > I received this invite: apparently this SE is approaching a similar >> > > way to wikia to get better results: the users "votes" in the best >> > > results, and can receive money for it, but cannot customize their >> > > results, like in wikia. >> > > However, maybe wikia could make a similar "points rewards" to market >> > > the SE among Internet users, what do you think? >> > > >> > > Best Regards, >> > > Marc Rosenfeld >> > > >> >> > > ---------- Forwarded message ---------- >> > > From: .... >> > > Date: 2008/8/30 >> > > Subject: Scour invite from Gabriel Toueg >> > > To: marcnaweb at gmail.com >> > > >> > > >> > > Did you hear about Scour? It is the next gen search engine with >> > > Google/Yahoo/MSN results and user comments all on one page. Best of >> > > all we >> > > get rewarded for using it by collecting points with every search, >> > > comment >> > > and vote. The points are redeemable for Visa gift cards It's like >> > > earning >> > > credit card or airline points just for searching. Hit the link below >> > > to >> > join >> > > and we will both get points! >> > > > > I know you'll like it! >> > > >> > > - ..... >> > > >> > > >> > > >> > > >> > > >> > > If you would prefer not to receive invitations from ANY Scour members >> > > please click here - >> _______________________________________________ >> Wikia Search mailing list >> http://re.search.wikia.com/ >> Change options or unsubscribe: >> http://lists.wikia.com/mailman/options/search-l > > > _______________________________________________ > Wikia Search mailing list > http://re.search.wikia.com/ > Change options or unsubscribe: > http://lists.wikia.com/mailman/options/search-l > From patcito at gmail.com Tue Sep 2 18:16:07 2008 From: patcito at gmail.com (Patrick Aljord) Date: Tue, 2 Sep 2008 13:16:07 -0500 Subject: [Search-l] toolbar bug Message-ID: <6b6419750809021116s545faa16i355b79a4437b30da@mail.gmail.com> Hey all, For some reasons since a couple of days or more, the " Add | Rate ******" link only appears youtube links when I do a google search. I'm not sure if this was intended or if it only happens on my machine (kubuntu, ff3, latest toolbar version) Example: http://www.google.com/search?q=Stephen+Fry From aerik at thesylvans.com Wed Sep 3 20:45:33 2008 From: aerik at thesylvans.com (Aerik Sylvan) Date: Wed, 3 Sep 2008 13:45:33 -0700 Subject: [Search-l] Processing from mulitple indexes... Message-ID: <355a36af0809031345t65c97dfdy9a8e8d88b861f46d@mail.gmail.com> Crossposting this - not sure of the best home for it... So, we have the the beginnings of code and a framework that really facilitates multiple indexes, with factories and brokers and collectors... How about moving the processing of the final result to the client? In other words, meta search via javascript. You get a lot of nifty things with that architecture - one of the big bits is that it reduces the server resources needed to aggregate results. It increases bandwidth, becuase you'd need to server perhaps the top 100 results from each index you're querying, and let the client whittle them down to the top 10 combined result. Anybody working/thinking in that direction? (Had this thought for awhile, but the news of Google Chrome's js engine being 10+ times faster makes it even more intriguing). Best Regards, Aerik -- http://www.wikidweb.com - the Wiki Directory of the Web http://tagthis.info - Hosted Tagging for your website! -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.wikia.com/pipermail/search-l/attachments/20080903/1bb21b53/attachment.html From jeremie at jabber.org Thu Sep 4 16:44:06 2008 From: jeremie at jabber.org (Jeremie Miller) Date: Thu, 4 Sep 2008 11:44:06 -0500 Subject: [Search-l] Processing from mulitple indexes... In-Reply-To: <355a36af0809031345t65c97dfdy9a8e8d88b861f46d@mail.gmail.com> References: <355a36af0809031345t65c97dfdy9a8e8d88b861f46d@mail.gmail.com> Message-ID: Heh, that's almost exactly how the JS behind the current re.search stuff works, it's pulling in and merging between two indexes currently (nutch and KT contribs, and another recent idea is to build a third index of just page summaries so we can fit more into our limited resources). One of our important principles is to try and use only open-source-built and freely-licensed indexes too :) The "widgets" (or whatever you want to call them) discussion is very close to this also, triggering the merge of other custom results into any search, and there's some working going on now to clean up the JS to support this even better. Jer On Sep 3, 2008, at 3:45 PM, Aerik Sylvan wrote: > Crossposting this - not sure of the best home for it... > > So, we have the the beginnings of code and a framework that really > facilitates multiple indexes, with factories and brokers and > collectors... How about moving the processing of the final result to > the client? In other words, meta search via javascript. You get a > lot of nifty things with that architecture - one of the big bits is > that it reduces the server resources needed to aggregate results. > It increases bandwidth, becuase you'd need to server perhaps the top > 100 results from each index you're querying, and let the client > whittle them down to the top 10 combined result. > > Anybody working/thinking in that direction? > > (Had this thought for awhile, but the news of Google Chrome's js > engine being 10+ times faster makes it even more intriguing). > > Best Regards, > Aerik > > -- > http://www.wikidweb.com - the Wiki Directory of the Web > http://tagthis.info - Hosted Tagging for your website! > _______________________________________________ > Wikia Search mailing list > http://re.search.wikia.com/ > Change options or unsubscribe: http://lists.wikia.com/mailman/options/search-l From ssriram at gmail.com Thu Sep 4 17:00:02 2008 From: ssriram at gmail.com (S. Sriram) Date: Thu, 04 Sep 2008 10:00:02 -0700 Subject: [Search-l] rest/json api for search results Message-ID: <48C01412.2050504@gmail.com> Hi, I was looking for a rest/json api to access search results but couldn't seem to find it. Does it currently exists, if so where, if not than what are the plans for something like this? Thanks From aerik at thesylvans.com Thu Sep 4 19:50:14 2008 From: aerik at thesylvans.com (Aerik Sylvan) Date: Thu, 4 Sep 2008 12:50:14 -0700 Subject: [Search-l] Processing from mulitple indexes... In-Reply-To: References: <355a36af0809031345t65c97dfdy9a8e8d88b861f46d@mail.gmail.com> Message-ID: <355a36af0809041250s49f68ea7yc156331d661b3c86@mail.gmail.com> Sweet! That's beautiful. I can picture hundreds of of little brokers serving up interesting bits of information, and being pulled into meta-search results in the client, providing a richer and more up-to-date experience for the searcher... very cool. Tell you what, if there's any interest, I've got small datasets at both wikidweb.com (domain level, mostly, about 40k listings, GFDL) and tagthis.info (page level, don't know how many off the top of my head, no license selected yet... gotta do that...) that I'd be happy to publish via js api call, to help feed such a thing. Getting data from larger vendors (del.ico.us, stumbleupon, whoever) would be better of course, but... :-) Cool... I've got this whole vision (I continue to think it's pretty close to your vision) of a much more interconected, cross-pollinating internet, with data shared, aggregated, ranked, and shared again much more freely... Somewhat off topic, I'm working on something simple and waaay compelling: RSS type feeds for events, and then an aggregator/search engine to find events. (Think of it, garage sales, concerts, sports events, all published in an atom feed, and searchable by date and location...) - I'm working on a prototype/promotional website at http://eventfeed.org - I plan to open source the (relatively simple) code in an effort for greater adoption. (It's really simple and rough right now, but basically works.) Best Regards, Aerik On Thu, Sep 4, 2008 at 9:44 AM, Jeremie Miller wrote: > Heh, that's almost exactly how the JS behind the current re.search > stuff works, it's pulling in and merging between two indexes currently > (nutch and KT contribs, and another recent idea is to build a third > index of just page summaries so we can fit more into our limited > resources). One of our important principles is to try and use only > open-source-built and freely-licensed indexes too :) > > The "widgets" (or whatever you want to call them) discussion is very > close to this also, triggering the merge of other custom results into > any search, and there's some working going on now to clean up the JS > to support this even better. > > Jer > > On Sep 3, 2008, at 3:45 PM, Aerik Sylvan wrote: > > > Crossposting this - not sure of the best home for it... > > > > So, we have the the beginnings of code and a framework that really > > facilitates multiple indexes, with factories and brokers and > > collectors... How about moving the processing of the final result to > > the client? In other words, meta search via javascript. You get a > > lot of nifty things with that architecture - one of the big bits is > > that it reduces the server resources needed to aggregate results. > > It increases bandwidth, becuase you'd need to server perhaps the top > > 100 results from each index you're querying, and let the client > > whittle them down to the top 10 combined result. > > > > Anybody working/thinking in that direction? > > > > (Had this thought for awhile, but the news of Google Chrome's js > > engine being 10+ times faster makes it even more intriguing). > > > > Best Regards, > > Aerik > > > > -- > > http://www.wikidweb.com - the Wiki Directory of the Web > > http://tagthis.info - Hosted Tagging for your website! > > _______________________________________________ > > Wikia Search mailing list > > http://re.search.wikia.com/ > > Change options or unsubscribe: > http://lists.wikia.com/mailman/options/search-l > > _______________________________________________ > Wikia Search mailing list > http://re.search.wikia.com/ > Change options or unsubscribe: > http://lists.wikia.com/mailman/options/search-l > -- http://www.wikidweb.com - the Wiki Directory of the Web http://tagthis.info - Hosted Tagging for your website! -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.wikia.com/pipermail/search-l/attachments/20080904/36f0d61c/attachment.html From jeremie at jabber.org Thu Sep 4 21:14:57 2008 From: jeremie at jabber.org (Jeremie Miller) Date: Thu, 4 Sep 2008 16:14:57 -0500 Subject: [Search-l] Processing from mulitple indexes... In-Reply-To: <355a36af0809041250s49f68ea7yc156331d661b3c86@mail.gmail.com> References: <355a36af0809031345t65c97dfdy9a8e8d88b861f46d@mail.gmail.com> <355a36af0809041250s49f68ea7yc156331d661b3c86@mail.gmail.com> Message-ID: <37714D02-9960-4AEC-B584-AB2B8318CADB@jabber.org> Definitely publish a simple JSON api and it'll become really easy to mash it up. I'm starting to see so many JSON RESTful APIs out there that I wonder if the real semantic web isn't happening already via this kind of growth :) Jer On Sep 4, 2008, at 2:50 PM, Aerik Sylvan wrote: > Sweet! That's beautiful. I can picture hundreds of of little > brokers serving up interesting bits of information, and being pulled > into meta-search results in the client, providing a richer and more > up-to-date experience for the searcher... very cool. Tell you what, > if there's any interest, I've got small datasets at both > wikidweb.com (domain level, mostly, about 40k listings, GFDL) and > tagthis.info (page level, don't know how many off the top of my > head, no license selected yet... gotta do that...) that I'd be happy > to publish via js api call, to help feed such a thing. Getting data > from larger vendors (del.ico.us, stumbleupon, whoever) would be > better of course, but... :-) > > Cool... > > I've got this whole vision (I continue to think it's pretty close to > your vision) of a much more interconected, cross-pollinating > internet, with data shared, aggregated, ranked, and shared again > much more freely... Somewhat off topic, I'm working on something > simple and waaay compelling: RSS type feeds for events, and then an > aggregator/search engine to find events. (Think of it, garage > sales, concerts, sports events, all published in an atom feed, and > searchable by date and location...) - I'm working on a prototype/ > promotional website at http://eventfeed.org - I plan to open source > the (relatively simple) code in an effort for greater adoption. > (It's really simple and rough right now, but basically works.) > > Best Regards, > Aerik > > > On Thu, Sep 4, 2008 at 9:44 AM, Jeremie Miller > wrote: > Heh, that's almost exactly how the JS behind the current re.search > stuff works, it's pulling in and merging between two indexes currently > (nutch and KT contribs, and another recent idea is to build a third > index of just page summaries so we can fit more into our limited > resources). One of our important principles is to try and use only > open-source-built and freely-licensed indexes too :) > > The "widgets" (or whatever you want to call them) discussion is very > close to this also, triggering the merge of other custom results into > any search, and there's some working going on now to clean up the JS > to support this even better. > > Jer > > On Sep 3, 2008, at 3:45 PM, Aerik Sylvan wrote: > > > Crossposting this - not sure of the best home for it... > > > > So, we have the the beginnings of code and a framework that really > > facilitates multiple indexes, with factories and brokers and > > collectors... How about moving the processing of the final result to > > the client? In other words, meta search via javascript. You get a > > lot of nifty things with that architecture - one of the big bits is > > that it reduces the server resources needed to aggregate results. > > It increases bandwidth, becuase you'd need to server perhaps the top > > 100 results from each index you're querying, and let the client > > whittle them down to the top 10 combined result. > > > > Anybody working/thinking in that direction? > > > > (Had this thought for awhile, but the news of Google Chrome's js > > engine being 10+ times faster makes it even more intriguing). > > > > Best Regards, > > Aerik > > > > -- > > http://www.wikidweb.com - the Wiki Directory of the Web > > http://tagthis.info - Hosted Tagging for your website! > > _______________________________________________ > > Wikia Search mailing list > > http://re.search.wikia.com/ > > Change options or unsubscribe: http://lists.wikia.com/mailman/options/search-l > > _______________________________________________ > Wikia Search mailing list > http://re.search.wikia.com/ > Change options or unsubscribe: http://lists.wikia.com/mailman/options/search-l > > > > -- > http://www.wikidweb.com - the Wiki Directory of the Web > http://tagthis.info - Hosted Tagging for your website! > _______________________________________________ > Wikia Search mailing list > http://re.search.wikia.com/ > Change options or unsubscribe: http://lists.wikia.com/mailman/options/search-l From balinny at gmail.com Sat Sep 13 12:13:50 2008 From: balinny at gmail.com (Balinny) Date: Sat, 13 Sep 2008 14:13:50 +0200 Subject: [Search-l] =?windows-1252?q?Outdated_bankruptcy_story_sparked_a_?= =?windows-1252?q?=241_billion_run_on_an_airline=92s_stock_value?= Message-ID: <48CBAE7E.7050202@gmail.com> A surfer views a newspaper old article about a company's bankrupcy on a low traffic period. It bumps to Popular Stories Google News Bot catches it as a new article From Google News it moves to Bloomberg financial information service Investors dumps the stock Algorithms also sell. The stock plunged from around $12 to just $3 a share http://technology.timesonline.co.uk/tol/news/tech_and_web/article4742147.ece From fredbaud at fairpoint.net Sat Sep 13 14:00:51 2008 From: fredbaud at fairpoint.net (Fred Bauder) Date: Sat, 13 Sep 2008 08:00:51 -0600 (MDT) Subject: [Search-l] =?iso-8859-1?q?Outdated_bankruptcy_story_sparked_a_=24?= =?iso-8859-1?q?1_billion_run__on_an_airline=92s_stock_value?= In-Reply-To: <48CBAE7E.7050202@gmail.com> References: <48CBAE7E.7050202@gmail.com> Message-ID: <60726.66.243.196.131.1221314451.squirrel@webmail.fairpoint.net> > A surfer views a newspaper old article about a company's bankrupcy on a > low traffic period. > It bumps to Popular Stories > Google News Bot catches it as a new article > From Google News it moves to Bloomberg financial information service > Investors dumps the stock > Algorithms also sell. > The stock plunged from around $12 to just $3 a share > > http://technology.timesonline.co.uk/tol/news/tech_and_web/article4742147.ece If I were doing the suing of Google, I would point out that there was existing technology in use (on Search Wikia) which, if in place on Google, would have encouraged user input which would have permitted user feedback which could have prevented continued display of the error a top hit. Of course a user would have to actually have read the story. Traders may not. Fred From g.lippitt at att.net Sat Sep 13 20:58:07 2008 From: g.lippitt at att.net (g.lippitt at att.net) Date: Sat, 13 Sep 2008 20:58:07 +0000 Subject: [Search-l] =?utf-8?q?Outdated_bankruptcy_story_sparked_a_=241_bil?= =?utf-8?q?lion_run__on_an_airline=E2=80=99s_stock_value?= In-Reply-To: <60726.66.243.196.131.1221314451.squirrel@webmail.fairpoint.net> References: <48CBAE7E.7050202@gmail.com> <60726.66.243.196.131.1221314451.squirrel@webmail.fairpoint.net> Message-ID: <091320082058.6282.48CC295F0002A1160000188A22230680329B0A02D29B9B0EBF9B9B079F9F0704D209@att.net> This happens fairly often with Google. This time it was a problem because the date was changed to the current date (presumably by Bloomberg). -------------- Original message from "Fred Bauder" : -------------- > > A surfer views a newspaper old article about a company's bankrupcy on a > > low traffic period. > > It bumps to Popular Stories > > Google News Bot catches it as a new article > > From Google News it moves to Bloomberg financial information service > > Investors dumps the stock > > Algorithms also sell. > > The stock plunged from around $12 to just $3 a share > > > > http://technology.timesonline.co.uk/tol/news/tech_and_web/article4742147.ece > > If I were doing the suing of Google, I would point out that there was > existing technology in use (on Search Wikia) which, if in place on > Google, would have encouraged user input which would have permitted user > feedback which could have prevented continued display of the error a top > hit. Of course a user would have to actually have read the story. Traders > may not. > > Fred > > _______________________________________________ > Wikia Search mailing list > http://re.search.wikia.com/ > Change options or unsubscribe: http://lists.wikia.com/mailman/options/search-l -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.wikia.com/pipermail/search-l/attachments/20080913/7f33fcad/attachment.html From christian.ledermann at gmail.com Tue Sep 23 10:00:41 2008 From: christian.ledermann at gmail.com (Christian Ledermann) Date: Tue, 23 Sep 2008 13:00:41 +0300 Subject: [Search-l] Coop - Custom search engine Message-ID: <1222164041.7620.42.camel@ubuntu> Hi all, After there have been some discussions about social bookmarking, tagging and sitesearch I liked to add my 2 cents. pulling all of the above together could evolve in a vertical search engine pretty much like google custom search (cse or google coop). short description (for those who do not know cse): You may limit the search to predefined sites, add labels (keywords) by which you may refine your search, add weighs to sites ... shortcomings of google coop: if you add more than 3 sites your search results get quaint (i.e they do not really get sorted in the way google scores them but rather randomly - this behaviour is documented somewhere but hard to find) so it is not really usable. Anyway lots of people use it and are doing work for google categorizing sites and other users cannot even see what sites are used to make up that search engines. another search engine for focused search is opencrawl.de and windows live, rollyo or yahoo offer similar services (with other limitations like max 25 sites) pros for wikia: people are selfish so if they see an (immediate) benefit for them they will be much more likely to contribute links, keywords and descriptions. to have an up to date index people would be more likely to contribute indexers for 'their sites' pros for users: have a usable search over multiple sites, and of course all the benefits wikia search has. implementation thoughts: a user registers a bookmark collections / custom searcher where he can add tag and organize his bookmarks in categories/ folders. have a simple exchange protocol for up/dowloading bookmarks like XBEL (that is why categories/folders come in handy) there are already products for firefox which do this - so no reinventing the wheel. for integration in an external site: javascript and/or opensearch - rss/atom -- Best Regards, Christian Ledermann From natanael.l at gmail.com Tue Sep 23 12:34:00 2008 From: natanael.l at gmail.com (Natanael) Date: Tue, 23 Sep 2008 14:34:00 +0200 Subject: [Search-l] Coop - Custom search engine In-Reply-To: <1222164041.7620.42.camel@ubuntu> References: <1222164041.7620.42.camel@ubuntu> Message-ID: Definitley interesting. Maybe "user groups" could be cretaed where a school class could cooperate and use the same "mini-engine", and then the behaviour of everybody in the group will affect the results when the others search. On 23/09/2008, Christian Ledermann wrote: > > Hi all, > > After there have been some discussions about social bookmarking, tagging > and sitesearch I liked to add my 2 cents. > > pulling all of the above together could evolve in a vertical search > engine pretty much like google custom search (cse or google coop). > > short description (for those who do not know cse): > > You may limit the search to predefined sites, add labels (keywords) by > which you may refine your search, add weighs to sites ... > > shortcomings of google coop: if you add more than 3 sites your search > results get quaint (i.e they do not really get sorted in the way google > scores them but rather randomly - this behaviour is documented somewhere > but hard to find) so it is not really usable. Anyway lots of people use > it and are doing work for google categorizing sites and other users > cannot even see what sites are used to make up that search engines. > > another search engine for focused search is opencrawl.de > > and windows live, rollyo or yahoo offer similar services (with other > limitations like max 25 sites) > > > pros for wikia: > people are selfish so if they see an (immediate) benefit for them they > will be much more likely to contribute links, keywords and descriptions. > > to have an up to date index people would be more likely to contribute > indexers for 'their sites' > > pros for users: > have a usable search over multiple sites, and of course all the benefits > wikia search has. > > > > implementation thoughts: > > a user registers a bookmark collections / custom searcher where he can > add tag and organize his bookmarks in categories/ folders. > > have a simple exchange protocol for up/dowloading bookmarks like XBEL > (that is why categories/folders come in handy) there are already > products for firefox which do this - so no reinventing the wheel. > > > for integration in an external site: javascript and/or opensearch - > rss/atom > > > > -- > Best Regards, > > Christian Ledermann > > > _______________________________________________ > Wikia Search mailing list > http://re.search.wikia.com/ > Change options or unsubscribe: > http://lists.wikia.com/mailman/options/search-l > -- If everybody are thinking alike, then somebody aren't thinking || Stupidity is a renewable resource -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.wikia.com/pipermail/search-l/attachments/20080923/aeb78574/attachment.html From christian.ledermann at gmail.com Tue Sep 23 14:18:31 2008 From: christian.ledermann at gmail.com (Christian Ledermann) Date: Tue, 23 Sep 2008 17:18:31 +0300 Subject: [Search-l] Coop - Custom search engine In-Reply-To: References: <1222164041.7620.42.camel@ubuntu> Message-ID: <1222179511.7620.63.camel@ubuntu> On Tue, 2008-09-23 at 14:34 +0200, Natanael wrote: > Definitley interesting. > Maybe "user groups" could be cretaed where a school class could > cooperate and use the same "mini-engine", and then the behaviour of > everybody in the group will affect the results when the others search. > right i forgot about the coop! the 'mini-engine' is owned by a user, a user can define multiple engines. a user can assign rights to other users to contribute to the engine. so there should / could be different levels of cooperation: private: only the user and collaborators can view and add links to the engine, wikia search can use the tags, description , etc to improve the search results (anonymized), only the user and collaborators can view the search results and links submitted to the engine (i.e authorization required) Q: is that useful? if you want something something private than this might be the wrong place. public visible: only the user and collaborators can add links to the engine, everybody can see the links and use the engine (as opposed to google where the links are a secret) public, suggestions welcome: only the user and collaborators can add links to the engine, others can add suggestions, which can be approved or discarded by the owners. public, open for all: everybody can contribute, the owners can review delete links. so the community of collaborators is responsible to eliminate spam. cheers, christian From jeremie at jabber.org Tue Sep 23 21:28:08 2008 From: jeremie at jabber.org (Jeremie Miller) Date: Tue, 23 Sep 2008 16:28:08 -0500 Subject: [Search-l] Coop - Custom search engine In-Reply-To: <1222179511.7620.63.camel@ubuntu> References: <1222164041.7620.42.camel@ubuntu> <1222179511.7620.63.camel@ubuntu> Message-ID: <59C64D03-F975-4767-B35F-A38A56EACD20@jabber.org> A really really exciting step is happening in the current index- building process right now, Dennis is busy adding support for external scripts in Nutch's processing of the content for adding custom fields to the index during the MapReduce phases... what this means is that anyone can start contributing simple/small scripts in any language to extend the index (for example marking any urls matching a regex or list with X and then search filtering on X), making building tools like this incredibly easier. It might take some time to get all the kinks worked out, but I think we're talking weeks here, maybe a few more to get a new custom-field index rolled out, so it's definitely in the near future :) Jer On Sep 23, 2008, at 9:18 AM, Christian Ledermann wrote: > On Tue, 2008-09-23 at 14:34 +0200, Natanael wrote: >> Definitley interesting. >> Maybe "user groups" could be cretaed where a school class could >> cooperate and use the same "mini-engine", and then the behaviour of >> everybody in the group will affect the results when the others >> search. >> > > right i forgot about the coop! > > the 'mini-engine' is owned by a user, a user can define multiple > engines. > > a user can assign rights to other users to contribute to the engine. > > so there should / could be different levels of cooperation: > > private: > only the user and collaborators can view and add links to the engine, > wikia search can use the tags, description , etc to improve the search > results (anonymized), only the user and collaborators can view the > search results and links submitted to the engine (i.e authorization > required) Q: is that useful? if you want something something private > than this might be the wrong place. > > public visible: > only the user and collaborators can add links to the engine, > everybody can see the links and use the engine (as opposed to google > where the links are a secret) > > public, suggestions welcome: > only the user and collaborators can add links to the engine, > others can add suggestions, which can be approved or discarded by the > owners. > > public, open for all: > everybody can contribute, the owners can review delete links. > so the community of collaborators is responsible to eliminate spam. > > > cheers, > christian > > > > > _______________________________________________ > Wikia Search mailing list > http://re.search.wikia.com/ > Change options or unsubscribe: http://lists.wikia.com/mailman/options/search-l > From christian.ledermann at gmail.com Wed Sep 24 07:30:56 2008 From: christian.ledermann at gmail.com (Christian Ledermann) Date: Wed, 24 Sep 2008 10:30:56 +0300 Subject: [Search-l] Coop - Custom search engine In-Reply-To: <59C64D03-F975-4767-B35F-A38A56EACD20@jabber.org> References: <1222164041.7620.42.camel@ubuntu> <1222179511.7620.63.camel@ubuntu> <59C64D03-F975-4767-B35F-A38A56EACD20@jabber.org> Message-ID: <1222241456.7620.71.camel@ubuntu> On Tue, 2008-09-23 at 16:28 -0500, Jeremie Miller wrote: > A really really exciting step is happening in the current index- > building process right now, Dennis is busy adding support for > external > scripts in Nutch's processing of the content for adding custom > fields > to the index during the MapReduce phases... what this means is that > anyone can start contributing simple/small scripts in any language > to > extend the index (for example marking any urls matching a regex or > list with X and then search filtering on X), making building tools > like this incredibly easier. WOW! You guys ROCK :) Best Regards, Christian Ledermann From balinny at gmail.com Thu Sep 25 20:49:37 2008 From: balinny at gmail.com (Balinny) Date: Thu, 25 Sep 2008 22:49:37 +0200 Subject: [Search-l] Coop - Custom search engine In-Reply-To: <1222179511.7620.63.camel@ubuntu> References: <1222164041.7620.42.camel@ubuntu> <1222179511.7620.63.camel@ubuntu> Message-ID: <48DBF961.4070500@gmail.com> Christian Ledermann wrote: > right i forgot about the coop! > > the 'mini-engine' is owned by a user, a user can define multiple > engines. > That looks really good! > a user can assign rights to other users to contribute to the engine. > > so there should / could be different levels of cooperation: > > private: > only the user and collaborators can view and add links to the engine, > wikia search can use the tags, description , etc to improve the search > results (anonymized), only the user and collaborators can view the > search results and links submitted to the engine (i.e authorization > required) Q: is that useful? if you want something something private > than this might be the wrong place. > I wouldn't call that 'private'. It must be clear that it can be provided to external users (although it will have completely different scores than in your internal one). At the very least, the search engine shall be allowed to use that data to crawl those urls for third-party users. We don't want something like this: ;) -You showed my private information to other users! -Well, the search engine just answered the query providing your blog. -It was a private search engine! -It happened that only your web had a hit for asdfghjjj. What's wrong on showing the result? -It was my secret diary!! And now hundreds of people are visiting it! -If you wanted a secret diary, why do you placed that in a blog? -So i could use Wikia Search to search on it, obviously. -.... From balinny at gmail.com Thu Sep 25 21:16:32 2008 From: balinny at gmail.com (Balinny) Date: Thu, 25 Sep 2008 23:16:32 +0200 Subject: [Search-l] Coop - Custom search engine In-Reply-To: <59C64D03-F975-4767-B35F-A38A56EACD20@jabber.org> References: <1222164041.7620.42.camel@ubuntu> <1222179511.7620.63.camel@ubuntu> <59C64D03-F975-4767-B35F-A38A56EACD20@jabber.org> Message-ID: <48DBFFB0.9000201@gmail.com> Jeremie Miller wrote: > A really really exciting step is happening in the current index- > building process right now, Dennis is busy adding support for external > scripts in Nutch's processing of the content for adding custom fields > to the index during the MapReduce phases... what this means is that > anyone can start contributing simple/small scripts in any language to > extend the index (for example marking any urls matching a regex or > list with X and then search filtering on X), making building tools > like this incredibly easier. > > It might take some time to get all the kinks worked out, but I think > we're talking weeks here, maybe a few more to get a new custom-field > index rolled out, so it's definitely in the near future :) > > Jer Could you provide an example? Looks like it will be great, but I'm not chasing the scope. Would the script provide select a subset for a given query, or be for a wide subset? Would it score the matches? From christian.ledermann at gmail.com Fri Sep 26 06:47:01 2008 From: christian.ledermann at gmail.com (Christian Ledermann) Date: Fri, 26 Sep 2008 09:47:01 +0300 Subject: [Search-l] Coop - Custom search engine In-Reply-To: <48DBF961.4070500@gmail.com> References: <1222164041.7620.42.camel@ubuntu> <1222179511.7620.63.camel@ubuntu> <48DBF961.4070500@gmail.com> Message-ID: <1222411621.6210.25.camel@ubuntu> On Thu, 2008-09-25 at 22:49 +0200, Balinny wrote: > I wouldn't call that 'private'. It must be clear that it can be > provided > to external users (although > it will have completely different scores than in your internal one). > At > the very least, the search > engine shall be allowed to use that data to crawl those urls for > third-party users. > > We don't want something like this: ;) > -You showed my private information to other users! > -Well, the search engine just answered the query providing your blog. > -It was a private search engine! > -It happened that only your web had a hit for asdfghjjj. What's wrong > on > showing the result? > -It was my secret diary!! And now hundreds of people are visiting it! > -If you wanted a secret diary, why do you placed that in a blog? > -So i could use Wikia Search to search on it, obviously. No, that's not what I meant, let me construct a use case: So called 'hacker tools' are outlawed by several countries, I am a security consultant in one of those countries, so i do need information what is going on in terms of exploits, scanning tools, etc. but I do not necessarily want to draw attention to the fact that i have a search engine that focuses on 'hacking and cracking'. So my 'personal search engine' will not be available to anyone (except others i invited to collaborate) a.) to have a look at the sites to be included in the search or to b) use the 'personal search engine' wikia search otoh is allowed to crawl those sites of course( this is not a tool to exclude sites from search, that's what robot.txt or authentication is for) and use the description, keywords and ratings. well, maybe this is not the best way to tackle this, there might be other and better ways to achieve this form of privacy. Anyway this is not the main use case of a focused search engine so it easily can be dropped or deferred. -- Best Regards, Christian Ledermann From natanael.l at gmail.com Fri Sep 26 06:59:31 2008 From: natanael.l at gmail.com (Natanael) Date: Fri, 26 Sep 2008 08:59:31 +0200 Subject: [Search-l] Coop - Custom search engine In-Reply-To: <1222411621.6210.25.camel@ubuntu> References: <1222164041.7620.42.camel@ubuntu> <1222179511.7620.63.camel@ubuntu> <48DBF961.4070500@gmail.com> <1222411621.6210.25.camel@ubuntu> Message-ID: Makes sense to me. I might wan't a personal search engine that focuses on tech sites only, or security site only, or even cars if I want. Or maybe shopping? I don't want random results about everything else. On 26/09/2008, Christian Ledermann wrote: > > On Thu, 2008-09-25 at 22:49 +0200, Balinny wrote: > > I wouldn't call that 'private'. It must be clear that it can be > > provided > > to external users (although > > it will have completely different scores than in your internal one). > > At > > the very least, the search > > engine shall be allowed to use that data to crawl those urls for > > third-party users. > > > > We don't want something like this: ;) > > -You showed my private information to other users! > > -Well, the search engine just answered the query providing your blog. > > -It was a private search engine! > > -It happened that only your web had a hit for asdfghjjj. What's wrong > > on > > showing the result? > > -It was my secret diary!! And now hundreds of people are visiting it! > > -If you wanted a secret diary, why do you placed that in a blog? > > -So i could use Wikia Search to search on it, obviously. > > No, that's not what I meant, let me construct a use case: > > So called 'hacker tools' are outlawed by several countries, I am a > security consultant in one of those countries, so i do need information > what is going on in terms of exploits, scanning tools, etc. but I do not > necessarily want to draw attention to the fact that i have a search > engine that focuses on 'hacking and cracking'. > > So my 'personal search engine' will not be available to anyone (except > others i invited to collaborate) a.) to have a look at the sites to be > included in the search or to b) use the 'personal search engine' > > wikia search otoh is allowed to crawl those sites of course( this is not > a tool to exclude sites from search, that's what robot.txt or > authentication is for) and use the description, keywords and ratings. > > > well, maybe this is not the best way to tackle this, there might be > other and better ways to achieve this form of privacy. Anyway this is > not the main use case of a focused search engine so it easily can be > dropped or deferred. > > > > -- > Best Regards, > > Christian Ledermann > > > _______________________________________________ > Wikia Search mailing list > http://re.search.wikia.com/ > Change options or unsubscribe: > http://lists.wikia.com/mailman/options/search-l > -- If everybody are thinking alike, then somebody aren't thinking || Stupidity is a renewable resource -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.wikia.com/pipermail/search-l/attachments/20080926/8d3009cf/attachment.html From balinny at gmail.com Fri Sep 26 10:49:43 2008 From: balinny at gmail.com (Balinny) Date: Fri, 26 Sep 2008 12:49:43 +0200 Subject: [Search-l] Coop - Custom search engine In-Reply-To: <1222411621.6210.25.camel@ubuntu> References: <1222164041.7620.42.camel@ubuntu> <1222179511.7620.63.camel@ubuntu> <48DBF961.4070500@gmail.com> <1222411621.6210.25.camel@ubuntu> Message-ID: <48DCBE47.9030300@gmail.com> Christian Ledermann wrote: > No, that's not what I meant, let me construct a use case: > I wasn't trying to show a use case, but a /misuse case/ based on how people could bizarrely understand the word "private". I find your use case perfectly acceptable. Wikia Search would benefit from the crawling needed for your "personal search", and take into account keywords and ratings (beware of SEOs creating hundreds of personal search engines to add a keyword to their irrelevant site!). From ottk at zzz.ee Fri Sep 26 19:15:44 2008 From: ottk at zzz.ee (=?UTF-8?B?T3R0IEvDtnN0bmVy?=) Date: Fri, 26 Sep 2008 22:15:44 +0300 Subject: [Search-l] Project 10 to the 100th In-Reply-To: <091320082058.6282.48CC295F0002A1160000188A22230680329B0A02D29B9B0EBF9B9B079F9F0704D209@att.net> References: <48CBAE7E.7050202@gmail.com> <60726.66.243.196.131.1221314451.squirrel@webmail.fairpoint.net> <091320082058.6282.48CC295F0002A1160000188A22230680329B0A02D29B9B0EBF9B9B079F9F0704D209@att.net> Message-ID: <48DD34E0.4060800@zzz.ee> A little bit off topic: What do You think about Google's 10 to the 100th project? Is it just a trick or something serious? http://www.project10tothe100.com/how_it_works.html How it works Project 10^100 (pronounced "Project 10 to the 100th") is a call for ideas to change the world by helping as many people as possible. Here's how to join in. 1. Send us your idea by October 20th. Simply fill out the submission form giving us the gist of your idea. You can supplement your proposal with a 30-second video. 2. Voting on ideas begins on January 27th. We'll post a selection of one hundred ideas and ask you, the public, to choose twenty semi-finalists. Then an advisory board will select up to five final ideas. Send me a reminder to vote. 3. We'll help bring these ideas to life. We're committing $10 million to implement these projects, and our goal is to help as many people as possible. So remember, money may provide a jumpstart, but the idea is the thing. Good luck, and may those who help the most win. Submit your idea Remember, the deadline is October 20th, 2008 Guidelines Our goal is to set as few rules as possible. However, we ask that you put your idea into one of the following categories and consider the evaluation criteria below. Categories: * *Community:* How can we help connect people, build communities and protect unique cultures? * *Opportunity:* How can we help people better provide for themselves and their families? * *Energy:* How can we help move the world toward safe, clean, inexpensive energy? * *Environment:* How can we help promote a cleaner and more sustainable global ecosystem? * *Health:* How can we help individuals lead longer, healthier lives? * *Education:* How can we help more people get more access to better education? * *Shelter:* How can we help ensure that everyone has a safe place to live? * *Everything else:* Sometimes the best ideas don't fit into any category at all. Criteria: * Reach: How many people would this idea affect? * Depth: How deeply are people impacted? How urgent is the need? * Attainability: Can this idea be implemented within a year or two? * Efficiency: How simple and cost-effective is your idea? * Longevity: How long will the idea's impact last? -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.wikia.com/pipermail/search-l/attachments/20080926/b2a707ad/attachment.html From MGarside at oriental.com Fri Sep 26 20:08:33 2008 From: MGarside at oriental.com (Megan Garside) Date: Fri, 26 Sep 2008 15:08:33 -0500 Subject: [Search-l] Megan Garside is out of the office. Message-ID: I will be out of the office starting 09/25/2008 and will not return until 09/29/2008. I will be out of the office this afternoon and will return Monday. I will respond to you message at that time. Thanks, Megan From jeremie at jabber.org Sat Sep 27 22:15:21 2008 From: jeremie at jabber.org (Jeremie Miller) Date: Sat, 27 Sep 2008 17:15:21 -0500 Subject: [Search-l] Coop - Custom search engine In-Reply-To: <48DBFFB0.9000201@gmail.com> References: <1222164041.7620.42.camel@ubuntu> <1222179511.7620.63.camel@ubuntu> <59C64D03-F975-4767-B35F-A38A56EACD20@jabber.org> <48DBFFB0.9000201@gmail.com> Message-ID: > Could you provide an example? Looks like it will be great, but I'm not > chasing the scope. > Would the script provide select a subset for a given query, or be > for a > wide subset? > Would it score the matches? Dennis just ran some test scripts for me yesterday with great success! Heh ;) The scripts were very simple, map.pl: while(){ while(s/src[ |\=]+[\"|\']*http\:\/\/([^\s|\"|\'|\>]+)//i){ print "$1\t1\n"; } } and reduce.pl: my %keyz; while(){ chop; my($key,$cnt) = split; $keyz{$key} += $cnt; } foreach $key (keys %keyz){ print "$key\t$keyz{$key}\n"; } The map stage gives the script the raw HTML on STDIN, uses a regex to find any src=URL and it's output is collected then fed into reduce which sorts based on the key (URL) and aggregates a total count. My example doesn't do anything useful really, but it shows just how easily anyone can now provide scripts to plug into the processing stages of Nutch and do interesting things. Dennis is also cresting a format that either of these can output that will look something like: URL \t fieldname \t fieldvalue \t flags \n The fields can have any string value, can be stored (so they'll be returned along with that url on any result in the JSON) and/or indexed (so they can be required in any query like fieldname:foo). Some uses such as having a DB of URLs that you just want to be able to filter on could just build the above file format and wouldn't even need any map/reduce steps. Jer From jeremie at jabber.org Sat Sep 27 22:21:29 2008 From: jeremie at jabber.org (Jeremie Miller) Date: Sat, 27 Sep 2008 17:21:29 -0500 Subject: [Search-l] Coop - Custom search engine In-Reply-To: <48DCBE47.9030300@gmail.com> References: <1222164041.7620.42.camel@ubuntu> <1222179511.7620.63.camel@ubuntu> <48DBF961.4070500@gmail.com> <1222411621.6210.25.camel@ubuntu> <48DCBE47.9030300@gmail.com> Message-ID: Unfortunately nothing this project is doing with it's Nutch/Lucene index, the grub ARC files, or with KT (the Hbase instance that stores all the user contributions) will ever really support private information, since it's all publicly available in either open APIs or for raw access ( http://soap.grub.org/arcs/ http://re.search.wikia.com/downloads/kt/ http://search.isc.org/download/ ). It's possible to still build personalized search tools with transparency, but not private ones. Jer On Sep 26, 2008, at 5:49 AM, Balinny wrote: > Christian Ledermann wrote: >> No, that's not what I meant, let me construct a use case: >> > I wasn't trying to show a use case, but a /misuse case/ based on how > people > could bizarrely understand the word "private". > > I find your use case perfectly acceptable. Wikia Search would > benefit from > the crawling needed for your "personal search", and take into account > keywords and ratings (beware of SEOs creating hundreds of personal > search > engines to add a keyword to their irrelevant site!). > _______________________________________________ > Wikia Search mailing list > http://re.search.wikia.com/ > Change options or unsubscribe: http://lists.wikia.com/mailman/options/search-l > From natanael.l at gmail.com Mon Sep 29 11:20:46 2008 From: natanael.l at gmail.com (Natanael) Date: Mon, 29 Sep 2008 13:20:46 +0200 Subject: [Search-l] Coop - Custom search engine In-Reply-To: References: <1222164041.7620.42.camel@ubuntu> <1222179511.7620.63.camel@ubuntu> <48DBF961.4070500@gmail.com> <1222411621.6210.25.camel@ubuntu> <48DCBE47.9030300@gmail.com> Message-ID: Then we'll just set up a "search bot" with it's own database, so whenever somebody creates a private engine it will use Nutch and filter out the results, and it will store the settings for the private engines. Does that seem to be a sane solution? On Sun, Sep 28, 2008 at 00:21, Jeremie Miller wrote: > Unfortunately nothing this project is doing with it's Nutch/Lucene > index, the grub ARC files, or with KT (the Hbase instance that stores > all the user contributions) will ever really support private > information, since it's all publicly available in either open APIs or > for raw access ( http://soap.grub.org/arcs/ > http://re.search.wikia.com/downloads/kt/ > http://search.isc.org/download/ ). > > It's possible to still build personalized search tools with > transparency, but not private ones. > > Jer > > On Sep 26, 2008, at 5:49 AM, Balinny wrote: > > > Christian Ledermann wrote: > >> No, that's not what I meant, let me construct a use case: > >> > > I wasn't trying to show a use case, but a /misuse case/ based on how > > people > > could bizarrely understand the word "private". > > > > I find your use case perfectly acceptable. Wikia Search would > > benefit from > > the crawling needed for your "personal search", and take into account > > keywords and ratings (beware of SEOs creating hundreds of personal > > search > > engines to add a keyword to their irrelevant site!). > > _______________________________________________ > > Wikia Search mailing list > > http://re.search.wikia.com/ > > Change options or unsubscribe: > http://lists.wikia.com/mailman/options/search-l > > > > _______________________________________________ > Wikia Search mailing list > http://re.search.wikia.com/ > Change options or unsubscribe: > http://lists.wikia.com/mailman/options/search-l > -- If everybody are thinking alike, then somebody aren't thinking || Stupidity is a renewable resource -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.wikia.com/pipermail/search-l/attachments/20080929/72680bec/attachment.html From jeremie at jabber.org Mon Sep 29 18:00:00 2008 From: jeremie at jabber.org (Jeremie Miller) Date: Mon, 29 Sep 2008 13:00:00 -0500 Subject: [Search-l] Coop - Custom search engine In-Reply-To: References: <1222164041.7620.42.camel@ubuntu> <1222179511.7620.63.camel@ubuntu> <48DBF961.4070500@gmail.com> <1222411621.6210.25.camel@ubuntu> <48DCBE47.9030300@gmail.com> Message-ID: If someone wanted their own personal search engine they could just run their own instance of Nutch sure, I don't think anyone has done the work to make it as friendly as it might need to be though :) On Sep 29, 2008, at 6:20 AM, Natanael wrote: > Then we'll just set up a "search bot" with it's own database, so > whenever somebody creates a private engine it will use Nutch and > filter out the results, and it will store the settings for the > private engines. > > Does that seem to be a sane solution? > > On Sun, Sep 28, 2008 at 00:21, Jeremie Miller > wrote: > Unfortunately nothing this project is doing with it's Nutch/Lucene > index, the grub ARC files, or with KT (the Hbase instance that stores > all the user contributions) will ever really support private > information, since it's all publicly available in either open APIs or > for raw access ( http://soap.grub.org/arcs/ http://re.search.wikia.com/downloads/kt/ > http://search.isc.org/download/ ). > > It's possible to still build personalized search tools with > transparency, but not private ones. > > Jer > > On Sep 26, 2008, at 5:49 AM, Balinny wrote: > > > Christian Ledermann wrote: > >> No, that's not what I meant, let me construct a use case: > >> > > I wasn't trying to show a use case, but a /misuse case/ based on how > > people > > could bizarrely understand the word "private". > > > > I find your use case perfectly acceptable. Wikia Search would > > benefit from > > the crawling needed for your "personal search", and take into > account > > keywords and ratings (beware of SEOs creating hundreds of personal > > search > > engines to add a keyword to their irrelevant site!). > > _______________________________________________ > > Wikia Search mailing list > > http://re.search.wikia.com/ > > Change options or unsubscribe: http://lists.wikia.com/mailman/options/search-l > > > > _______________________________________________ > Wikia Search mailing list > http://re.search.wikia.com/ > Change options or unsubscribe: http://lists.wikia.com/mailman/options/search-l > > > > -- > If everybody are thinking alike, then somebody aren't thinking || > Stupidity is a renewable resource > _______________________________________________ > Wikia Search mailing list > http://re.search.wikia.com/ > Change options or unsubscribe: http://lists.wikia.com/mailman/options/search-l From ash at mojosupreme.com Mon Sep 29 18:41:08 2008 From: ash at mojosupreme.com (Ashkan Karbasfrooshan) Date: Mon, 29 Sep 2008 14:41:08 -0400 Subject: [Search-l] Coop - Custom search engine In-Reply-To: References: <1222164041.7620.42.camel@ubuntu> <1222179511.7620.63.camel@ubuntu> <48DBF961.4070500@gmail.com> <1222411621.6210.25.camel@ubuntu> <48DCBE47.9030300@gmail.com> Message-ID: http://www.MetaMojo.com Built on Nutch. Basically what you guys are trying to accomplish, I think. *MetaMojo.com at its core is a domain-specific vertical search engine, returning best of breed publishers for a given topic*, look at the high quality results, for example, in travel or health: Travel - Paris http://metamojo.com/results.php?query=paris&cat=5 Health - Prostate Cancer http://metamojo.com/results.php?query=prostate+cancer&cat=8 If anyone is interested, we can create mirror search engines, that you can customize at will. If interested, email me. thanks Ash On Mon, Sep 29, 2008 at 2:00 PM, Jeremie Miller wrote: > If someone wanted their own personal search engine they could just run > their own instance of Nutch sure, I don't think anyone has done the > work to make it as friendly as it might need to be though :) > > On Sep 29, 2008, at 6:20 AM, Natanael wrote: > > > Then we'll just set up a "search bot" with it's own database, so > > whenever somebody creates a private engine it will use Nutch and > > filter out the results, and it will store the settings for the > > private engines. > > > > Does that seem to be a sane solution? > > > > On Sun, Sep 28, 2008 at 00:21, Jeremie Miller > > wrote: > > Unfortunately nothing this project is doing with it's Nutch/Lucene > > index, the grub ARC files, or with KT (the Hbase instance that stores > > all the user contributions) will ever really support private > > information, since it's all publicly available in either open APIs or > > for raw access ( http://soap.grub.org/arcs/ > http://re.search.wikia.com/downloads/kt/ > > http://search.isc.org/download/ ). > > > > It's possible to still build personalized search tools with > > transparency, but not private ones. > > > > Jer > > > > On Sep 26, 2008, at 5:49 AM, Balinny wrote: > > > > > Christian Ledermann wrote: > > >> No, that's not what I meant, let me construct a use case: > > >> > > > I wasn't trying to show a use case, but a /misuse case/ based on how > > > people > > > could bizarrely understand the word "private". > > > > > > I find your use case perfectly acceptable. Wikia Search would > > > benefit from > > > the crawling needed for your "personal search", and take into > > account > > > keywords and ratings (beware of SEOs creating hundreds of personal > > > search > > > engines to add a keyword to their irrelevant site!). > > > _______________________________________________ > > > Wikia Search mailing list > > > http://re.search.wikia.com/ > > > Change options or unsubscribe: > http://lists.wikia.com/mailman/options/search-l > > > > > > > _______________________________________________ > > Wikia Search mailing list > > http://re.search.wikia.com/ > > Change options or unsubscribe: > http://lists.wikia.com/mailman/options/search-l > > > > > > > > -- > > If everybody are thinking alike, then somebody aren't thinking || > > Stupidity is a renewable resource > > _______________________________________________ > > Wikia Search mailing list > > http://re.search.wikia.com/ > > Change options or unsubscribe: > http://lists.wikia.com/mailman/options/search-l > > _______________________________________________ > Wikia Search mailing list > http://re.search.wikia.com/ > Change options or unsubscribe: > http://lists.wikia.com/mailman/options/search-l > -- Ashkan Karbasfrooshan CEO | Mojo Supreme, WatchMojo.com 5413 St. Laurent Blvd, Suite 200 Montreal, QC, H2T 1S5, Canada p: 1-514/448-1631 c: 1-514/827-2532 f: 1-866/868-0981 Ash at MojoSupreme.com http://www.WatchMojo.com http://www.MojoSupreme.com http://www.linkedin.com/in/ashkank http://www.hipmojo.com/ -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.wikia.com/pipermail/search-l/attachments/20080929/23c234eb/attachment.html From christian.ledermann at gmail.com Tue Sep 30 11:36:12 2008 From: christian.ledermann at gmail.com (Christian Ledermann) Date: Tue, 30 Sep 2008 14:36:12 +0300 Subject: [Search-l] Coop - Custom search engine In-Reply-To: References: <1222164041.7620.42.camel@ubuntu> <1222179511.7620.63.camel@ubuntu> <48DBF961.4070500@gmail.com> <1222411621.6210.25.camel@ubuntu> <48DCBE47.9030300@gmail.com> Message-ID: <1222774572.6245.15.camel@ubuntu> On Sat, 2008-09-27 at 17:21 -0500, Jeremie Miller wrote: > It's possible to still build personalized search tools with > transparency, but not private ones. OK, lets drop the 'private search' use case. Anyway as you guys pointed out this causes too much confusion. -- Best Regards, Christian Ledermann Africa i-Parliaments Action Plan http://www.parliaments.info UN/DESA Nairobi - Kenya Mobile : +254 729495789 Telephone : +254-20-374 9892/3 Fax : +254-20-374 9894