From kubes at apache.org Fri Feb 1 00:59:22 2008 From: kubes at apache.org (Dennis Kubes) Date: Thu, 31 Jan 2008 18:59:22 -0600 Subject: [Search-l] URL Normalization and Input In-Reply-To: <47A25C2F.6010801@gmail.com> References: <47A207BF.8080304@apache.org> <47A25C2F.6010801@gmail.com> Message-ID: <47A26EEA.5000707@apache.org> Balinny wrote: > Dennis Kubes wrote: >> We are currently working on URL normalization measures for the search >> wikia crawls. URL normalization is used during crawls to change URLs >> into standard forms. An example of this is have www.site.com/index.html >> and www.site.com/ resolve to the same URL for crawling and scoring purposes. >> >> Eventually the idea would be to allow normalizations on a per domain >> basis and allow the community to give detailed feedback per domain. >> Currently all normalizations are on a global basic. Our current url >> normalizations are done through regex so I have included the current >> expressions as well. Currently we have come up with the following >> normalizations, is there anything else we should include, change? What >> does everyone think? >> > I hope it's only done for url comparing purposes to avoid duplicate > results) not for crawl. The remove-index is specially dangerous. > We should rely on the Content-Location to detect the index as the same as / That is one of the reasons I wanted to get some feedback, is it dangerous because the opinion is that there are many non-default pages called index.html or equivalent or is it dangerous because of spam implications. The intention was to use this during generation of urls to fetch and when parsing links from pages. Nutch has a dedup process that eliminates both duplicate urls and duplicate content by hash. I am more concerned about duplicate pages for crawling and more importantly scoring. For instance if you do this search: http://re.search.wikia.com/search#java You will see that java.net and www.java.net are currently counted as different urls but are duplicates with differing scores. > > Also, i don't know how they're currently done, but the order may matter. > I'd do it in this order: > -Remove #... > -Remove session ids > -Clean &s > -Remove ?&var > -Trailing ? > -Change default pages into standard > > Some suggestions: > -Remove maxage and smaxage parameters for comparing. I am not understanding what these are? Just query parameters? > -Add php5 to the extensions list (although if they're putting the > version in the extension, it's probably NOT the default). will do > -The ending with ([^/]*)$ instead of $ doesn't make me feel too comfortable. The ([^/]) is to make sure things like wiki/index.php/Main_Page don't get changed to wiki//Main_Page > -Use ETags I don't know if Nutch supports ETags yet or not, if not it is definitely something that is needed. :) > > _______________________________________________ > Wikia Search mailing list > http://alpha.search.wikia.com/ > Change options or unsubscribe: http://lists.wikia.com/mailman/options/search-l From peter.burden at gmail.com Fri Feb 1 13:10:22 2008 From: peter.burden at gmail.com (Peter Burden) Date: Fri, 1 Feb 2008 13:10:22 +0000 Subject: [Search-l] URL Normalization and Input In-Reply-To: <47A207BF.8080304@apache.org> References: <47A207BF.8080304@apache.org> Message-ID: On 31/01/2008, Dennis Kubes wrote: > > We are currently working on URL normalization measures for the search > wikia crawls. URL normalization is used during crawls to change URLs > into standard forms. An example of this is have www.site.com/index.html > and www.site.com/ resolve to the same URL for crawling and scoring > purposes. > > Eventually the idea would be to allow normalizations on a per domain > basis and allow the community to give detailed feedback per domain. > Currently all normalizations are on a global basic. Our current url > normalizations are done through regex so I have included the current > expressions as well. Currently we have come up with the following > normalizations, is there anything else we should include, change? What > does everyone think? 1. Session id elimination. Is this wise? If you eliminate the session id from a URL then the server is likely to respond with a completely different page that represents the start of a new session. This may result in the user seeing something quite different to what the search engine sees which may be confusing. 2. Pages served for directory requests. I have also seen "welcome.htm[l]" and "home.htm[l]" used in this context. However I think doing this sort of normalisation is unwise and agree with Balinny that you need to access the actual pages. However a typically configured Apache server will not return a "Content-Location" header, it will simply and silently return "index.html" when you request the directory. [I think, but I'm not sure, that MS IIS does return a "Content-Location" header.] So I'm afraid you'll have to fetch both "/" and "/index.html" and determine that they're the same by checksumming or content inspection. 3. Removal of fragments (the bit after the #) Yes, of course, but remember, and the regex quoted doesn't, that this may interact with the dynamic part of the URL (the bit after the "?"). I would assume that the fragment part of the URL is terminated by the "?" if a dynamic part (or query) is present. 4. Collapse multiple ampersands (&) to a single ampersand. Can't see why. 5. Removal of initial & after ? I.e. ?&var=.... -> ?var=.... OK if you really want to, personally I'd prefer to ensure that there was an ampersand in this position as it makes it slightly easier to parse the dynamic part if you really want to. 6. Remove trailing ? If it's all on its own, seems sensible - but check with server, it just might do something insufferably clever. Some extra suggestions 7. Encoded characters Map + to space (RFC1630), Map hex encoded non-reserved characters to their non-encoded equivalents. [E.g. %7e -> ~ see RFC3986] 8. Leading double dots etc., Do something coherent with URLs that start .././ and the like. Again see RFC3986 for detailed discussion. This is associated with the process I call derelativisation, i.e. converting a relative URL to an absolute URL. 9. Care with case If the server is Unix/Linux based then URL case must preserved since the underlying file naming system is case sensitive. On Microsoft based servers file naming is not case sensitive, so if server signature analysis (the Server header) suggests a Microsoft based host then MYPAGE.HTM and mypage.htm can be regarded as being the same. 10. Order of dynamic parts. In general the order of the variable settings in the dynamic part of a URL is unimportant. I.e. bigsite?&chap=23&page=11 and bigsite?&page=11&chap=23 will both refer to the same document. This requires parsing the dynamic part and comparing the sequence. General point. I think it would be better to retain all the URLs in the database and associate an arbitrary document identification with them. So 2 (or more) URLs that redirect to or refer to the same document will retain their distinctiveness but will all be associated with the same document. This mechanism can support both HTTP and meta tag redirection. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.wikia.com/pipermail/search-l/attachments/20080201/add24bcf/attachment.html From jmcc at hackwatch.com Fri Feb 1 14:11:47 2008 From: jmcc at hackwatch.com (John McCormac) Date: Fri, 01 Feb 2008 14:11:47 +0000 Subject: [Search-l] The Search Business Just Got A Whole Lot Harder Message-ID: <47A328A3.6020704@hackwatch.com> Well this will either be a major change in the search business or the equivalent of the AOL/Time Warner merger. http://biz.yahoo.com/ap/080201/microsoft_yahoo.html Yahoo search with Microsoft management or Microsoft search with Yahoo management - which would be the greatest threat? Regards...jmcc -- ****************************************************** John McCormac * e-mail: jmcc at whoisireland.com MC2 * voice: +353-51-873640 22 Viewmount * web: http://www.whoisireland.com/ Waterford * blog: http://blog.whoisireland.com Ireland * Irish Domain Stats & Market Research ****************************************************** From newsmarkie at googlemail.com Fri Feb 1 15:06:27 2008 From: newsmarkie at googlemail.com (Mark (Markie)) Date: Fri, 1 Feb 2008 15:06:27 +0000 Subject: [Search-l] Input wanted on Mini and main name spaces for Search Wikia Message-ID: Right im being bold here, so sorry if anyone has problems with this. Currently there is not really any definite policy on what is acceptable in, and what should not be in mini: articles for Search Wikia. This means that as an admin on the wiki I, and other admins, have difficulty in deciding what should be allowed to stay and what should be in the articles. It also means trouble for people who are trying to contribute as there isn't really a list of what should and shouldn't be in the mini articles. So I'm suggesting the following to try and get this sorted. Go to this page :- *http://search.wikia.com/wiki/search:Mini_article/Policy_discussion* This page is to be used for the posting and discussion of ideas for the policy on Mini: articles, Search:Mini article, and what they should contain. As the main name space, (the namespace with no prefix) is an alternative location for content, the question of what the main name space is to be used for is also under consideration, see Search:Main namespace for one alternative use. Please post ideas below and then these will be discussed until *February 8, 2008*. From then until *February 15, 2008* various versions of the policy will be drawn up. More details are on the wiki page, please check there. Please add your inputs. Thanks all and hope i haven't upset anyone. Best wishes Mark -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.wikia.com/pipermail/search-l/attachments/20080201/e9760fea/attachment.html From peter.burden at gmail.com Fri Feb 1 23:54:46 2008 From: peter.burden at gmail.com (Peter Burden) Date: Fri, 1 Feb 2008 23:54:46 +0000 Subject: [Search-l] Input wanted on Mini and main name spaces for Search Wikia In-Reply-To: References: Message-ID: On 01/02/2008, Mark (Markie) wrote: > > Right im being bold here, so sorry if anyone has problems with this. > > Currently there is not really any definite policy on what is acceptable > in, and what should not be in mini: articles for Search Wikia. This means > that as an admin on the wiki I, and other admins, have difficulty in > deciding what should be allowed to stay and what should be in the articles. > It also means trouble for people who are trying to contribute as there isn't > really a list of what should and shouldn't be in the mini articles. > > So I'm suggesting the following to try and get this sorted. Go to this > page :- * > http://search.wikia.com/wiki/search:Mini_article/Policy_discussion* I've been there and, frankly, it's far from obvious how to enter into a discussion there. I suggest the discussion continue on this mailing list. So to start the ball rolling :- Before going any further I'd strongly recommend that anybody interested in this discussion spends a few minutes hitting the "random mini-article" button. This, if nothing else, will give some idea of the various possible interpretations of the idea. Pure disambiguation (a la Wikipedia) seems the most useful practical role of mini-articles and would be a very useful search engine plus subject to one important proviso. This is that once a user has focussed on one of the disambiguating alternatives he/she can then re-do their search getting results relevant to just the sense they had selected. If this can be done then disambiguating mini-articles will be most useful. However I've not seen any suggestions or proposals that would result in Search Wikia having the necessary semantic mark-up in any credible fashion. Here's some ideas on rules for mini-articles Mini-articles should be between 50 and 200 words long. Mini-articles must include a a tag of some sort indicating what language they are in. Mini-articles should not include images or hyper-links. This page is to be used for the posting and discussion of ideas for the > policy on Mini: articles, Search:Mini article, > and what they should contain. As the main name space, (the namespace with no > prefix) is an alternative location for content, the question of what the > main name space is to be used for is also under consideration, see Search:Main > namespace for one > alternative use. Please post ideas below and then these will be discussed > until *February 8, 2008*. From then until *February 15, 2008* various > versions of the policy will be drawn up. > > More details are on the wiki page, please check there. Please add your > inputs. > > Thanks all and hope i haven't upset anyone. > > Best wishes > > Mark > > _______________________________________________ > Wikia Search mailing list > http://alpha.search.wikia.com/ > Change options or unsubscribe: > http://lists.wikia.com/mailman/options/search-l > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.wikia.com/pipermail/search-l/attachments/20080201/9ae5aae7/attachment.html From jeremie at jabber.org Sat Feb 2 00:21:44 2008 From: jeremie at jabber.org (jer) Date: Fri, 1 Feb 2008 18:21:44 -0600 Subject: [Search-l] Input wanted on Mini and main name spaces for Search Wikia In-Reply-To: References: Message-ID: <16D653D6-EFD5-4336-9A45-EA6DC20F8CA5@jabber.org> >> So I'm suggesting the following to try and get this sorted. Go to >> this page :- http://search.wikia.com/wiki/search:Mini_article/ >> Policy_discussion > > I've been there and, frankly, it's far from obvious how to enter > into a discussion there. > I suggest the discussion continue on this mailing list. So to start > the ball rolling :- It is a bit confusing, a conversation in progress, but if anyone has input they should just tack it onto the end and it'll get moved/ cleaned up as the page evolves. A discussion here is of course great too, and any points should/will get recorded into the policy discussion page. > ... > Here's some ideas on rules for mini-articles > > Mini-articles should be between 50 and 200 words long. > Mini-articles must include a a tag of some sort indicating what > language they are in. Agree with both of these, I just included an i18n section on the page for ideas on how to do the localization. > Mini-articles should not include images or hyper-links. I think both are potentially useful but must have a careful policy around them, small images and good hyperlinks could be very valuable to a searcher. Jer From newsmarkie at googlemail.com Sat Feb 2 10:25:16 2008 From: newsmarkie at googlemail.com (Mark (Markie)) Date: Sat, 2 Feb 2008 10:25:16 +0000 Subject: [Search-l] Input wanted on Mini and main name spaces for Search Wikia In-Reply-To: <16D653D6-EFD5-4336-9A45-EA6DC20F8CA5@jabber.org> References: <16D653D6-EFD5-4336-9A45-EA6DC20F8CA5@jabber.org> Message-ID: On Feb 2, 2008 12:21 AM, jer wrote: > >> So I'm suggesting the following to try and get this sorted. Go to > >> this page :- http://search.wikia.com/wiki/search:Mini_article/ > >> Policy_discussion > > > > I've been there and, frankly, it's far from obvious how to enter > > into a discussion there. > > I suggest the discussion continue on this mailing list. So to start > > the ball rolling :- > > It is a bit confusing, a conversation in progress, but if anyone has > input they should just tack it onto the end and it'll get moved/ > cleaned up as the page evolves. A discussion here is of course great > too, and any points should/will get recorded into the policy > discussion page. > > > ... > > Here's some ideas on rules for mini-articles > > > > Mini-articles should be between 50 and 200 words long. > > Mini-articles must include a a tag of some sort indicating what > > language they are in. > > Agree with both of these, I just included an i18n section on the page > for ideas on how to do the localization. i agree with this also. further thoughts on localisation, we could put some .js code in the mini:edit pages that say "click me if this article is in English, Deutsch etc etc" which would then insert the correctly formatted lang code holder. > > > > Mini-articles should not include images or hyper-links. > > I think both are potentially useful but must have a careful policy > around them, small images and good hyperlinks could be very valuable > to a searcher. hmmm images maybe, if they are informative, i would say no to screenshots of the website though, links...well isnt that what search is for??... cheers mark > > > Jer > > _______________________________________________ > Wikia Search mailing list > http://alpha.search.wikia.com/ > Change options or unsubscribe: > http://lists.wikia.com/mailman/options/search-l > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.wikia.com/pipermail/search-l/attachments/20080202/499bd298/attachment.html From ottk at zzz.ee Sat Feb 2 13:12:25 2008 From: ottk at zzz.ee (ottk at zzz.ee) Date: Sat, 2 Feb 2008 13:12:25 +0000 (UTC) Subject: [Search-l] NYTimes.com: Eyes on Google, Microsoft Bids $44 Billion for Yahoo Message-ID: <20080202131225.4C4D61D40A2A@ut4.sjc.wikia-inc.com> This page was sent to you by: ottk at zzz.ee. What do you think about this story? TECHNOLOGY | February 2, 2008 Eyes on Google, Microsoft Bids $44 Billion for Yahoo By MIGUEL HELFT and ANDREW ROSS SORKIN If consummated, the deal would instantly redraw the competitive landscape on the Internet and escalate the battle between Microsoft and Google. http://www.nytimes.com/2008/02/02/technology/02yahoo.html?ex=1202619600&en=43c6aeedd5c25cc1&ei=5070&emc=eta1 ---------------------------------------------------------- ABOUT THIS E-MAIL This e-mail was sent to you by a friend through NYTimes.com's E-mail This Article service. For general information about NYTimes.com, write to help at nytimes.com. NYTimes.com 620 Eighth Avenue New York, NY 10018 Copyright 2008 The New York Times Company -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.wikia.com/pipermail/search-l/attachments/20080202/76b7295d/attachment.html From jmcc at hackwatch.com Sat Feb 2 13:34:45 2008 From: jmcc at hackwatch.com (John McCormac) Date: Sat, 02 Feb 2008 13:34:45 +0000 Subject: [Search-l] NYTimes.com: Eyes on Google, Microsoft Bids $44 Billion for Yahoo In-Reply-To: <20080202131225.4C4D61D40A2A@ut4.sjc.wikia-inc.com> References: <20080202131225.4C4D61D40A2A@ut4.sjc.wikia-inc.com> Message-ID: <47A47175.8010002@hackwatch.com> ottk at zzz.ee wrote: > > The New York Times E-mail This > *This page was sent to you by: * ottk at zzz.ee > > Message from sender: > What do you think about this story? It certainly makes the attempt to gain search marketshare a bit harder. Microsoft management and Yahoo search technology and expertise will be quite a threat to Google and others. Given Yahoo's concentration on the social web, the product of this merger could be quite an opponent for any new search engine venture, especially ones that involve a social element. It would also have something that a lot of new ventures have to struggle to attain - users. Regards...jmcc -- ****************************************************** John McCormac * e-mail: jmcc at whoisireland.com MC2 * voice: +353-51-873640 22 Viewmount * web: http://www.whoisireland.com/ Waterford * blog: http://blog.whoisireland.com Ireland * Irish Domain Stats & Market Research ****************************************************** From memorabilia.ggm at gmail.com Sat Feb 2 15:24:37 2008 From: memorabilia.ggm at gmail.com (Fernando Jaramillo) Date: Sat, 2 Feb 2008 10:24:37 -0500 Subject: [Search-l] (sin asunto) Message-ID: -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.wikia.com/pipermail/search-l/attachments/20080202/7f1bccc8/attachment.html From bikengr at netnet.net Sat Feb 2 15:28:15 2008 From: bikengr at netnet.net (Jim Papadopoulos) Date: Sat, 2 Feb 2008 09:28:15 -0600 Subject: [Search-l] vision Message-ID: <001901c865b0$3b34e1b0$6600a8c0@PC270429458147> Hello, I am notoriously bad at lurking, these days. So I apologise if I am repeating something old, or speaking at odds with the general direction. I am speaking as a 'search' visionary, not a computer expert. I see 'search' as being more powerful than the old concept of 'library'. I have hopes for what will someday be available, and that is why I tried to make contact as soon as I heard of Wiki Search. Two or three major points, which I hope intersect with this venture in some way: Configurable ranking algorithms. I imagine that there must be more or less complicated ranking algorithms that 'weight' various factors such as 'type A spam', 'type B spam', 'authoritativeness among other non-spam pages', 'commercial content', 'specificity', etc. etc. I would like to see some user choice in making those weightings. In other words, if a ranking could be simplified as 100 weighting factors, let a user fiddle with those factors. Several 'standard' weighting assignments would always be available, but the user would have the opportunity to 'interpolate between weighting vectors' or even 'create a weighting vector from scratch'. Weighting vectors could be widely shared. Probably some committed souls would do their best to create a 'weighting vector closely matching google performance'. This kind of 'open weighting source' freedom would eliminate the current one-weighting tyranny of google and others. Commercial taint: On one hand, commerce is immensely powerful, and can fund useful things. On the other hand, it can diminish free and fair access. Even though I have always heard that Google does not accept pay for rank, my sense is that somehow, a large commercial venture can achieve high rank SOMEHOW. (By linking to its own websites, by continual SEO.) Isn't it true that Google has the possibility of political influence, for example simply not ranking a large category of pages if they don't match the prevailing trends? I have two thoughts: 1. The first is that with an 'open source weighting system', any company (Ford?) or consortium (Better Business Bureau) can provide a search gateway with their own weighting system. If they have a way of providing extra value while pushing their own members, such an initiative can freely try to attract users. 2. The second is a 'government responsibility' idea that may not go down well with all. Just as the Library of Congress evolved into a nationally supported gateway to information, I have this feeling that LC should also be supporting and promoting a NEUTRAL CONFIGURABLE search engine such as Wiki Search. Of course this could only work if it is not run by ideologues. Maybe it should be an international effort? I think open access to information via search is the library of tomorrow. We don't want a Coca Cola TM library, in many cases we want a Community Library. Talk first, think later. Now I will take a look at the alpha site. Thanks for your patience. Jim Papadopoulos -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.wikia.com/pipermail/search-l/attachments/20080202/7d19e40b/attachment.html From tat.wright at googlemail.com Sat Feb 2 15:27:51 2008 From: tat.wright at googlemail.com (Tom Wright) Date: Sat, 2 Feb 2008 15:27:51 +0000 Subject: [Search-l] Input wanted on Mini and main name spaces for Search Wikia In-Reply-To: References: <16D653D6-EFD5-4336-9A45-EA6DC20F8CA5@jabber.org> Message-ID: <2813c4970802020727t3db3cf74v70dbb8b24b462b22@mail.gmail.com> The motivation behind providing links within mini-articles is that they can provide: (i) Better links. (ii) Information describing links (This is absent from search results.) (iii) Links that might be lower down the search order but are still important. (Example i. when looking up programming related terms one often doesn't get links to documentation because of the profusion of programming discussion forums. ii. When looking up books titles links to the full text of the book (if it is out of copyright) are very useful - but won't, as a rule, appear in searches. ) (iv) Links to things that mightn't appear in the search results but are very important. (Examples: Javascript is also known as ECMAscript - you won't get links to ecmascript when searching for javascript - but ecmascript provides the only formal documentation. Also searching for web related terms) (v) A way for people who have found a the necessary link after much searching to store their result somewhere - thereby preventing the inefficiencies of other people having to do exactly the same work. Now certainly you can argue that all of the above can be done by: (a) Providing suitable disambiguation. (b) Adding extra features to search results. (c) Adding the ability to search based on "semantics" rather than syntactics. and all of this should be attempted. However the question is will this work in practice? I think human generated content can always be better than algorithmically generated content - assuming that the human doesn't have mal-intent. (Because the humans use the algorithms first.) Why do you think having links within mini-articles is a bad idea? The possible risks I can perceive are: (i) Spam (ii) Biased content (iii) Detraction from the search results. (iv) Distraction due to the links being of a poor quality. The question is how much can one control these risks, and do they justify the potential benefits of having links. This is very much an empirical rather than theoretic question... so I'm not quite sure how to answer it. Tom On Feb 2, 2008 10:25 AM, Mark (Markie) wrote: > > > > On Feb 2, 2008 12:21 AM, jer wrote: > > > > >> So I'm suggesting the following to try and get this sorted. Go to > > >> this page :- http://search.wikia.com/wiki/search:Mini_article/ > > >> Policy_discussion > > > > > > I've been there and, frankly, it's far from obvious how to enter > > > into a discussion there. > > > I suggest the discussion continue on this mailing list. So to start > > > the ball rolling :- > > > > It is a bit confusing, a conversation in progress, but if anyone has > > input they should just tack it onto the end and it'll get moved/ > > cleaned up as the page evolves. A discussion here is of course great > > too, and any points should/will get recorded into the policy > > discussion page. > > > > > ... > > > > > Here's some ideas on rules for mini-articles > > > > > > Mini-articles should be between 50 and 200 words long. > > > Mini-articles must include a a tag of some sort indicating what > > > language they are in. > > > > Agree with both of these, I just included an i18n section on the page > > for ideas on how to do the localization. > > > i agree with this also. further thoughts on localisation, we could put some > .js code in the mini:edit pages that say "click me if this article is in > English, Deutsch etc etc" which would then insert the correctly formatted > lang code holder. > > > > > > > > > > Mini-articles should not include images or hyper-links. > > > > I think both are potentially useful but must have a careful policy > > around them, small images and good hyperlinks could be very valuable > > to a searcher. > hmmm images maybe, if they are informative, i would say no to screenshots of > the website though, links...well isnt that what search is for??... > > cheers > > mark > > > > > > > Jer > > > > > > > > > > _______________________________________________ > > Wikia Search mailing list > > http://alpha.search.wikia.com/ > > Change options or unsubscribe: > http://lists.wikia.com/mailman/options/search-l > > > > > _______________________________________________ > Wikia Search mailing list > http://alpha.search.wikia.com/ > Change options or unsubscribe: > http://lists.wikia.com/mailman/options/search-l > From fredbaud at fairpoint.net Sat Feb 2 15:44:13 2008 From: fredbaud at fairpoint.net (Fred Bauder) Date: Sat, 2 Feb 2008 10:44:13 -0500 (EST) Subject: [Search-l] vision In-Reply-To: <001901c865b0$3b34e1b0$6600a8c0@PC270429458147> References: <001901c865b0$3b34e1b0$6600a8c0@PC270429458147> Message-ID: <65231.66.243.196.131.1201967053.squirrel@webx1.neonova.net> Yes, rather like choosing music at Pandora.com, one could chose the weighting factors to be used in your search. That assumes that generic weighting factors have not already winnowed the results. So one could chose advertising if you're shopping, and journal articles if you're researching. Fred Bauder > Hello, I am notoriously bad at lurking, these days. So I apologise if I > am repeating something old, or speaking at odds with the general > direction. > > > > I am speaking as a 'search' visionary, not a computer expert. I see > 'search' as being more powerful than the old concept of 'library'. I > have hopes for what will someday be available, and that is why I tried > to make contact as soon as I heard of Wiki Search. > > > > Two or three major points, which I hope intersect with this venture in > some way: > > > > Configurable ranking algorithms. I imagine that there must be more or > less complicated ranking algorithms that 'weight' various factors such > as 'type A spam', 'type B spam', 'authoritativeness among other non-spam > pages', 'commercial content', 'specificity', etc. etc. I would like > to see some user choice in making those weightings. In other words, if a > ranking could be simplified as 100 weighting factors, let a user fiddle > with those factors. Several 'standard' weighting assignments would > always be available, but the user would have the opportunity to > 'interpolate between weighting vectors' or even 'create a weighting > vector from scratch'. Weighting vectors could be widely shared. Probably > some committed souls would do their best to create a 'weighting vector > closely matching google performance'. This kind of 'open weighting > source' freedom would eliminate the current one-weighting tyranny of > google and others. > > > > > > Commercial taint: On one hand, commerce is immensely powerful, and can > fund useful things. On the other hand, it can diminish free and fair > access. Even though I have always heard that Google does not accept pay > for rank, my sense is that somehow, a large commercial venture can > achieve high rank SOMEHOW. (By linking to its own websites, by continual > SEO.) Isn't it true that Google has the possibility of political > influence, for example simply not ranking a large category of pages if > they don't match the prevailing trends? > > > > I have two thoughts: > > > > 1. The first is that with an 'open source weighting system', any company > (Ford?) or consortium (Better Business Bureau) can provide a search > gateway with their own weighting system. If they have a way of providing > extra value while pushing their own members, such an initiative can > freely try to attract users. > > > > 2. The second is a 'government responsibility' idea that may not go down > well with all. Just as the Library of Congress evolved into a nationally > supported gateway to information, I have this feeling that LC should > also be supporting and promoting a NEUTRAL CONFIGURABLE search engine > such as Wiki Search. Of course this could only work if it is not run by > ideologues. Maybe it should be an international effort? I think open > access to information via search is the library of tomorrow. We don't > want a Coca Cola TM library, in many cases we want a Community Library. > > > > Talk first, think later. Now I will take a look at the alpha site. > Thanks for your patience. > > > > Jim Papadopoulos > > > > > > > > > > > > > > > > > > From spam.again.spam at gmail.com Sat Feb 2 17:42:56 2008 From: spam.again.spam at gmail.com (SpamAgain Spam) Date: Sat, 2 Feb 2008 09:42:56 -0800 Subject: [Search-l] Wikia Searching Message-ID: <496053cd0802020942vc2c1d19ld9e4e61ad85f9325@mail.gmail.com> First off, I'm terrible at using these mailing systems, they confuse me and therefor I hate them. Anyway, I'm glad Wikia Search has been founded. I had an idea just like this (wikipedia-like search engine), and was thinking of programming it. Instead, I just found this I would like to add input (my own ideas) to be considered. I thought I'd share some of my original ideas, instead of let them go to waste: + Have a link that you want to have spidered? Do you consider something to be spam? Flag something for the adult section? And so on. + Discussion pages will be available to discuss this stuff, such as if the site has adult images and should be flagged. + Link website to a wikipedia entry if one exists. + Search .mp3's, etc,. Good luck. - Chris -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.wikia.com/pipermail/search-l/attachments/20080202/d561517a/attachment.html From newsmarkie at googlemail.com Sat Feb 2 18:33:19 2008 From: newsmarkie at googlemail.com (Mark (Markie)) Date: Sat, 2 Feb 2008 18:33:19 +0000 Subject: [Search-l] Wikia Searching In-Reply-To: <496053cd0802020942vc2c1d19ld9e4e61ad85f9325@mail.gmail.com> References: <496053cd0802020942vc2c1d19ld9e4e61ad85f9325@mail.gmail.com> Message-ID: hmm these are all good ideas for user ranking. atm we only have star ratings and im not too sure if these are actually used or stored currently. what would people think about having various different rankings like these. i think that big company (yes the one beginning with G :-) currently have safe search, would we want that kinda thing?? others thoughts? mark On Feb 2, 2008 5:42 PM, SpamAgain Spam wrote: > First off, I'm terrible at using these mailing systems, they confuse me > and therefor I hate them. Anyway, I'm glad Wikia Search has been founded. I > had an idea just like this (wikipedia-like search engine), and was thinking > of programming it. Instead, I just found this I would like to add input (my > own ideas) to be considered. I thought I'd share some of my original ideas, > instead of let them go to waste: > > + Have a link that you want to have spidered? Do you consider something to > be spam? Flag something for the adult section? And so on. > + Discussion pages will be available to discuss this stuff, such as if the > site has adult images and should be flagged. > + Link website to a wikipedia entry if one exists. > + Search .mp3's, etc,. > > Good luck. > - Chris > > _______________________________________________ > Wikia Search mailing list > http://alpha.search.wikia.com/ > Change options or unsubscribe: > http://lists.wikia.com/mailman/options/search-l > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.wikia.com/pipermail/search-l/attachments/20080202/50ee0702/attachment.html From aerik at thesylvans.com Sat Feb 2 18:56:41 2008 From: aerik at thesylvans.com (Aerik Sylvan) Date: Sat, 2 Feb 2008 10:56:41 -0800 Subject: [Search-l] Wikia Searching In-Reply-To: References: <496053cd0802020942vc2c1d19ld9e4e61ad85f9325@mail.gmail.com> Message-ID: <355a36af0802021056j59dcee95n5d76cc7b1e12baca@mail.gmail.com> On Feb 2, 2008 10:33 AM, Mark (Markie) wrote: > hmm these are all good ideas for user ranking. atm we only have star > ratings and im not too sure if these are actually used or stored currently. > what would people think about having various different rankings like these. > i think that big company (yes the one beginning with G :-) currently have > safe search, would we want that kinda thing?? > > others thoughts? > > mark > Yes, I think a safe search is a good idea. And stars are good; though I still like the idea of being able to assign more meta information (tags!) to a result to help with disambiguation. Aerik -- http://www.wikidweb.com - the Wiki Directory of the Web http://tagthis.info - Hosted Tagging for your website! From fredbaud at fairpoint.net Sat Feb 2 19:16:36 2008 From: fredbaud at fairpoint.net (Fred Bauder) Date: Sat, 2 Feb 2008 14:16:36 -0500 (EST) Subject: [Search-l] Mechanism for Feedback on Search Results, was Re: Wikia Searching In-Reply-To: <496053cd0802020942vc2c1d19ld9e4e61ad85f9325@mail.gmail.com> References: <496053cd0802020942vc2c1d19ld9e4e61ad85f9325@mail.gmail.com> Message-ID: <49968.66.243.196.131.1201979796.squirrel@webx1.neonova.net> > First off, I'm terrible at using these mailing systems, they confuse me > and therefor I hate them. Anyway, I'm glad Wikia Search has been > founded. I had an idea just like this (wikipedia-like search engine), > and was thinking of programming it. Instead, I just found this I would > like to add input (my own ideas) to be considered. I thought I'd share > some of my original ideas, instead of let them go to waste: > > + Have a link that you want to have spidered? Do you consider something > to be spam? Flag something for the adult section? And so on. > + Discussion pages will be available to discuss this stuff, such as if > the site has adult images and should be flagged. > + Link website to a wikipedia entry if one exists. > + Search .mp3's, etc,. > > Good luck. > - Chris Chris's ideas seem to assume we would have a way of generating user feedback regarding individual search results. This could be the search namespace (policy pages there would be moved to a new policy namespace). Each search which produced a hit from a url would automatically generate an entry on the search page for that url which would consist of the page which constituted the hit and what search term was used. To give an example, if I search for "Kiss" + "band" the top hit today on Google is http://en.wikipedia.org/wiki/Kiss_(band) So on page Search:en.wikipedia.en an entry would be made: http://en.wikipedia.org/wiki/Kiss_(band) Hit #1 "kiss" + "band". This page would be editable allowing feedback by users regarding the utility and appropriateness of that result. In addition to verbal comments the result could be rated from -5 to -5, feedback which would affect subsequent search results. Situations like dead links or redirects to porn sites would could also be reported by clicking on nix link which would then cause the result to not be displayed. In the case of popular sites, the page would be automatically cleared as often as needed, or perhaps archived. Obviously we could generate more pages and entries in this manner than we would want. Perhaps only searches by signed in users would result in generation of these pages. However any user of the service, signed in or not could view such pages after a search and give feedback on the existing content. Fred From fredbaud at fairpoint.net Sat Feb 2 19:23:18 2008 From: fredbaud at fairpoint.net (Fred Bauder) Date: Sat, 2 Feb 2008 14:23:18 -0500 (EST) Subject: [Search-l] Wikia Searching In-Reply-To: <355a36af0802021056j59dcee95n5d76cc7b1e12baca@mail.gmail.com> References: <496053cd0802020942vc2c1d19ld9e4e61ad85f9325@mail.gmail.com> <355a36af0802021056j59dcee95n5d76cc7b1e12baca@mail.gmail.com> Message-ID: <49992.66.243.196.131.1201980198.squirrel@webx1.neonova.net> > On Feb 2, 2008 10:33 AM, Mark (Markie) > wrote: >> hmm these are all good ideas for user ranking. atm we only have star >> ratings and im not too sure if these are actually used or stored >> currently. what would people think about having various different >> rankings like these. i think that big company (yes the one beginning >> with G :-) currently have safe search, would we want that kinda >> thing?? >> >> others thoughts? >> >> mark >> > > Yes, I think a safe search is a good idea. And stars are good; though I > still like the idea of being able to assign more meta information > (tags!) to a result to help with disambiguation. > > Aerik As I point out in another post, there has to be a place, a page, where this could happen, although one can imagine an option, "rate this result" next to a hit. Or there could be a popup which appears after the user goes to the hit, "rate this result". Just brainstorming here, if we get too annoying, no one will put up with it, let alone rate anything. Fred Bauder From david.trebosc at gmail.com Sat Feb 2 19:34:08 2008 From: david.trebosc at gmail.com (David TREBOSC) Date: Sat, 2 Feb 2008 20:34:08 +0100 Subject: [Search-l] Wikia Searching In-Reply-To: <49992.66.243.196.131.1201980198.squirrel@webx1.neonova.net> References: <496053cd0802020942vc2c1d19ld9e4e61ad85f9325@mail.gmail.com> <355a36af0802021056j59dcee95n5d76cc7b1e12baca@mail.gmail.com> <49992.66.243.196.131.1201980198.squirrel@webx1.neonova.net> Message-ID: <845ce57b0802021134oe2510e4r45e2324da5163afe@mail.gmail.com> > > >As I point out in another post, there has to be a place, a page, where > >this could happen, although one can imagine an option, "rate this result" > >next to a hit. I like this but think that I will also like a way to rate the resultS (with a S). Imagine you are looking for ecologic car. If the result are : - Commercial web site selling car and saying their cars are more ecologic than other - Forum on ecologic car - Web site against ecology - Blog on ecologic car. Then It's good to rate (with star) like this *- Commercial web site selling car and saying their cars are more ecologic than other ****- Forum on ecologic car - Web site against ecology **** - Blog on ecologic car. But could be more interesting (how ... i don't know :-() to rate he result In this case, result isn't good because 2 links aren't good and the first is not a good one. In fact If people can rate like this (1- sorting result good are first 2- put star) it could be better => **** - Blog on ecologic car. ****- Forum on ecologic car *- Commercial web site selling car and saying their cars are more ecologic than other - Web site against ecology Do you understand what I mean ? Reuslt are not independant and it's about rating the whole result and not the individual links. (sorry but As I don't speak english, I put example). David -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.wikia.com/pipermail/search-l/attachments/20080202/c950d4ca/attachment.html From fredbaud at fairpoint.net Sat Feb 2 19:37:35 2008 From: fredbaud at fairpoint.net (Fred Bauder) Date: Sat, 2 Feb 2008 14:37:35 -0500 (EST) Subject: [Search-l] Wikia Searching In-Reply-To: <845ce57b0802021134oe2510e4r45e2324da5163afe@mail.gmail.com> References: <496053cd0802020942vc2c1d19ld9e4e61ad85f9325@mail.gmail.com> <355a36af0802021056j59dcee95n5d76cc7b1e12baca@mail.gmail.com> <49992.66.243.196.131.1201980198.squirrel@webx1.neonova.net> <845ce57b0802021134oe2510e4r45e2324da5163afe@mail.gmail.com> Message-ID: <50119.66.243.196.131.1201981055.squirrel@webx1.neonova.net> I will make a more considered response in a minute, but if I were shopping for an ecologic car the site I want is the commercial site. Fred >> >> >As I point out in another post, there has to be a place, a page, >> where this could happen, although one can imagine an option, "rate >> this result" next to a hit. > > I like this but think that I will also like a way to rate the resultS > (with a S). > Imagine you are looking for ecologic car. > If the result are : > - Commercial web site selling car and saying their cars are more > ecologic than other > - Forum on ecologic car > - Web site against ecology > - Blog on ecologic car. > > Then It's good to rate (with star) like this > *- Commercial web site selling car and saying their cars are more > ecologic than other > ****- Forum on ecologic car > - Web site against ecology > **** - Blog on ecologic car. > > > But could be more interesting (how ... i don't know :-() to rate he > result In this case, result isn't good because 2 links aren't good and > the first is not a good one. > In fact If people can rate like this (1- sorting result good are first > 2- put star) it could be better => > **** - Blog on ecologic car. > ****- Forum on ecologic car > *- Commercial web site selling car and saying their cars are more > ecologic than other > - Web site against ecology > > Do you understand what I mean ? > Reuslt are not independant and it's about rating the whole result and > not the individual links. > > (sorry but As I don't speak english, I put example). > > > David From newsmarkie at googlemail.com Sat Feb 2 20:00:58 2008 From: newsmarkie at googlemail.com (Mark (Markie)) Date: Sat, 2 Feb 2008 20:00:58 +0000 Subject: [Search-l] Mechanism for Feedback on Search Results, was Re: Wikia Searching In-Reply-To: <49968.66.243.196.131.1201979796.squirrel@webx1.neonova.net> References: <496053cd0802020942vc2c1d19ld9e4e61ad85f9325@mail.gmail.com> <49968.66.243.196.131.1201979796.squirrel@webx1.neonova.net> Message-ID: yup, im liking whats said below. great minds really do think alike :-) regards mark On Feb 2, 2008 7:16 PM, Fred Bauder wrote: > > First off, I'm terrible at using these mailing systems, they confuse me > > and therefor I hate them. Anyway, I'm glad Wikia Search has been > > founded. I had an idea just like this (wikipedia-like search engine), > > and was thinking of programming it. Instead, I just found this I would > > like to add input (my own ideas) to be considered. I thought I'd share > > some of my original ideas, instead of let them go to waste: > > > > + Have a link that you want to have spidered? Do you consider something > > to be spam? Flag something for the adult section? And so on. > > + Discussion pages will be available to discuss this stuff, such as if > > the site has adult images and should be flagged. > > + Link website to a wikipedia entry if one exists. > > + Search .mp3's, etc,. > > > > Good luck. > > - Chris > > Chris's ideas seem to assume we would have a way of generating user > feedback regarding individual search results. This could be the search > namespace (policy pages there would be moved to a new policy namespace). > Each search which produced a hit from a url would automatically generate > an entry on the search page for that url which would consist of the page > which constituted the hit and what search term was used. To give an > example, if I search for "Kiss" + "band" the top hit today on Google is > http://en.wikipedia.org/wiki/Kiss_(band)So on page Search: > en.wikipedia.en > an entry would be made: http://en.wikipedia.org/wiki/Kiss_(band)Hit #1 > "kiss" + "band". This page would be editable allowing feedback by users > regarding the utility and appropriateness of that result. In addition to > verbal comments the result could be rated from -5 to -5, feedback which > would affect subsequent search results. Situations like dead links or > redirects to porn sites would could also be reported by clicking on nix > link which would then cause the result to not be displayed. In the case of > popular sites, the page would be automatically cleared as often as needed, > or perhaps archived. > > Obviously we could generate more pages and entries in this manner than we > would want. Perhaps only searches by signed in users would result in > generation of these pages. However any user of the service, signed in or > not could view such pages after a search and give feedback on the existing > content. > > Fred > > > > _______________________________________________ > Wikia Search mailing list > http://alpha.search.wikia.com/ > Change options or unsubscribe: > http://lists.wikia.com/mailman/options/search-l > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.wikia.com/pipermail/search-l/attachments/20080202/ecda606e/attachment.html From newsmarkie at googlemail.com Sat Feb 2 20:03:58 2008 From: newsmarkie at googlemail.com (Mark (Markie)) Date: Sat, 2 Feb 2008 20:03:58 +0000 Subject: [Search-l] Wikia Searching In-Reply-To: <49992.66.243.196.131.1201980198.squirrel@webx1.neonova.net> References: <496053cd0802020942vc2c1d19ld9e4e61ad85f9325@mail.gmail.com> <355a36af0802021056j59dcee95n5d76cc7b1e12baca@mail.gmail.com> <49992.66.243.196.131.1201980198.squirrel@webx1.neonova.net> Message-ID: yeah i think a popup, or hoverover box for giving these results/scores would be nice. this is kinda already done with the stars and a nice new shiny edit thingy for the minis, so i think it would be quite nice to have an extra bit for giving these back jer: possible to get a mockup of this? if not no worries, ill carry on dreaming :-) thanks mark On Feb 2, 2008 7:23 PM, Fred Bauder wrote: > > On Feb 2, 2008 10:33 AM, Mark (Markie) > > wrote: > >> hmm these are all good ideas for user ranking. atm we only have star > >> ratings and im not too sure if these are actually used or stored > >> currently. what would people think about having various different > >> rankings like these. i think that big company (yes the one beginning > >> with G :-) currently have safe search, would we want that kinda > >> thing?? > >> > >> others thoughts? > >> > >> mark > >> > > > > Yes, I think a safe search is a good idea. And stars are good; though I > > still like the idea of being able to assign more meta information > > (tags!) to a result to help with disambiguation. > > > > Aerik > > As I point out in another post, there has to be a place, a page, where > this could happen, although one can imagine an option, "rate this result" > next to a hit. Or there could be a popup which appears after the user goes > to the hit, "rate this result". Just brainstorming here, if we get too > annoying, no one will put up with it, let alone rate anything. > > Fred Bauder > > > _______________________________________________ > Wikia Search mailing list > http://alpha.search.wikia.com/ > Change options or unsubscribe: > http://lists.wikia.com/mailman/options/search-l > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.wikia.com/pipermail/search-l/attachments/20080202/2523f9d4/attachment.html From natanael.l at gmail.com Sat Feb 2 20:39:07 2008 From: natanael.l at gmail.com (Natanael) Date: Sat, 2 Feb 2008 21:39:07 +0100 Subject: [Search-l] Wikia Searching In-Reply-To: <50119.66.243.196.131.1201981055.squirrel@webx1.neonova.net> References: <496053cd0802020942vc2c1d19ld9e4e61ad85f9325@mail.gmail.com> <355a36af0802021056j59dcee95n5d76cc7b1e12baca@mail.gmail.com> <49992.66.243.196.131.1201980198.squirrel@webx1.neonova.net> <845ce57b0802021134oe2510e4r45e2324da5163afe@mail.gmail.com> <50119.66.243.196.131.1201981055.squirrel@webx1.neonova.net> Message-ID: Then I guess that you'll search for "buy ecologic car" or something. On Feb 2, 2008 8:37 PM, Fred Bauder wrote: > I will make a more considered response in a minute, but if I were shopping > for an ecologic car the site I want is the commercial site. > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.wikia.com/pipermail/search-l/attachments/20080202/ef300d15/attachment.html From aerik at thesylvans.com Sat Feb 2 20:51:10 2008 From: aerik at thesylvans.com (Aerik Sylvan) Date: Sat, 2 Feb 2008 12:51:10 -0800 Subject: [Search-l] Wikia Searching In-Reply-To: References: <496053cd0802020942vc2c1d19ld9e4e61ad85f9325@mail.gmail.com> <355a36af0802021056j59dcee95n5d76cc7b1e12baca@mail.gmail.com> <49992.66.243.196.131.1201980198.squirrel@webx1.neonova.net> <845ce57b0802021134oe2510e4r45e2324da5163afe@mail.gmail.com> <50119.66.243.196.131.1201981055.squirrel@webx1.neonova.net> Message-ID: <355a36af0802021251l5c808a51ue0c6de501a5cadf3@mail.gmail.com> On Feb 2, 2008 12:39 PM, Natanael wrote: > Then I guess that you'll search for "buy ecologic car" or something. > > > > On Feb 2, 2008 8:37 PM, Fred Bauder wrote: > > > I will make a more considered response in a minute, but if I were shopping > > for an ecologic car the site I want is the commercial site. > > > Yes, but that would only work of the commercial site was optimized for that word, or if the algorithm is smart enough to search on equivalent words and phrases. "Buy" for example. So, "buy ecologic car" on Google doesn't return much use for shopping, but I think the path for some human intervention is pretty short: - human powered "similar words" - search results with algorithmically generated keywords AND tags - and algorithm that matches through the given words, then all similar words (perhaps in an outward spiral from the central term) against the ranked results having those keywords and tagwords. The down side is the tags (and similar words) are subject to spam, but the project is going to have to deal with spam no matter what; it's the nature of the beast. Aerik -- http://www.wikidweb.com - the Wiki Directory of the Web http://tagthis.info - Hosted Tagging for your website! -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.wikia.com/pipermail/search-l/attachments/20080202/e7e7c492/attachment.html From giandoscriba at tin.it Sat Feb 2 21:26:41 2008 From: giandoscriba at tin.it (GIAN DOMENICO MAZZOCATO) Date: Sat, 2 Feb 2008 22:26:41 +0100 Subject: [Search-l] Wikia Searching References: <496053cd0802020942vc2c1d19ld9e4e61ad85f9325@mail.gmail.com><355a36af0802021056j59dcee95n5d76cc7b1e12baca@mail.gmail.com><49992.66.243.196.131.1201980198.squirrel@webx1.neonova.net><845ce57b0802021134oe2510e4r45e2324da5163afe@mail.gmail.com><50119.66.243.196.131.1201981055.squirrel@webx1.neonova.net> <355a36af0802021251l5c808a51ue0c6de501a5cadf3@mail.gmail.com> Message-ID: <00a301c865e2$4dcf3aa0$ed01a8c0@pcuser> http://www.giandomenicomazzocato.it/ SITO DELLO SCRITTORE GIAN DOMENICO MAZZOCATO giandoscriba at tin.it webmaster Nicola Novello ------------------------------------------ ----- Original Message ----- From: Aerik Sylvan To: Mailing list for Search Wikia Sent: Saturday, February 02, 2008 9:51 PM Subject: Re: [Search-l] Wikia Searching On Feb 2, 2008 12:39 PM, Natanael wrote: > Then I guess that you'll search for "buy ecologic car" or something. > > > > On Feb 2, 2008 8:37 PM, Fred Bauder wrote: > > > I will make a more considered response in a minute, but if I were shopping > > for an ecologic car the site I want is the commercial site. > > > Yes, but that would only work of the commercial site was optimized for that word, or if the algorithm is smart enough to search on equivalent words and phrases. "Buy" for example. So, "buy ecologic car" on Google doesn't return much use for shopping, but I think the path for some human intervention is pretty short: a.. human powered "similar words" b.. search results with algorithmically generated keywords AND tags c.. and algorithm that matches through the given words, then all similar words (perhaps in an outward spiral from the central term) against the ranked results having those keywords and tagwords. The down side is the tags (and similar words) are subject to spam, but the project is going to have to deal with spam no matter what; it's the nature of the beast. Aerik -- http://www.wikidweb.com - the Wiki Directory of the Web http://tagthis.info - Hosted Tagging for your website! ------------------------------------------------------------------------------ _______________________________________________ Wikia Search mailing list http://alpha.search.wikia.com/ Change options or unsubscribe: http://lists.wikia.com/mailman/options/search-l -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.wikia.com/pipermail/search-l/attachments/20080202/f0752a98/attachment.html From giandoscriba at tin.it Sat Feb 2 21:26:45 2008 From: giandoscriba at tin.it (GIAN DOMENICO MAZZOCATO) Date: Sat, 2 Feb 2008 22:26:45 +0100 Subject: [Search-l] Wikia Searching References: <496053cd0802020942vc2c1d19ld9e4e61ad85f9325@mail.gmail.com><355a36af0802021056j59dcee95n5d76cc7b1e12baca@mail.gmail.com><49992.66.243.196.131.1201980198.squirrel@webx1.neonova.net><845ce57b0802021134oe2510e4r45e2324da5163afe@mail.gmail.com><50119.66.243.196.131.1201981055.squirrel@webx1.neonova.net> <355a36af0802021251l5c808a51ue0c6de501a5cadf3@mail.gmail.com> Message-ID: <00ad01c865e2$5044a770$ed01a8c0@pcuser> http://www.giandomenicomazzocato.it/ SITO DELLO SCRITTORE GIAN DOMENICO MAZZOCATO giandoscriba at tin.it webmaster Nicola Novello ------------------------------------------ ----- Original Message ----- From: Aerik Sylvan To: Mailing list for Search Wikia Sent: Saturday, February 02, 2008 9:51 PM Subject: Re: [Search-l] Wikia Searching On Feb 2, 2008 12:39 PM, Natanael wrote: > Then I guess that you'll search for "buy ecologic car" or something. > > > > On Feb 2, 2008 8:37 PM, Fred Bauder wrote: > > > I will make a more considered response in a minute, but if I were shopping > > for an ecologic car the site I want is the commercial site. > > > Yes, but that would only work of the commercial site was optimized for that word, or if the algorithm is smart enough to search on equivalent words and phrases. "Buy" for example. So, "buy ecologic car" on Google doesn't return much use for shopping, but I think the path for some human intervention is pretty short: a.. human powered "similar words" b.. search results with algorithmically generated keywords AND tags c.. and algorithm that matches through the given words, then all similar words (perhaps in an outward spiral from the central term) against the ranked results having those keywords and tagwords. The down side is the tags (and similar words) are subject to spam, but the project is going to have to deal with spam no matter what; it's the nature of the beast. Aerik -- http://www.wikidweb.com - the Wiki Directory of the Web http://tagthis.info - Hosted Tagging for your website! ------------------------------------------------------------------------------ _______________________________________________ Wikia Search mailing list http://alpha.search.wikia.com/ Change options or unsubscribe: http://lists.wikia.com/mailman/options/search-l -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.wikia.com/pipermail/search-l/attachments/20080202/4583bb35/attachment.html From giandoscriba at tin.it Sat Feb 2 21:26:38 2008 From: giandoscriba at tin.it (GIAN DOMENICO MAZZOCATO) Date: Sat, 2 Feb 2008 22:26:38 +0100 Subject: [Search-l] Wikia Searching References: <496053cd0802020942vc2c1d19ld9e4e61ad85f9325@mail.gmail.com><355a36af0802021056j59dcee95n5d76cc7b1e12baca@mail.gmail.com><49992.66.243.196.131.1201980198.squirrel@webx1.neonova.net><845ce57b0802021134oe2510e4r45e2324da5163afe@mail.gmail.com><50119.66.243.196.131.1201981055.squirrel@webx1.neonova.net> <355a36af0802021251l5c808a51ue0c6de501a5cadf3@mail.gmail.com> Message-ID: <009901c865e2$4bb92be0$ed01a8c0@pcuser> http://www.giandomenicomazzocato.it/ SITO DELLO SCRITTORE GIAN DOMENICO MAZZOCATO giandoscriba at tin.it webmaster Nicola Novello ------------------------------------------ ----- Original Message ----- From: Aerik Sylvan To: Mailing list for Search Wikia Sent: Saturday, February 02, 2008 9:51 PM Subject: Re: [Search-l] Wikia Searching On Feb 2, 2008 12:39 PM, Natanael wrote: > Then I guess that you'll search for "buy ecologic car" or something. > > > > On Feb 2, 2008 8:37 PM, Fred Bauder wrote: > > > I will make a more considered response in a minute, but if I were shopping > > for an ecologic car the site I want is the commercial site. > > > Yes, but that would only work of the commercial site was optimized for that word, or if the algorithm is smart enough to search on equivalent words and phrases. "Buy" for example. So, "buy ecologic car" on Google doesn't return much use for shopping, but I think the path for some human intervention is pretty short: a.. human powered "similar words" b.. search results with algorithmically generated keywords AND tags c.. and algorithm that matches through the given words, then all similar words (perhaps in an outward spiral from the central term) against the ranked results having those keywords and tagwords. The down side is the tags (and similar words) are subject to spam, but the project is going to have to deal with spam no matter what; it's the nature of the beast. Aerik -- http://www.wikidweb.com - the Wiki Directory of the Web http://tagthis.info - Hosted Tagging for your website! ------------------------------------------------------------------------------ _______________________________________________ Wikia Search mailing list http://alpha.search.wikia.com/ Change options or unsubscribe: http://lists.wikia.com/mailman/options/search-l -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.wikia.com/pipermail/search-l/attachments/20080202/73fe8b03/attachment.html From giandoscriba at tin.it Sat Feb 2 21:26:03 2008 From: giandoscriba at tin.it (GIAN DOMENICO MAZZOCATO) Date: Sat, 2 Feb 2008 22:26:03 +0100 Subject: [Search-l] Wikia Searching References: <496053cd0802020942vc2c1d19ld9e4e61ad85f9325@mail.gmail.com><355a36af0802021056j59dcee95n5d76cc7b1e12baca@mail.gmail.com><49992.66.243.196.131.1201980198.squirrel@webx1.neonova.net><845ce57b0802021134oe2510e4r45e2324da5163afe@mail.gmail.com><50119.66.243.196.131.1201981055.squirrel@webx1.neonova.net> <355a36af0802021251l5c808a51ue0c6de501a5cadf3@mail.gmail.com> Message-ID: <005501c865e2$371a8bc0$ed01a8c0@pcuser> http://www.giandomenicomazzocato.it/ SITO DELLO SCRITTORE GIAN DOMENICO MAZZOCATO giandoscriba at tin.it webmaster Nicola Novello ------------------------------------------ ----- Original Message ----- From: Aerik Sylvan To: Mailing list for Search Wikia Sent: Saturday, February 02, 2008 9:51 PM Subject: Re: [Search-l] Wikia Searching On Feb 2, 2008 12:39 PM, Natanael wrote: > Then I guess that you'll search for "buy ecologic car" or something. > > > > On Feb 2, 2008 8:37 PM, Fred Bauder wrote: > > > I will make a more considered response in a minute, but if I were shopping > > for an ecologic car the site I want is the commercial site. > > > Yes, but that would only work of the commercial site was optimized for that word, or if the algorithm is smart enough to search on equivalent words and phrases. "Buy" for example. So, "buy ecologic car" on Google doesn't return much use for shopping, but I think the path for some human intervention is pretty short: a.. human powered "similar words" b.. search results with algorithmically generated keywords AND tags c.. and algorithm that matches through the given words, then all similar words (perhaps in an outward spiral from the central term) against the ranked results having those keywords and tagwords. The down side is the tags (and similar words) are subject to spam, but the project is going to have to deal with spam no matter what; it's the nature of the beast. Aerik -- http://www.wikidweb.com - the Wiki Directory of the Web http://tagthis.info - Hosted Tagging for your website! ------------------------------------------------------------------------------ _______________________________________________ Wikia Search mailing list http://alpha.search.wikia.com/ Change options or unsubscribe: http://lists.wikia.com/mailman/options/search-l -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.wikia.com/pipermail/search-l/attachments/20080202/94f255e8/attachment.html From giandoscriba at tin.it Sat Feb 2 21:26:50 2008 From: giandoscriba at tin.it (GIAN DOMENICO MAZZOCATO) Date: Sat, 2 Feb 2008 22:26:50 +0100 Subject: [Search-l] Wikia Searching References: <496053cd0802020942vc2c1d19ld9e4e61ad85f9325@mail.gmail.com><355a36af0802021056j59dcee95n5d76cc7b1e12baca@mail.gmail.com><49992.66.243.196.131.1201980198.squirrel@webx1.neonova.net><845ce57b0802021134oe2510e4r45e2324da5163afe@mail.gmail.com><50119.66.243.196.131.1201981055.squirrel@webx1.neonova.net> Message-ID: <00bc01c865e2$52becf30$ed01a8c0@pcuser> http://www.giandomenicomazzocato.it/ SITO DELLO SCRITTORE GIAN DOMENICO MAZZOCATO giandoscriba at tin.it webmaster Nicola Novello ------------------------------------------ ----- Original Message ----- From: Natanael To: fredbaud at fairpoint.net ; Mailing list for Search Wikia Sent: Saturday, February 02, 2008 9:39 PM Subject: Re: [Search-l] Wikia Searching Then I guess that you'll search for "buy ecologic car" or something. On Feb 2, 2008 8:37 PM, Fred Bauder wrote: I will make a more considered response in a minute, but if I were shopping for an ecologic car the site I want is the commercial site. ------------------------------------------------------------------------------ _______________________________________________ Wikia Search mailing list http://alpha.search.wikia.com/ Change options or unsubscribe: http://lists.wikia.com/mailman/options/search-l -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.wikia.com/pipermail/search-l/attachments/20080202/25108076/attachment.html From giandoscriba at tin.it Sat Feb 2 21:27:17 2008 From: giandoscriba at tin.it (GIAN DOMENICO MAZZOCATO) Date: Sat, 2 Feb 2008 22:27:17 +0100 Subject: [Search-l] Wikia Searching References: <496053cd0802020942vc2c1d19ld9e4e61ad85f9325@mail.gmail.com><355a36af0802021056j59dcee95n5d76cc7b1e12baca@mail.gmail.com> <49992.66.243.196.131.1201980198.squirrel@webx1.neonova.net> Message-ID: <00f501c865e2$63577770$ed01a8c0@pcuser> http://www.giandomenicomazzocato.it/ SITO DELLO SCRITTORE GIAN DOMENICO MAZZOCATO giandoscriba at tin.it webmaster Nicola Novello ------------------------------------------ ----- Original Message ----- From: "Fred Bauder" To: Sent: Saturday, February 02, 2008 8:23 PM Subject: Re: [Search-l] Wikia Searching >> On Feb 2, 2008 10:33 AM, Mark (Markie) >> wrote: >>> hmm these are all good ideas for user ranking. atm we only have star >>> ratings and im not too sure if these are actually used or stored >>> currently. what would people think about having various different >>> rankings like these. i think that big company (yes the one beginning >>> with G :-) currently have safe search, would we want that kinda >>> thing?? >>> >>> others thoughts? >>> >>> mark >>> >> >> Yes, I think a safe search is a good idea. And stars are good; though I >> still like the idea of being able to assign more meta information >> (tags!) to a result to help with disambiguation. >> >> Aerik > > As I point out in another post, there has to be a place, a page, where > this could happen, although one can imagine an option, "rate this result" > next to a hit. Or there could be a popup which appears after the user goes > to the hit, "rate this result". Just brainstorming here, if we get too > annoying, no one will put up with it, let alone rate anything. > > Fred Bauder > > > _______________________________________________ > Wikia Search mailing list > http://alpha.search.wikia.com/ > Change options or unsubscribe: > http://lists.wikia.com/mailman/options/search-l From giandoscriba at tin.it Sat Feb 2 21:27:06 2008 From: giandoscriba at tin.it (GIAN DOMENICO MAZZOCATO) Date: Sat, 2 Feb 2008 22:27:06 +0100 Subject: [Search-l] Wikia Searching References: <496053cd0802020942vc2c1d19ld9e4e61ad85f9325@mail.gmail.com><355a36af0802021056j59dcee95n5d76cc7b1e12baca@mail.gmail.com><49992.66.243.196.131.1201980198.squirrel@webx1.neonova.net><845ce57b0802021134oe2510e4r45e2324da5163afe@mail.gmail.com> <50119.66.243.196.131.1201981055.squirrel@webx1.neonova.net> Message-ID: <00e101c865e2$5c58d900$ed01a8c0@pcuser> http://www.giandomenicomazzocato.it/ SITO DELLO SCRITTORE GIAN DOMENICO MAZZOCATO giandoscriba at tin.it webmaster Nicola Novello ------------------------------------------ ----- Original Message ----- From: "Fred Bauder" To: Sent: Saturday, February 02, 2008 8:37 PM Subject: Re: [Search-l] Wikia Searching >I will make a more considered response in a minute, but if I were shopping > for an ecologic car the site I want is the commercial site. > > Fred > >>> >>> >As I point out in another post, there has to be a place, a page, >>> where this could happen, although one can imagine an option, "rate >>> this result" next to a hit. >> >> I like this but think that I will also like a way to rate the resultS >> (with a S). >> Imagine you are looking for ecologic car. >> If the result are : >> - Commercial web site selling car and saying their cars are more >> ecologic than other >> - Forum on ecologic car >> - Web site against ecology >> - Blog on ecologic car. >> >> Then It's good to rate (with star) like this >> *- Commercial web site selling car and saying their cars are more >> ecologic than other >> ****- Forum on ecologic car >> - Web site against ecology >> **** - Blog on ecologic car. >> >> >> But could be more interesting (how ... i don't know :-() to rate he >> result In this case, result isn't good because 2 links aren't good and >> the first is not a good one. >> In fact If people can rate like this (1- sorting result good are first >> 2- put star) it could be better => >> **** - Blog on ecologic car. >> ****- Forum on ecologic car >> *- Commercial web site selling car and saying their cars are more >> ecologic than other >> - Web site against ecology >> >> Do you understand what I mean ? >> Reuslt are not independant and it's about rating the whole result and >> not the individual links. >> >> (sorry but As I don't speak english, I put example). >> >> >> David > > > > _______________________________________________ > Wikia Search mailing list > http://alpha.search.wikia.com/ > Change options or unsubscribe: > http://lists.wikia.com/mailman/options/search-l From giandoscriba at tin.it Sat Feb 2 21:27:11 2008 From: giandoscriba at tin.it (GIAN DOMENICO MAZZOCATO) Date: Sat, 2 Feb 2008 22:27:11 +0100 Subject: [Search-l] Wikia Searching References: <496053cd0802020942vc2c1d19ld9e4e61ad85f9325@mail.gmail.com><355a36af0802021056j59dcee95n5d76cc7b1e12baca@mail.gmail.com><49992.66.243.196.131.1201980198.squirrel@webx1.neonova.net> <845ce57b0802021134oe2510e4r45e2324da5163afe@mail.gmail.com> Message-ID: <00f001c865e2$5f4327b0$ed01a8c0@pcuser> http://www.giandomenicomazzocato.it/ SITO DELLO SCRITTORE GIAN DOMENICO MAZZOCATO giandoscriba at tin.it webmaster Nicola Novello ------------------------------------------ ----- Original Message ----- From: David TREBOSC To: Mailing list for Search Wikia Sent: Saturday, February 02, 2008 8:34 PM Subject: Re: [Search-l] Wikia Searching >As I point out in another post, there has to be a place, a page, where >this could happen, although one can imagine an option, "rate this result" >next to a hit. I like this but think that I will also like a way to rate the resultS (with a S). Imagine you are looking for ecologic car. If the result are : - Commercial web site selling car and saying their cars are more ecologic than other - Forum on ecologic car - Web site against ecology - Blog on ecologic car. Then It's good to rate (with star) like this *- Commercial web site selling car and saying their cars are more ecologic than other ****- Forum on ecologic car - Web site against ecology **** - Blog on ecologic car. But could be more interesting (how ... i don't know :-() to rate he result In this case, result isn't good because 2 links aren't good and the first is not a good one. In fact If people can rate like this (1- sorting result good are first 2- put star) it could be better => **** - Blog on ecologic car. ****- Forum on ecologic car *- Commercial web site selling car and saying their cars are more ecologic than other - Web site against ecology Do you understand what I mean ? Reuslt are not independant and it's about rating the whole result and not the individual links. (sorry but As I don't speak english, I put example). David ------------------------------------------------------------------------------ _______________________________________________ Wikia Search mailing list http://alpha.search.wikia.com/ Change options or unsubscribe: http://lists.wikia.com/mailman/options/search-l -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.wikia.com/pipermail/search-l/attachments/20080202/0422031d/attachment.html From giandoscriba at tin.it Sat Feb 2 21:26:34 2008 From: giandoscriba at tin.it (GIAN DOMENICO MAZZOCATO) Date: Sat, 2 Feb 2008 22:26:34 +0100 Subject: [Search-l] Wikia Searching References: <496053cd0802020942vc2c1d19ld9e4e61ad85f9325@mail.gmail.com><355a36af0802021056j59dcee95n5d76cc7b1e12baca@mail.gmail.com><49992.66.243.196.131.1201980198.squirrel@webx1.neonova.net><845ce57b0802021134oe2510e4r45e2324da5163afe@mail.gmail.com><50119.66.243.196.131.1201981055.squirrel@webx1.neonova.net> <355a36af0802021251l5c808a51ue0c6de501a5cadf3@mail.gmail.com> Message-ID: <008f01c865e2$4951efe0$ed01a8c0@pcuser> http://www.giandomenicomazzocato.it/ SITO DELLO SCRITTORE GIAN DOMENICO MAZZOCATO giandoscriba at tin.it webmaster Nicola Novello ------------------------------------------ ----- Original Message ----- From: Aerik Sylvan To: Mailing list for Search Wikia Sent: Saturday, February 02, 2008 9:51 PM Subject: Re: [Search-l] Wikia Searching On Feb 2, 2008 12:39 PM, Natanael wrote: > Then I guess that you'll search for "buy ecologic car" or something. > > > > On Feb 2, 2008 8:37 PM, Fred Bauder wrote: > > > I will make a more considered response in a minute, but if I were shopping > > for an ecologic car the site I want is the commercial site. > > > Yes, but that would only work of the commercial site was optimized for that word, or if the algorithm is smart enough to search on equivalent words and phrases. "Buy" for example. So, "buy ecologic car" on Google doesn't return much use for shopping, but I think the path for some human intervention is pretty short: a.. human powered "similar words" b.. search results with algorithmically generated keywords AND tags c.. and algorithm that matches through the given words, then all similar words (perhaps in an outward spiral from the central term) against the ranked results having those keywords and tagwords. The down side is the tags (and similar words) are subject to spam, but the project is going to have to deal with spam no matter what; it's the nature of the beast. Aerik -- http://www.wikidweb.com - the Wiki Directory of the Web http://tagthis.info - Hosted Tagging for your website! ------------------------------------------------------------------------------ _______________________________________________ Wikia Search mailing list http://alpha.search.wikia.com/ Change options or unsubscribe: http://lists.wikia.com/mailman/options/search-l -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.wikia.com/pipermail/search-l/attachments/20080202/7fb3b59e/attachment.html From peter.burden at gmail.com Sat Feb 2 21:32:43 2008 From: peter.burden at gmail.com (Peter Burden) Date: Sat, 2 Feb 2008 21:32:43 +0000 Subject: [Search-l] Wikia Searching In-Reply-To: <355a36af0802021251l5c808a51ue0c6de501a5cadf3@mail.gmail.com> References: <496053cd0802020942vc2c1d19ld9e4e61ad85f9325@mail.gmail.com> <355a36af0802021056j59dcee95n5d76cc7b1e12baca@mail.gmail.com> <49992.66.243.196.131.1201980198.squirrel@webx1.neonova.net> <845ce57b0802021134oe2510e4r45e2324da5163afe@mail.gmail.com> <50119.66.243.196.131.1201981055.squirrel@webx1.neonova.net> <355a36af0802021251l5c808a51ue0c6de501a5cadf3@mail.gmail.com> Message-ID: On 02/02/2008, Aerik Sylvan wrote: > > > > On Feb 2, 2008 12:39 PM, Natanael wrote: > > Then I guess that you'll search for "buy ecologic car" or something. > > > > > > > > On Feb 2, 2008 8:37 PM, Fred Bauder wrote: > > > > > I will make a more considered response in a minute, but if I were > shopping > > > for an ecologic car the site I want is the commercial site. > > > > > > > Yes, but that would only work of the commercial site was optimized for > that word, or if the algorithm is smart enough to search on equivalent words > and phrases. "Buy" for example. So, "buy ecologic car" on Google doesn't > return much use for shopping, but I think the path for some human > intervention is pretty short: > > > - human powered "similar words" > - search results with algorithmically generated keywords AND tags > - and algorithm that matches through the given words, then all > similar words (perhaps in an outward spiral from the central term) against > the ranked results having those keywords and tagwords. > > > The down side is the tags (and similar words) are subject to spam, but the > project is going to have to deal with spam no matter what; it's the nature > of the beast. One point that comes out of this is that the pages I want to see most depend on both the query I'm making and what I want to do. For example, I have a Toshiba A60 laptop (runs Fedora 8 quite nicely), if I want technical information about it, that's one set of pages, if I want to buy spares that's a different set of pages. I suspect that pages of these two types will have measurable structural differences. The implication of this example is that a considerable amount of effort needs to be devoted to the syntactical analysis of documents, the rich set of metadata available would then be useful meat for the algorithms Aerik has in mind. Does anybody know what metadata Wikia Search collects about documents? Aerik > > -- > http://www.wikidweb.com - the Wiki Directory of the Web > http://tagthis.info - Hosted Tagging for your website! > > _______________________________________________ > Wikia Search mailing list > http://alpha.search.wikia.com/ > Change options or unsubscribe: > http://lists.wikia.com/mailman/options/search-l > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.wikia.com/pipermail/search-l/attachments/20080202/6a1000ad/attachment.html From giandoscriba at tin.it Sat Feb 2 21:27:22 2008 From: giandoscriba at tin.it (GIAN DOMENICO MAZZOCATO) Date: Sat, 2 Feb 2008 22:27:22 +0100 Subject: [Search-l] Mechanism for Feedback on Search Results, was Re: Wikia Searching References: <496053cd0802020942vc2c1d19ld9e4e61ad85f9325@mail.gmail.com> <49968.66.243.196.131.1201979796.squirrel@webx1.neonova.net> Message-ID: <00fa01c865e2$65db1510$ed01a8c0@pcuser> http://www.giandomenicomazzocato.it/ SITO DELLO SCRITTORE GIAN DOMENICO MAZZOCATO giandoscriba at tin.it webmaster Nicola Novello ------------------------------------------ ----- Original Message ----- From: "Fred Bauder" To: Sent: Saturday, February 02, 2008 8:16 PM Subject: [Search-l] Mechanism for Feedback on Search Results,was Re: Wikia Searching >> First off, I'm terrible at using these mailing systems, they confuse me >> and therefor I hate them. Anyway, I'm glad Wikia Search has been >> founded. I had an idea just like this (wikipedia-like search engine), >> and was thinking of programming it. Instead, I just found this I would >> like to add input (my own ideas) to be considered. I thought I'd share >> some of my original ideas, instead of let them go to waste: >> >> + Have a link that you want to have spidered? Do you consider something >> to be spam? Flag something for the adult section? And so on. >> + Discussion pages will be available to discuss this stuff, such as if >> the site has adult images and should be flagged. >> + Link website to a wikipedia entry if one exists. >> + Search .mp3's, etc,. >> >> Good luck. >> - Chris > > Chris's ideas seem to assume we would have a way of generating user > feedback regarding individual search results. This could be the search > namespace (policy pages there would be moved to a new policy namespace). > Each search which produced a hit from a url would automatically generate > an entry on the search page for that url which would consist of the page > which constituted the hit and what search term was used. To give an > example, if I search for "Kiss" + "band" the top hit today on Google is > http://en.wikipedia.org/wiki/Kiss_(band) So on page Search:en.wikipedia.en > an entry would be made: http://en.wikipedia.org/wiki/Kiss_(band) Hit #1 > "kiss" + "band". This page would be editable allowing feedback by users > regarding the utility and appropriateness of that result. In addition to > verbal comments the result could be rated from -5 to -5, feedback which > would affect subsequent search results. Situations like dead links or > redirects to porn sites would could also be reported by clicking on nix > link which would then cause the result to not be displayed. In the case of > popular sites, the page would be automatically cleared as often as needed, > or perhaps archived. > > Obviously we could generate more pages and entries in this manner than we > would want. Perhaps only searches by signed in users would result in > generation of these pages. However any user of the service, signed in or > not could view such pages after a search and give feedback on the existing > content. > > Fred > > > > _______________________________________________ > Wikia Search mailing list > http://alpha.search.wikia.com/ > Change options or unsubscribe: > http://lists.wikia.com/mailman/options/search-l From giandoscriba at tin.it Sat Feb 2 21:26:59 2008 From: giandoscriba at tin.it (GIAN DOMENICO MAZZOCATO) Date: Sat, 2 Feb 2008 22:26:59 +0100 Subject: [Search-l] Mechanism for Feedback on Search Results, was Re: Wikia Searching References: <496053cd0802020942vc2c1d19ld9e4e61ad85f9325@mail.gmail.com><49968.66.243.196.131.1201979796.squirrel@webx1.neonova.net> Message-ID: <00dc01c865e2$585ec800$ed01a8c0@pcuser> http://www.giandomenicomazzocato.it/ SITO DELLO SCRITTORE GIAN DOMENICO MAZZOCATO giandoscriba at tin.it webmaster Nicola Novello ------------------------------------------ ----- Original Message ----- From: Mark (Markie) To: fredbaud at fairpoint.net ; Mailing list for Search Wikia Sent: Saturday, February 02, 2008 9:00 PM Subject: Re: [Search-l] Mechanism for Feedback on Search Results,was Re: Wikia Searching yup, im liking whats said below. great minds really do think alike :-) regards mark On Feb 2, 2008 7:16 PM, Fred Bauder wrote: > First off, I'm terrible at using these mailing systems, they confuse me > and therefor I hate them. Anyway, I'm glad Wikia Search has been > founded. I had an idea just like this (wikipedia-like search engine), > and was thinking of programming it. Instead, I just found this I would > like to add input (my own ideas) to be considered. I thought I'd share > some of my original ideas, instead of let them go to waste: > > + Have a link that you want to have spidered? Do you consider something > to be spam? Flag something for the adult section? And so on. > + Discussion pages will be available to discuss this stuff, such as if > the site has adult images and should be flagged. > + Link website to a wikipedia entry if one exists. > + Search .mp3's, etc,. > > Good luck. > - Chris Chris's ideas seem to assume we would have a way of generating user feedback regarding individual search results. This could be the search namespace (policy pages there would be moved to a new policy namespace). Each search which produced a hit from a url would automatically generate an entry on the search page for that url which would consist of the page which constituted the hit and what search term was used. To give an example, if I search for "Kiss" + "band" the top hit today on Google is http://en.wikipedia.org/wiki/Kiss_(band) So on page Search:en.wikipedia.en an entry would be made: http://en.wikipedia.org/wiki/Kiss_(band) Hit #1 "kiss" + "band". This page would be editable allowing feedback by users regarding the utility and appropriateness of that result. In addition to verbal comments the result could be rated from -5 to -5, feedback which would affect subsequent search results. Situations like dead links or redirects to porn sites would could also be reported by clicking on nix link which would then cause the result to not be displayed. In the case of popular sites, the page would be automatically cleared as often as needed, or perhaps archived. Obviously we could generate more pages and entries in this manner than we would want. Perhaps only searches by signed in users would result in generation of these pages. However any user of the service, signed in or not could view such pages after a search and give feedback on the existing content. Fred _______________________________________________ Wikia Search mailing list http://alpha.search.wikia.com/ Change options or unsubscribe: http://lists.wikia.com/mailman/options/search-l ------------------------------------------------------------------------------ _______________________________________________ Wikia Search mailing list http://alpha.search.wikia.com/ Change options or unsubscribe: http://lists.wikia.com/mailman/options/search-l -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.wikia.com/pipermail/search-l/attachments/20080202/90f64291/attachment.html From giandoscriba at tin.it Sat Feb 2 21:26:56 2008 From: giandoscriba at tin.it (GIAN DOMENICO MAZZOCATO) Date: Sat, 2 Feb 2008 22:26:56 +0100 Subject: [Search-l] Wikia Searching References: <496053cd0802020942vc2c1d19ld9e4e61ad85f9325@mail.gmail.com><355a36af0802021056j59dcee95n5d76cc7b1e12baca@mail.gmail.com><49992.66.243.196.131.1201980198.squirrel@webx1.neonova.net> Message-ID: <00cd01c865e2$5654a020$ed01a8c0@pcuser> http://www.giandomenicomazzocato.it/ SITO DELLO SCRITTORE GIAN DOMENICO MAZZOCATO giandoscriba at tin.it webmaster Nicola Novello ------------------------------------------ ----- Original Message ----- From: Mark (Markie) To: fredbaud at fairpoint.net ; Mailing list for Search Wikia Sent: Saturday, February 02, 2008 9:03 PM Subject: Re: [Search-l] Wikia Searching yeah i think a popup, or hoverover box for giving these results/scores would be nice. this is kinda already done with the stars and a nice new shiny edit thingy for the minis, so i think it would be quite nice to have an extra bit for giving these back jer: possible to get a mockup of this? if not no worries, ill carry on dreaming :-) thanks mark On Feb 2, 2008 7:23 PM, Fred Bauder wrote: > On Feb 2, 2008 10:33 AM, Mark (Markie) > wrote: >> hmm these are all good ideas for user ranking. atm we only have star >> ratings and im not too sure if these are actually used or stored >> currently. what would people think about having various different >> rankings like these. i think that big company (yes the one beginning >> with G :-) currently have safe search, would we want that kinda >> thing?? >> >> others thoughts? >> >> mark >> > > Yes, I think a safe search is a good idea. And stars are good; though I > still like the idea of being able to assign more meta information > (tags!) to a result to help with disambiguation. > > Aerik As I point out in another post, there has to be a place, a page, where this could happen, although one can imagine an option, "rate this result" next to a hit. Or there could be a popup which appears after the user goes to the hit, "rate this result". Just brainstorming here, if we get too annoying, no one will put up with it, let alone rate anything. Fred Bauder _______________________________________________ Wikia Search mailing list http://alpha.search.wikia.com/ Change options or unsubscribe: http://lists.wikia.com/mailman/options/search-l ------------------------------------------------------------------------------ _______________________________________________ Wikia Search mailing list http://alpha.search.wikia.com/ Change options or unsubscribe: http://lists.wikia.com/mailman/options/search-l -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.wikia.com/pipermail/search-l/attachments/20080202/b8a33bfb/attachment.html From newsmarkie at googlemail.com Sat Feb 2 21:35:46 2008 From: newsmarkie at googlemail.com (Mark (Markie)) Date: Sat, 2 Feb 2008 21:35:46 +0000 Subject: [Search-l] Wikia Searching In-Reply-To: <005501c865e2$371a8bc0$ed01a8c0@pcuser> References: <496053cd0802020942vc2c1d19ld9e4e61ad85f9325@mail.gmail.com> <355a36af0802021056j59dcee95n5d76cc7b1e12baca@mail.gmail.com> <49992.66.243.196.131.1201980198.squirrel@webx1.neonova.net> <845ce57b0802021134oe2510e4r45e2324da5163afe@mail.gmail.com> <50119.66.243.196.131.1201981055.squirrel@webx1.neonova.net> <355a36af0802021251l5c808a51ue0c6de501a5cadf3@mail.gmail.com> <005501c865e2$371a8bc0$ed01a8c0@pcuser> Message-ID: wahey, congratulations, your prize?? yes that's right ladies and gentlemen - Moderation :-) apologies all mark On Feb 2, 2008 9:26 PM, GIAN DOMENICO MAZZOCATO wrote: > > http://www.giandomenicomazzocato.it/ > SITO DELLO SCRITTORE > GIAN DOMENICO MAZZOCATO > giandoscriba at tin.it > webmaster Nicola Novello > ------------------------------------------ > > ----- Original Message ----- > *From:* Aerik Sylvan > *To:* Mailing list for Search Wikia > *Sent:* Saturday, February 02, 2008 9:51 PM > *Subject:* Re: [Search-l] Wikia Searching > > > > On Feb 2, 2008 12:39 PM, Natanael wrote: > > Then I guess that you'll search for "buy ecologic car" or something. > > > > > > > > On Feb 2, 2008 8:37 PM, Fred Bauder wrote: > > > > > I will make a more considered response in a minute, but if I were > shopping > > > for an ecologic car the site I want is the commercial site. > > > > > > > Yes, but that would only work of the commercial site was optimized for > that word, or if the algorithm is smart enough to search on equivalent words > and phrases. "Buy" for example. So, "buy ecologic car" on Google doesn't > return much use for shopping, but I think the path for some human > intervention is pretty short: > > > - human powered "similar words" > - search results with algorithmically generated keywords AND tags > - and algorithm that matches through the given words, then all > similar words (perhaps in an outward spiral from the central term) against > the ranked results having those keywords and tagwords. > > > The down side is the tags (and similar words) are subject to spam, but the > project is going to have to deal with spam no matter what; it's the nature > of the beast. > > Aerik > > -- > http://www.wikidweb.com - the Wiki Directory of the Web > http://tagthis.info - Hosted Tagging for your website! > > ------------------------------ > > _______________________________________________ > Wikia Search mailing list > http://alpha.search.wikia.com/ > Change options or unsubscribe: > http://lists.wikia.com/mailman/options/search-l > > > _______________________________________________ > Wikia Search mailing list > http://alpha.search.wikia.com/ > Change options or unsubscribe: > http://lists.wikia.com/mailman/options/search-l > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.wikia.com/pipermail/search-l/attachments/20080202/67022ee2/attachment.html From balinny at gmail.com Sat Feb 2 23:28:04 2008 From: balinny at gmail.com (Balinny) Date: Sun, 03 Feb 2008 00:28:04 +0100 Subject: [Search-l] Input wanted on Mini and main name spaces for Search Wikia In-Reply-To: <2813c4970802020727t3db3cf74v70dbb8b24b462b22@mail.gmail.com> References: <16D653D6-EFD5-4336-9A45-EA6DC20F8CA5@jabber.org> <2813c4970802020727t3db3cf74v70dbb8b24b462b22@mail.gmail.com> Message-ID: <47A4FC84.7030803@gmail.com> Tom Wright wrote: > The motivation behind providing links within mini-articles is that > they can provide: > > (i) Better links. > (ii) Information describing links (This is absent from search results.) > (iii) Links that might be lower down the search order but are still important. > (Example > i. when looking up programming related terms one often doesn't get > links to documentation because of the profusion of programming > discussion forums. > ii. When looking up books titles links to the full text of the book > (if it is out of copyright) are very useful - but won't, as a rule, > appear in searches. > ) > (iv) Links to things that mightn't appear in the search results but > are very important. > (Examples: Javascript is also known as ECMAscript - you won't get > links to ecmascript when searching for javascript - but ecmascript > provides the only formal documentation. Also searching for web related > terms) > (v) A way for people who have found a the necessary link after much > searching to store their result somewhere - thereby preventing the > inefficiencies of other people having to do exactly the same work. > IMHO we have a problem in that mini articles is the only way we can currently influence search results. "Better links" or "Links too low" shouldn't be on mini articles. They should be got right by the search engine. This is the only way it can scale On the other hand, v, disambiguations, etc. are suitable for miniarticles. Maybe we should allow the user to "reorder this search result". But using that isn't easy either... From tat.wright at googlemail.com Sun Feb 3 14:05:11 2008 From: tat.wright at googlemail.com (Tom Wright) Date: Sun, 3 Feb 2008 14:05:11 +0000 Subject: [Search-l] Input wanted on Mini and main name spaces for Search Wikia In-Reply-To: <47A4FC84.7030803@gmail.com> References: <16D653D6-EFD5-4336-9A45-EA6DC20F8CA5@jabber.org> <2813c4970802020727t3db3cf74v70dbb8b24b462b22@mail.gmail.com> <47A4FC84.7030803@gmail.com> Message-ID: <2813c4970802030605h71ae503gb2c31ef840192c16@mail.gmail.com> Certainly it's a good idea to add features for ranking results. I feel that an argument why the absense of links in mini-articles articles is better than their presense needs to be clearly enunciated before one removes their possible benefits. Is your argument that presenting links in mini-articles will detract from their other purposes of disambiguation - hence the possible roles of mini-articles should be restricted to only providing disambiguation? This may well be a reasonable argument. What is it that is wrong with having links presented in both mini-articles and the main search results? Plausible arguments might include: (i) Irritation of having to look in two places for links. (ii) Damaging duplication of effort between modifying search results and creating mini-article links. (iii) Causes links to be created rather than disambiguation. (iv) Providing a disambiguation to a suitable term will almost always provide a suitable link as the first result - requiring just one more click. Hence you can have "links without external links" (v) Risks of spam. (vi) Giving an individual user the power to place a link at the top of search results is excessive iand likely to cause conflict. (vii) The space of search terms is far too large to police. But as to whether these apply, I am not sure. I don't see why other mechanisms are any more scalable than mini-articles - though they might be fundamentally better. Sorry, I'm not trying to be argumentative, though it might seem like I'm doing a good impression... As an example of when the links provided with extra information and grouping can be useful look at http://re.search.wikia.com/search#britney%20spears versus the search results for britney spears on google - mentally try to create an order of links that will provide the same amount of information. (Sorry, couldn't find a more cultured example.) Tom On Feb 2, 2008 11:28 PM, Balinny wrote: > Tom Wright wrote: > > The motivation behind providing links within mini-articles is that > > they can provide: > > > > (i) Better links. > > (ii) Information describing links (This is absent from search results.) > > (iii) Links that might be lower down the search order but are still important. > > (Example > > i. when looking up programming related terms one often doesn't get > > links to documentation because of the profusion of programming > > discussion forums. > > ii. When looking up books titles links to the full text of the book > > (if it is out of copyright) are very useful - but won't, as a rule, > > appear in searches. > > ) > > (iv) Links to things that mightn't appear in the search results but > > are very important. > > (Examples: Javascript is also known as ECMAscript - you won't get > > links to ecmascript when searching for javascript - but ecmascript > > provides the only formal documentation. Also searching for web related > > terms) > > (v) A way for people who have found a the necessary link after much > > searching to store their result somewhere - thereby preventing the > > inefficiencies of other people having to do exactly the same work. > > > IMHO we have a problem in that mini articles is the only way we can > currently influence search results. > "Better links" or "Links too low" shouldn't be on mini articles. They > should be got right by the search engine. > This is the only way it can scale > On the other hand, v, disambiguations, etc. are suitable for miniarticles. > Maybe we should allow the user to "reorder this search result". But > using that isn't easy either... > > _______________________________________________ > Wikia Search mailing list > http://alpha.search.wikia.com/ > Change options or unsubscribe: http://lists.wikia.com/mailman/options/search-l > From peter.burden at gmail.com Sun Feb 3 17:13:24 2008 From: peter.burden at gmail.com (Peter Burden) Date: Sun, 3 Feb 2008 17:13:24 +0000 Subject: [Search-l] Input wanted on Mini and main name spaces for Search Wikia In-Reply-To: <2813c4970802030605h71ae503gb2c31ef840192c16@mail.gmail.com> References: <16D653D6-EFD5-4336-9A45-EA6DC20F8CA5@jabber.org> <2813c4970802020727t3db3cf74v70dbb8b24b462b22@mail.gmail.com> <47A4FC84.7030803@gmail.com> <2813c4970802030605h71ae503gb2c31ef840192c16@mail.gmail.com> Message-ID: On 03/02/2008, Tom Wright wrote: > > Certainly it's a good idea to add features for ranking results. > > I feel that an argument why the absense of links in mini-articles > articles is better than their presense needs to be clearly enunciated > before one removes their possible benefits. > > Is your argument that presenting links in mini-articles will detract > from their other purposes of disambiguation - hence the possible roles > of mini-articles should be restricted to only providing > disambiguation? This may well be a reasonable argument. > > What is it that is wrong with having links presented in both > mini-articles and the main search results? Plausible arguments might > include: > (i) Irritation of having to look in two places for links. (ii) Damaging duplication of effort between modifying search results > and creating mini-article links. (iii) Causes links to be created rather than disambiguation. > (iv) Providing a disambiguation to a suitable term will almost always > provide a suitable link as the first result - requiring just one more > click. Hence you can have "links without external links" > (v) Risks of spam. > (vi) Giving an individual user the power to place a link at the top of > search results is excessive iand likely to cause conflict. > (vii) The space of search terms is far too large to police. > > But as to whether these apply, I am not sure. Most of these seem plausible. The final 3 (v) - (vii) are by far the most significant. A mini-article with links would look, to many people, rather like the sponsored results that appear at the top of Google result lists and people would tend to interpret them the same way. The risks of mis-use by financially motivated members of the community are substantial and are probably best avoided. A link in a mini-article could easily be an excuse for not providing useful information. It is much quicker to read one sentence of informative text than to click on a link and wait for some media intensive irrelevancy to download. As a sole exception, I'd like to see mini-articles including a link to the relevant Wikipedia article. Brand-wise this should be acceptable. Another point I haven't really seen discussed is the relationship between mini-articles and search queries. For example if I do a search for "York" then I am NOT interested in mini-articles and pages about "New York". To resolve this problem with mini-articles I'd suggest that there should be an EXACT match between mini-article titles and queries. Incidentally the York search on Search Wikia does, indeed, give lots of pages about New York, whereas Google gets it right. Not quite sure how Google does this - conceivably it looks at my IP address and decides (correctly) that I'm in the UK and delivers results accordingly. The Search Wikia team need to think about this. These comments have been about mini-articles as information rather than disambiguation. To provide a good user experience disambiguation should provide an immediate hook to enable the user to repeat the query focussing just on pages that match the selected disambiguation. To make this work requires semantic tagging of both pages and queries. I think this is doable and would make Search Wikia stand out. I'll post some notes on how I think this would work soon. I don't see why other mechanisms are any more scalable than > mini-articles - though they might be fundamentally better. > > Sorry, I'm not trying to be argumentative, though it might seem like > I'm doing a good impression... It's a discussion not an argument ;-) Your ideas help me refine and expand my ideas, and hope help you and anybody else reading this in the sames way. As an example of when the links provided with extra information and > grouping can be useful look at > http://re.search.wikia.com/search#britney%20spears versus the search > results for britney spears on google - mentally try to create an order > of links that will provide the same amount of information. (Sorry, > couldn't find a more cultured example.) > > Tom > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.wikia.com/pipermail/search-l/attachments/20080203/d60a9151/attachment.html From balinny at gmail.com Sun Feb 3 21:00:18 2008 From: balinny at gmail.com (Balinny) Date: Sun, 03 Feb 2008 22:00:18 +0100 Subject: [Search-l] Input wanted on Mini and main name spaces for Search Wikia In-Reply-To: References: <16D653D6-EFD5-4336-9A45-EA6DC20F8CA5@jabber.org> <2813c4970802020727t3db3cf74v70dbb8b24b462b22@mail.gmail.com> <47A4FC84.7030803@gmail.com> <2813c4970802030605h71ae503gb2c31ef840192c16@mail.gmail.com> Message-ID: <47A62B62.8050706@gmail.com> Peter Burden wrote: > Most of these seem plausible. The final 3 (v) - (vii) are by far the > most significant. A mini-article with links would look, to many > people, rather like the sponsored results that appear at the top > of Google result lists and people would tend to interpret them the > same way. The risks of mis-use by financially motivated members > of the community are substantial and are probably best avoided. Agree. > A link in a mini-article could easily be an excuse for not providing > useful information. It is much quicker to read one sentence of > informative text than to click on a link and wait for some media > intensive irrelevancy to download. > > As a sole exception, I'd like to see mini-articles including a link > to the relevant Wikipedia article. Brand-wise this should be > acceptable. In that case wikipedia result should be automatically placed on top, like Google does (not necessary on the top, only "biased to appear before most results"). > Another point I haven't really seen discussed is the relationship > between mini-articles and search queries. For example if I do > a search for "York" then I am NOT interested in mini-articles > and pages about "New York". To resolve this problem with > mini-articles I'd suggest that there should be an EXACT match > between mini-article titles and queries. It's currently an exact matching. The problem is see is that it's too exact. Even in capitalization! That needs to be relaxed. And probably also for multiword searchs. Something like lowercase and alphabetically sort terms before checking the miniarticle existance. > Incidentally the York search on Search Wikia does, indeed, > give lots of pages about New York, whereas Google gets it > right. Not quite sure how Google does this - conceivably it > looks at my IP address and decides (correctly) that I'm in the > UK and delivers results accordingly. The Search Wikia team > need to think about this. I doubt it's done like that. It's probably detecting "New York" as a word instead of two terms, or something similar. Google stressed from the beginning the importance of words placed near the term. From jmcc at hackwatch.com Sun Feb 3 21:19:26 2008 From: jmcc at hackwatch.com (John McCormac) Date: Sun, 03 Feb 2008 21:19:26 +0000 Subject: [Search-l] Google's comments on the MSFT/YHOO merger Message-ID: <47A62FDE.6050008@hackwatch.com> http://googleblog.blogspot.com/2008/02/yahoo-and-future-of-internet.html Regards...jmcc -- ****************************************************** John McCormac * e-mail: jmcc at whoisireland.com MC2 * voice: +353-51-873640 22 Viewmount * web: http://www.whoisireland.com/ Waterford * blog: http://blog.whoisireland.com Ireland * Irish Domain Stats & Market Research ****************************************************** From alisirag at gmail.com Mon Feb 4 10:52:08 2008 From: alisirag at gmail.com (A. Sirag) Date: Mon, 4 Feb 2008 11:52:08 +0100 Subject: [Search-l] Google's comments on the MSFT/YHOO merger In-Reply-To: <47A62FDE.6050008@hackwatch.com> References: <47A62FDE.6050008@hackwatch.com> Message-ID: Hi and this's Microsoft's responce to the Google blog post . On Feb 3, 2008 10:19 PM, John McCormac wrote: > http://googleblog.blogspot.com/2008/02/yahoo-and-future-of-internet.html > > Regards...jmcc > -- > ****************************************************** > John McCormac * e-mail: jmcc at whoisireland.com > MC2 * voice: +353-51-873640 > 22 Viewmount * web: http://www.whoisireland.com/ > Waterford * blog: http://blog.whoisireland.com > Ireland * Irish Domain Stats & Market Research > ****************************************************** > _______________________________________________ > Wikia Search mailing list > http://alpha.search.wikia.com/ > Change options or unsubscribe: > http://lists.wikia.com/mailman/options/search-l > -- Ali Sirag -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.wikia.com/pipermail/search-l/attachments/20080204/99f82f5d/attachment.html From peter.burden at gmail.com Mon Feb 4 11:30:32 2008 From: peter.burden at gmail.com (Peter Burden) Date: Mon, 4 Feb 2008 11:30:32 +0000 Subject: [Search-l] Google's comments on the MSFT/YHOO merger In-Reply-To: References: <47A62FDE.6050008@hackwatch.com> Message-ID: On 04/02/2008, A. Sirag wrote: It says "Microsoft is committed to openness, innovation, and the protection of privacy on the Internet." - now that's nice to know. On Feb 3, 2008 10:19 PM, John McCormac wrote: > > > http://googleblog.blogspot.com/2008/02/yahoo-and-future-of-internet.html > > > > Regards...jmcc > > -- > > ****************************************************** > > John McCormac * e-mail: jmcc at whoisireland.com > > MC2 * voice: +353-51-873640 > > 22 Viewmount * web: http://www.whoisireland.com/ > > Waterford * blog: http://blog.whoisireland.com > > Ireland * Irish Domain Stats & Market Research > > ****************************************************** > > _______________________________________________ > > Wikia Search mailing list > > http://alpha.search.wikia.com/ > > Change options or unsubscribe: > > http://lists.wikia.com/mailman/options/search-l > > > > > > -- > > Ali Sirag > _______________________________________________ > Wikia Search mailing list > http://alpha.search.wikia.com/ > Change options or unsubscribe: > http://lists.wikia.com/mailman/options/search-l > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.wikia.com/pipermail/search-l/attachments/20080204/fa5a3d76/attachment.html From jaime.ag at gmail.com Mon Feb 4 11:41:55 2008 From: jaime.ag at gmail.com (Jaime Agudo) Date: Mon, 4 Feb 2008 12:41:55 +0100 Subject: [Search-l] Google's comments on the MSFT/YHOO merger In-Reply-To: References: <47A62FDE.6050008@hackwatch.com> Message-ID: <3f9af6db0802040341y46b0264ej1fc46c25bc08c7a4@mail.gmail.com> it is also nice to know that a heavyweight purchaser like Google is afraid of competitor's bids. thanks for the link On 2/4/08, Peter Burden wrote: > > > On 04/02/2008, A. Sirag wrote: > > It says "Microsoft is committed to openness, innovation, and the protection > of privacy on the Internet." - now that's nice to know. > > > > > > > On Feb 3, 2008 10:19 PM, John McCormac wrote: > > > > > > http://googleblog.blogspot.com/2008/02/yahoo-and-future-of-internet.html > > > > > > Regards...jmcc > > > -- > > > ****************************************************** > > > John McCormac * e-mail: jmcc at whoisireland.com > > > MC2 * voice: +353-51-873640 > > > 22 Viewmount * web: http://www.whoisireland.com/ > > > Waterford * blog: http://blog.whoisireland.com > > > Ireland * Irish Domain Stats & Market Research > > > ****************************************************** > > > _______________________________________________ > > > Wikia Search mailing list > > > http://alpha.search.wikia.com/ > > > Change options or unsubscribe: > http://lists.wikia.com/mailman/options/search-l > > > > > > > > > > > -- > > > > Ali Sirag > > _______________________________________________ > > Wikia Search mailing list > > http://alpha.search.wikia.com/ > > Change options or unsubscribe: > http://lists.wikia.com/mailman/options/search-l > > > > > _______________________________________________ > Wikia Search mailing list > http://alpha.search.wikia.com/ > Change options or unsubscribe: > http://lists.wikia.com/mailman/options/search-l > From maxime at maxime.net.ru Mon Feb 4 12:54:02 2008 From: maxime at maxime.net.ru (Maxim Zakharov) Date: Mon, 4 Feb 2008 15:54:02 +0300 Subject: [Search-l] Google's comments on the MSFT/YHOO merger In-Reply-To: References: <47A62FDE.6050008@hackwatch.com> Message-ID: <92e334420802040454l7f912b98maefa0e244030a847@mail.gmail.com> Hi, Excellent, Google says about possible MS+Yahoo donination in emails and instant messaging while MS says about Google domination in search and context ads. Looks like chat between deaf and dumb. On 2/4/08, A. Sirag wrote: > > Hi > and this's Microsoft's responce to the Google blog post > . > > On Feb 3, 2008 10:19 PM, John McCormac wrote: > > > http://googleblog.blogspot.com/2008/02/yahoo-and-future-of-internet.html > > > > -- http://www.dataparksearch.org/ -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.wikia.com/pipermail/search-l/attachments/20080204/6930604b/attachment.html From nigel at turbo10.com Mon Feb 4 13:50:58 2008 From: nigel at turbo10.com (nigel at turbo10.com) Date: Mon, 4 Feb 2008 13:50:58 +0000 (GMT) Subject: [Search-l] Google's comments on the MSFT/YHOO merger In-Reply-To: <92e334420802040454l7f912b98maefa0e244030a847@mail.gmail.com> References: <47A62FDE.6050008@hackwatch.com> <92e334420802040454l7f912b98maefa0e244030a847@mail.gmail.com> Message-ID: > Hi, > > Excellent, Google says about possible MS+Yahoo donination in emails and > instant messaging while MS says about Google domination in search and > context ads. Looks like chat between deaf and dumb. Agreed! After reading their respective press releases I think the word "open" is going to start to become a swear word! ;-) Seriously though it's good to see the tectonic plates of search shifting. While they're busy trying to come up with a new mission statement (e.g., "Don't be Really Evil" or even "Openly Not Evil") new opportunities will fall through the cracks for the smaller fish. The downside is Google may streak further ahead. Jason, what if they brand their new engine "Mahoo?" ... they may need to buy you out as well to avoid trade mark troubles. ;-) It's stuff like this that makes the industry fun. But then again I'm still hoping for Ask.com to unfreeze Jeeves[1]. Nige CEO Trexy Blaze Search Trails http://trexy.com [1] the death of a brand is a sad thing ... and we still need an anthropomorphic question answering interface ... http://blog.trexy.com/2006/03/goodbye-jeeves-may-force-be-with-you.html From jwales at wikia.com Mon Feb 4 19:50:24 2008 From: jwales at wikia.com (Jimmy Wales) Date: Tue, 05 Feb 2008 01:20:24 +0530 Subject: [Search-l] NYTimes.com: Eyes on Google, Microsoft Bids $44 Billion for Yahoo In-Reply-To: <47A47175.8010002@hackwatch.com> References: <20080202131225.4C4D61D40A2A@ut4.sjc.wikia-inc.com> <47A47175.8010002@hackwatch.com> Message-ID: <47A76C80.9050607@wikia.com> John McCormac wrote: > It certainly makes the attempt to gain search marketshare a bit harder. > Microsoft management and Yahoo search technology and expertise will be > quite a threat to Google and others. Hmm, I would argue the opposite. I am not yet convinced that the merger has any very interesting synergies to make the combination any more interesting than the separate pieces were. > Given Yahoo's concentration on the social web, the product of this > merger could be quite an opponent for any new search engine venture, > especially ones that involve a social element. It would also have > something that a lot of new ventures have to struggle to attain - users. Yes, the existing user base of Yahoo / Microsoft is large, but it is no larger after the merger than before. I think that the tie-up may be a boon for smaller competitors, as consumers seek out alternatives. --Jimbo From jwales at wikia.com Tue Feb 5 08:43:39 2008 From: jwales at wikia.com (Jimmy Wales) Date: Tue, 05 Feb 2008 03:43:39 -0500 Subject: [Search-l] Input wanted on Mini and main name spaces for Search Wikia In-Reply-To: <2813c4970802020727t3db3cf74v70dbb8b24b462b22@mail.gmail.com> References: <16D653D6-EFD5-4336-9A45-EA6DC20F8CA5@jabber.org> <2813c4970802020727t3db3cf74v70dbb8b24b462b22@mail.gmail.com> Message-ID: <47A821BB.8080706@wikia.com> I agree with Tom here. Take a look at Mahalo for some examples of what mini-articles could do. It would be great if future iterations of the search algorithm mean that we end up removing things from the miniarticles because the algorithm is smart enough to get it right, but in the meantime, human edited results provide an interesting "target" to evaluate how clueful the algorithm is. Tom Wright wrote: > The motivation behind providing links within mini-articles is that > they can provide: > > (i) Better links. > (ii) Information describing links (This is absent from search results.) > (iii) Links that might be lower down the search order but are still important. > (Example > i. when looking up programming related terms one often doesn't get > links to documentation because of the profusion of programming > discussion forums. > ii. When looking up books titles links to the full text of the book > (if it is out of copyright) are very useful - but won't, as a rule, > appear in searches. > ) > (iv) Links to things that mightn't appear in the search results but > are very important. > (Examples: Javascript is also known as ECMAscript - you won't get > links to ecmascript when searching for javascript - but ecmascript > provides the only formal documentation. Also searching for web related > terms) > (v) A way for people who have found a the necessary link after much > searching to store their result somewhere - thereby preventing the > inefficiencies of other people having to do exactly the same work. > > Now certainly you can argue that all of the above can be done by: > (a) Providing suitable disambiguation. > (b) Adding extra features to search results. > (c) Adding the ability to search based on "semantics" rather than syntactics. > and all of this should be attempted. > > However the question is will this work in practice? > I think human generated content can always be better than > algorithmically generated content - assuming that the human doesn't > have mal-intent. (Because the humans use the algorithms first.) > > Why do you think having links within mini-articles is a bad idea? > The possible risks I can perceive are: > (i) Spam > (ii) Biased content > (iii) Detraction from the search results. > (iv) Distraction due to the links being of a poor quality. > > The question is how much can one control these risks, and do they > justify the potential benefits of having links. This is very much an > empirical rather than theoretic question... so I'm not quite sure how > to answer it. > > Tom > > > On Feb 2, 2008 10:25 AM, Mark (Markie) wrote: >> >> >> On Feb 2, 2008 12:21 AM, jer wrote: >>>>> So I'm suggesting the following to try and get this sorted. Go to >>>>> this page :- http://search.wikia.com/wiki/search:Mini_article/ >>>>> Policy_discussion >>>> I've been there and, frankly, it's far from obvious how to enter >>>> into a discussion there. >>>> I suggest the discussion continue on this mailing list. So to start >>>> the ball rolling :- >>> It is a bit confusing, a conversation in progress, but if anyone has >>> input they should just tack it onto the end and it'll get moved/ >>> cleaned up as the page evolves. A discussion here is of course great >>> too, and any points should/will get recorded into the policy >>> discussion page. >>> >>>> ... >>>> Here's some ideas on rules for mini-articles >>>> >>>> Mini-articles should be between 50 and 200 words long. >>>> Mini-articles must include a a tag of some sort indicating what >>>> language they are in. >>> Agree with both of these, I just included an i18n section on the page >>> for ideas on how to do the localization. >> >> i agree with this also. further thoughts on localisation, we could put some >> .js code in the mini:edit pages that say "click me if this article is in >> English, Deutsch etc etc" which would then insert the correctly formatted >> lang code holder. >> >>> >>> >>>> Mini-articles should not include images or hyper-links. >>> I think both are potentially useful but must have a careful policy >>> around them, small images and good hyperlinks could be very valuable >>> to a searcher. >> hmmm images maybe, if they are informative, i would say no to screenshots of >> the website though, links...well isnt that what search is for??... >> >> cheers >> >> mark >> >>> >>> Jer >>> >>> >>> >>> >>> _______________________________________________ >>> Wikia Search mailing list >>> http://alpha.search.wikia.com/ >>> Change options or unsubscribe: >> http://lists.wikia.com/mailman/options/search-l >> >> _______________________________________________ >> Wikia Search mailing list >> http://alpha.search.wikia.com/ >> Change options or unsubscribe: >> http://lists.wikia.com/mailman/options/search-l >> > _______________________________________________ > Wikia Search mailing list > http://alpha.search.wikia.com/ > Change options or unsubscribe: http://lists.wikia.com/mailman/options/search-l > From jwales at wikia.com Tue Feb 5 08:45:21 2008 From: jwales at wikia.com (Jimmy Wales) Date: Tue, 05 Feb 2008 03:45:21 -0500 Subject: [Search-l] Input wanted on Mini and main name spaces for Search Wikia In-Reply-To: <47A4FC84.7030803@gmail.com> References: <16D653D6-EFD5-4336-9A45-EA6DC20F8CA5@jabber.org> <2813c4970802020727t3db3cf74v70dbb8b24b462b22@mail.gmail.com> <47A4FC84.7030803@gmail.com> Message-ID: <47A82221.3010300@wikia.com> Balinny wrote: > IMHO we have a problem in that mini articles is the only way we can > currently influence search results. Yes, but this will pass soon... From jmcc at hackwatch.com Tue Feb 5 15:24:26 2008 From: jmcc at hackwatch.com (John McCormac) Date: Tue, 05 Feb 2008 15:24:26 +0000 Subject: [Search-l] NYTimes.com: Eyes on Google, Microsoft Bids $44 Billion for Yahoo In-Reply-To: <47A76C80.9050607@wikia.com> References: <20080202131225.4C4D61D40A2A@ut4.sjc.wikia-inc.com> <47A47175.8010002@hackwatch.com> <47A76C80.9050607@wikia.com> Message-ID: <47A87FAA.3000609@hackwatch.com> Jimmy Wales wrote: > John McCormac wrote: > >>It certainly makes the attempt to gain search marketshare a bit harder. >>Microsoft management and Yahoo search technology and expertise will be >>quite a threat to Google and others. > > > Hmm, I would argue the opposite. I am not yet convinced that the merger > has any very interesting synergies to make the combination any more > interesting than the separate pieces were. That's one way of looking at it Jimbo, Yahoo and Microsoft combined are, in social search terms, data rich. And as a merged operation, they have more social data than almost any other operation. While there are privacy concerns, it provides a lot of raw data for recommendations and search quality improvement. I think that Yahoo is already integrating del.icio.us into search results. > Yes, the existing user base of Yahoo / Microsoft is large, but it is no > larger after the merger than before. There may be an integration of the advertising platforms. The search is a problem - Microsoft's own search venture sucks because they really don't do search in the way that others do. Buying in some oth