<br><br><div class="gmail_quote">On Wed, May 28, 2008 at 1:15 AM, Rainer Blome <<a href="mailto:rainer.blome@gmx.de">rainer.blome@gmx.de</a>> wrote:<br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
Aerik Sylvan wrote:<br>
> [...] Mahalo [...] a search application.<br>
<br>The problem is that when they do have a page, they only<br>
show what's on that dedicated page, and no more. It's like Wikia Search<br>
would only show the mini article, once there is one. Effectively, the<br>
"search" part is dropped in those cases. By design, these cases are<br>
common, because Maholo aims to cover the common searches. </blockquote><div><br>Exactly - it's just like looking at a dmoz category, but with a "search" interface instead of a "browse" interface and keywords instead of categories. Same problem too: like the other point I was making, the application itself favors the entrenched players! Some resources may always be "best" for the average searcher, and therefore the best result to serve until such a time as personalized searches are the norm, but many other resources may become stale over time (think technology or medical research), or they are simply the "incumbent", blocking the other very good resources from being served as top results.<br>
<br></div><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"><br>
> [...] the data being built in Wikipedia (and similar<br>
> projects) is huge and is under-utilized. [...] My favorite possibility<br>
<div class="Ih2E3d">> is category intersections. A category in Wikipedia is essentially a tag<br>
> - someone has said that this chunk of information should be associated<br>
> with this concept.<br>
<br>
</div>Wikipedia embodies a semantic network. The links are sometimes not<br>
unambiguous, but I guess that effective automated use is possible<br>
(exploiting the "human computation" done there). Some are already trying<br>
it, just search for "semantic mining wikipedia" or "wikipedia link<br>
structure". The categories make the semantic network relatively<br>
explicit and therefore easier to mine, but mining should be possible<br>
even without them. And yes, it would be swell to have a search engine<br>
which guesses Wikipedia articles and categories and directly links to them.</blockquote><div> <br></div><div>I think there are a number of such tools, but third party tools do not fulfill the whole promise. The Semantic Mediawiki guys have a great vision, but it has technology hurdles to overcome. Category Intersections is quite doable, and Roan has written a trunk backend for it - it sounds like the interface will need tweaking, and then it needs to be set up with Lucene for Wikipedia. But the main point is this: In any software design, the design needs to consider all (likely) use cases, and all outputs. <br>
<br>I'm sure we've all bumped into software that was was shortsighted in it's view of the necessary outputs, and the application then cannot support them because the architecture itself cannot (database scheme for instance). In Wikipedia, the primary use case is users browsing or searching for articles in a fairly straightforward manner, ie a search for "Elvis". But another very powerful use case is searching for articles for intersecting concepts - ie, "Americans" and "Rock and Roll Stars" for example. A more pragmatic example is my search for video games. This is *tremendously* powerful, and this use case needs to be a consideration in the ongoing design of Wikipedia to facilitate the extraction of that data.<br>
<br>Similar methods of human constructed meta-data can be equally as powerful, and are a lot more interesting, innovative, and ultimately useful than Mahalo. I know Jimmy sees this - he has made several attempts at harnessing it, including an early version of wikia (when it was a search engine powered by tags and star ratings - the problem was that it had a small number of contributors and seems to be spammed to death). I think there was promise in that early vision, but that a slightly different model is needed - the community approach to wikia search *feels* like the right direction, and I still think that tags (as an obvious human generated datapoint) is an important part of the puzzle. I also think that some more hard controls against spam are necessary, as the ratio of data to contributors is not like Wikipedia, and the good will of contributors alone is not enough to stop the spam.<br>
<br>Best Regards,<br>Aerik<br><br></div></div>-- <br><a href="http://www.wikidweb.com">http://www.wikidweb.com</a> - the Wiki Directory of the Web<br><a href="http://tagthis.info">http://tagthis.info</a> - Hosted Tagging for your website!