[Search-l] Related concepts, function words, content words

Linas Vepstas linasvepstas at gmail.com
Thu Aug 7 22:40:01 UTC 2008


2008/8/7 Jimmy Wales <jwales at wikia.com>:
> That's totally fascinating.
>
> This is one reason I am such a skeptic for the world seeing any major
> improvements in machine-generated search results anytime soon, whether using
> semantic technology or whatever.  Frankly, machines are still pretty stupid,
> and stupid even in cases where there is pretty obviously a HUGE amount of
> money to be made from having the machines "understand" things just a little
> bit.

Heh. Them's fighin' words. Note that there is money pouring into startups
for this (powerset, opencalais, etc) which are then immediately snapped
up in buyouts (powerset, opencalais, etc.). The CPU demands are non-trivial:
I'm currently parsing the english wikipedia, and it looks like it will take
2-4 cpu-years to finish this, and that's just scratching the surface of the
data that can be mined from there.

Also, this area seems to be red-hot in research: one mailing list I'm on
seems to get weekly posts for open academic positions, including a
senior position at Stanford.  Perhaps I'm out of touch, but I'd never
seen anything like this before, certainly not when I was a young post-doc
looking for a job.

Of course, "anytime soon" are the operative words -- 6 months, 2 years,
5 years, 20 years, these are all "soon" from an appropriate perspective.

--linas



More information about the Search-l mailing list