[Search-l] Related concepts, function words, content words
Linas Vepstas
linasvepstas at gmail.com
Mon Jul 28 18:56:09 UTC 2008
I just noticed something curious about google's "related topics" function.
I'd been reading gmail using the web browser, and there's always a list
of ads that seem to be keyed off of keywords in the email. Today, none
of the ads were keyed off of keywords ... instead, they were keyed off of
broad sentiment.
The actual email was from my rowing coach, bitching about how people
failed to show up for practice, and how that makes everyone late on the
water and changes the planned workout, etc. The ads were all about
employee-employer relations -- how to fire employees, how to file
workplace greivances, negotiating with unions, etc. Now, nowhere
in the email did it use the words "employee", "union", "grievance",
"fire", "discharge" -- but somehow google perceived the overall negative
tone, and that it had to do with personal relationships. Its mistake was
to assume its job-related. And yet -- none of the ads were for marriage
counseling or spousal abuse -- so it could tell that this was a more formal
setting -- it did not mistake it for a lover showing up late for a romantic
dinner (no "buy her flowers" ads), or missing out on camping with your
buddies (no "how to make friends" ads).
Particularly of note is that it missed the obvious sports nature of the email:
content words like "rowing" "water", "workout", "practice" and "boat" were in
the email, and should have given a strong positive .. and yet these were
overlooked, in favour of the much more vague non-content, functional
phrases like "failing to show up".
--linas
More information about the Search-l
mailing list