[atlas-l] back to knuggets
jer
jeremie at jabber.org
Thu Oct 11 00:15:47 UTC 2007
>> So is a meta-token equivalent to a phrase? Are factories expected to
>> identify all phrases in a document?
>
> Right, a phrase. This is optional for a Factory, and just a great way
> to add value. It only has to identify any or as many phrases as it
> can or has resources to.
Also I want to add that I believe there should be something along the
lines of a "roots" attribute on knuggets. This attribute would be a
tab-delimited list of strings, those being single words or phrases,
and they would also be entirely up to the Factory to decide on. Any
single root is intended to be essentially a "category" and must be
human readable/usable (Finance, Gaming, Iowa Agriculture, etc). This
attribute is mostly useful on the doc knuggets (since a single
document generally has the same roots for all of it's content), but
could be used on individual knuggets if deemed appropriate by the
Factory.
A Collector can decide how or if it even wants to use the roots, such
as providing rank clustering based on overlapping roots. A Broker
can use this for end-user display as well.
I could see some standard classification schemes arising eventually,
but again this should be organic and not some top-down-all-
encompassing effort.
Jer
More information about the Atlas-l
mailing list