[Search-l] more than just interoperability

Aerik Sylvan aerik at thesylvans.com
Fri Jun 1 06:36:21 UTC 2007


I've been thinking about something that is kind of tangential to this - one
of the things we've discussed is getting a large amount of proactive human
data - tags, or something like them.  It would take a really large number of
tags (or whatever) to be really useful.  Hopefully something like millions
of websites each tagged by hundreds of people with at least several tags.
So a dataset of perhaps a billion records is easy to imagine.

But, it's not easy to accumulate or process.  Processing it is a technical
hurdle which will be fun to tackle, but accumulating the data is a whole
other matter.

So, here it is:  Getting data from existent social bookmarking services may
be an option we should consider.  Think of it - aggregating data from
del.icio.us, stumbleupon, etc.  Now, I can't imagine how we'd get Yahoo to
give us the data from del.icio.us, but maybe there are other providers who
would be willing to do this.  Or perhaps we look at paying them for it, at
least enough to cover their bandwidth and other overhead.

Anybody got an ideas around this type of thing?

Aerik
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.wikia.com/pipermail/search-l/attachments/20070531/a084f62d/attachment.html 


More information about the Search-l mailing list