[atlas-l] cleaned up a few bits and a super alpha collector
Jeremie Miller
jeremie at jabber.org
Mon Jul 21 21:51:25 UTC 2008
I've made a few edits to http://search.wikia.com/wiki/Atlas to clean
up the knugget definition, with things I'm learning as I continue to
prototype the first factory and collector. One refinement I've made
is really focusing in the definition of a knugget to be intrinsic,
that it only describes itself and not it's larger context, a knugget
is an atomic entity and only given context by some other knugget that
references it.
I also threw together a really really rough perl prototype "collector"
that parsed the knuggets from the experimental factory and indexed
them using sqlite3's full text search (it was low hanging fruit).
This doesn't do anything useful yet, but I have ~7 thousand urls
indexed using it and you can kinda query it via:
http://people.swlabs.org/cgi/collector?q=java
I think my next step is to make the factory process ARC files and
churn out a bunch of static knuggets, then have the test collector
index those. There's a lot of interaction between the factory<-
>collector that has to be thought through and that should help get
that moving.
PS: I'll be at OSCON on Wednesday if anyone on this list happens to be
around, track me down, I'd love to talk about Atlas more in person :)
Jer
More information about the Atlas-l
mailing list