[Grub-dev] do we even really need a native client
jer
jeremie at jabber.org
Fri Jan 11 18:32:37 UTC 2008
> Your link doesn't document anything -- it's a post about "wouldn't
> it be
> nice if...". It has NO urls, no service definitions, and I was
> actually
> already subscribed to this list at the time.
>
> I asked for specific technical documentation since Jer linked me
> directly to a web service, and I have no idea of parameters,
> semantics,
> and authentication credentials.
It's a work in progress :)
I suppose it's probably stabilized enough though now to make a wiki
page describing more of those details, let's use:
http://search.wikia.com/wiki/GrubWorkUnit
> There's a huge jump from "I propose a new work unit!" to "Here is a
> URL
> that purported is the new work unit but requires auth you don't know
> about."
Sorry if you missed that thread, it's just the Wikia user accounts.
> As I said once before to Jer (and maybe posted here), if we're talking
> about (mostly?) scrapping Grub or the workunit idea, we might as well
> use the open source crawlers that Archive.org uses, since they produce
> the file format we want AND are open source AND are cross-platform AND
> are distributed AND already exist.
We're scrapping the old codebase, but not the model or even idea of a
"workunit", which is central to Grub.
This isn't just a "everybody run a crawler and throw results into the
same pot" either, there is a centralized intelligence about when and
who should crawl what, and bringing those back into a central massive
hBase. It's also a community where the quality of contributions have
attribution, and those helping more should have greater access to the
resources.
I would love to start building a secondary more advanced mechanism
that highly trusted or elected members would have access to, one
where grub would give out a "discovery list" that can be fed into any
crawler, such as Archive.org, and ingest large numbers of ARCs back
in a completely trusted way. This feels like a second step, after
the basic workunits are working and hBase is up and running, but not
far away at all.
> Perhaps we need some better documentation of where this project is
> going
> if you want more F/OSS contributors -- I was definitely interested in
> helping out until the compression bug with the original grub, and
> now I
> feel totally lost in the discussion :)
It's going to be a little chaotic as things are changing and evolving
on the fly, so best I can suggest is to just follow the list and help
out on the wiki for collecting details that might start to
stabilize. Actually, asking questions and saying "I'm lost" is a big
help too, to force some summarization and question things that may be
assumed :)
Jer
More information about the Grub-dev
mailing list