[Grub-dev] LZMA support in upload server
Balinny
balinny at gmail.com
Wed Feb 25 22:33:27 UTC 2009
Jeremie Miller wrote:
> They're derived from an initial dump of urls from here from last year:
> http://index.isc.org/download/index
> And supposed to be regularly including submissions from here (but it's
> manual right now):
> http://dispatch.grub.org/maps/
>
Where/how are they stored?
What would need a workunit-generator to read the db?
>
> It's only getting urls from the sitemap submissions, which your client
> is creating new ones automatically from contents so it sort of
> works... eventually yes a mapreduce job should be getting them
> straight from the arcs, but right now we have more urls than crawling
> so getting even more isn't that useful, yet :)
>
> Jer
Should sitemaps creation be added to other clients, then?
More information about the Grub-dev
mailing list