[Grub-dev] LZMA support in upload server

Balinny balinny at gmail.com
Wed Feb 25 22:33:27 UTC 2009


Jeremie Miller wrote:
> They're derived from an initial dump of urls from here from last year:
> 	http://index.isc.org/download/index
> And supposed to be regularly including submissions from here (but it's  
> manual right now):
> 	http://dispatch.grub.org/maps/
>   
Where/how are they stored?
What would need a workunit-generator to read the db?


>
> It's only getting urls from the sitemap submissions, which your client  
> is creating new ones automatically from contents so it sort of  
> works... eventually yes a mapreduce job should be getting them  
> straight from the arcs, but right now we have more urls than crawling  
> so getting even more isn't that useful, yet :)
>
> Jer
Should sitemaps creation be added to other clients, then?



More information about the Grub-dev mailing list