Thanks to some wonderful assistance from Dennis building on the work from the Internet Archive, the (still just 'test') ARC files of the Grub data can be imported into the main-line Nutch and distributed into a hadoop cluster. Soon we'll be building some indexes to be re- distributable along with the source ARC files :) Jer