[Grub-dev] Yet another invalid workunit entry
Jeremie Miller
jeremie at jabber.org
Wed Jul 9 19:54:07 UTC 2008
It's not been touched, and since there's a master copy of the
workunits handed out I can grep through them to track these down:
find master -type f -exec grep chiesadicristo {} \;
Host: firenze.chiesadicristo.org
Host: www.chiesadicristofe.org
Host: www.chiesadicristo.it
Host: www.chiesadicristo-padova.it
Host: firenze.chiesadicristo.org
Maybe it's something else messing it up?
Also, I checked into grubng svn the hack of a perl script that make
these workunits just to make sure... with any luck it'll go away soon
to be replaced by map-reduce jobs that work on the submitted ARCs and
sitemap files :)
Jer
On Jul 9, 2008, at 12:31 PM, Balinny wrote:
> And more invalid ones:
>
>
> GET / HTTP/1.0
> Host: www.tice.de
> User-Agent: GrubNG 20080128
>
> GET /bunrei.html HTTP/1.0
> He.chiesadicristo.org
> User-Agent: GrubNG 20080128
>
> GET /en/index.php HTTP/1.0
> Host: carmencuevas.com
> User-Agent: GrubNG 20080128
>
> -----
>
> GET /topsite.php?id=36616 HTTP/1.0
> Host: click.listinus.de
> User-Agent: GrubNG 20080128
>
> GET /reference.htm HTTP/1.0
> HostGET /u2_hime/lotus/ HTTP/1.0
> Host: www.geocities.jp
> User-Agent: GrubNG 20080128
>
> GET /wwwboard.html HTTP/1.0
> Host: www.klotz-kamelien.de
> User-Agent: GrubNG 20080128
>
> ---
>
>
> GET /region6/index.htm HTTP/1.0
> Host: www.stcregion.org
> User-Agent: GrubNG 20080128
>
> GET /story/arts/national/2006/06/27/toronto-film-cannes.ht\nUser-
> Agent:
> GrubNG 20080128
>
> GET /2000/ALLPOLITICS/stories/12/06/senateleaders.ap/ HTTP/1.0
> Host: cnn.com
> User-Agent: GrubNG 20080128
>
> In the last one there's a literal \n (ASCII 10) instead of CRLF (+the
> missing host).
>
> Has the workunit generator been changed recently?
>
> _______________________________________________
> Grub-dev mailing list
> Grub-dev at wikia.com
> http://lists.wikia.com/mailman/listinfo/grub-dev
>
More information about the Grub-dev
mailing list