[Grub-dev] Back to past - discussion about headers

Bartek Jasicki thindil2 at gmail.com
Tue Jul 15 18:56:24 UTC 2008


Hello

I propose small resurrection discussion about crawler headers

1) Accept

Maybe we simply hard-code this header in clients? Last proposition was:

Accept: application/xhtml+xml, text/html, text/*, application/pdf

I think that we really don't need index bits from .mov or .mpeg
files ;)

2) if-modified-since

Because our Benevolent Dictator (aka Jeremie) ;) don't want revolution
in workunits format, i have proposition: maybe we add it to current:

GET /index.html HTTP/1.0 YYYY/MM/DD/hh:mm:ss\r\n
Host: www.example.info\r\n

Of course if our servers been able to use this header.

Both propositions can save some bandwidth and speed up crawling.

Bartek


More information about the Grub-dev mailing list