[Grub-dev] Two questions about icons and protocol
Balinny
balinny at gmail.com
Wed Jan 23 18:37:17 UTC 2008
jer wrote:
>> 2. Sometimes in workunit server send a .pdf, .txt or even .mp3
>> files to
>> crawl. How reports this files?
>> As a 500 Error or maybe another HTTP Error Code? I think output from
>> this files is unnecessary ;)
>>
>
> It is actually necessary, a bunch of those types are important to be
> indexed as well, so the client should just put them in the resulting
> ARC like anything else (it's binary safe so it shouldn't be a big deal).
>
Not all of them. Today i found the process was "stopped" (ie. taking a
longer time on that url) crawling
www2.ati.com/drivers/linux/ati-driver-installer-8.42.3-x86.x86_64.run
Maybe we should skip application/octet-stream MIME if extension doesn't
match a well-known one.
> Right now the URLs are just a big batch of test ones and there are
> some that are files like that in there, once we get the full linkdb
> from the existing index imported and workunits coming from that, it
> should be 99% text/html for some time into the future, but it should
> span out to other content types as it grows.
>
> Jer
>
More information about the Grub-dev
mailing list