[Grub-dev] where to report bad workunit entries?

Balinny balinny at gmail.com
Fri Feb 1 19:41:26 UTC 2008


ab wrote:
> problems like private-links, intranet/lan-links and such come to my mind.
>
> if these urls or pages that collect such links also back to a companies 
> intranet site or something end up in the grub/wikia-search database 
> there could be a problem.
>   
If those private links arrived at the crawler then they weren't so 
private. The search engine only
makes them easier to find.
If you're referring to a grub crawler inside a company whose assigment 
includes some domain
resolving to an ip in the local network, then yes. You have a point, 
there could be problems. That's
why the C client will refuse to crawl those links :-)

> and then again, maybe someday wikia-search/grubclients will also be 
> available as local/intranet services and so on, so then they could also 
> spider all those non-internet links too.
>
> cheers.
You could use your own wikia search for your intranet. There would be 
little point for the internet search
to crawl internal webs. Also note they would then enter into the 
internet as search engine cached pages.
I agree that giving companies/domains a tool to crawl their own pages / 
dbs would be interesting. It may
also provide a way to access the invisible internet.


More information about the Grub-dev mailing list