[Grub-dev] do we even really need a native client
Balinny
balinny at gmail.com
Tue Jan 8 21:04:16 UTC 2008
Yousef Ourabi wrote:
> Let me start off by saying I believe the new client for Unix like
> operating system should be written in a dynamic language such as perl,
> python, ruby, or even all of the above...
There's no point in doing so. Make an API in agnostic C / C++. All those
languages have bindings with C libs. Then you python app could vary from
a call to DoAllStuff() to doing each of the steps with at a lower level.
> As I sit here, typing into this gmail inputbox via Firefox after just
> finishing my rant on why the client should be written in a dynamic
> language -- the thought occurs to me that perhaps the notion of a
> native client is very 90's and what we should really be thinking about
> are browser extensions that implement the same functionality if not more.
>
> [[ Disclaimer: this is not an original idea, I believe the Heretix
> folks are or were up to something similar with the "Monkeys" project
> if memory serves correctly ]]
>
> Think of this:
> A Firefox plugin that meets current functionally (gets list of URIs,
> crawls, creates ARC, puts to grub.org...)
> But also (with user consent / warning) registers new urls with the
> grub server
Having the work done with Firefos is a swift knife, very nice but you
can get cut.
You can be crawling what the user browses with no overhead (just the
upload, you are already downloading it). The user finds new urls for you.
Some things you need:
*Very strict following of Cache directives, Vary, Cookies, HTTP
authentication...
*Easy button to toggle crawling/not crawling.
*Remove from crawled data before sending (error in configuration, spam
site...).
*Blacklist of sites to crawl.
*Ability to execute in XULRunner
Some sites you wouldn't want to get crawled under your nick:
*somepornsite.com Typical example
*myemployer.com It's not of your interest who i work for.
*searchingjobs.com My employer wouldn't like this.
*search queries Note they're usually cacheable!
*etc.
More information about the Grub-dev
mailing list