[Grub-dev] tiered client, other thoughts...

Yousef Ourabi yourabi at zero-analog.com
Wed Jan 9 18:31:50 UTC 2008


So as I play around with my currently less broken patched babygrub, a few
ideas are floating around that I wanted to share with the list and get
feedback.

1) Have tiered client modes: The current implementation holds the clients at
arms length by specifying they should not follow redirects, or follow links.
There are strong advantages to this approach as it keeps the overall design
simple. However, it might make sense to incorporate the notion that not all
clients are equal. For example,for  someone with weaker hardware on a slower
connection fetching 250 pages might be a header burden, so instead there
could be a "validation" tier of clients that simply do HEAD requests and
either validate current existence, or report and error without. There could
be "super" clients that are in some way "authenticated" or vetted by the
grub server that take more of the processing burden, such as parsing the
page for outbound links...etc.

2) Ability to report failure -- every time a request is made to the
dispatcher, it generates a new work-list. What if the client thread is
stopped for some reason, and the admin wants to explicitly re-fetch the last
list (I'm thinking of me in my debug mode right now). Does this
functionality exist? IF not it should.

3) Significance of result order -- I have somewhat mixed feelings on this.
Having the hash of the URIs in order is cool,. but I'm wondering if it would
be just as effective if they were out of order, because it shouldn't make a
difference if I crawl hosts a,b, and c in that order or c,a,b. Also the
individual URI is a nice unit to divide work in a multi-threaded client,
which would then place the burden of re-ordering the results on the client
side just so the hash can match...  I don't know, maybe I just need to be
convinced some more on this.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.wikia.com/pipermail/grub-dev/attachments/20080109/976e4709/attachment.html 


More information about the Grub-dev mailing list