[Grub-dev] bugreport: grubng.exe c# version 01.2950.37591 - various bugs (crawling / .arc file bloating)
ab
spam at abittner.de
Wed Jan 30 19:58:24 UTC 2008
hello there,
been using grubng.exe c# version 01.2950.37591 on a windows xp sp2
english system for a little while now and have encountered several bugs
and buggy behaviour i wanted to report:
1. summary: .arc result file gets bloated enormously when
aborting/restarting gui (crawl process).
while working (crawling) the given workunit (url-list) and then being
interrupted by the quit/exit button and then being restarted again
grubng.exe doesnt respect the already crawled urls at all.
the resulting answer file from the webservers named
username.044d253173a6af2b2b8cb0cae40c040ac0e6f989.arc keeps growing and
growing when the user restarts the grubng.exe gui. and grubng.exe keeps
crawling and adding all the same results over and over again....
grubng.exe gets for example 250urls to crawl.
it starts crawling urls 1 to 10. grubng.exe gets quitted.
grubng.exe gets restarted. it crawls urls 1 to 10 again and then
continues to urls 11 to 250.
the results from all the previous url-crawling all land in the .arc file
and inflate it enourmously depending on the urls/answers.
2. summary: crawling stalls from time to time. never resumes after that.
grubng crawling stalls from time to time. no http connections to any
more http servers are being made. thats for example how i came across
the bug mentioned above. my grubng.exe process stopped with the url
number 10. all the results from urls one to nine were already in the
.arc resultfile. grubng.exe never tries to recrawl url number 10 or
never continues to urls beyond that failing/errorneous url #10. so it
never skips to 11 and thus never continues with the crawl.
i have been waiting on the grubng.exe client for like 10 to 20minutes.
nothing ever happened after that url number 10. then i quit
grubng.exe/gui normally from within the gui. after restarting grubng.exe
it started all over again with the urls from one to 250 again.
regards.
More information about the Grub-dev
mailing list