[Grub-dev] Yet another invalid workunit entry
Balinny
balinny at gmail.com
Thu Jul 10 11:40:08 UTC 2008
Rainer Blome wrote:
>
> These look like symptoms of memory corruption.
> What software generated these strings?
>
Agree. But server side an interpreted language is used, so there
shouldn't be such kind of problems.
Locally downloaded with wget, a solid rock there. My copy could be
broken though, depite having
downloaded lots of big files without problem (as shown by the accompanng
hash).
The Grub C Client opens the file read-only, so nothing there either.
> The crawling system as a whole receives external input, both in the form
> of URLs being read from web pages and result files from users, if I'm
> not mistaken. Since the crawlers are intended to be run by end-users,
> they must be secured against both malicious and unintentionally
> detrimental input, from both work units and the web itself. What has
> been done in this direction?
>
So i intended :)
There're several assumptions on my code about the workunit and http
format, no embedded nulls,
only one PUT, the maximum size of an URL, the MIME length... I tried to
set reasonable values
(why would we crawl an url which some browsers wouldn't visit?) and/or
give appropiate erros.
I think the more defensive approach would be isPrivateIP() to avoid
crawling the NAT.
> Searching for "grub security" shows that it would be better if the
> crawler had a different name :-). Yes, I know that this has been
> already discussed, but still.
>
> Has Grub been run with a memory debugger?
>
Running it under valgrind, there're many errors reported... on ld-2.3.6,
by subcalls of gethostbyname :S
No problems on the Client detected.
> Has its code been reviewed?
> Looking at the Grub C Client code...
>
As far as i know, you may be the first one which bothered to read it.
> o There is no systematic documentation in the code.
>
> o Requires C99 compiler - OK, I actually like C99, but the code
> should #if __STDC_VERSION__ < 199901L
> #error "Requires C99 compiler"
> #endif
>
That breaks default compiling with gcc. Added the check for C99 *or* GCC
> o String function calls without explicit limits.
>
Can you elaborate? AFAIK the only unlimited piece was the strtok (i
think i was a
bit lazy when i added it), which i just replaced.
> o Conditionals without braces.
>
Are they so bad? Ok, i'll bracify the code a bit... :)
> o The function FetchPage...
> - is four pages long (makes it hard to understand,
> any function longer than one page should be broken up),
>
Yeah, it's a kind of beast. But things are quite enchained. Perhaps the
first 25 lines
could be moved to a validating subfunction, splitting the rest would
take a little more.
> - defines function-scope global ("static")
> variables in the middle of the function
> (do I just need to get used to this, or do others find this
> nasty too?).
>
It's on my TODO-list to make it reentrant.
> I'm not saying that there is is an error here, but these
> taken together make the code hard to validate.
>
> - contains this loop:
> (...)
>
> That "continue" is actually a NOP, since it just falls through to the
> "while". Assuming that all further "send" calls fail (return -1), and
> that send doesn't try to access *Q unless the socket is connected, the
> loop will iterate roughly 4.5 billion times, until QueryLen overflows
> (wraps around reaching zero), all the while acquiring new sockets and
> releasing them. Although the loop does finish, this may look like an
> infinite loop to the user. (Test program below)
Good catch, that continue was intended to apply to the outer for, not
the while. Fixed.
> o If the source does not fit in the buffer,
> copytoEOL leaves the target unterminated (missing '\0').
>
It's expected to return null-terminated strings. Fixed by making it a
hard error.
There's probably no point on continuing with a wrong hostname (we could
crawl
the others, but the server would end reject it). Probably copytoSP
deserves the same.
Although truncating could be acceptable for the user-agent. :-/
> I would not use the Grub C Client :-(.
>
> Rainer
>
You still have to evaluate the user support ;)
More information about the Grub-dev
mailing list