[Grub-dev] Yet another invalid workunit entry
Rainer Blome
rainer.blome at gmx.de
Wed Jul 9 22:23:25 UTC 2008
Balinny wrote:
> And more invalid ones:
>
> He.chiesadicristo.org
>
> HostGET /u2_hime/lotus/ HTTP/1.0
>
> GET /story/arts/national/2006/06/27/toronto-film-cannes.ht\nUser-Agent:
These look like symptoms of memory corruption.
What software generated these strings?
The crawling system as a whole receives external input, both in the form
of URLs being read from web pages and result files from users, if I'm
not mistaken. Since the crawlers are intended to be run by end-users,
they must be secured against both malicious and unintentionally
detrimental input, from both work units and the web itself. What has
been done in this direction?
Searching for "grub security" shows that it would be better if the
crawler had a different name :-). Yes, I know that this has been
already discussed, but still.
Has Grub been run with a memory debugger?
Has its code been reviewed?
Looking at the Grub C Client code...
o There is no systematic documentation in the code.
o Requires C99 compiler - OK, I actually like C99, but the code
should #if __STDC_VERSION__ < 199901L
#error "Requires C99 compiler"
#endif
o String function calls without explicit limits.
o Conditionals without braces.
o The function FetchPage...
- is four pages long (makes it hard to understand,
any function longer than one page should be broken up),
- defines function-scope global ("static")
variables in the middle of the function
(do I just need to get used to this, or do others find this
nasty too?).
I'm not saying that there is is an error here, but these
taken together make the code hard to validate.
- contains this loop:
// Corner case of QueryLen == 0 not handled upon entry
// Use "while do" unless you really need "do while".
do {
i = send(sockfd, Q, QueryLen, 0);
// What happens if i is always zero?
// Don't know whether this is possible, DOS?
QueryLen -= i;
Q += i;
if (i == -1) {
perror("send");
Error = SEND_ERROR;
closesocket(sockfd);
sockfd = NewSocket();
// A new socket descriptor... unconnected.
// This continues the while loop - and not
// the enclosing for loop (not shown here),
// which would actually reconnect.
// Consider restructuring the function.
continue;
// The send above will never succeed after this branch.
}
} while (QueryLen);
// After this loop, the function continues to try
// and read data from the socket -
// even if the socket is unconnected.
o If the source does not fit in the buffer,
copytoEOL leaves the target unterminated (missing '\0').
If the host name after "Host: " is longer than MAXHOSTNAMELEN,
HostName will be unterminated.
HostName is passed as the first arg of
FetchPage(char const * Server, ...) {
...
// and proceeds thus:
char* HostPortSplit = strchr(Server, ':');
// If the host name value does not contain a ':',
// the above may find a ':' anywhere in memory,
// and if it does...
if (HostPortSplit) {
// the program will then write to that location ...
*HostPortSplit = '\0';
// and try to read from the next ones ...
Port = atoi(HostPortSplit+1);
}
...
}
I would not use the Grub C Client :-(.
Rainer
More information about the Grub-dev
mailing list