From some urls i got, you seem to have been extracting URLs from wiki
sources.
Please note that any trailing ] should be removed and you should skip
any urls
containing braces {{ won't be valid and thus doesn't need to be crawled
(but the
urls generated via that template do, so the best way is using
externallinks table).
BTW: Which page parser is used?