[Search-l] call to action....
Sami M
sami2065 at gmail.com
Tue Jun 12 07:08:11 UTC 2007
I referred to the paper to clarify the scope. Once you start working on it,
you realize that it is very compact publication for the amount of work that
was done. With exception of PageRank, most of what they talk about is public
domain knowledge covered in CS text books (for example *Managing Gigabytes*by
Witten, Moffat, Bell). PageRank in its original form is not as useful
anymore (thanks to the SEO's) and Google doesn't have a patent on link graph
analysis. That was the basic premise of my work – design & implement a
system + algorithm that is more effective and SEO resistant than PageRank,
HITS etc. I believe Google results have gone down in quality post 2003 or
so. Most of this is due to the pollution of the ranking metrics by SEO's.
The opportunity to improve it by a great deal is still out there. However,
the problem is a lot harder to solve as well. I have some ideas in that area
that I'm in process of evaluating. Given the right idea/technology &
team a big shift is possible in search area (unlike DBMS or OS area for
example). And it is more likely to come from someone reading this list then
Microsoft or Yahoos of the world. That is my opinion. I saw it happen before
me in 2000 when altavista enjoyed over 80% + marketshare. Anything is
possible...
I apologize if this wasn't the right forum to discuss this. I am just
looking for feedback. I am a big supporter of open source software without
which this would've been a lot harder project. I am at a point where I need
help in moving forward. This is a task for a team & a great one at that.
There are several directions I can go. Here is what I am considering at the
moment:
- Grow the team & build a web scale search engine running on a cluster of
few hundred nodes competing with the big guys. That would require me going
down VC funding path though.
- My original plan was to build this into a niche search engine tweaked and
marketed for superior subdomain performance (for example .edu, sports etc.).
The investments required are a lot smaller & it can always be scaled up
based on success.
- Open-source it! I just got this idea & that is why I thought of posting
here. However, since I didn't start off like that… what incentive do I have
for it now? If this was a product I could go the Jboss/mysql route.
All ideas are welcome. I can do some demos & go into technical details of my
implementation if needed. Thanks for all the feedback.
Sami
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.wikia.com/pipermail/search-l/attachments/20070612/8cad6628/attachment.html
More information about the Search-l
mailing list