[Search-l] Search Team Update: July 15, 2008
Dan Lewis
dan at wikia-inc.com
Wed Jul 16 13:45:28 UTC 2008
Here's what the Search Team did last week:
Search Tools:
1) Finished enhanced redirect handling
2) Finished fixing of redirect errors in databases and search results
3) Finished deployment and testing of new search results based on new
link rank algorithms.
4) Changed inbound link text in indexing to be untokenized.
4) Started working on new Indexer tool.
5) Started working on new Outlink parsing and analysis tools.
6) Added BOSS results as a supplement to our current results.
The new indexer and outlink parsing tools will allow us to perform
analysis on inbound link text and to store and index only relevant
text. The new indexer will also allow us to specifiy per field /
value weights for text. These tools, for example, should allow us to
weight text such as "Google Homepage" higher than "Hotels" when
pointing to google.com, and to avoid Google Bombs such as "Miserable
Failure" for a search for Michael Moore resulting in better, more
relevant, less spammy, search results.
Operations:
1) Handled re-indexing
2) Started a new crawl
3) Made fixes to KT importer, adding the ability to load/populate the
new location table
4) Built a new 0.1.3 Hbase cluster, loaded with production data
snapshot, populated the new location table, setup tim's new KT code
(with new features) pointed to the new cluster with new data
(kt.search.isc.org/ktdev/)
5) Tweaked lots of system monitoring
Other Tools:
1) Revised email notification text to include real name if available
2) Making great progress on the Wikia Search toolbar!
More information about the Search-l
mailing list