[Search-l] Search Update: August 13, 2008

Dan Lewis dan at wikia-inc.com
Wed Aug 13 20:17:12 UTC 2008


Here's the update from the last week (or so) of Search Team goodness:

Toolbar
* Wikia Evolution, the search toolbar launched.  Download it at
http://re.search.wikia.com/toolbar/download.html
* Want to tinker with the code?  Grab it via the SVN at
http://svn.swlabs.org/re.search/cool/toolbar/ - there's no
documentation yet, but we're working on it.  The test .xpi
(evolution.test.xpi) can be unzipped as per typical for a .zip file.

UI Stuff
(If you're not on the search-ui miling list, that's at
http://lists.wikia.com/mailman/listinfo/search-ui)
* More work on widget/application framework
* Working on a search engine comparison tool
* Started on a "light" fork to the results UI.

Atlas
(There's a distribution list for Atlas at
http://lists.wikia.com/mailman/listinfo/atlas-l and a wiki page about
the project at http://search.wikia.com/wiki/Atlas.  Check out both)
* Updated the Atlas protocol spec for "knuggets"
* A lot of the prototypes are starting to come together.  Atlas-l has
a discussion starting at
http://lists.wikia.com/pipermail/atlas-l/2008-July/000092.html and the
SVN is at http://svn.swlabs.org/atlas/.

Operations
* Assisted with Nutch re-index
* Started a new crawl
* Fixes to KT importer, added the ability to load/populate the new
location table
* Built a new 0.1.3 Hbase cluster, loaded with production data
snapshot, populated the new location table, setup new KT code (with
new features) pointed to the new cluster with new data
(kt.search.isc.org/ktdev/)
* Tweaked lots of system monitoring
* Lots of work with the crawler, trying to find the source of very
high fetch failure rates
* Deploy-redploy KT /ktdev/, started review of code
* Determining new hardware requirements
* Bind updates
* More work on Grub

Nutch
* Finished test rollout of new indexing and scoring systems.
* Started working on shard management servers.
* Started work on pornography and bad content identification.
* Started integration of kt input into analysis algorithms.
* Documentation, bug fixes, and unit tests for new scoring and
indexing frameworks for Nutch.  Working to get final patches submitted
and committed into the Nutch core.
* Finished new crawl, working on deployment and roll-out of new
indexing and scoring systems to test
* Finished all patches and code documentation for new scoring and
indexing systems.  Everything has been submitted to Nutch for
inclusion in the Nutch code distribution.
* Finished modifications to FieldIndexer including field filter
extension point, and field-basic, field-boost plugins that integration
in the arbitrary boosting with the new indexing framework.

Other Stuff
* Improved contact importer for the social tools
* Working on a Facebook application



More information about the Search-l mailing list