[Search-l] Search Update: August 13, 2008
Dan Lewis
dan at wikia-inc.com
Wed Aug 13 20:17:12 UTC 2008
Here's the update from the last week (or so) of Search Team goodness:
Toolbar
* Wikia Evolution, the search toolbar launched. Download it at
http://re.search.wikia.com/toolbar/download.html
* Want to tinker with the code? Grab it via the SVN at
http://svn.swlabs.org/re.search/cool/toolbar/ - there's no
documentation yet, but we're working on it. The test .xpi
(evolution.test.xpi) can be unzipped as per typical for a .zip file.
UI Stuff
(If you're not on the search-ui miling list, that's at
http://lists.wikia.com/mailman/listinfo/search-ui)
* More work on widget/application framework
* Working on a search engine comparison tool
* Started on a "light" fork to the results UI.
Atlas
(There's a distribution list for Atlas at
http://lists.wikia.com/mailman/listinfo/atlas-l and a wiki page about
the project at http://search.wikia.com/wiki/Atlas. Check out both)
* Updated the Atlas protocol spec for "knuggets"
* A lot of the prototypes are starting to come together. Atlas-l has
a discussion starting at
http://lists.wikia.com/pipermail/atlas-l/2008-July/000092.html and the
SVN is at http://svn.swlabs.org/atlas/.
Operations
* Assisted with Nutch re-index
* Started a new crawl
* Fixes to KT importer, added the ability to load/populate the new
location table
* Built a new 0.1.3 Hbase cluster, loaded with production data
snapshot, populated the new location table, setup new KT code (with
new features) pointed to the new cluster with new data
(kt.search.isc.org/ktdev/)
* Tweaked lots of system monitoring
* Lots of work with the crawler, trying to find the source of very
high fetch failure rates
* Deploy-redploy KT /ktdev/, started review of code
* Determining new hardware requirements
* Bind updates
* More work on Grub
Nutch
* Finished test rollout of new indexing and scoring systems.
* Started working on shard management servers.
* Started work on pornography and bad content identification.
* Started integration of kt input into analysis algorithms.
* Documentation, bug fixes, and unit tests for new scoring and
indexing frameworks for Nutch. Working to get final patches submitted
and committed into the Nutch core.
* Finished new crawl, working on deployment and roll-out of new
indexing and scoring systems to test
* Finished all patches and code documentation for new scoring and
indexing systems. Everything has been submitted to Nutch for
inclusion in the Nutch code distribution.
* Finished modifications to FieldIndexer including field filter
extension point, and field-basic, field-boost plugins that integration
in the arbitrary boosting with the new indexing framework.
Other Stuff
* Improved contact importer for the social tools
* Working on a Facebook application
More information about the Search-l
mailing list