[Search-l] May 20, 2008 Update

Dan Lewis dan at wikia.com
Tue May 20 19:39:56 UTC 2008


Here is an update on what the Search team did last week.

Grub:
•	Started the process for openly accepting from the community large
dumps of URLs into grub to be crawled

Operations:
•	Spent a few days last week in NYC meeting with Panther about our
operational needs.
•	Ironing out some small problems, such as upgrading HBase backend
clusters and patching the search server framework to add resiliency.

Index:
•	Monitoring and deploying a "whitelist"-seeded crawl to see the
quality of search results obtained that way.
•	Working on ways to handle multilingual search results

Nutch 2:
•	Finished ant build scripts
•	Finished distributed and local unit tests
•	Completed serialization framework for easy writables
•	Refactored urlnomalizer and urlfilter to non-plugins
•	Completed Injector and CrawlBase Merger tools
•	Started working on Generator and Fetcher tools
•	Started working on improved version of Arc to Nutch Segments Tool

Social Tools:
•	A lot of bug swatting and tweaking of the social toolset.
•	Added the ability to remove your profile photo
•	Began cleanup in general on some of the profile editing tools.
•	Added the ability to see if you have mutual friends with a person
who is not your friend.

Search Tools:
•	Made it so global changes will now show display some of the features
Jer has been adding to the results pages.
•	Started documenting all of the features.


More information about the Search-l mailing list