Searcharoo is an open-source C#/ACP.NET implementation of a search engine that you can download and use on your website.
It crawls links, parses Html and other doctypes (eg. Pdf, images), does basic ranking and keyword highlighting in the summary results.Searcharoo
has been around since 2004, in the form of "how to" articles posted on CodeProject
. It is now hosted on CodePlex to provide more visibility into the development process and allow bugfixes to become available more quickly than waiting 6 months for a new article to be written.
The code was written primarily as a learning exercise
. It was never intended to provide search for large websites - rather to provide code for others to adapt to their needs (such as using the 'crawling' code to feed a lucene.net search implementation). Now that it's on Codeplex it may grow into a more robust tool - but I urge people to understand
their searching needs before spending too long on this (or any other product) which might not turn out to be right for them.
You can read the most recent "how this works" CodeProject article (for version 7)
and all previous articles on the searcharoo.net website
Try it out: (in Silverlight)
or (using jQuery and Json)
or just plain Aspx-Html