"Web search technology is a huge subject, encompassing:

  • networking (spidering the web),
  • string and markup-language manipulation (parsing HTML)
  • proprietary file formats (searching Word, Excel, PDF, etc)
  • language and text-parsing (finding words & sentences in documents, stemming and other linguistic analysis),
  • algorithms (finding matches, AND/OR queries, combining multiple word results)
  • performance (both increasing spidering speed, and making large catalogs fast to search)
  • user interface (presenting search input options, and results)

Searcharoo.net hardly touches the surface on any of these topics :-) but it does attempt to introduce them with an open-source C# implementation of a search engine that you can download and use on your website."

Searcharoo Home