Arch is an extension of Apache Nutch (a popular, highly scalable general purpose search engine) for intranet search. It includes blind test evaluation tools for comparing to other search engines. Arch has many features critical for corporate environments, such as document level security.
BlackRay is a relational database system designed to offer performance features commonly associated with search engines. It offers SQL support and sophisticated operational and management features. Load-balancing and operational stability by means of N+1 redundance are included. BlackRay is called a "Data Engine" since it combines traditional, relational database features and SQL with the power and flexibility of search engines. It is a true hybrid, offering transaction support, data-versioned snapshots, and sophisticated function-based indices. Wildcards, phonetic, and fuzzy logic searches are supported, as well. BlackRay supports a subset of the SQL92 standard and provides JDBC/ODBC/native driver options via the PostgreSQL protocol, in addition to an API based query option. The project is released under the GPLv2, with some drivers available under BSD-style licenses. Commercial support contracts are available as well.
The Ex-Crawler Project is divided into three subprojects. The main part is the Ex-Crawler daemon server, a highly configurable and flexible Web crawler written in Java. It comes with its own socket server, with which you can manage the server, users, distributed grid/volunteer computing, and much more. Crawled information is stored in a database (Currently MySQL, PostgreSQL, and MSSQL are supported). The second part is a graphical (Java Swing) distributed grid/volunteer computing client, including user computer state detection, based on JADIF Project. The Web search engine is written in PHP. It comes with a Content Management System, user language detection and multi-language support, and templates using Smarty, including an application framework that is partly forked from Joomla 1.5, so that Joomla components can be adapted quickly.
FM SiteSearch Pro is a quick and simple solution to adding professional search capability to a Web site. It comes with a relevance engine, control panel, large Web site support, MySQL support (optional), search/keyword statistics, advanced searches, and specialized searches, and is fully customizable. It also comes with a setup interface.
Find What I Mean aims to provide a searching library that tolerates errors in queries. It will auto-correct typos, extra letters, and so on. This is extremely useful when searching for an item in a list. In traditional search methods the query must be perfect or you get zero matches.
InstaSearch is an Eclipse IDE plug-in for performing quick and advanced searches of source code files. It uses the Apache Lucene library for indexing and fast searching of files in the workspace. The search is performed instantly as you type, and resulting files are displayed in an Eclipse view. Each file then can be previewed using a few of the most closely matching and relevant lines. A double-click on the match leads to the matching line in the file.