mp3riot (formerly known as f2html.pl) is a command line utility that searches recursively through directories, builds a file list (with additional file information), and generates HTML files, playlists, etc. The output can be controlled, links can be corrected, and more. The script is mainly desigend to create Web pages, playlists, and databases for MP3 and Ogg files, but can also used for other purposes.
FemFind is a crawler and search engine for SMB shares (provided by Samba/Unix or Windows) and FTP servers. The crawler maps the filesystem structure of your shares to a MySQL database. Then, the Web interface or a Windows client can be used to quickly locate any file on the network.
Greenstone is a complete digital library creation, management, and distribution package for Unix, Windows, and Mac OS X. Users create collections by gathering a set of input documents, specifying a configuration file, and running the build script. It provides full-text and fielded searching, browsable indexes, customised formatting, metadata extraction (acronyms, languages, etc), a Z39.50 client, and many other features. It supports many input formats, the interface is configurable and multi-lingual, and collections can be distributed on the Web or on CD-ROM.
Harvest is a system to collect information and make it searchable using a Web interface. It can collect information using HTTP, FTP, NNTP, and local files. Supported formats include HTML, DVI, PS, fulltext, mail, man pages, news, troff, WordPerfect, C sources, and many more. Adding support for new formats is easy due to Harvest's modular design.
HPFind searches your users' directories for homepages and displays them in a table on an html page. All outputted text and background colours, and file locations can be specified by the administrator. It is able to exclude a given list of users. This program can be used as a small search program to search for filenames.
ht://Check is a link checker derived from ht://Dig. It can retrieve information through HTTP/1.1 and store it in a MySQL database so that after a "crawl", ht://Check can return broken links, anchors not found, content-types, and HTTP status codes summaries. ht://Check also performs accessibility checks in accordance with the principles of the University of Toronto's Open Accessibility Checks (OAC) project, allowing users to discover site-wide barriers like images without proper alternatives, missing titles, etc. A PHP interface lets the user query and view the results directly via the Web.
The ht://Dig system is a complete WWW indexing and searching system for a domain or intranet. This system is not meant to replace the need for internet-wide search systems like Lycos, Infoseek, Google, and AltaVista. Instead, it is meant to cover the search needs for a single company, campus, or even a particular sub-section of a Web site.
HTML-Tree is a Perl program that recursively decends directories, and creates a web-page based graphical map of HTML pages on a webserver. A configuration file provides control over the "root" directory for the map, map page title and header, directories to be excluded, link substitution strings, and map page background image. This mapper may be run as a cron task to provide an up-to-date roadmap of a webserver. It is primarily useful as a web site development and administration tool, since it shows all pages available to web browsers, and can identify where links are needed.
HTTrack is an easy-to-use offline browser utility. It allows you to download a Web site from the Internet to a local directory, building recursively all directories, getting HTML, images, and other files from the server to your computer. HTTrack arranges the original site's relative link-structure. Simply open a page of the mirrored Web site in your browser, and you can browse the site from link to link, as if you were viewing it online. HTTrack can also update an existing mirrored site, and resume interrupted downloads. WebHTTrack is a Web-based GUI for HTTrack.