Bayon is a simple and fast data clustering tool for large-scale data sets. If you want to survey large-scale data, bayon is useful to partition the data into some groups and understand it. Bayon supports two hard-clustering methods, repeated bisection clustering, and K-means clustering. In the outputs of these methods, each input document is assigned to only one cluster. But you can get similar clusters for each input document like soft-clustering method by using some options.
Stupa is an associative search engine. It lets you search related documents with high performance and high precision. Since document data and inverted indexes are kept in memory, Stupa reflects updates of documents in search results in real time. A server implementation of Stupa is possible by using Thrift.