Release Notes: The code has been reorganized into modules. Some iteration constructs have been converted to Python iterators and generators. All text processing internally is now handled as Unicode. Analyzers are back as generators of tokens. The changes to the code to make it more Pythonic appear to have resulted in trading time for space: preliminary tests indicate about a 5% speedup on one dataset in exchange for a 20% increase in memory usage.
Release Notes: This version fixes a bug related to another bug fixed previously.
Release Notes: The main reason for this release is to clean up a minor bug in the indexer.Index wrapper. The default mergeFactor has been changed from 9 to 20 for better performance. The example in simple.py uses a keyword for filename instead of a tokenized and stored Text field. SegmentInfos and FieldInfos have been tidied up to be more Pythonic. close() is called on the open searcher in indexer.Index.setupIndexer.
Release Notes: This version fixes a Windows-only bug in IndexWriter, and adds setMergeFactor to the Index to allow for tuning.
Release Notes: Some minor changes were made for Python 2.3, although a couple of warnings about bit operations remain. This release breaks some code: field.Keyword() must now be used instead of field.Field.Keyword(). If you are using the Indexer wrapper, searches are now more accurate because the query is tokenized first.
Release Notes: A fix for a bug in BooleanQuery, and some other small changes and tweaks. This is a recommended upgrade.
Release Notes: This release adds performance improvements if you are using the indexer.Index wrapper. Splitter is another Analyzer/Tokenizer derived from David Mertz's article in IBM Developerworks. It is faster than the existing Analyzer and more Pythonic, but it is not Unicode yet. The indexer wrapper continues to use the Unicode tokenizer.
Release Notes: A bug in deletion has been fixed. There is a new improved lupy.indexer.Index wrapper which makes the program a lot easier to use, and handles deletion too.
Release Notes: A Unicode related bug in LowerCaseTokenizer was fixed, and lupy.indexer.Index() was added as an easy wrapper for indexing and searching. A __str__ method was added to Hits, and a __getitem__ method enables indexed access and loops. Examples of the new Index wrapper of indexing email are included. Unicode changes in IndexReader were tweaked.
Release Notes: New Unicode support.