OOoPy is a Python library for modifying OpenOffice.org documents. It provides a set of transformations on the OOo XML format using the ElementTree XML Library. Transformations included are a mail merge application and the concatenation of documents with formatting intact. The framework supports easy creation of new transformations.
Emdros is a corpus query system for storing and searching linguistically annotated text. It is very generic, supporting almost any kind of annotation from almost any linguistic theory. All linguistic levels of analysis are supported, including phonology, morphology, the lexical level, syntax, and discourse. The core libraries act as a middleware layer between a client and an underlying SQL database. MySQL, PostgreSQL, and SQLite (2 and 3) are supported.
ICU provides a Unicode implementation, with functions for formatting numbers, dates, times, and currencies (according to locale conventions, transliteration, and parsing text in those formats). It provides flexible patterns for formatting messages, where the pattern determines the order of the variable parts of the messages, and the format for each of those variables. These patterns can be stored in resource files for translation to different languages. Included are more than 100 codepage converters for interaction with non-unicode systems.
Vrapper is an Eclipse plugin which acts as a wrapper for Eclipse text editors to provide a Vim-like input scheme for moving around and editing text. Unlike other plugins which embed Vim in Eclipse, Vrapper imitates the behavior of Vim while still using whatever editor you have opened in the workbench. The goal is to have the comfort and ease which comes with the different modes, complex commands, and count/operator/motion combinations which are the key features behind editing with Vim, while preserving the powerful features of the different Eclipse text editors, like code generation and refactoring.
ExactScan is a versatile document capture application for home offices and workgroups. It is designed from the ground up for high-speed document scanners and can easily handle hundreds of images per minute, including duplex scans. Included functionality reaches from managing, sorting, and editing singles pages to writing multi- as well as single-page PDF files including JPEG compression and TIFF, JPEG, JPEG2000, and PNG bitmap files. ExactScan allows performing state of the art image processing including automatic cropping, deskewing, dynamic thresholding for perfect black and white documents, and descreening print rasters.
libunibreak is an implementation of the line breaking and word breaking algorithms as described in Unicode Standard Annex 14 and Unicode Standard Annex 29. It is a superset of, and supersedes, liblinebreak. It is designed to be used in a generic text renderer. FBReader is one real-world example.
FreeMarker is a template engine that was originally designed so that servlet-based applications could keep graphical design separate from application logic. The templates provide an easy and highly flexible way to generate any kind of text output (HTML, PostScript, TeX, source code, etc.) from a variety of data sources such as Java objects, Jython objects, XML object models, and more.