ICU provides a Unicode implementation, with functions for formatting numbers, dates, times, and currencies (according to locale conventions, transliteration, and parsing text in those formats). It provides flexible patterns for formatting messages, where the pattern determines the order of the variable parts of the messages, and the format for each of those variables. These patterns can be stored in resource files for translation to different languages. Included are more than 100 codepage converters for interaction with non-unicode systems.
Emdros is a corpus query system for storing and searching linguistically annotated text. It is very generic, supporting almost any kind of annotation from almost any linguistic theory. All linguistic levels of analysis are supported, including phonology, morphology, the lexical level, syntax, and discourse. The core libraries act as a middleware layer between a client and an underlying SQL database. MySQL, PostgreSQL, and SQLite (2 and 3) are supported.
Frink is a calculating tool and programming language designed to help you in the real world. It tracks units of measurement throughout all calculations and ensures that answers are correct. It converts between systems of measurement, and has a huge library of physical data. It is both a simple calculator for quick calculations and a full-fledged programming language for large tasks. It draws high-quality graphics, handles conversions between time zones, currencies, and historical values of the U.S. dollar and the British pound, translates between several languages, does date/time math, and more.
Enca detects the encoding of text files, on the basis of knowledge of their language. It can also convert them to other encodings, allowing you to recode files without knowing their current encoding. It supports most of Central and East European languages, and a few Unicode variants, independently on language.
GNU Source-highlight produces a document with syntax highlighting when given a source file. It handles many languages, e.g., Java, C/C++, Prolog, Perl, PHP3, Python, Flex, HTML, and other formats, e.g., ChangeLog and log files, as source languages and HTML, XHTML, DocBook, ANSI color escapes, LaTeX, and Texinfo as output formats. Input and output formats can be specified with a regular expression-oriented syntax.
FreeMarker is a template engine that was originally designed so that servlet-based applications could keep graphical design separate from application logic. The templates provide an easy and highly flexible way to generate any kind of text output (HTML, PostScript, TeX, source code, etc.) from a variety of data sources such as Java objects, Jython objects, XML object models, and more.
cw is a non-intrusive real-time ANSI color wrapper for common Unix-based commands. It is designed to simulate the environment of the commands being executed, so that if a person types 'du', 'df', 'ping', etc. in their shell it will automatically color the output in real-time according to a definition file containing the color format desired. It has support for wildcard match coloring, tokenized coloring, headers/footers, case scenario coloring, command-line- dependent definition coloring, and includes over 50 pre- made definition files.
OOoPy is a Python library for modifying OpenOffice.org documents. It provides a set of transformations on the OOo XML format using the ElementTree XML Library. Transformations included are a mail merge application and the concatenation of documents with formatting intact. The framework supports easy creation of new transformations.