Projects / Archiving for Java

Archiving for Java

Archive4J is an archive engine for large document collections written in Java, i.e. a set of algorithmic tools and implementations that make it possible to build a direct index of a document collection. In particular, for each document some basic data can be recovered, such as the length of the document in words, the list of distinct terms appearing in the document, and the number of occurrences of each term in the document (the count). Goals include a very high compression rate and very fast random access. To obtain this result, Archive4J combines techniques typical of search engines with succinct data structures.

Tags
Licenses
Implementation

Recent releases

  •  26 May 2008 18:01

    No changes have been submitted for this release.

    Screenshot

    Project Spotlight

    OpenStack4j

    A Fluent OpenStack client API for Java.

    Screenshot

    Project Spotlight

    TurnKey TWiki Appliance

    A TWiki appliance that is easy to use and lightweight.