The Graphical Models Toolkit (GMTK) is a toolkit for rapidly prototyping statistical models using dynamic graphical models (DGMs) and dynamic Bayesian networks (DBNs). It can be used for speech and language processing, bioinformatics, activity recognition, and any time series application. It features exact and approximate inference, many built-in factors including dense, sparse, and deterministic conditional probability tables, native support for ARPA backoff-based factors and factored language models, parameter sharing, gamma and beta distributions, dense and sparse Gaussian factors, heterogeneous mixtures, deep neural network factors, and time-inhomogeneous trellis factors, arbitrary order embedded Markov chains, a GUI graph viewer, and much more.
Harry is a small tool for comparing strings and measuring their similarity. It implements several common distance and kernel functions for strings, as well as some exotic similarity measures. For example, Harry supports the Levenshtein (edit) distance, the Jaro-Winkler distance, and the compression distance. Harry is implemented using OpenMP, so its runtime scales linearly with the number of available CPU cores. Efficient implementations and effective caching speed comparison of strings.
KaHIP - Karlsruhe High Quality Partitioning - is a family of graph partitioning programs that tackle the balanced graph partitioning problem. It focuses on solution quality and implements flow-based methods, more-localized local searches, and several parallel and sequential meta-heuristics.
Salad (short for Letter Salad) is an efficient and flexible implementation of the well-known anomaly detection method Anagram by Wang et al. (RAID 2006). Salad is based on n-gram models, that is, data is represented as all of its substrings of length n. During training these n-grams are stored in a Bloom filter. This enables the detector to represent a large number of n-grams in little memory and still being able to efficiently access the data. Salad extends Anagram by allowing various n-gram types, a 2-class version of the detector for classification, and various model analysis modes.
The underling library provides simple, scalable means to manipulate MPI-parallel, three dimensional pencil decompositions using FFTW. Pencil decompositions are a natural way to distribute O(n^3) data across O(n^2) processors and are well-suited for memory-intensive, structured spectral turbulence simulations and postprocessing codes. It may be useful in other domains as well. The library is written in C99 and may be used by C89 or C++ applications.
The Pegasus Workflow Management System encompasses a set of technologies which help workflow-based applications execute in a number of different environments, including desktops, campus clusters, grids, and clouds. It bridges the scientific domain and the execution environment by automatically mapping high-level workflow descriptions onto distributed resources. It automatically locates the necessary input data and computational resources necessary for workflow execution. It enables scientists to construct workflows in abstract terms without worrying about the details of the underlying execution environment or the particulars of the low-level specifications required by the middleware (Condor, Globus, or Amazon EC2). It bridges the current cyberinfrastructure by effectively coordinating multiple distributed resources.
LifeV is a finite element (FE) library providing implementations of state of the art mathematical and numerical methods. It serves both as a research and production library. It has already been used in medical and industrial contexts to simulate fluid structure interaction and mass transport. LifeV is the joint collaboration between four institutions: École Polytechnique Fédérale de Lausanne (CMCS) in Switzerland, Politecnico di Milano (MOX) in Italy, INRIA (REO, ESTIME) in France, and Emory University (Sc. Comp) in the U.S.A.
The ExaScale IO (ESIO) library provides simple, high throughput input and output of structured data sets using parallel HDF5. It is designed to support reading and writing of turbulence simulation restart files, but it may be useful in other contexts. The library is written in C99 and may be used by C89 or C++ applications. A Fortran API built atop the F2003 standard ISO_C_BINDING is also available.
Qt-based library with functionality to create highly efficient and fully graphical applications, oriented to computer vision, image processing, and scientific computation. The library features an homogeneous and well documented object-oriented API, with wrapping methods for high performance functionality from libraries such as OpenCV, GSL, CGAL, IPP, BLAS, LAPACK, or Octave library.