libagf is a library of variable-bandwidth kernel estimators for statistical classification, PDF estimation, and interpolation/non-linear regression using both Gaussian kernels and k-nearest-neighbours. Statistical classification allows the use of a pre-trained model for considerable speed gains. Also included are clustering algorithms. It includes command line executables as well as easy-to-use libraries.
|Tags||Scientific/Engineering Artificial Intelligence Image Recognition Mathematics statistical classification non-parametric statistics kernel density estimation|
|Operating Systems||OS Independent|
I had hoped to have multi-class border-classification ready by now, but the simple generalization I had envisioned to implement it won't work in all cases. The idea was to use matrix inversion to solve for the conditional probabilities, but quite obviously (in retrospect) you can solve for the class without being able to determine all the conditional probabilities. Likely we need two cases: one where all the conditional probabilities can be found, and one where only that of the retrieved class can be found and these two cases need to interoperate. A recursive or hierarchical model would seem to be the best solution here. I realize that there is literature relating to the problem of creating multi-class classifications from two-class, however I do not currently have access to commercial journals as I am not affiliated with an academic or research institution. It is also an enjoyable challenge to try and figure these things out for yourself, from scratch, so to speak. Likewise I had hoped to have the optimal-bandwidth Gaussian PDF estimation ready. I had made some progress on it, but the test cases were not giving consistent results and I have failed to work on it in the intervening months.
Release Notes: New in this release is multi-class classification from binary classifiers using a recursive control language, and hierarchical clustering.
Release Notes: This release has a new command for generating the Relative Operation Characteristic (ROC) function. There is also a shell script for validating probability density function estimates. Many bugs have been found and corrected.
Release Notes: The k-nearest-neighbours routine is now based on a quicksort instead of a binary tree, and a weights calculation routine implemented that solves for the filter variance using the same root-finding algorithm (supernewton) as the class borders algorithm. To accommodate these changes, several options have been changed/added.
Release Notes: Everything except the I/O routines has been templated. With the exception of those used in external routines, variable types in the main routines are now controlled with global typedefs, with each class of variable having a different type. Different metrics are now only supported in the routines where they make sense: KNN classification and KNN interpolation. The functions now require a pointer to the desired metric. The nfold routine now supports interpolation. Note that this is still not well tested (if at all).
Release Notes: This release includes a basic quickstart guide to get new users up and running as well as a more in depth discussion of the theory. Classification algorithms are stable and run very well.