DKPro WSD provides UIMA components which encapsulate corpus readers, linguistic annotators, lexical semantic resources, WSD algorithms, and evaluation and reporting tools. You configure the components, or write new ones, and arrange them into a data processing pipeline. DKPro WSD is modular and flexible. Components which provide the same functionality can be freely swapped. You can easily run the same algorithm on different data sets, or test several different algorithms on the same data set.
|Tags||NLP computational linguistics word sense disambiguation WSD|
|Licenses||GPLv3 Apache 2.0|
|Operating Systems||Java Runtime Environment 6|
|Implementation||Java 6+ UIMA|
Release Notes: Evaluators now permit chaining of backoff algorithms. There are now annotators that allow for disambiguating the complete text collectively. There is now a weighted MFS baseline. The sense cluster evaluator now computes McNemar's test. The sense cluster evaluator now handles the case where there are multiple gold-standard senses, and includes undisambiguated instances in the confusion matrix. Bugs were fixed.
Release Notes: New features include support for the IMS disambiguator, a new sense inventory wrapping the GermaNet Java API, and a new wrapper module for easy disambiguation of text strings. The WebCAGe reader now works with the official release of WebCAGe. The SemCor reader optionally writes Token, Lemma, and POS annotations. Readers of XML-based data sets can now optionally ignore the DTD. The cluster evaluator's output is more verbose and informative. There are also a few bugfixes and API changes.
Release Notes: Upgraded to DKPro Core 1.5.0, uimaFIT 2.0.0, UBY 0.4.0, and TWSI 1.0.1. Adds a module for word sense induction. Moves Wikipedia-specific graph algorithms to a separate module.
Release Notes: This is the initial public release.