STATSREP-ML by cguckelsberger

STATSREP-ML is an open-source solution for automating the process of eval- uating machine-learning results. It calculates qualitative statistics, performs the appropriate tests and reports them in a comprehensive way. It largely, but not exclusively, relies on well-tested and robust statistics implementations in R, and uses the tests the machine-learning community largely agreed upon.

Features

Straight-forward configuration, either programmatically or using an XML file.
Support for sample input from k-fold cross-validation, repeated cross-validation on one or multiple datasets, and train-test splits.
Support for two or >2 values of the independent variable, i.e. either classifiers or features. The appropriate tests for the number of groups are selected automatically.
Support for sample sets with two independent variables, by automatically splitting the data along a predefined fixed independent variable value.
Integration of both parametric and non-parametric omnibus- as well as post-hoc tests that are commonly used for comparing machine learning results.
Integration of specific parametric and non-parametric tests to compare multiple models against a baseline, and support for input data annotations to indicate the baseline model.
Automatic p-value correction and integration of several more and less conservative techniques for p-value adjustment.
Testing of parametric test assumptions such as normality and sphericity, allowing an easy application of these algorithms.
Generation of both a plain-text and a better structured \LaTeX\ report, comprising sample tables, qualitative statistics, basic graphs and the evaluation results.

Tests and Correction Methods

Parametric, omnibus: dependent T-test, repeated Measure One-Way ANOVA
Parametric, post-hoc: Dunett test, Tukey HSD test
Non-parametric, omnibus: Wilcoxon Signed-Rank test, Friedman's test, McNemar test
Non-parametric, post-hoc: Nemenyi test, pairwise Wilcoxon Signed-Rank test
P-value correction methods: Bonferroni, Hochberg, Holm, Hommel, Benjamini-Hochberg, Benjamini-Yekuteli

Additional tests can be easily integrated by means of calling the corresponding R packages or by implementing them natively in Java (See Wiki!).

Getting Started

Please have a look at our Wiki for a quick introduction to get you started, and more information on setting up and extending STATSREP-ML.

How to Cite?

If you use STATSREP-ML in research, please cite the following paper (Download):

Christian Guckelsberger, Axel Schulz (2014). STATSREP-ML: Statistical Evaluation & Reporting Framework for Machine Learning Results. Technical Report. Published by tuprints [http://tuprints.ulb.tu-darmstadt.de/id/eprint/4294].

Licence

While most STATSREP-ML modules are available under the Apache Software License (ASL) version 2, there are a few modules that depend on external libraries and are thus licensed under the GPL. The license of each individual module is specified in its LICENSE file.

It must be pointed out that while the component's source code itself is licensed under the ASL or GPL, individual components might make use of third-party libraries or products that are not licensed under the ASL or GPL. Please make sure that you are aware of the third party licenses and respect them.

This project was initiated under the auspices of Prof. Dr. Max Mühlhäuser, Telecooperation Lab (TK), Technische Universität Darmstadt.