The Gonito evaluation platform Project website

A key step in developing machine learning systems is comparing different methods in terms of quality. To do this, data is divided into sets: training, validating and testing. Models are learned on the training data and the result is checked on the fly using the validating data. The final solution is evaluated on the test data. The result is measured by a metric, which is represented by a single number. The choice of metric is often not obvious and strongly depends on the specifics of the task.

To enable this process to be conducted in a structured and repeatable way, we created the Gonito platform. The authors of the machine learning challenge perform a breakdown of the data into sets and select the metric, and the challenge participants post their solutions along with their answers for the test set. The real answers for the test set are hidden from the participants, so that they have no opportunity to cheat. Gonito users can view the results attained by all solutions on a single screen, comparing them against a set metric which is calculated automatically. In addition, Gonito offers a number of tools to analyze errors.

Gonito is used in a similar way to the Kaggle platform, but differs from it by being open source, and by requiring the use of git to report solutions. Open-source status allows the platform to be freely customized for a specific challenge. With git, on the other hand, it is possible to version and archive all machine learning solutions.

Gonito can be run standalone, including on-premise. For this reason, it is used commercially by companies, universities (to support academic paper publishing and teaching), and organizers of machine learning competitions (such as Poleval).

To learn more about Gonito, read the article and visit the repository.