This tutorial shows you how you can test the performance of a classifier on a data set. This tutorial will introduce two classes. EvaluateDataset, which allows you to test a classifier on a data set and it will also introduce PerformanceMeasure. This class is used to store information regarding the performance of a classifier.
Evaluate a classifier on a dataset
- Dataset data = FileHandler.loadDataset(new File("devtools/data/iris.data"), 4, ",");
- Classifier knn = new KNearestNeighbors(5);
- Dataset dataForClassification = FileHandler.loadDataset(new File("devtools/data/iris.data"), 4, ",");
- Map<Object, PerformanceMeasure> pm = EvaluateDataset.testDataset(knn, dataForClassification);
- for(Object o:pm.keySet())
- System.out.println(o+": "+pm.get(o).getAccuracy());
The testDataset method will use the trained classifier to predict the labels for all instances in the supplied data set. The performance of the classifier is returned as a map that contains for each class a performance measure. A PerformanceMeasure is a wrapper around the values for the true positives, true negatives, false positives and false negatives. This class also provides a number of convenience method to calculate a number of aggregate measures like accuracy, f-score, recall, precision, sensitivity, specificity, etc.