Creating a Dataset

This tutorial show how to create a Dataset. Simply put, a Dataset is a collection of Instances. This tutorial assumes you know how to create an Instance, either DenseInstance or SparseInstance will work as both are implementations of the Instance interface.

For the purpose of this tutorial we will use a method of InstanceTools to create Instances with random values for its attributes. In reality you will either create the Instances yourself or you will load them from a file (see next page in this tutorial trail). The method InstanceTools.randomInstance(25); will create a DenseInstance with 25 attributes that all have values between zero and one.

Now on to creating a Dataset. Dataset is an interface which defines a number of operations on a data set. The default implementation of Dataset is DefaultDataset. At the moment no other implementations of Dataset are available. In the following example we will create a DefaultDataset and populate it with 10 Instances that have been randomly generated as described in the previous paragraph.

  1. Dataset data = new DefaultDataset();
  2. for (int i = 0; i < 10; i++) {
  3. Instance tmpInstance = InstanceTools.randomInstance(25);
  4. data.add(tmpInstance);
  5. }

[Documented source code]