In order to activate the buttons and the menu of the form, data or models must be loaded. In the window, two listboxes will show the details of the loaded data (on the left) and of the loaded (or calculated) model (on the right).
Loading data and models
Data, class vectors and models can be loaded directly from the matlab workspace or from a matlab data file. In order to load a dataset, select the file menu ("file->load data"). If CP-ANNs will be calculated, a class must be loaded too ("file->load class").
Models can be saved (see next paragraphs) and loaded in the graphical interface: in order to load models select "file->load models". Loaded data and models can also be deleted. When loading data, classes and models, an automatic filter will display in the listbox only the allowed matlab variables.
Sample and variable labels can be loaded from the "file" menu. If so, labels will be stored in the calculated models and will be visualised in the Kohonen top map. Details on the structure of label vectors are given here. Pay attention: if you wish to visualize labels, these have to be loaded before the model calculation.
Viewing the data
Data and class vector (if loaded) can be seen in the view menu ("view->view data matrix" and "view->view class vector").
A new variable (tmp_view) will be created in the matlab work space.
"view->plot samples" will show the sample profiles, while "view->plot means" will show the profiles of the variable averages. If the class vector is loaded, samples will be coloured with the corresponding class colour, while the averages will be calculated on each class separately. Finally, "view->plot univariate stat" will open a form for plotting boxplots, histograms and biplots of variables.
How to calculate models
In order to calculate models select the "calculate model" button or the calculate menu. The "model settings" form will appear:
First of all, the model type must be chosen. If the class vector is not loaded, only Kohonen Maps can be calculated (unsupervised modeling), otherwise the user can choose in between Counterpropagation Artificial Neural networs (CP-ANNs), Supervised Kohonen networks, XY-fused networks, that are supervised methods.
The number of neurons must be set. This is the number of neurons for each side of the map. Taking into consideration that the map is a square, if you enter 7, you'lle get a total number of neurons equal to 7*7 = 49.
The number of epochs can be selected. This defines the number of times the objects will be introduced in the net. Consider that in this toolbox, a new strategy for the selection of the optimal number of epochs and neurons of classification models is provided. Read the corresponding help section here
The topology condition ('square' or 'hexagonal') can be selected. If you select the 'hexagonal' topology togheter with a toroidal boundary condition, you must define an even number of neurons.
If supervised methods are selected as model type, you can choose also to cross-validate the model. Cross validation can be performed with venetian blinds or contiguous blocks (cross-validation type). Regarding venetian blinds, with 3 cross-validation groups the split of the first group in venetian blinds will be [t,0,0,t,0,0,....,t,0,0], while the second one will be [0,t,0,0,t,0,....,0,t,0], and so on. On the other hand, the split of the first group with contiguous blocks will be [t,t,t,t,0,0,....,0,0,0] and so on. If cross-validation is performed, the number of cross-validation groups must be defined (default value is 5).
Besides this basic settings, the user can also change the advanced settings by clicking on the "show advanced settings" button. The advanced settings are thoroughly explained in the start section
. Finally, settings can be saved (as a structure) and loaded from this form. Clicking on the "calculate model" button, the calculation will start and a waiting bar will appear, showing the percentage of epochs calculated.
After the model calculation, the model window in the main form is updated with the model details (number of neurons and epochs used for training the model, number of variables used to build the model, and error rates in the case of supervised methods).
It is possible to view the top map clicking on the "view top map" button (or choosing "results->view top map" in the menu): the top map form will appear. For further details on this form (where it is possible to visualize the samples on the top map, the Kohonen weights, the output weights and calculate PCA on the weights) look the "How to plot the results" section of Kohonen maps
If supervised methods (CP-ANNs, XYF, SKN) are calculated, it is possible the look the classification performances of the model by clicking on the "classification results" button. The "view classification results" form will appear:
On the left of the form, error rate, non-error rate, specificity, sensitivity, precision and the ratio of not assigned samples (when higher than 0) are shown both in fitting (all the samples used to build the model) and in cross-validation (look here
for further inforomation on these classification parameters). On the right of the form it is possible to plot the class profiles: a figure will appear where the mean of the kohonen weights for each layer (variable) for each class is shown. The "plot ROC curves" button will open a plot with the ROC curve
of each class.
The "view confusion matrix" button shows the confusion matrix (look here
for further inforomation on this classification parameter). The "view predicted class" shows the vector with the class assignment of each sample [n x 1], while the "view class weights" button shows the output weights associated to each sample [n x c], where c is the number of classes. At the end of the training, each sample is placed in a defined neuron. Since each neuron is characterised by Kohonen weights and output weights, it is possible to link each sample to the output weights of the neuron where the sample is placed. These weights represent the class probabilities of assignment for the sample.
Confusion matrix, predicted classes and class weights can be displayed both on the fitting and cross-validation results.
Kohonen and CP-ANNs models can be saved from the file menu. The models are saved as structures in the matlab workspace (for further information on the model structure look the "How to read the results" sections of Kohonen maps
help sections). If supervised methods (CP-ANNs, XYF, SKN) are calculated, the cross-validation results can be saved also (for further information on the cross-validation structure look the "Cross validation of Counterpropagation ANNs" section of CP-ANNs
help section). Also predictions on new samples can be saved as structures in the matlab
workspace (for further information on the prediction structure look the
help provided in each matlab m file
, e.g. pred_kohonen,
Predicting new samples
When a model is loaded or calculated, a new set of samples can be loaded, overwriting the set of samples used for the calculation. This enables the "predict sample" button (and the corresponding menu). The button (and the menu) is not active when the data used for the calculation of the model are loaded. This new set of samples can be consequently predicted using a previously calculated model. The new samples are projected in the model (but not used to calculate the model). Clicking the button, the "predict samples" form will appear:
If Kohonen maps are calculated, only the "view samples in top map" button is active. Clicking on this button, the top map with the new predicted samples is shown. Predicted sampels are marked in red with a "P" at the beginning of their labels.
If supervised methods (CP-ANNs, XYF, SKN) are calculated, the "view predicted class" and "view class weights" buttons are active. The "view predicted class" shows the vector with the class prediction of each sample [n x 1], while the "view class weights" button shows the output weights associated to each sample [n x c], where c is the number of classes. Each sample is placed in a defined neuron. Since each neuron is characterised by Kohonen weights and output weights, it is possible to link each sample to the output weights of the neuron where the sample is placed. These weights represent the class probabilities of assignment for the sample.
Finally, if a class was previusoly loaded togheter with the samples to be predicted, classification parameters on the predicted sampekls are calculated. On the left of the form, error rate, non-error rate, specificity, sensitivity, precision and the ratio of not assigned samples (when higher than 0) are shown.
The "view confusion matrix" button shows the confusion matrix. Look here
for further inforomation on these classification parameters.
Finally, prediction results can be saved from the "file->save prediction" menu.