## Machine Learning Terminology

#### Classifier

A program or a function which maps from unlabeled instances to classes is called a classifier.

#### Confusion Matrix

A confusion matrix, also called a contingeny table or error matrix, is used to visualize the performance of a classifier.

The columns of the matrix represent the instances of the predicted classes and the rows represent the instances of the actual class. (Note: It can be the other way around as well.)

In the case of binary classification the table has 2 rows and 2 columns.

Example:

Confusion Matrix |
Predicted classes |
||
---|---|---|---|

cat | dog |
||

Actual classes |
cat | 42 | 8 |

dog | 18 | 32 |

This means that the classifier correctly predicted a cat in 42 cases and it wrongly predicted 8 cat instances as dog. It correctly predicted 32 instances as dog. 18 cases had been wrongly predicted as cat instead of dog.

#### Accuracy

Accuracy is a statistical measure which is defined as the quotient of correct predictions made by a classifier divided by the sum of predictions made by the classifier.

Accuracy: (TN+TP)/(TN+TP+FN+FP)

where:

- TP
- True posive
- TN
- True negative
- FP
- False positive
- FN
- False negative

In our example we have no "True" or "False" but "cat" and "dog". So the nominator is the sum of cases in which the classifier predicted the actual class, i.e. actual class is "cat" and predicted class is "cat" and actual class is "dog" and predicted class is "dog". "FT" and "FN" correspond to the cases where the classifier made mistakes, i.e. actual class is "cat" and predicted class turned out to be "dog" and the actual class is "dog" and the predicted class is "cat".

The classifier in our previous example predicted correctly 42 cat instances and 32 dog instance. Furthermore it erroneously predicted 8 dog cases where it actual class was "cat" and erroneously predicted 18 cat cases, where the actual class was "dog".

Therefore, the accuracy can be calculated by:

accuracy = $(42 + 32) / (42 + 8 + 18 + 32)$

which is 0.72

Let's assume we have a classifier, which always predicts "dog". We have an accuracy of 50 % in this case.

Confusion Matrix |
Predicted classes |
||
---|---|---|---|

cat | dog |
||

Actual classes |
cat | 0 | 50 |

dog | 0 | 50 |

We will demonstrate the so-called accuracy paradox.

A spam recogition classifier is described by the following confusion matrix:

Confusion Matrix |
Predicted classes |
||
---|---|---|---|

spam | ham |
||

Actual classes |
spam | 4 | 1 |

ham | 4 | 91 |

The accuracy of this classifier is (4 + 91) / 100, i.e. 95 %.

The following classifier predicts solely "ham" and has the same accuracy.

Confusion Matrix |
Predicted classes |
||
---|---|---|---|

spam | ham |
||

Actual classes |
spam | 0 | 5 |

ham | 0 | 95 |

The accuracy of this classifier is 95%, even though it is not capable of recognizing any spam at all.

#### Precision and Recall

Confusion Matrix |
Predicted classes |
||
---|---|---|---|

negative | positive |
||

Actual classes |
negative | TN | FP |

positive | FN | TP |

Accuracy: $(TN + TP)/(TN + TP + FN + FP)$

Precision: $TP / (TP + FP)$

Recall: $ TP / (TP + FN)$

#### Supervised learning

The machine learning program is both given the input data and the corresponding labelling. This means that the learn data has to be labelled by a human being beforehand.

#### Unsupervised learning

No labels are provided to the learning algorithm. The algorithm has to figure out the a clustering of the input data.

#### Reinforcement learning

A computer program dynamically interacts with its environment. This means that the program receives positive and/or negative feedback to improve it performance.