What is confusion matrix


  R Interview Questions

Confusion matrix is typically used to compare the performance of a model. For example, when you do KNN on the iris data set, you can use the confusion matrix to compare the real values vs the predicted values.

#  Split the data into train and test set. 
require(caret)
set.seed(100)

# Get the split indices
trainIndex <- createDataPartition(iris$Species, p = .8, 
                                  list = FALSE, 
                                  times = 1)
# Generate the training and test data
train = iris[trainIndex,]
test = iris[-trainIndex,]

# The "class" package
require(class)
predict = knn(train = train[,1:4], 
              test = test[,1:4],
              cl = train[,5],
              k = 5)

table(predict,test[,5])

predict      setosa versicolor virginica
  setosa         10          0         0
  versicolor      0          8         1
  virginica       0          2         9

What do you see in the confusion matrix ?

10 out of 10 times, “setosa” species was correctly predicted. 8 out of 9 times “versicolor” was correctly predicted. 9 out of 11 times, “virginica” was correctly predicted.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.