Metrics To Evaluate Ml Algorithms

Refer to Metrics to Evaluate your Machine Learning Algorithm. There's also extra content in this note.

Types of Evaluation Metrics

Classification Accuracy

Accuracy = # of correct predictions / Total # of predictions

  • Pros: Good when equal samples/clas

  • Cons: False sense of achieving high accuracy if unequal samples/class

  • eg. If samples were 90% Class A and 10% Class B, Model could predict class A 100% of the time and it would have a 90% accuracy rate (which is clearly not correct)

Logarithmic Loss

  • Penalises false classifications

  • Pros: Good with multiclass classification

  • Prior setup:

    • Classifier must assign probability to each class for all samples

  • Equation parameters:

    • N: Total # of samples

    • M: Total # of classes

    • y_ij: If sample i belongs to class j or not

    • p_ij: Probability of sample i belonging to class j

  • Range: [0,][0, \infty]

    • If logloss → 0, more accurate

Confusion Matrix

  • Output: Matrix that describes entire performance of model

  • True Positives:

    • Predicted: YES

    • Actual: YES

  • True Negatives:

    • Predicted: NO

    • Actual: NO

  • False Positives:

    • Predicted: YES

    • Actual: NO

  • False Negatives:

    • Predicted: NO

    • Actual: YES

Accuracy of Matrix = (True Positives + False Negatives) / (Total # of Samples)

Area Under Curve (AUC)

  • Use Case: Binary Classification

  • Output:

    • AUC of TPR vs FPR Graph

    • Probability that classifier will rank positive example higher than negative example

  • True Positive Rate (Sensitivity):

    • Equation: TP / (FN+TP)

    • Range: [0,1][0, 1]

    • Meaning: Positive data points with respect to ALL positive (actual) data points

  • False Positive Rate (Specificity):

    • Equation: FP / (FP+TN)

    • Range: [0,1][0, 1]

    • Meaning: Negative data points considered positive out of all negative (actual) data points

F1 Score

  • Output: Harmonic mean between precision and recall

  • Range: [0,1][0, 1]

  • Meaning:

    • Balance between Precision and Recall

    • Precision (how many instances it classifies correctly)

Last updated