Validation Metrics

Validation metrics are used to evaluate the performance of a model on a validation dataset. These metrics help assess how well a model generalizes to unseen data. Below are common validation metrics categorized by problem type.

Mean Absolute Error (MAE):

Measures the average absolute difference between predicted and actual values.

  Formula:  
  **MAE = (1/n) Σ |y_i - ẏ_i|**

Mean Squared Error (MSE):

Averages the squared differences between predicted and actual values.

  Formula:  
  **MSE = (1/n) Σ (y_i - ẏ_i)²**

Root Mean Squared Error (RMSE):

The square root of MSE, with the same units as the target variable.

R² Score (Coefficient of Determination):

Measures the proportion of variance explained by the model.

  Formula:  
  **R² = 1 - (Σ(y_i - ẏ_i)² / Σ(y_i - ȳ)²)**

Accuracy:

Proportion of correct predictions.

  Formula:  
  **Accuracy = (Correct Predictions / Total Predictions)**

Precision:

Measures the proportion of true positives among predicted positives.

  Formula:  
  **Precision = TP / (TP + FP)**

Recall (Sensitivity):

Measures the proportion of true positives identified.

  Formula:  
  **Recall = TP / (TP + FN)**

F1 Score:

Harmonic mean of precision and recall.

  Formula:  
  **F1 = 2 × (Precision × Recall) / (Precision + Recall)**

ROC-AUC:

Measures the trade-off between true positive and false positive rates at various thresholds.

Log Loss (Cross-Entropy Loss):

Evaluates the accuracy of predicted probabilities.

Silhouette Score:

Measures how similar an object is to its cluster compared to other clusters.

Adjusted Rand Index (ARI):

Evaluates similarity between true labels and clustering results.

Davies-Bouldin Index:

Assesses compactness and separation of clusters.

Inertia:

Measures how tightly grouped the clusters are.

Mean Absolute Percentage Error (MAPE):

Expresses prediction error as a percentage.

  Formula:  
  **MAPE = (100/n) Σ |(y_i - ẏ_i) / y_i|**

Symmetric Mean Absolute Percentage Error (sMAPE):

Reduces bias for small values in MAPE.

Mean Squared Logarithmic Error (MSLE):

Penalizes under- and over-predictions logarithmically.

Mean Reciprocal Rank (MRR):

Evaluates ranking quality based on the reciprocal of the rank of the first relevant result.

Normalized Discounted Cumulative Gain (NDCG):

Considers the position of relevant results in a ranked list.

Precision at k (P@k):

Measures precision for the top-k predictions.

Hamming Loss:

Proportion of misclassified labels.

  Formula:  
  **Hamming Loss = (1/nL) ΣΣ I(y_ij ≠ ẏ_ij)**

Subset Accuracy:

Measures the percentage of samples where all labels are correctly predicted.

Macro/Micro Averaged Metrics:

Aggregate metrics across labels (macro) or weight by support (micro).

Summary

The choice of validation metric depends on the problem type, dataset characteristics, and business goals.

Validation Metrics

1. Regression Problems

2. Classification Problems

3. Clustering Problems

4. Time Series Problems

5. Ranking Problems

6. Multi-Label Problems

Summary

Neurosurgery Wiki