Validation Metrics

Validation metrics are used to evaluate the performance of a model on a validation dataset. These metrics help assess how well a model generalizes to unseen data. Below are common validation metrics categorized by problem type.

  • Mean Absolute Error (MAE):

Measures the average absolute difference between predicted and actual values.

  Formula:  
  **MAE = (1/n) Σ |y_i - ẏ_i|**
  • Mean Squared Error (MSE):

Averages the squared differences between predicted and actual values.

  Formula:  
  **MSE = (1/n) Σ (y_i - ẏ_i)²**
  • Root Mean Squared Error (RMSE):

The square root of MSE, with the same units as the target variable.

  • R² Score (Coefficient of Determination):

Measures the proportion of variance explained by the model.

  Formula:  
  **R² = 1 - (Σ(y_i - ẏ_i)² / Σ(y_i - ȳ)²)**
  • Accuracy:

Proportion of correct predictions.

  Formula:  
  **Accuracy = (Correct Predictions / Total Predictions)**
  • Precision:

Measures the proportion of true positives among predicted positives.

  Formula:  
  **Precision = TP / (TP + FP)**
  • Recall (Sensitivity):

Measures the proportion of true positives identified.

  Formula:  
  **Recall = TP / (TP + FN)**
  • F1 Score:

Harmonic mean of precision and recall.

  Formula:  
  **F1 = 2 × (Precision × Recall) / (Precision + Recall)**
  • ROC-AUC:

Measures the trade-off between true positive and false positive rates at various thresholds.

  • Log Loss (Cross-Entropy Loss):

Evaluates the accuracy of predicted probabilities.

  • Silhouette Score:

Measures how similar an object is to its cluster compared to other clusters.

  • Adjusted Rand Index (ARI):

Evaluates similarity between true labels and clustering results.

  • Davies-Bouldin Index:

Assesses compactness and separation of clusters.

  • Inertia:

Measures how tightly grouped the clusters are.

  • Mean Absolute Percentage Error (MAPE):

Expresses prediction error as a percentage.

  Formula:  
  **MAPE = (100/n) Σ |(y_i - ẏ_i) / y_i|**
  • Symmetric Mean Absolute Percentage Error (sMAPE):

Reduces bias for small values in MAPE.

  • Mean Squared Logarithmic Error (MSLE):

Penalizes under- and over-predictions logarithmically.

  • Mean Reciprocal Rank (MRR):

Evaluates ranking quality based on the reciprocal of the rank of the first relevant result.

  • Normalized Discounted Cumulative Gain (NDCG):

Considers the position of relevant results in a ranked list.

  • Precision at k (P@k):

Measures precision for the top-k predictions.

  • Hamming Loss:

Proportion of misclassified labels.

  Formula:  
  **Hamming Loss = (1/nL) ΣΣ I(y_ij ≠ ẏ_ij)**
  • Subset Accuracy:

Measures the percentage of samples where all labels are correctly predicted.

  • Macro/Micro Averaged Metrics:

Aggregate metrics across labels (macro) or weight by support (micro).

The choice of validation metric depends on the problem type, dataset characteristics, and business goals.

  • validation_metrics.txt
  • Last modified: 2025/04/29 20:24
  • by 127.0.0.1