====== Validation Metrics ====== Validation metrics are used to evaluate the performance of a model on a validation dataset. These metrics help assess how well a model generalizes to unseen data. Below are common validation metrics categorized by problem type. ===== 1. Regression Problems ===== * **Mean Absolute Error (MAE):** Measures the average absolute difference between predicted and actual values. Formula: **MAE = (1/n) Σ |y_i - ẏ_i|** * **Mean Squared Error (MSE):** Averages the squared differences between predicted and actual values. Formula: **MSE = (1/n) Σ (y_i - ẏ_i)²** * **Root Mean Squared Error (RMSE):** The square root of MSE, with the same units as the target variable. * **R² Score (Coefficient of Determination):** Measures the proportion of variance explained by the model. Formula: **R² = 1 - (Σ(y_i - ẏ_i)² / Σ(y_i - ȳ)²)** ===== 2. Classification Problems ===== * **Accuracy:** Proportion of correct predictions. Formula: **Accuracy = (Correct Predictions / Total Predictions)** * **Precision:** Measures the proportion of true positives among predicted positives. Formula: **Precision = TP / (TP + FP)** * **Recall (Sensitivity):** Measures the proportion of true positives identified. Formula: **Recall = TP / (TP + FN)** * **F1 Score:** Harmonic mean of precision and recall. Formula: **F1 = 2 × (Precision × Recall) / (Precision + Recall)** * **ROC-AUC:** Measures the trade-off between true positive and false positive rates at various thresholds. * **Log Loss (Cross-Entropy Loss):** Evaluates the accuracy of predicted probabilities. ===== 3. Clustering Problems ===== * **Silhouette Score:** Measures how similar an object is to its cluster compared to other clusters. * **Adjusted Rand Index (ARI):** Evaluates similarity between true labels and clustering results. * **Davies-Bouldin Index:** Assesses compactness and separation of clusters. * **Inertia:** Measures how tightly grouped the clusters are. ===== 4. Time Series Problems ===== * **Mean Absolute Percentage Error (MAPE):** Expresses prediction error as a percentage. Formula: **MAPE = (100/n) Σ |(y_i - ẏ_i) / y_i|** * **Symmetric Mean Absolute Percentage Error (sMAPE):** Reduces bias for small values in MAPE. * **Mean Squared Logarithmic Error (MSLE):** Penalizes under- and over-predictions logarithmically. ===== 5. Ranking Problems ===== * **Mean Reciprocal Rank (MRR):** Evaluates ranking quality based on the reciprocal of the rank of the first relevant result. * **Normalized Discounted Cumulative Gain (NDCG):** Considers the position of relevant results in a ranked list. * **Precision at k (P@k):** Measures precision for the top-k predictions. ===== 6. Multi-Label Problems ===== * **Hamming Loss:** Proportion of misclassified labels. Formula: **Hamming Loss = (1/nL) ΣΣ I(y_ij ≠ ẏ_ij)** * **Subset Accuracy:** Measures the percentage of samples where all labels are correctly predicted. * **Macro/Micro Averaged Metrics:** Aggregate metrics across labels (macro) or weight by support (micro). ===== Summary ===== The choice of validation metric depends on the problem type, dataset characteristics, and business goals.