Show pageBacklinksCite current pageExport to PDFBack to top This page is read only. You can view the source, but not change it. Ask your administrator if you think this is wrong. ====== Validation Metrics ====== Validation metrics are used to evaluate the performance of a model on a validation dataset. These metrics help assess how well a model generalizes to unseen data. Below are common validation metrics categorized by problem type. ===== 1. Regression Problems ===== * **Mean Absolute Error (MAE):** Measures the average absolute difference between predicted and actual values. Formula: **MAE = (1/n) Σ |y_i - ẏ_i|** * **Mean Squared Error (MSE):** Averages the squared differences between predicted and actual values. Formula: **MSE = (1/n) Σ (y_i - ẏ_i)²** * **Root Mean Squared Error (RMSE):** The square root of MSE, with the same units as the target variable. * **R² Score (Coefficient of Determination):** Measures the proportion of variance explained by the model. Formula: **R² = 1 - (Σ(y_i - ẏ_i)² / Σ(y_i - ȳ)²)** ===== 2. Classification Problems ===== * **Accuracy:** Proportion of correct predictions. Formula: **Accuracy = (Correct Predictions / Total Predictions)** * **Precision:** Measures the proportion of true positives among predicted positives. Formula: **Precision = TP / (TP + FP)** * **Recall (Sensitivity):** Measures the proportion of true positives identified. Formula: **Recall = TP / (TP + FN)** * **F1 Score:** Harmonic mean of precision and recall. Formula: **F1 = 2 × (Precision × Recall) / (Precision + Recall)** * **ROC-AUC:** Measures the trade-off between true positive and false positive rates at various thresholds. * **Log Loss (Cross-Entropy Loss):** Evaluates the accuracy of predicted probabilities. ===== 3. Clustering Problems ===== * **Silhouette Score:** Measures how similar an object is to its cluster compared to other clusters. * **Adjusted Rand Index (ARI):** Evaluates similarity between true labels and clustering results. * **Davies-Bouldin Index:** Assesses compactness and separation of clusters. * **Inertia:** Measures how tightly grouped the clusters are. ===== 4. Time Series Problems ===== * **Mean Absolute Percentage Error (MAPE):** Expresses prediction error as a percentage. Formula: **MAPE = (100/n) Σ |(y_i - ẏ_i) / y_i|** * **Symmetric Mean Absolute Percentage Error (sMAPE):** Reduces bias for small values in MAPE. * **Mean Squared Logarithmic Error (MSLE):** Penalizes under- and over-predictions logarithmically. ===== 5. Ranking Problems ===== * **Mean Reciprocal Rank (MRR):** Evaluates ranking quality based on the reciprocal of the rank of the first relevant result. * **Normalized Discounted Cumulative Gain (NDCG):** Considers the position of relevant results in a ranked list. * **Precision at k (P@k):** Measures precision for the top-k predictions. ===== 6. Multi-Label Problems ===== * **Hamming Loss:** Proportion of misclassified labels. Formula: **Hamming Loss = (1/nL) ΣΣ I(y_ij ≠ ẏ_ij)** * **Subset Accuracy:** Measures the percentage of samples where all labels are correctly predicted. * **Macro/Micro Averaged Metrics:** Aggregate metrics across labels (macro) or weight by support (micro). ===== Summary ===== The choice of validation metric depends on the problem type, dataset characteristics, and business goals. validation_metrics.txt Last modified: 2025/01/15 22:42by 127.0.0.1