Posted 2025-01-17Machine Learning

Model Evaluation

Evaluate a model and decide what to do next.

Model Evaluation

Train Test Split

Split the dataset to: 70% for training and 30% for testing.

Train/test Procedure for Linear Regression

Fit params by minimizing const function J (contains regularization term)
$$
J(\mathbf{w},b) = \frac{1}{2m_{train}} \sum\limits_{i = 1}^{m_{train}} (f_{\mathbf{w},b}(\mathbf{x}^{(i)}) - y^{(i)})^2 + \frac{\lambda}{2m_{train}} \sum_{j=1}^{n} w_j^2
$$

Compute Test Error:
$$
J_{test}(\mathbf{w},b) = \frac{1}{2m_{test}} \sum\limits_{i = 1}^{m_{test}} (f_{\mathbf{w},b}(\mathbf{x_{test}}^{(i)}) - {y_{test}}^{(i)})^2
$$

Compute Train Error:
$$
J_{train}(\mathbf{w},b) = \frac{1}{2m_{train}} \sum\limits_{i = 1}^{m_{train}} (f_{\mathbf{w},b}(\mathbf{x_{train}}^{(i)}) - {y_{train}}^{(i)})^2
$$

$J_{train}(\mathbf{w},b)$ will be low, and $J_{test}(\mathbf{w},b)$ will be high.

Train/test Procedure for Logistic Regression

Fit params by minimizing const function J (contains regularization term)
$$
J(\vec{w},b) = -\frac{1}{m_{train}} \sum\limits_{i = 1}^{m_{train}} [y^{(i)}\log(f_{\vec{w}, b}(\vec{x}^{(i)})) + (1 - y^{(i)})\log(1 - f_{\vec{w}, b}(\vec{x}^{(i)}))] + \frac{\lambda}{2m_{train}}\sum_{j=1}^{n} w_j^2
$$

Compute Test Error:
$$
J_{text}(\vec{w},b) = -\frac{1}{m_{test}} \sum\limits_{i = 1}^{m_{test}} [y_{test}^{(i)} \log(f_{\vec{w}, b}(\vec{x_{test}}^{(i)})) + (1 - y_{test}^{(i)})\log(1 - f_{\vec{w}, b}(\vec{x}_{test}^{(i)}))]
$$

Compute Train Error:
$$
J_{train}(\vec{w},b) = -\frac{1}{m_{train}} \sum\limits_{i = 1}^{m_{train}} [y_{train}^{(i)}\log(f_{\vec{w}, b}(\vec{x_{train}}^{(i)})) + (1 - y_{train}^{(i)})\log(1 - f_{\vec{w}, b}(\vec{x}_{train}^{(i)}))]
$$

Common Method for train and test error in classification model:
$J_{test}(\vec{w}, b)$ is the fraction of the test set that has been misclassified.
$J_{train}(\vec{w}, b)$ is the fraction of the train set that has been misclassified.

Model Selection

Train Cross-Validation Test Split

Split dataset into 60% for training, 20% for cross-validation and 20% for testing.

Evaluate and Choose

Train all models on the training set.

Evaluate all models on cross-validation set and pick the best one.

Then give a fair estimate value of accuracy of the chosen model using test set.

#ML

Buy me a coffee

Model Evaluation

Model Evaluation

Train Test Split

Train/test Procedure for Linear Regression

Train/test Procedure for Logistic Regression

Model Selection

Train Cross-Validation Test Split

Evaluate and Choose

Like this article? Support the author with

Comments

Tags

Archives

Links

Categories

Recents

follow.it