\[ \begin{align}\begin{aligned}\newcommand\blank{~\underline{\hspace{1.2cm}}~}\\% Bold symbols (vectors) \newcommand\bs[1]{\mathbf{#1}}\\% Differential \newcommand\dd[2][]{\mathrm{d}^{#1}{#2}} % use as \dd, \dd{x}, or \dd[2]{x}\\% Poor man's siunitx \newcommand\unit[1]{\mathrm{#1}} \newcommand\num[1]{#1} \newcommand\qty[2]{#1~\unit{#2}}\\\newcommand\per{/} \newcommand\squared{{}^2} \newcommand\cubed{{}^3} % % Scale \newcommand\milli{\unit{m}} \newcommand\centi{\unit{c}} \newcommand\kilo{\unit{k}} \newcommand\mega{\unit{M}} % % Percent \newcommand\percent{\unit{{\kern-4mu}\%}} % % Angle \newcommand\radian{\unit{rad}} \newcommand\degree{\unit{{\kern-4mu}^\circ}} % % Time \newcommand\second{\unit{s}} \newcommand\s{\second} \newcommand\minute{\unit{min}} \newcommand\hour{\unit{h}} % % Distance \newcommand\meter{\unit{m}} \newcommand\m{\meter} \newcommand\inch{\unit{in}} \newcommand\foot{\unit{ft}} % % Force \newcommand\newton{\unit{N}} \newcommand\kip{\unit{kip}} % kilopound in "freedom" units - edit made by Sri % % Mass \newcommand\gram{\unit{g}} \newcommand\g{\gram} \newcommand\kilogram{\unit{kg}} \newcommand\kg{\kilogram} \newcommand\grain{\unit{grain}} \newcommand\ounce{\unit{oz}} \newcommand\pound{\unit{lbs}} % % Temperature \newcommand\kelvin{\unit{K}} \newcommand\K{\kelvin} \newcommand\celsius{\unit{{\kern-4mu}^\circ C}} \newcommand\C{\celsius} \newcommand\fahrenheit{\unit{{\kern-4mu}^\circ F}} \newcommand\F{\fahrenheit} % % Area \newcommand\sqft{\unit{sq\,\foot}} % square foot % % Volume \newcommand\liter{\unit{L}} \newcommand\gallon{\unit{gal}} % % Frequency \newcommand\hertz{\unit{Hz}} \newcommand\rpm{\unit{rpm}} % % Voltage \newcommand\volt{\unit{V}} \newcommand\V{\volt} \newcommand\millivolt{\milli\volt} \newcommand\mV{\milli\volt} \newcommand\kilovolt{\kilo\volt} \newcommand\kV{\kilo\volt} % % Current \newcommand\ampere{\unit{A}} \newcommand\A{\ampere} \newcommand\milliampereA{\milli\ampere} \newcommand\mA{\milli\ampere} \newcommand\kiloampereA{\kilo\ampere} \newcommand\kA{\kilo\ampere} % % Resistance \newcommand\ohm{\Omega} \newcommand\milliohm{\milli\ohm} \newcommand\kiloohm{\kilo\ohm} % correct SI spelling \newcommand\kilohm{\kilo\ohm} % "American" spelling used in siunitx \newcommand\megaohm{\mega\ohm} % correct SI spelling \newcommand\megohm{\mega\ohm} % "American" spelling used in siunitx % % Capacitance \newcommand\farad{\unit{F}} \newcommand\F{\farad} \newcommand\microfarad{\micro\farad} \newcommand\muF{\micro\farad} % % Inductance \newcommand\henry{\unit{H}} \newcommand\H{\henry} \newcommand\millihenry{\milli\henry} \newcommand\mH{\milli\henry} % % Power \newcommand\watt{\unit{W}} \newcommand\W{\watt} \newcommand\milliwatt{\milli\watt} \newcommand\mW{\milli\watt} \newcommand\kilowatt{\kilo\watt} \newcommand\kW{\kilo\watt} % % Energy \newcommand\joule{\unit{J}} \newcommand\J{\joule} % % Composite units % % Torque \newcommand\ozin{\unit{\ounce}\,\unit{in}} \newcommand\newtonmeter{\unit{\newton\,\meter}} % % Pressure \newcommand\psf{\unit{psf}} % pounds per square foot \newcommand\pcf{\unit{pcf}} % pounds per cubic foot \newcommand\pascal{\unit{Pa}} \newcommand\Pa{\pascal} \newcommand\ksi{\unit{ksi}} % kilopound per square inch \newcommand\bar{\unit{bar}} \end{aligned}\end{align} \]

Apr 28, 2026 | 797 words | 8 min read

7.1.3. Task 3#

Learning Objectives#

  • Evaluate machine learning models using training, validation, and testing datasets

  • Compare model performance across different hyperparameters

  • Visualize model behavior using plots

  • Interpret trends such as overfitting and underfitting

Introduction#

In this task, you will evaluate and visualize the performance of both:

This assignment focuses on model evaluation and interpretation rather than building new models from scratch. You will reuse functions from those tasks for loading and scaling the dataset, splitting the data, generating predictions, training logistic regression, and calculating accuracy and error.

Task Instructions#

Deliverable Reminder

Create a flowchart of your evaluation workflow and save it as tp3_team_3_teamnumber.pdf. Save your program as tp3_team_3_teamnumber.py.

Step 1: Evaluate KNN Across Multiple k Values#

Create a function named evaluate_knn_model.

Arguments:

  • k_values (list of int): List of k values to test.

  • X_train (2D NumPy array of floats, shape (n_train, n_features)): Training feature matrix.

  • y_train (1D NumPy array of ints, shape (n_train,)): Training labels.

  • X_val (2D NumPy array of floats, shape (n_val, n_features)): Validation feature matrix.

  • y_val (1D NumPy array of ints, shape (n_val,)): Validation labels.

Returns:

  • metrics (dict): dictionary with the following structure:

metrics = {
    "acc": {"train_acc": [], "val_acc": []},
    "error": {"train_error": [], "val_error": []}
}

For each value of k:

  1. Predict labels on the training set using your KNN model

  2. Predict labels on the validation set using your KNN model

  3. Compute the accuracy and error rate for both sets

  4. Append the results to the appropriate lists in metrics

Use your previously implemented predict_labels_knn and calculate_metrics functions.

Note

You will need the predict_labels_knn and calculate_metrics functions developed in Section 7.1.1. Copy them into your program.

Step 2: Plot KNN Performance#

Create a function named plot_knn_performance.

Arguments:

  • metrics (dict): The dictionary returned by evaluate_knn_model.

  • k_values (list of int): List of tested k values.

Returns:

  • None

This function should create one figure containing two side-by-side plots:

  1. Accuracy vs. k

  2. Error Rate vs. k

Each plot must include:

  • A training curve

  • A validation curve

  • A title

  • Labeled axes

  • A legend

Save the figure as train_val.png.

Step 3: Select the Best k Value#

Create a function named select_best_k.

Arguments:

  • k_values (list of int): List of tested k values.

  • metrics (dict): The dictionary returned by evaluate_knn_model.

Returns:

  • best_k (int): The k value with the highest validation accuracy.

Use the validation accuracies in metrics["acc"]["val_acc"] to determine the best value of k. If there is a tie, return the smallest k.

Step 4: Final KNN Evaluation on the Test Set#

Create a function named evaluate_knn_test_set.

Arguments:

  • X_test (2D NumPy array of floats, shape (n_test, n_features)): Test feature matrix.

  • y_test (1D NumPy array of ints, shape (n_test,)): Test labels.

  • X_train (2D NumPy array of floats, shape (n_train, n_features)): Training feature matrix.

  • y_train (1D NumPy array of ints, shape (n_train,)): Training labels.

  • best_k (int): Selected value of k.

Returns:

  • test_accuracy (float): Test-set accuracy.

  • test_error (float): Test-set error rate.

This function should predict the labels of the test set using the selected best_k value, then compute and return the accuracy and error rate.

Step 5: Evaluate Logistic Regression#

Create a function named evaluate_logistic_model.

Arguments:

  • X_train (2D NumPy array of floats, shape (n_train, n_features)): Training feature matrix.

  • y_train (1D NumPy array of ints, shape (n_train,)): Training labels.

  • X_val (2D NumPy array of floats, shape (n_val, n_features)): Validation feature matrix.

  • y_val (1D NumPy array of ints, shape (n_val,)): Validation labels.

  • X_test (2D NumPy array of floats, shape (n_test, n_features)): Test feature matrix.

  • y_test (1D NumPy array of ints, shape (n_test,)): Test labels.

  • w (1D NumPy array of floats, shape (n_features,)): Trained weights.

  • b (float): Trained bias.

Returns:

  • metrics (dict): dictionary with the following structure:

metrics = {
    "train": {"accuracy": 0.0, "error": 0.0},
    "val": {"accuracy": 0.0, "error": 0.0},
    "test": {"accuracy": 0.0, "error": 0.0},
}

Use evaluate_logistic_regression to compute the accuracy and error rate for the training, validation, and test sets.

Note

You will need the evaluate_logistic_regression function developed in Section 7.1.2. Copy it into your program.

Step 6: Plot Logistic Regression Training Behavior#

Create a function named plot_logistic_training.

Arguments:

  • loss_history (list of float): The training loss values produced during training.

  • val_loss_history (list of float): The validation loss values produced during training.

Returns:

  • None

This function should create a plot of Loss vs. Iteration. The plot must include:

  • A title

  • Labeled axes

  • A training loss curve

  • A validation loss curve

Save the figure as training_progress.png.

Step 7: Main Function#

Create a main function that collects the following inputs from the user:

  • The path of the CSV file containing the feature dataset

  • The choice for shuffling the dataset

  • The seed for shuffling the dataset and model reproducibility

The function should then:

  1. Load the dataset using load_dataset from Section 7.1.1

  2. Split the data into training, validation, and testing sets

  3. Evaluate KNN across several values of k such as [1, 3, 5, 7, 9, 15, 31, 53]

  4. Plot the KNN training and validation performance

  5. Determine the best k using validation accuracy

  6. Evaluate the final KNN model on the test set

  7. Train the logistic regression model using a learning rate of \(0.01\) and \(5000\) iterations, as described in Section 7.1.2

  8. Evaluate logistic regression on the training, validation, and test sets

  9. Plot the logistic regression training loss

Ensure that your code output matches the sample output provided below.

Sample Output#

Main-function inputs for the Team 3 model evaluation and plotting workflow. Use the values in Table 7.7 below to test your program.

Table 7.7 Test Cases#

Case

dataset

shuffle

seed

1

img_features.csv

no

42

2

img_features.csv

yes

42

Ensure your program’s output matches the provided samples exactly. This includes all characters, white space, and punctuation. In the samples, user input is highlighted like this for clarity, but your program should not highlight user input in this way.

Case 1 Sample Output

$ python3 tp3_team_3_teamnumber.py Enter the path to the feature dataset: img_features.csv Shuffle the dataset? (yes/no): no Enter a seed for loading the dataset: 42

Data loaded and split into

Training set: size: 864 Validation set: size: 108 Test set: size: 108

KNN Validation Curve Summary k = 1: train accuracy = 1.0000, validation accuracy = 0.9444 k = 3: train accuracy = 0.9977, validation accuracy = 0.9167 k = 5: train accuracy = 0.9942, validation accuracy = 0.8796 k = 7: train accuracy = 0.9942, validation accuracy = 0.8796 k = 9: train accuracy = 0.9942, validation accuracy = 0.8796 k = 15: train accuracy = 0.9931, validation accuracy = 0.8519 k = 31: train accuracy = 0.9792, validation accuracy = 0.8241 k = 53: train accuracy = 0.9711, validation accuracy = 0.8241

Based on the plots, the best k appears to be: 1

--- Final KNN Model Performance --- Test Set Accuracy: 0.9630 Test Set Error Rate: 0.0370

Training logistic regression model with learning rate = 0.01 and num_iterations = 5000...

Iteration 0, Train Loss: 0.6802, Val Loss: 0.6716 Iteration 100, Train Loss: 0.3697, Val Loss: 0.3483 Iteration 200, Train Loss: 0.2919, Val Loss: 0.2791 Iteration 300, Train Loss: 0.2581, Val Loss: 0.2572 Iteration 400, Train Loss: 0.2388, Val Loss: 0.2511 Iteration 500, Train Loss: 0.2260, Val Loss: 0.2518 Iteration 600, Train Loss: 0.2168, Val Loss: 0.2559 Iteration 700, Train Loss: 0.2097, Val Loss: 0.2617 Iteration 800, Train Loss: 0.2042, Val Loss: 0.2683 Iteration 900, Train Loss: 0.1997, Val Loss: 0.2753 Iteration 1000, Train Loss: 0.1960, Val Loss: 0.2824 Iteration 1100, Train Loss: 0.1928, Val Loss: 0.2893 Iteration 1200, Train Loss: 0.1902, Val Loss: 0.2960 Iteration 1300, Train Loss: 0.1880, Val Loss: 0.3024 Iteration 1400, Train Loss: 0.1860, Val Loss: 0.3086 Iteration 1500, Train Loss: 0.1843, Val Loss: 0.3144 Iteration 1600, Train Loss: 0.1828, Val Loss: 0.3199 Iteration 1700, Train Loss: 0.1815, Val Loss: 0.3252 Iteration 1800, Train Loss: 0.1804, Val Loss: 0.3301 Iteration 1900, Train Loss: 0.1793, Val Loss: 0.3348 Iteration 2000, Train Loss: 0.1784, Val Loss: 0.3392 Iteration 2100, Train Loss: 0.1776, Val Loss: 0.3434 Iteration 2200, Train Loss: 0.1768, Val Loss: 0.3474 Iteration 2300, Train Loss: 0.1761, Val Loss: 0.3511 Iteration 2400, Train Loss: 0.1755, Val Loss: 0.3547 Iteration 2500, Train Loss: 0.1749, Val Loss: 0.3581 Iteration 2600, Train Loss: 0.1744, Val Loss: 0.3614 Iteration 2700, Train Loss: 0.1739, Val Loss: 0.3644 Iteration 2800, Train Loss: 0.1735, Val Loss: 0.3674 Iteration 2900, Train Loss: 0.1731, Val Loss: 0.3702 Iteration 3000, Train Loss: 0.1727, Val Loss: 0.3728 Iteration 3100, Train Loss: 0.1724, Val Loss: 0.3754 Iteration 3200, Train Loss: 0.1720, Val Loss: 0.3778 Iteration 3300, Train Loss: 0.1717, Val Loss: 0.3801 Iteration 3400, Train Loss: 0.1715, Val Loss: 0.3824 Iteration 3500, Train Loss: 0.1712, Val Loss: 0.3845 Iteration 3600, Train Loss: 0.1709, Val Loss: 0.3865 Iteration 3700, Train Loss: 0.1707, Val Loss: 0.3885 Iteration 3800, Train Loss: 0.1705, Val Loss: 0.3904 Iteration 3900, Train Loss: 0.1703, Val Loss: 0.3922 Iteration 4000, Train Loss: 0.1701, Val Loss: 0.3939 Iteration 4100, Train Loss: 0.1699, Val Loss: 0.3956 Iteration 4200, Train Loss: 0.1697, Val Loss: 0.3972 Iteration 4300, Train Loss: 0.1696, Val Loss: 0.3987 Iteration 4400, Train Loss: 0.1694, Val Loss: 0.4002 Iteration 4500, Train Loss: 0.1693, Val Loss: 0.4016 Iteration 4600, Train Loss: 0.1691, Val Loss: 0.4030 Iteration 4700, Train Loss: 0.1690, Val Loss: 0.4043 Iteration 4800, Train Loss: 0.1688, Val Loss: 0.4056 Iteration 4900, Train Loss: 0.1687, Val Loss: 0.4068

Final Logistic Regression Evaluation: -------------------------------------------------- Training Set - Accuracy: 0.9317 (93.17%), Error Rate: 0.0683 (6.83%) Validation Set - Accuracy: 0.8519 (85.19%), Error Rate: 0.1481 (14.81%) Test Set - Accuracy: 0.8981 (89.81%), Error Rate: 0.1019 (10.19%)

Model Summary Best KNN validation accuracy: 0.9444 Logistic regression validation accuracy: 0.8519 KNN test accuracy: 0.9630 Logistic regression test accuracy: 0.8981

train val

Fig. 7.2 Case_1_train_val.png#

training progress

Fig. 7.3 Case_1_training_progress.png#

Case 2 Sample Output

$ python3 tp3_team_3_teamnumber.py Enter the path to the feature dataset: img_features.csv Shuffle the dataset? (yes/no): yes Enter a seed for loading the dataset: 42

Data loaded and split into

Training set: size: 864 Validation set: size: 108 Test set: size: 108

KNN Validation Curve Summary k = 1: train accuracy = 1.0000, validation accuracy = 1.0000 k = 3: train accuracy = 0.9977, validation accuracy = 1.0000 k = 5: train accuracy = 0.9977, validation accuracy = 1.0000 k = 7: train accuracy = 0.9942, validation accuracy = 0.9815 k = 9: train accuracy = 0.9907, validation accuracy = 0.9815 k = 15: train accuracy = 0.9884, validation accuracy = 0.9722 k = 31: train accuracy = 0.9734, validation accuracy = 0.9630 k = 53: train accuracy = 0.9514, validation accuracy = 0.9722

Based on the plots, the best k appears to be: 1

--- Final KNN Model Performance --- Test Set Accuracy: 1.0000 Test Set Error Rate: 0.0000

Training logistic regression model with learning rate = 0.01 and num_iterations = 5000...

Iteration 0, Train Loss: 0.6802, Val Loss: 0.6705 Iteration 100, Train Loss: 0.3669, Val Loss: 0.3325 Iteration 200, Train Loss: 0.2935, Val Loss: 0.2533 Iteration 300, Train Loss: 0.2634, Val Loss: 0.2205 Iteration 400, Train Loss: 0.2470, Val Loss: 0.2026 Iteration 500, Train Loss: 0.2367, Val Loss: 0.1912 Iteration 600, Train Loss: 0.2296, Val Loss: 0.1832 Iteration 700, Train Loss: 0.2244, Val Loss: 0.1772 Iteration 800, Train Loss: 0.2203, Val Loss: 0.1725 Iteration 900, Train Loss: 0.2171, Val Loss: 0.1688 Iteration 1000, Train Loss: 0.2145, Val Loss: 0.1657 Iteration 1100, Train Loss: 0.2124, Val Loss: 0.1631 Iteration 1200, Train Loss: 0.2105, Val Loss: 0.1609 Iteration 1300, Train Loss: 0.2090, Val Loss: 0.1589 Iteration 1400, Train Loss: 0.2076, Val Loss: 0.1573 Iteration 1500, Train Loss: 0.2065, Val Loss: 0.1558 Iteration 1600, Train Loss: 0.2054, Val Loss: 0.1545 Iteration 1700, Train Loss: 0.2045, Val Loss: 0.1533 Iteration 1800, Train Loss: 0.2037, Val Loss: 0.1523 Iteration 1900, Train Loss: 0.2029, Val Loss: 0.1513 Iteration 2000, Train Loss: 0.2023, Val Loss: 0.1505 Iteration 2100, Train Loss: 0.2017, Val Loss: 0.1497 Iteration 2200, Train Loss: 0.2011, Val Loss: 0.1490 Iteration 2300, Train Loss: 0.2006, Val Loss: 0.1483 Iteration 2400, Train Loss: 0.2001, Val Loss: 0.1477 Iteration 2500, Train Loss: 0.1997, Val Loss: 0.1472 Iteration 2600, Train Loss: 0.1992, Val Loss: 0.1467 Iteration 2700, Train Loss: 0.1989, Val Loss: 0.1462 Iteration 2800, Train Loss: 0.1985, Val Loss: 0.1457 Iteration 2900, Train Loss: 0.1982, Val Loss: 0.1453 Iteration 3000, Train Loss: 0.1978, Val Loss: 0.1449 Iteration 3100, Train Loss: 0.1975, Val Loss: 0.1446 Iteration 3200, Train Loss: 0.1972, Val Loss: 0.1442 Iteration 3300, Train Loss: 0.1970, Val Loss: 0.1439 Iteration 3400, Train Loss: 0.1967, Val Loss: 0.1436 Iteration 3500, Train Loss: 0.1965, Val Loss: 0.1433 Iteration 3600, Train Loss: 0.1962, Val Loss: 0.1431 Iteration 3700, Train Loss: 0.1960, Val Loss: 0.1428 Iteration 3800, Train Loss: 0.1958, Val Loss: 0.1426 Iteration 3900, Train Loss: 0.1956, Val Loss: 0.1423 Iteration 4000, Train Loss: 0.1954, Val Loss: 0.1421 Iteration 4100, Train Loss: 0.1952, Val Loss: 0.1419 Iteration 4200, Train Loss: 0.1950, Val Loss: 0.1417 Iteration 4300, Train Loss: 0.1949, Val Loss: 0.1415 Iteration 4400, Train Loss: 0.1947, Val Loss: 0.1413 Iteration 4500, Train Loss: 0.1945, Val Loss: 0.1412 Iteration 4600, Train Loss: 0.1944, Val Loss: 0.1410 Iteration 4700, Train Loss: 0.1942, Val Loss: 0.1408 Iteration 4800, Train Loss: 0.1941, Val Loss: 0.1407 Iteration 4900, Train Loss: 0.1940, Val Loss: 0.1405

Final Logistic Regression Evaluation: -------------------------------------------------- Training Set - Accuracy: 0.9086 (90.86%), Error Rate: 0.0914 (9.14%) Validation Set - Accuracy: 0.9444 (94.44%), Error Rate: 0.0556 (5.56%) Test Set - Accuracy: 0.9074 (90.74%), Error Rate: 0.0926 (9.26%)

Model Summary Best KNN validation accuracy: 1.0000 Logistic regression validation accuracy: 0.9444 KNN test accuracy: 1.0000 Logistic regression test accuracy: 0.9074

train val

Fig. 7.4 Case_2_train_val.png#

training progress

Fig. 7.5 Case_2_training_progress.png#

Table 7.8 Deliverables#

Deliverables

Description

tp3_team_3_teamnumber.pdf

Flowchart(s) for this task.

tp3_team_3_teamnumber.py

Your completed Python code.