Online Evaluation (Iso Fitts)
For EMG-based control systems, it has been shown that the offline performance of a system (i.e., classification accuracy, mean absolute error) does not necessarily correlate to online usability. In this example, we introduce an Iso Fitts test for assessing the online performance of continuous EMG-based control systems. Different types of models, such as regressors and classifiers, cannot be easily compared offline since different metrics are calculated. Online tests allow us to compare these distinct model types and assess a model’s ability to perform a task with a user in the loop.
Methods
This example acts as a mini experiment that you can try out on yourself or a friend where the offline and online performance of four popular classifiers (LDA, SVM, RF, and KNN (k=5)) and two regressors (LR and SVM) are compared.
The steps of this ‘mini experiment’ are as follows:
Accumulate 3 repetitions of five contractions (no movement, flexion, extension, hand open, and hand closed). These classes correspond to movement in the isofitts task (do nothing, and move left, right, up, and down).
![]() |
|
![]() |
![]() |
Train and evaluate four classifiers in an offline setting (LDA, SVM, KNN (k=5), and RF). For this step, the first 2 reps are used for training and the last for testing.
Perform an Iso Fitts test to evaluate the online usability of each classifier trained in step 2. These fitts law tests are useful for computing throughput, overshoots, and efficiency. Ultimately, these metrics provide an indication of the online usability of a model. The Iso Fitts test is useful for myoelectric control systems as it requires changes in degrees of freedom to complete sucessfully.
Repeat steps 1-3 using regressors instead of classifiers. Select ‘regression’ from the radio buttons and redo data collection. You will now be shown a video of a point moving through a cartesian plane, which indicates the position along each DOF. Follow the point in real-time to provide the regressor with continuously-labelled training data (as opposed to classes in classification). This video will be repeated 3 times (i.e., 3 repetitions). Note that you can now perform simultaneous contractions (i.e., move the cursor along the diagonal) when using a regressor.
Note: We have made this example to work with the Myo Armband
. However, it can easily be used for any hardware by simply switching the streamer
, WINDOW_SIZE
, and WINDOW_INCREMENT
.
Fitts Test
To create the Isofitts test, we leveraged pygame
. The code for this module can be found in libemg.environments.isofitts.py
.
To increase the speed of the cursor we could do one of two things: (1) increase the velocity of the cursor (i.e., how many pixels it moves for each prediction), or (2) decrease the increment so that more predictions are made in the same amount of time. Parameters like this can be modified by passing arguments to the IsoFitts
constructor.
Data Analysis
After accumulating data from the experiment, we need a way to analyze the data. In analyze_data.py
, we added the capability to evaluate each model’s offline and online performance.
To evaluate each model’s offline performance, we took a similar approach to set up the online model. However, in this case, we have to split up the data into training and testing. To do this, we first extract each of the 3 reps of data. We will split this into training and testing in a little bit.
regex_filters = [
RegexFilter(left_bound = "classification/C_", right_bound="_R", values = ["0","1","2","3","4"], description='classes'),
RegexFilter(left_bound = "R_", right_bound="_emg.csv", values = ["0", "1", "2"], description='reps'),
]
clf_odh = OfflineDataHandler()
clf_odh.get_data('data/', regex_filters, delimiter=",")
regex_filters = [
RegexFilter(left_bound='data/regression/C_0_R_', right_bound='_emg.csv', values=['0', '1', '2'], description='reps')
]
metadata_fetchers = [
FilePackager(RegexFilter(left_bound='animation/', right_bound='.txt', values=['collection'], description='labels'), package_function=lambda x, y: True)
]
reg_odh = OfflineDataHandler()
reg_odh.get_data('./', regex_filters, metadata_fetchers=metadata_fetchers, delimiter=',')
Using the isolate_data
function, we can split the data into training and testing. In this specific case, we are splitting on the “reps” keyword and we want values with index 0-1 for training and 2 for testing. After isolating the data, we extract windows and associated metadata for both training and testing sets.
train_odh = odh.isolate_data(key="reps", values=[0,1])
train_windows, train_metadata = train_odh.parse_windows(WINDOW_SIZE,WINDOW_INCREMENT, metadata_operations=metadata_operations)
test_odh = odh.isolate_data(key="reps", values=[2])
test_windows, test_metadata = test_odh.parse_windows(WINDOW_SIZE,WINDOW_INCREMENT, metadata_operations=metadata_operations)
Next, we create a dataset dictionary consisting of testing and training features and labels. This dictionary is passed into an OfflineDataHandler
.
data_set = {}
data_set['testing_features'] = fe.extract_feature_group('HTD', test_windows)
data_set['training_features'] = fe.extract_feature_group('HTD', train_windows)
data_set['testing_labels'] = test_metadata[labels_key]
data_set['training_labels'] = train_metadata[labels_key]
Finally, to extract the offline performance of each model, we leverage the OfflineMetrics
module. We do this in a loop to easily evaluate a number of models. We append the metrics to a dictionary for future use.
om = OfflineMetrics()
# Normal Case - Test all different models
for model in models:
if is_regression:
model = EMGRegressor(model)
model.fit(data_set.copy())
preds = model.run(data_set['testing_features'])
else:
model = EMGClassifier(model)
model.fit(data_set.copy())
preds, _ = model.run(data_set['testing_features'])
out_metrics = om.extract_offline_metrics(metrics, data_set['testing_labels'], preds, 2)
offline_metrics['model'].append(model)
offline_metrics['metrics'].append(out_metrics)
return offline_metrics
Results
There are clear discrepancies between offline and online metrics. For example, RF outperforms LDA in the offline classification analysis, but it is clear in the online test that it is much worse. Similarly, RF outperforms LR in the offline regression analysis, but the usability metrics again suggest that LR outperforms RF during an online task. This highlights the need to evaluate EMG-based control systems in online settings with user-in-the-loop feedback.
These results also show that regressors had worse usability metrics than classifiers despite enabling simultaneous motions. The high number of overshoots indicate that this is likely due to the fact that the model stuggled to stay at rest without drifting, increasing the time each trial took. This example could be expanded by adding things like a threshold to the regressors (see EMGRegressor.add_deadband
), which may improve regressor performance.
Visual Output: