ML Workflow Lab

Machine Learning

shiny

tidymodels

cross-validation

hyperparameter-tuning

Build, tune, and evaluate a supervised learning pipeline with tidymodels, from preprocessing through to held-out evaluation

Published

April 17, 2026

Purpose

The hardest part of learning machine learning is the pipeline: data split, preprocessing recipe, model specification, resampling plan, tuning grid, evaluation on a held-out set. The ML Workflow Lab makes every stage of a tidymodels pipeline visible and editable, so that readers can see how each choice feeds forward to the final performance estimate.

User inputs

Dataset (built-in classification/regression examples or user-uploaded)
Outcome variable and feature selection
Data split: proportion for training, stratification toggle
Preprocessing steps: imputation, normalisation, one-hot encoding, PCA, upsampling
Model family: logistic regression, random forest, boosted trees, SVM
Hyperparameter grid and resampling plan (k, repeats)
Performance metric primary and secondary (accuracy, AUC, RMSE, \(R^2\), F1)

Outputs

The resulting workflow() object as pseudo-code in a syntax-highlighted panel
Tuning-result plot: performance as a function of each hyperparameter
Best model summary and the finalised workflow
Held-out performance: confusion matrix, ROC, calibration curve, SHAP values (tree models)
Variable-importance plot

Didactic value

The app drives home a single lesson that a surprising number of ML practitioners fail to internalise: preprocessing must be inside the resampling loop, not before it, or performance estimates are optimistically biased. Seeing what happens when a “data leak” toggle is flipped on communicates this more viscerally than a warning in a manual.

Embedded in

machine-learning/tidymodels-introduction.md
machine-learning/cross-validation.md
machine-learning/hyperparameter-tuning.md

Source code

Local: apps/13-ml-workflow-lab/

Run with:

shiny::runApp("apps/13-ml-workflow-lab")