Skip to main content

Table 1 Glossary

From: Development and validation of a meta-learner for combining statistical and machine learning prediction models in individuals with depression

Term

Definition

Area under the receiver operating characteristic curve (AUC)

A discrimination metric for classification problems, measuring the area under the entire receiver operating characteristic curve. AUC ranges from 0 to 1 with higher values indicating better performance.

Base-learner

A single, stand-alone statistical or machine learning model built for predicting a continuous or a binary outcome.

Bootstrapping

Random sampling data with replacement.

Calibration-in-the-large

A method for measuring the agreement between observed outcomes and predictions for classification problems, where the average predicted probability is compared with the observed event rate. A mismatch indicates that the model over- or underestimates the risk on average.

Deep neural network

A type of machine learning model that resembles how neurons in human brain work.

Mean absolute error (MAE)

MAE measures the average magnitude of errors, i.e., the difference between true/observed values and their predictions. Lower MAE indicates better performance.

Meta-learner

A statistical or machine learning model that uses as input the output of other models (i.e., base-learners), to predict an outcome of interest.

Multi-layer perceptron (MLP)

The simplest deep neural network model with multiple stacked hidden layers.

Overfitting

The case when a model fits too closely to the data used to develop the model (training data), but performs badly on new, testing data.

Permutation feature importance

A method to evaluate the importance of predictors used in machine learning models, by measuring the decrease in model performance when the predictor’s values are randomly shuffled.

Ridge regression

A statistical regression model which uses a penalized likelihood. The penalty has the effect of shrinking the estimated coefficients so that the model does not yield extreme predictions.