TRIPOD+AI Checklist
TITLE
- Identify the study as developing or evaluating the performance of a multivariable prediction model, the target population, and the outcome to be predicted.
Machine Learning for Predicting Waitlist Mortality in Pediatric Heart Transplantation
ABSTRACT
- N/A - full paper vice abstract only submission.
INTRODUCTION
Background
3a. Explain the healthcare context (including whether diagnostic or prognostic) and rationale for developing or evaluating the prediction model, including references to existing models.
- Section: Introduction (Page 2-3)
3b. Describe the target population and the intended purpose of the prediction model in the context of the care pathway, including its intended users (e.g., healthcare professionals, patients, public).
- Section: Introduction (Page 2-3)
3c. Describe any known health inequalities between sociodemographic groups.
- Section: Introduction (Page 3) & Discussion (Page 13-14)
Objectives
4. Specify the study objectives, including whether the study describes the development or validation of a prediction model (or both).
- Section: Introduction (Page 2-3)
- Section: Methods (Page 5-6)
METHODS
Data
5a. Describe the sources of data separately for the development and evaluation datasets (e.g., randomized trial, cohort, routine care or registry data), the rationale for using these data, and representativeness of the data.
- Section: Methods – Data Source (Page 5-6)
5b. Specify the dates of the collected participant data, including start and end of participant accrual; and, if applicable, end of follow-up.
- Section: Methods – Study Population (Page 6)
Participants
6a. Specify key elements of the study setting (e.g., primary care, secondary care, general population) including the number and location of centers.
- Section: Methods – Study Population (Page 6)
- Additional Context: Discussion (Page 13-14)
6b. Describe the eligibility criteria for study participants.
- Section: Methods – Study Population (Page 6)
6c. Give details of any treatments received, and how they were handled during model development or evaluation, if relevant.
- Section: Methods – Data Collection (Page 7)
- Section: Results (Page 9)
Data Preparation
7. Describe any data pre-processing and quality checking, including whether this was similar across relevant sociodemographic groups.
- see model-data
Outcome
8a. Clearly define the outcome that is being predicted and the time horizon, including how and when assessed, the rationale for choosing this outcome, and whether the method of outcome assessment is consistent across sociodemographic groups.
- Section: Methods – Study Population (Page 6)
- Section: Methods – Statistical Analysis/Machine Learning Modeling (Page 7-8)
- Section: Results (Page 9)
8b. If outcome assessment requires subjective interpretation, describe the qualifications and demographic characteristics of the outcome assessors.
- Section: Discussion (Page 13-14)
8c. Report any actions to blind assessment of the outcome to be predicted.
- N/A. Not needed due to outcomes (waitlist mortality/removal) are recorded objectively in the OPTN database rather than assessed subjectively.
Predictors
9a. Describe the choice of initial predictors (e.g., literature, previous models, all available predictors) and any pre-selection of predictors before model building.
- Section: Methods – Data Collection (Page 7)
9b. Clearly define all predictors, including how and when they were measured (and any actions to blind assessment of predictors for the outcome and other predictors).
- Section: Methods – Data Collection (Page 7-8)
- Section: Results – Top Predictors (Page 9)
9c. If predictor measurement requires subjective interpretation, describe the qualifications and demographic characteristics of the predictor assessors.
N/A.
Sample Size
10. Explain how the study size was arrived at (separately for development and evaluation), and justify that the study size was sufficient to answer the research question. Include details of any sample size calculation.
- A formal sample size calculation is not required because the dataset includes all pediatric heart transplants in the U.S. from the OPTN database
Missing Data
11. Describe how missing data were handled. Provide reasons for omitting any data.
Analytical Methods
12a. Describe how the data were used (e.g., for development and evaluation of model performance) in the analysis, including whether the data were partitioned, considering any sample size requirements.
12b. Depending on the type of model, describe how predictors were handled in the analyses (functional form, rescaling, transformation, or any standardization).
12c. Specify the type of model, rationale, all model-building steps, including any hyperparameter tuning, and method for internal validation.
12d. Describe if and how any heterogeneity in estimates of model parameter values and model performance was handled and quantified across clusters (e.g., hospitals, countries).
12e. Specify all measures and plots used (and their rationale) to evaluate model performance (e.g., discrimination, calibration, clinical utility) and, if relevant, to compare multiple models.
12f. Describe any model updating (e.g., recalibration) arising from the model evaluation, either overall or for particular sociodemographic groups or settings.
12g. For model evaluation, describe how the model predictions were calculated (e.g., formula, code, object, application programming interface).
Class Imbalance
13. If class imbalance methods were used, state why and how this was done, and any subsequent methods to recalibrate the model or the model predictions.
CatBoost provides a scale_pos_weight parameter for handling class imbalance, but we are instead using a probabilistic approach with calibration plots and decision threshold tuning to optimize for our desired precision-recall trade-off in the final model. Full code can be found here:
Fairness
14. Describe any approaches that were used to address model fairness and their rationale.
- Section: Discussion – Racial Disparities (Page 13-14)
- Section: Predictors – Model Inputs (Page 7-8, Page 9)
Model Output
15. Specify the output of the prediction model (e.g., probabilities, classification). Provide details and rationale for any classification and how the thresholds were identified.
Training vs. Evaluation
16. Differences in Training vs. Evaluation Data
- Page 6, Methods – Study Population
- The dataset was split into training (82%) and testing (18%).
- SHAP Partial Dependence Plots
Ethical Approval
17. Institutional Review Board Approval
- Page 5-6, Methods – Data Source
- The study was approved by the institutional review board of the University of Virginia.
- Participant-informed consent or ethics committee waiver not required - retroactive study.
OPEN SCIENCE
18a. Funding
- Page 1, Acknowledgments
- Research was funded by The Jefferson Trust and AHRQ grant 1R21HS029548-01A1.
18b. Conflicts of Interest
- Page 1, Acknowledgments
- No explicit financial disclosures provided. Content represents author’s own views and not those of NIH.
18c. Protocol Availability
18d. Study Registration
N/A. OPTN database.
18e. Data Sharing
18f. Code Sharing
PATIENT AND PUBLIC INVOLVEMENT
19. Patient & Public Involvement
N/A. OPTN Database.
RESULTS
Participants
20a. Participant Flow
20b. Participant Characteristics
- Page 6, Methods – Study Population**
- Sample size (5,523 participants)
- waitlist-data2.html
20c. Comparison of Predictor Distributions
- SHAP Partial Dependence Plots
21. Model development
Specify the number of participants and outcome events in each analysis (e.g., for model development, hyperparameter tuning, model evaluation)
22. Model specification
Provide details of the full prediction model (e.g., formula, code, object, application programming interface) to allow predictions in new individuals and to enable third-party evaluation and implementation, including any restrictions to access or re-use (e.g., freely available, proprietary)
23a. Model performance 23a
Report model performance estimates with confidence intervals, including for any key subgroups (e.g., sociodemographic). Consider plots to aid presentation.
23b. If examined, report results of any heterogeneity in model performance across clusters. See TRIPOD Cluster for additional details.
24. Model updating
Report the results from any model updating, including the updated model and subsequent performance
- N/A. Initial model.
DISCUSSION
25. Interpretation of Main Results
- Page 13-14, Discussion
26. Study Limitations
- Page 14-15, Discussion – Limitations
27a. Handling of Poor Data Quality
27b. User Interaction & Expertise Required
27c. Future Research and Generalizability
- Page 14-15, Discussion – Future Research
TRIPOD+AI Checklist Source: Collins GS, Moons KGM, Dhiman P, et al. BMJ 2024;385:e078378. doi:10.1136/bmj-2023-078378.yed).