Evaluation and
Benchmarking Working Group
Mission Statement
The Evaluation and Benchmarking MONAI working group aims at providing guidelines, infrastructure, and practical tools for evaluation and benchmarking of medical image analysis methods. It focuses on leading the community towards the identification and adoption of best practices for evaluation and benchmarking and on identifying practical solutions to improve reproducibility.
Highlights
Recommendations
Implementation of recommendations
Related resources
Group Leads

Dr. Annika Reinke
Deputy Head and Group Lead Validation of Intelligent Systems
German Cancer Research Center (DKFZ)
Benchmark Working Group Chair

Dr. Carole Sudre
Associate Professor
University College London
Benchmark Working Group Chair
Meeting Notes
GitHub Wiki
Ongoing Projects
Reporting Guidelines Taskforce (Lead - Olivier Colliot)
- • Surveying current reporting practices and identifying areas for improvement
- • Development of guidelines around results reporting with a focus statistical aspects
- • Identification of proper calculation and methods for various procedures (e.g., confidence intervals) across different tasks and validation metrics
- • Implementation of recommended calculations for MONAI users
Benchmarking Datasets Taskforce (Lead - Michela Antonelli)
- • Data quality review for MICCAI 2025 lighthouse challenges
- • Identification of key characteristics for benchmarking datasets
- • Encouragement to develop new datasets according to best practice
- • Identification of relevant historical datasets to be used for benchmarking
- • Implementation of guidelines for upcoming datasets
Collaboration Opportunities
Community Engagement
- • Join our regular surveys
- • Contribute to evaluation metrics testing
- • Share your expertise in validation and benchmarking
- • Participate in standards development