Evaluation and
Benchmarking Working Group

Mission Statement

The Evaluation and Benchmarking MONAI working group aims at providing guidelines, infrastructure, and practical tools for evaluation and benchmarking of medical image analysis methods. It focuses on leading the community towards the identification and adoption of best practices for evaluation and benchmarking and on identifying practical solutions to improve reproducibility.

Highlights

Recommendations

Implementation of recommendations

• MONAI Evaluation Metrics
• Metrics Documentation

Related resources

Group Leads

Dr. Annika Reinke

Deputy Head and Group Lead Validation of Intelligent Systems

German Cancer Research Center (DKFZ)

Benchmark Working Group Chair

View Profile

Dr. Carole Sudre

Associate Professor

University College London

Benchmark Working Group Chair

View Profile

Meeting Notes

GitHub Wiki

• Access all meeting notes

Ongoing Projects

Reporting Guidelines Taskforce (Lead - Olivier Colliot)

• Surveying current reporting practices and identifying areas for improvement
• Development of guidelines around results reporting with a focus statistical aspects
• Identification of proper calculation and methods for various procedures (e.g., confidence intervals) across different tasks and validation metrics
• Implementation of recommended calculations for MONAI users

Benchmarking Datasets Taskforce (Lead - Michela Antonelli)

• Data quality review for MICCAI 2025 lighthouse challenges
• Identification of key characteristics for benchmarking datasets
• Encouragement to develop new datasets according to best practice
• Identification of relevant historical datasets to be used for benchmarking
• Implementation of guidelines for upcoming datasets

Collaboration Opportunities

Community Engagement

• Join our regular surveys
• Contribute to evaluation metrics testing
• Share your expertise in validation and benchmarking
• Participate in standards development

Evaluation and Benchmarking Working Group