Evaluation and
Benchmarking Working Group

Mission Statement

The Evaluation and Benchmarking MONAI working group aims at providing guidelines, infrastructure, and practical tools for evaluation and benchmarking of medical image analysis methods. It focuses on leading the community towards the identification and adoption of best practices for evaluation and benchmarking and on identifying practical solutions to improve reproducibility.

Group Leads

Dr. Annika Reinke

Dr. Annika Reinke

Deputy Head and Group Lead Validation of Intelligent Systems

German Cancer Research Center (DKFZ)

Benchmark Working Group Chair

Dr. Carole Sudre

Dr. Carole Sudre

Associate Professor

University College London

Benchmark Working Group Chair

Meeting Notes

Ongoing Projects

Reporting Guidelines Taskforce (Lead - Olivier Colliot)

  • Surveying current reporting practices and identifying areas for improvement
  • Development of guidelines around results reporting with a focus statistical aspects
  • Identification of proper calculation and methods for various procedures (e.g., confidence intervals) across different tasks and validation metrics
  • Implementation of recommended calculations for MONAI users

Benchmarking Datasets Taskforce (Lead - Michela Antonelli)

  • Data quality review for MICCAI 2025 lighthouse challenges
  • Identification of key characteristics for benchmarking datasets
  • Encouragement to develop new datasets according to best practice
  • Identification of relevant historical datasets to be used for benchmarking
  • Implementation of guidelines for upcoming datasets

Collaboration Opportunities

Community Engagement

  • Join our regular surveys
  • Contribute to evaluation metrics testing
  • Share your expertise in validation and benchmarking
  • Participate in standards development