Evaluating the impact of different deface algorithms on deep learning segmentation software performance
Introduction
Data sharing is essential for advancing research in radiation oncology, particularly for training artificial intelligence (AI) models in medical imaging. However, privacy concerns necessitate de-identification of medical images, including defacing operations to remove facial features. This study evaluates the impact of defacing on AI-driven organ segmentation in head-and-neck (HN) computed tomography (CT) images.
Methods
Two defacing algorithms, DeIdentifier and mri_reface_0.3.3, were applied to 50 patient CT scans. Segmentation accuracy was assessed using two commercially available AI segmentation tools, INTContour and AccuContour®, and evaluated using Dice similarity coefficient (DSC), Hausdorff Distance at the 95th percentile (HD95), and Surface Dice Similarity Coefficients (SDSC) with 2 mm tolerance. Dose differences (D0.01cc) were calculated for each structure to evaluate potential clinical implications. Statistical comparisons were made using paired t-tests (p<0.05).
Results
The results showed that defacing significantly impacted segmentation of on-face structures (e.g., oral cavity, eyes, lacrimal glands) with reduced DSC (<0.9) and higher HD95 (>2.5 mm), while off-face structures (e.g., brainstem, spinal cord) remained largely unaffected (DSC >0.9, HD95 <2 mm). DeIdentifier better preserved Hounsfield Units (HU) and anatomical consistency than mri_reface, which introduced more variability, including HU shifts in air regions. Minor differences in segmentation accuracy were observed between defacing algorithms, with mri_reface showing slightly greater variability. AccuContour showed slightly greater segmentation variability than INTContour, particularly for small or complex structures. Dose distribution analysis revealed minimal differences (<20 cGy) in most structures, with the largest variation observed in the Brainstem (34 cGy), followed by Lips_NRG (28 cGy) and Brain (25 cGy).
Conclusion
These findings suggest that while defacing alters segmentation accuracy in on-face regions, its overall impact on off-face structures and radiation therapy planning is minimal. Future work should explore domain adaptation techniques to improve model robustness across defaced and non-defaced datasets, ensuring privacy while maintaining segmentation integrity.