Generative AI model validated in chest x-ray study

A group in South Korea has validated a generative AI model that could reduce reading times and increase chest x-ray reporting accuracy, according to a study published March 11 in Radiology.

In a reader study involving five radiologists interpreting 758 chest x-rays, use of the model (AIRead, Soombit.ai) reduced average reading times by 14 seconds per image and increased sensitivities for certain findings.

“The results of our study demonstrated that preliminary reports created by a multimodal generative AI model could aid radiologists in chest radiograph interpretation in terms of reading time, report quality, and accuracy,” noted lead author Eun Kyoung Hong, MD, of Mass General Brigham in Boston.

Previous studies have explored the potential of multimodal generative AI models for creating radiologic interpretations, yet work to validate the models for increasing reporting accuracy and efficiency through reader studies remains an unexplored area in the current literature, according to the authors. To that end, Hong and colleagues, including researchers from Seoul-based companies Soombit.ai and Kakaocorp, aimed to evaluate how well the generative AI model fits into current workflows and the interpretability of its outputs across different radiologists.

Five radiologists interpreted the chest radiographs in two sessions: without AI-generated reports and with AI-generated reports as preliminary reports. Next, two experienced thoracic radiologists compared reading times, reporting agreement, and quality scores (on a 5-point scale) between the first and second sessions.

Additionally, the group used a subset of 258 chest radiographs to assess the factual correctness of the reports, with sensitivities and specificities compared between the reports from the first and second sessions with use of a McNemar test.

An example that features an AI-generated report that identified “scattered pulmonary granulomatous calcifications.” In the first session, three of five readers specifically reported the presence of a nodule. Following the introduction of the AI-generated report, only one reader maintained the specific mention of a nodule, whereas the other four readers adapted their reports to include the AI-suggested terminology of “granulomatous calcifications.” AP = anterior-posterior. Image and caption courtesy of RSNA.An example that features an AI-generated report that identified “scattered pulmonary granulomatous calcifications.” In the first session, three of five readers specifically reported the presence of a nodule. Following the introduction of the AI-generated report, only one reader maintained the specific mention of a nodule, whereas the other four readers adapted their reports to include the AI-suggested terminology of “granulomatous calcifications.” AP = anterior-posterior. Image and caption courtesy of RSNA.

According to the analysis, the introduction of AI-generated reports reduced average reading times across all readers, with an average reduction from 34.2 seconds ± 20.4 to 19.8 seconds ± 12.5 with AI.

Also, report agreement scores among readers shifted from a median of 5 (interquartile range [IQR], 4-5) without AI reports to 5 (IQR, 4.5-5) with AI reports (p < 0.001); report quality scores changed from 4.5 (IQR, 4-5) without AI reports to 4.5 (IQR, 4.5-5) with AI reports (p < 0.001), the authors reported.

Finally, based on the subset analysis to assess factual correctness, the readers’ sensitivity for detecting various abnormalities increased significantly, including widened mediastinal silhouettes (84.3% to 90.8%; p < 0.001) and pleural lesions (77.7% to 87.4%; p < 0.001).

“The use of a domain-specific multimodal generative AI model increased the efficiency and quality of radiology report generation,” the group wrote.

In an accompanying editorial, Paul Babyn, MD, and Scott Adams, MD, PhD, of the University of Saskatchewan in Saskatoon, wrote that the ability for reports to be drafted using generative AI models represents a positive step for radiology reporting.

However, they noted that the tool used in the study considers only single chest x-rays it is presented with and does not compare findings with prior radiographs, which would allow consideration of clinically relevant changes between x-rays, nor does the model include relevant clinical context. Nonetheless, the era of generative AI in radiology holds great promise, they wrote.

“AI-generated reporting will undoubtedly continue to be an active area of research over the coming years due to its transformational potential in increasing radiologists’ efficiency and (hopefully) accuracy,” Babyn and Adams concluded.

The full study is available here.

Page 1 of 377
Next Page