Holistic OR Domain Modeling with Large Vision Language Models - MedAI #120

Overview

Explore innovative approaches to automated, comprehensive modeling of operating room environments using Large Vision Language Models in this Stanford University lecture. Delve into the creation and utilization of the first open-source 4D-OR dataset, capturing simulated surgeries with RGB-D sensors. Learn about the neural network-based semantic scene graph generation pipeline and its successful application to clinical role prediction and surgical phase recognition tasks. Discover the potential of this technology to enhance decision-making and patient safety during surgical procedures. Gain insights into the significant advancements in Knowledge Guidance using LVLMs for OR modeling and its transformative impact on the future of surgical data analysis.