Paper in IPCAI 2017 on “Video and Accelerometer-Based Motion Analysis for Automated Surgical Skills Assessment”

June 21st, 2017 Irfan Essa Posted in Activity Recognition, Aneeq Zia, Computer Vision, Eric Sarin, Medical, MICCAI, Vinay Bettadapura, Yachna Sharma No Comments »

Paper

  • A. Zia, Y. Sharma, V. Bettadapura, E.Sarin, and I. Essa (2017), “Video and Accelerometer-Based Motion Analysis for Automated Surgical Skills Assessment,” in Proceedings of Information Processing in Computer-Assisted Interventions (IPCAI), 2017. [PDF] [BIBTEX]
    @InProceedings{    2017-Zia-VAMAASSA,
      author  = {A. Zia and Y. Sharma and V. Bettadapura and E.Sarin
          and I. Essa},
      booktitle  = {Proceedings of Information Processing in
          Computer-Assisted Interventions (IPCAI)},
      month    = {June},
      pdf    = {http://www.cc.gatech.edu/~irfan/p/2017-Zia-VAMAASSA.pdf},
      title    = {Video and Accelerometer-Based Motion Analysis for
          Automated Surgical Skills Assessment},
      year    = {2017}
    }

Abstract

Purpose: Basic surgical skills of suturing and knot tying are an essential part of medical training. Having an automated system for surgical skills assessment could help save experts time and improve training efficiency. There have been some recent attempts at automated surgical skills assessment using either video analysis or acceleration data. In this paper, we present a novel approach for automated
assessment of OSATS based surgical skills and provide an analysis of different features on multi-modal data (video and accelerometer data).
Methods: We conduct the largest study, to the best of our knowledge, for basic surgical skills assessment on a dataset that contained video and accelerometer data for suturing and knot-tying tasks. We introduce “entropy based” features – Approximate Entropy (ApEn) and Cross-Approximate Entropy (XApEn), which quantify the amount of predictability and regularity of fluctuations in time-series data. The
proposed features are compared to existing methods of Sequential Motion Texture (SMT), Discrete Cosine Transform (DCT) and Discrete Fourier Transform (DFT), for surgical skills assessment.
Results: We report average performance of different features across all applicable OSATS criteria for suturing and knot tying tasks. Our analysis shows that the proposed entropy based features out-perform previous state-of-the-art methods using video data. For accelerometer data, our method performs better for suturing only. We also show that fusion of video and acceleration features can improve overall performance with the proposed entropy features achieving highest accuracy.
Conclusions: Automated surgical skills assessment can be achieved with high accuracy using the proposed entropy features. Such a system can significantly improve the efficiency of surgical training in medical schools and teaching hospitals.

  • Presented at The 8th International Conference on Information Processing in Computer-Assisted Interventions, in Barcelona, SPAIN, June 20-21, 2017.
  • Aneeq Zia awarded the “Young Investigator Travel Award” given to young investigators (including Ph.D. and MSc students and junior researchers) with accepted papers at IPCAI conference to attend IPCAI/CARS 2017.
  • This paper was also 1 of the 12 papers voted by the audience for a 25 minute long oral presentation and discussion session on the last day of conference (based on 5 minute short presentations given by all authors on the first day).
AddThis Social Bookmark Button

Paper (ACM MM 2016) “Leveraging Contextual Cues for Generating Basketball Highlights”

October 18th, 2016 Irfan Essa Posted in ACM MM, Caroline Pantofaru, Computational Photography and Video, Computer Vision, Papers, Sports Visualization, Vinay Bettadapura No Comments »

Paper

  • V. Bettadapura, C. Pantofaru, and I. Essa (2016), “Leveraging Contextual Cues for Generating Basketball Highlights,” in Proceedings of ACM International Conference on Multimedia (ACM-MM), 2016. [PDF] [WEBSITE] [arXiv] [BIBTEX]
    @InProceedings{    2016-Bettadapura-LCCGBH,
      arxiv    = {http://arxiv.org/abs/1606.08955},
      author  = {Vinay Bettadapura and Caroline Pantofaru and Irfan
          Essa},
      booktitle  = {Proceedings of ACM International Conference on
          Multimedia (ACM-MM)},
      month    = {October},
      organization  = {ACM},
      pdf    = {http://www.cc.gatech.edu/~irfan/p/2016-Bettadapura-LCCGBH.pdf},
      title    = {Leveraging Contextual Cues for Generating
          Basketball Highlights},
      url    = {http://www.vbettadapura.com/highlights/basketball/index.htm},
      year    = {2016}
    }

Abstract

2016-Bettadapura-LCCGBH

Leveraging Contextual Cues for Generating Basketball Highlights

The massive growth of sports videos has resulted in a need for automatic generation of sports highlights that are comparable in quality to the hand-edited highlights produced by broadcasters such as ESPN. Unlike previous works that mostly use audio-visual cues derived from the video, we propose an approach that additionally leverages contextual cues derived from the environment that the game is being played in. The contextual cues provide information about the excitement levels in the game, which can be ranked and selected to automatically produce high-quality basketball highlights. We introduce a new dataset of 25 NCAA games along with their play-by-play stats and the ground-truth excitement data for each basket. We explore the informativeness of five different cues derived from the video and from the environment through user studies. Our experiments show that for our study participants, the highlights produced by our system are comparable to the ones produced by ESPN for the same games.

AddThis Social Bookmark Button

Paper in IJCARS (2016) on “Automated video-based assessment of surgical skills for training and evaluation in medical schools”

September 2nd, 2016 Irfan Essa Posted in Activity Recognition, Aneeq Zia, Computer Vision, Eric Sarin, Mark Clements, Medical, MICCAI, Thomas Ploetz, Vinay Bettadapura, Yachna Sharma No Comments »

Paper

  • A. Zia, Y. Sharma, V. Bettadapura, E. L. Sarin, T. Ploetz, M. A. Clements, and I. Essa (2016), “Automated video-based assessment of surgical skills for training and evaluation in medical schools,” International Journal of Computer Assisted Radiology and Surgery, vol. 11, iss. 9, pp. 1623-1636, 2016. [WEBSITE] [DOI] [BIBTEX]
    @Article{    2016-Zia-AVASSTEMS,
      author  = {Zia, Aneeq and Sharma, Yachna and Bettadapura,
          Vinay and Sarin, Eric L and Ploetz, Thomas and
          Clements, Mark A and Essa, Irfan},
      doi    = {10.1007/s11548-016-1468-2},
      journal  = {International Journal of Computer Assisted
          Radiology and Surgery},
      month    = {September},
      number  = {9},
      pages    = {1623--1636},
      publisher  = {Springer Berlin Heidelberg},
      title    = {Automated video-based assessment of surgical skills
          for training and evaluation in medical schools},
      url    = {http://link.springer.com/article/10.1007/s11548-016-1468-2},
      volume  = {11},
      year    = {2016}
    }

Abstract

2016-Zia-AVASSTEMS

Sample frames from our video dataset

Purpose: Routine evaluation of basic surgical skills in medical schools requires considerable time and effort from supervising faculty. For each surgical trainee, a supervisor has to observe the trainees in- person. Alternatively, supervisors may use training videos, which reduces some of the logistical overhead. All these approaches, however, are still incredibly time consuming and involve human bias. In this paper, we present an automated system for surgical skills assessment by analyzing video data of surgical activities.

Method : We compare different techniques for video-based surgical skill evaluation. We use techniques that capture the motion information at a coarser granularity using symbols or words, extract motion dynamics using textural patterns in a frame kernel matrix, and analyze fine-grained motion information using frequency analysis. Results: We were successfully able to classify surgeons into different skill levels with high accuracy. Our results indicate that fine-grained analysis of motion dynamics via frequency analysis is most effective in capturing the skill relevant information in surgical videos.

Conclusion: Our evaluations show that frequency features perform better than motion texture features, which in turn perform better than symbol/word-based features. Put succinctly, skill classification accuracy is positively correlated with motion granularity as demonstrated by our results on two challenging video datasets.

AddThis Social Bookmark Button

Paper (WACV 2016) “Discovering Picturesque Highlights from Egocentric Vacation Videos”

March 7th, 2016 Irfan Essa Posted in Computational Photography and Video, Computer Vision, Daniel Castro, PAMI/ICCV/CVPR/ECCV, Vinay Bettadapura No Comments »

Paper

  • D. Castro, V. Bettadapura, and I. Essa (2016), “Discovering Picturesque Highlights from Egocentric Vacation Video,” in Proceedings of IEEE Winter Conference on Applications of Computer Vision (WACV), 2016. [PDF] [WEBSITE] [arXiv] [BIBTEX]
    @InProceedings{    2016-Castro-DPHFEVV,
      arxiv    = {http://arxiv.org/abs/1601.04406},
      author  = {Daniel Castro and Vinay Bettadapura and Irfan
          Essa},
      booktitle  = {Proceedings of IEEE Winter Conference on
          Applications of Computer Vision (WACV)},
      month    = {March},
      pdf    = {http://www.cc.gatech.edu/~irfan/p/2016-Castro-DPHFEVV.pdf},
      title    = {Discovering Picturesque Highlights from Egocentric
          Vacation Video},
      url    = {http://www.cc.gatech.edu/cpl/projects/egocentrichighlights/},
      year    = {2016}
    }

Abstract

2016-Castro-DPHFEVVWe present an approach for identifying picturesque highlights from large amounts of egocentric video data. Given a set of egocentric videos captured over the course of a vacation, our method analyzes the videos and looks for images that have good picturesque and artistic properties. We introduce novel techniques to automatically determine aesthetic features such as composition, symmetry, and color vibrancy in egocentric videos and rank the video frames based on their photographic qualities to generate highlights. Our approach also uses contextual information such as GPS, when available, to assess the relative importance of each geographic location where the vacation videos were shot. Furthermore, we specifically leverage the properties of egocentric videos to improve our highlight detection. We demonstrate results on a new egocentric vacation dataset which includes 26.5 hours of videos taken over a 14-day vacation that spans many famous tourist destinations and also provide results from a user-study to access our results.

 

AddThis Social Bookmark Button

Paper in MICCAI (2015): “Automated Assessment of Surgical Skills Using Frequency Analysis”

October 6th, 2015 Irfan Essa Posted in Activity Recognition, Aneeq Zia, Eric Sarin, Mark Clements, Medical, MICCAI, Papers, Vinay Bettadapura, Yachna Sharma No Comments »

Paper

  • A. Zia, Y. Sharma, V. Bettadapura, E. Sarin, M. Clements, and I. Essa (2015), “Automated Assessment of Surgical Skills Using Frequency Analysis,” in International Conference on Medical Image Computing and Computer Assisted Interventions (MICCAI), 2015. [PDF] [BIBTEX]
    @InProceedings{    2015-Zia-AASSUFA,
      author  = {A. Zia and Y. Sharma and V. Bettadapura and E.
          Sarin and M. Clements and I. Essa},
      booktitle  = {International Conference on Medical Image Computing
          and Computer Assisted Interventions (MICCAI)},
      month    = {October},
      pdf    = {http://www.cc.gatech.edu/~irfan/p/2015-Zia-AASSUFA.pdf},
      title    = {Automated Assessment of Surgical Skills Using
          Frequency Analysis},
      year    = {2015}
    }

Abstract

We present an automated framework for a visual assessment of the expertise level of surgeons using the OSATS (Objective Structured Assessment of Technical Skills) criteria. Video analysis technique for extracting motion quality via  frequency coefficients is introduced. The framework is tested in a case study that involved analysis of videos of medical students with different expertise levels performing basic surgical tasks in a surgical training lab setting. We demonstrate that transforming the sequential time data into frequency components effectively extracts the useful information differentiating between different skill levels of the surgeons. The results show significant performance improvements using DFT and DCT coefficients over known state-of-the-art techniques.

AddThis Social Bookmark Button