Paper in IPCAI 2017 on “Video and Accelerometer-Based Motion Analysis for Automated Surgical Skills Assessment”

June 21st, 2017 Irfan Essa Posted in Activity Recognition, Aneeq Zia, Computer Vision, Eric Sarin, Medical, MICCAI, Vinay Bettadapura, Yachna Sharma No Comments »

Paper

  • A. Zia, Y. Sharma, V. Bettadapura, E.Sarin, and I. Essa (2017), “Video and Accelerometer-Based Motion Analysis for Automated Surgical Skills Assessment,” in Proceedings of Information Processing in Computer-Assisted Interventions (IPCAI), 2017. [PDF] [BIBTEX]
    @InProceedings{    2017-Zia-VAMAASSA,
      author  = {A. Zia and Y. Sharma and V. Bettadapura and E.Sarin
          and I. Essa},
      booktitle  = {Proceedings of Information Processing in
          Computer-Assisted Interventions (IPCAI)},
      month    = {June},
      pdf    = {http://www.cc.gatech.edu/~irfan/p/2017-Zia-VAMAASSA.pdf},
      title    = {Video and Accelerometer-Based Motion Analysis for
          Automated Surgical Skills Assessment},
      year    = {2017}
    }

Abstract

Purpose: Basic surgical skills of suturing and knot tying are an essential part of medical training. Having an automated system for surgical skills assessment could help save experts time and improve training efficiency. There have been some recent attempts at automated surgical skills assessment using either video analysis or acceleration data. In this paper, we present a novel approach for automated
assessment of OSATS based surgical skills and provide an analysis of different features on multi-modal data (video and accelerometer data).
Methods: We conduct the largest study, to the best of our knowledge, for basic surgical skills assessment on a dataset that contained video and accelerometer data for suturing and knot-tying tasks. We introduce “entropy based” features – Approximate Entropy (ApEn) and Cross-Approximate Entropy (XApEn), which quantify the amount of predictability and regularity of fluctuations in time-series data. The
proposed features are compared to existing methods of Sequential Motion Texture (SMT), Discrete Cosine Transform (DCT) and Discrete Fourier Transform (DFT), for surgical skills assessment.
Results: We report average performance of different features across all applicable OSATS criteria for suturing and knot tying tasks. Our analysis shows that the proposed entropy based features out-perform previous state-of-the-art methods using video data. For accelerometer data, our method performs better for suturing only. We also show that fusion of video and acceleration features can improve overall performance with the proposed entropy features achieving highest accuracy.
Conclusions: Automated surgical skills assessment can be achieved with high accuracy using the proposed entropy features. Such a system can significantly improve the efficiency of surgical training in medical schools and teaching hospitals.

  • Presented at The 8th International Conference on Information Processing in Computer-Assisted Interventions, in Barcelona, SPAIN, June 20-21, 2017.
  • Aneeq Zia awarded the “Young Investigator Travel Award” given to young investigators (including Ph.D. and MSc students and junior researchers) with accepted papers at IPCAI conference to attend IPCAI/CARS 2017.
  • This paper was also 1 of the 12 papers voted by the audience for a 25 minute long oral presentation and discussion session on the last day of conference (based on 5 minute short presentations given by all authors on the first day).
AddThis Social Bookmark Button

Paper in AAAI’s ICWSM (2017) “Selfie-Presentation in Everyday Life: A Large-Scale Characterization of Selfie Contexts on Instagram”

May 18th, 2017 Irfan Essa Posted in Computational Journalism, Computational Photography and Video, Computer Vision, Face and Gesture, Julia Deeb-Swihart, Papers, Social Computing No Comments »

Paper

  • J. Deeb-Swihart, C. Polack, E. Gilbert, and I. Essa (2017), “Selfie-Presentation in Everyday Life: A Large-Scale Characterization of Selfie Contexts on Instagram,” in In Proceedings of The International AAAI Conference on Web and Social Media (ICWSM), 2017. [PDF] [BIBTEX]
    @InProceedings{    2017-Deeb-Swihart-SELLCSCI,
      author  = {Julia Deeb-Swihart and Christopher Polack and Eric
          Gilbert and Irfan Essa},
      booktitle  = {In Proceedings of The International AAAI Conference
          on Web and Social Media (ICWSM)},
      month    = {May},
      organization  = {AAAI},
      pdf    = {http://www.cc.gatech.edu/~irfan/p/2017-Deeb-Swihart-SELLCSCI.pdf},
      title    = {Selfie-Presentation in Everyday Life: A Large-Scale
          Characterization of Selfie Contexts on Instagram},
      year    = {2017}
    }

Abstract

Carefully managing the presentation of self via technology is a core practice on all modern social media platforms. Recently, selfies have emerged as a new, pervasive genre of identity performance. In many ways unique, selfies bring us full circle to Goffman—blending the online and offline selves together. In this paper, we take an empirical, Goffman-inspired look at the phenomenon of selfies. We report a large-scale, mixed-method analysis of the categories in which selfies appear on Instagram—an online community comprising over 400M people. Applying computer vision and network analysis techniques to 2.5M selfies, we present a typology of emergent selfie categories which represent emphasized identity statements. To the best of our knowledge, this is the first large-scale, empirical research on selfies. We conclude, contrary to common portrayals in the press, that selfies are really quite ordinary: they project identity signals such as wealth, health and physical attractiveness common to many online media, and to offline life.

AddThis Social Bookmark Button

Paper in IJCNN (2017) “Towards Using Visual Attributes to Infer Image Sentiment Of Social Events”

May 18th, 2017 Irfan Essa Posted in Computational Journalism, Computational Photography and Video, Computer Vision, Machine Learning, Papers, Unaiza Ahsan No Comments »

Paper

  • U. Ahsan, M. D. Choudhury, and I. Essa (2017), “Towards Using Visual Attributes to Infer Image Sentiment Of Social Events,” in Proceedings of The International Joint Conference on Neural Networks, Anchorage, Alaska, US, 2017. [PDF] [BIBTEX]
    @InProceedings{    2017-Ahsan-TUVAIISSE,
      address  = {Anchorage, Alaska, US},
      author  = {Unaiza Ahsan and Munmun De Choudhury and Irfan
          Essa},
      booktitle  = {Proceedings of The International Joint Conference
          on Neural Networks},
      month    = {May},
      pdf    = {http://www.cc.gatech.edu/~irfan/p/2017-Ahsan-TUVAIISSE.pdf},
      publisher  = {International Neural Network Society},
      title    = {Towards Using Visual Attributes to Infer Image
          Sentiment Of Social Events},
      year    = {2017}
    }

Abstract

Widespread and pervasive adoption of smartphones has led to instant sharing of photographs that capture events ranging from mundane to life-altering happenings. We propose to capture sentiment information of such social event images leveraging their visual content. Our method extracts an intermediate visual representation of social event images based on the visual attributes that occur in the images going beyond
sentiment-specific attributes. We map the top predicted attributes to sentiments and extract the dominant emotion associated with a picture of a social event. Unlike recent approaches, our method generalizes to a variety of social events and even to unseen events, which are not available at training time. We demonstrate the effectiveness of our approach on a challenging social event image dataset and our method outperforms state-of-the-art approaches for classifying complex event images into sentiments.

AddThis Social Bookmark Button

Paper in IEEE WACV (2017): “Complex Event Recognition from Images with Few Training Examples”

March 27th, 2017 Irfan Essa Posted in Computational Journalism, Computational Photography and Video, Computer Vision, PAMI/ICCV/CVPR/ECCV, Papers, Unaiza Ahsan No Comments »

Paper

  • U. Ahsan, C. Sun, J. Hays, and I. Essa (2017), “Complex Event Recognition from Images with Few Training Examples,” in IEEE Winter Conference on Applications of Computer Vision (WACV), 2017. [PDF] [arXiv] [BIBTEX]
    @InProceedings{    2017-Ahsan-CERFIWTE,
      arxiv    = {https://arxiv.org/abs/1701.04769},
      author  = {Unaiza Ahsan and Chen Sun and James Hays and Irfan
          Essa},
      booktitle  = {IEEE Winter Conference on Applications of Computer
          Vision (WACV)},
      month    = {March},
      pdf    = {http://www.cc.gatech.edu/~irfan/p/2017-Ahsan-CERFIWTE.pdf},
      title    = {Complex Event Recognition from Images with Few
          Training Examples},
      year    = {2017}
    }

Abstract

We propose to leverage concept-level representations for complex event recognition in photographs given limited training examples. We introduce a novel framework to discover event concept attributes from the web and use that to extract semantic features from images and classify them into social event categories with few training examples. Discovered concepts include a variety of objects, scenes, actions and event subtypes, leading to a discriminative and compact representation for event images. Web images are obtained for each discovered event concept and we use (pre-trained) CNN features to train concept classifiers. Extensive experiments on challenging event datasets demonstrate that our proposed method outperforms several baselines using deep CNN features directly in classifying images into events with limited training examples. We also demonstrate that our method achieves the best overall accuracy on a data set with unseen event categories using a single training example.

AddThis Social Bookmark Button

Paper in M2CAI (workshop MICCAI) on “Fine-tuning Deep Architectures for Surgical Tool Detection” and results of Tool Detection Challange

October 21st, 2016 Irfan Essa Posted in Aneeq Zia, Awards, Computer Vision, Daniel Castro, Medical, MICCAI No Comments »

Paper

  • A. Zia, D. Castro, and I. Essa (2016), “Fine-tuning Deep Architectures for Surgical Tool Detection,” in Workshop and Challenges on Modeling and Monitoring of Computer Assisted Interventions (M2CAI), Held in Conjunction with International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI), Athens, Greece, 2016. [PDF] [WEBSITE] [BIBTEX]
    @InProceedings{    2016-Zia-FDASTD,
      address  = {Athens, Greece},
      author  = {Aneeq Zia and Daniel Castro and Irfan Essa},
      booktitle  = {Workshop and Challenges on Modeling and Monitoring
          of Computer Assisted Interventions (M2CAI), Held in
          Conjunction with International Conference on Medical
          Image Computing and Computer Assisted Intervention
          (MICCAI)},
      month    = {October},
      pdf    = {http://www.cc.gatech.edu/~irfan/p/2016-Zia-FDASTD.pdf},
      title    = {Fine-tuning Deep Architectures for Surgical Tool
          Detection},
      url    = {http://www.cc.gatech.edu/cpl/projects/deepm2cai/},
      year    = {2016}
    }

Abstract

Visualization of some of the training videos.

Understanding surgical workflow has been a key concern of the medical research community. One of the main advantages of surgical workflow detection is real-time operating room (OR) scheduling. For hospitals, each minute of OR time is important in order to reduce cost and increase patient throughput. Traditional approaches in this field generally tackle the video analysis using hand-crafted video features to facilitate the tool detection. Recently, Twinanda et al. presented a CNN architecture ’EndoNet’ which outperformed previous methods for both surgical tool detection and surgical phase detection. Given the recent success of these networks, we present a study of various architectures coupled with a submission to the M2CAI Surgical Tool Detection challenge. We achieved a top-3 result for the M2CAI competition with a mAP of 37.6.

 

AddThis Social Bookmark Button