Research Blog: Motion Stills – Create beautiful GIFs from Live Photos

June 7th, 2016 Irfan Essa Posted in Computational Photography and Video, Computer Vision, In The News, Interesting, Matthias Grundmann, Projects No Comments »

Kudos to the team from Machine Perception at Google Research that just launched the Motion Still App to generate novel photos on an iOS device. This work is in part aimed at combining efforts like Video Textures and Video Stabilization and a lot more.

Today we are releasing Motion Stills, an iOS app from Google Research that acts as a virtual camera operator for your Apple Live Photos. We use our video stabilization technology to freeze the background into a still photo or create sweeping cinematic pans. The resulting looping GIFs and movies come alive, and can easily be shared via messaging or on social media.

Source: Research Blog: Motion Stills – Create beautiful GIFs from Live Photos

AddThis Social Bookmark Button

Paper in IEEE WACV (2015): “Finding Temporally Consistent Occlusion Boundaries using Scene Layout”

January 6th, 2015 Irfan Essa Posted in Computational Photography and Video, Computer Vision, Matthias Grundmann, PAMI/ICCV/CVPR/ECCV, Papers, S. Hussain Raza, Uncategorized No Comments »


  • S. H. Raza, A. Humayun, M. Grundmann, D. Anderson, and I. Essa (2015), “Finding Temporally Consistent Occlusion Boundaries using Scene Layout,” in Proceedings of IEEE Winter Conference on Applications of Computer Vision (WACV), 2015. [PDF] [DOI] [BIBTEX]
    @InProceedings{    2015-Raza-FTCOBUSL,
      author  = {Syed Hussain Raza and Ahmad Humayun and Matthias
          Grundmann and David Anderson and Irfan Essa},
      booktitle  = {Proceedings of IEEE Winter Conference on
          Applications of Computer Vision (WACV)},
      doi    = {10.1109/WACV.2015.141},
      month    = {January},
      pdf    = {},
      publisher  = {IEEE Computer Society},
      title    = {Finding Temporally Consistent Occlusion Boundaries
          using Scene Layout},
      year    = {2015}


We present an algorithm for finding temporally consistent occlusion boundaries in videos to support segmentation of dynamic scenes. We learn occlusion boundaries in a pairwise Markov random field (MRF) framework. We first estimate the probability of a spatiotemporal edge being an occlusion boundary by using appearance, flow, and geometric features. Next, we enforce occlusion boundary continuity in an MRF model by learning pairwise occlusion probabilities using a random forest. Then, we temporally smooth boundaries to remove temporal inconsistencies in occlusion boundary estimation. Our proposed framework provides an efficient approach for finding temporally consistent occlusion boundaries in video by utilizing causality, redundancy in videos, and semantic layout of the scene. We have developed a dataset with fully annotated ground-truth occlusion boundaries of over 30 videos (∼5000 frames). This dataset is used to evaluate temporal occlusion boundaries and provides a much-needed baseline for future studies. We perform experiments to demonstrate the role of scene layout, and temporal information for occlusion reasoning in video of dynamic scenes.

AddThis Social Bookmark Button

At ICVSS (International Computer Vision Summer School) 2013, in Calabria, ITALY (July 2013)

July 11th, 2013 Irfan Essa Posted in Computational Photography, Computational Photography and Video, Daniel Castro, Matthias Grundmann, Presentations, S. Hussain Raza, Vivek Kwatra No Comments »

Teaching at the ICVSS 2013, in Calabria, Italy, July 2013 (Programme)

Computational Video: Post-processing Methods for Stabilization, Retargeting and Segmentation

Irfan Essa
(This work in collaboration with
Matthias Grundmann, Daniel Castro, Vivek Kwatra, Mei Han, S. Hussian Raza).


We address a variety of challenges for analysis and enhancement of Computational Video. We present novel post-processing methods to bridge the difference between professional and casually shot videos mostly seen on online sites. Our research presents solutions to three well-defined problems: (1) Video stabilization and rolling shutter removal in casually-shot, uncalibrated videos; (2) Content-aware video retargeting; and (3) spatio-temporal video segmentation to enable efficient video annotation. We showcase several real-world applications building on these techniques.

We start by proposing a novel algorithm for video stabilization that generates stabilized videos by employing L1-optimal camera paths to remove undesirable motions. We compute camera paths that are optimally partitioned into con- stant, linear and parabolic segments mimicking the camera motions employed by professional cinematographers. To achieve this, we propose a linear program- ming framework to minimize the first, second, and third derivatives of the result- ing camera path. Our method allows for video stabilization beyond conventional filtering, that only suppresses high frequency jitter. An additional challenge in videos shot from mobile phones are rolling shutter distortions. Modern CMOS cameras capture the frame one scanline at a time, which results in non-rigid image distortions such as shear and wobble. We propose a solution based on a novel mixture model of homographies parametrized by scanline blocks to correct these rolling shutter distortions. Our method does not rely on a-priori knowl- edge of the readout time nor requires prior camera calibration. Our novel video stabilization and calibration free rolling shutter removal have been deployed on YouTube where they have successfully stabilized millions of videos. We also discuss several extensions to the stabilization algorithm and present technical details behind the widely used YouTube Video Stabilizer.

We address the challenge of changing the aspect ratio of videos, by proposing algorithms that retarget videos to fit the form factor of a given device without stretching or letter-boxing. Our approaches use all of the screens pixels, while striving to deliver as much video-content of the original as possible. First, we introduce a new algorithm that uses discontinuous seam-carving in both space and time for resizing videos. Our algorithm relies on a novel appearance-based temporal coherence formulation that allows for frame-by-frame processing and results in temporally discontinuous seams, as opposed to geometrically smooth and continuous seams. Second, we present a technique, that builds on the above mentioned video stabilization approach. We effectively automate classical pan and scan techniques by smoothly guiding a virtual crop window via saliency constraints.

Finally, we introduce an efficient and scalable technique for spatio-temporal segmentation of long video sequences using a hierarchical graph-based algorithm. We begin by over-segmenting a volumetric video graph into space-time regions grouped by appearance. We then construct a region graph over the ob- tained segmentation and iteratively repeat this process over multiple levels to create a tree of spatio-temporal segmentations. This hierarchical approach gen- erates high quality segmentations, and allows subsequent applications to choose from varying levels of granularity. We demonstrate the use of spatio-temporal segmentation as users interact with the video, enabling efficient annotation of objects within the video.

Part of this talks will will expose attendees to use the Video Stabilizer on YouTube and the video segmentation system at Please find appropriate videos to test the systems.

Part of the work described above was done at Google, where Matthias Grundmann, Vivek Kwatra and Mei Han are, and Professor Essa is working as a Consultant. Part of the work were efforts of research by Matthias Grundmann, Daniel Castro and S. Hussain Raza, as part of their research efforts as students at GA Tech.

AddThis Social Bookmark Button

Paper in IEEE CVPR 2013 “Geometric Context from Videos”

June 27th, 2013 Irfan Essa Posted in Matthias Grundmann, PAMI/ICCV/CVPR/ECCV, Papers, S. Hussain Raza No Comments »

  • S. H. Raza, M. Grundmann, and I. Essa (2013), “Geoemetric Context from Video,” in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2013. [PDF] [WEBSITE] [VIDEO] [DOI] [BIBTEX]
    @InProceedings{    2013-Raza-GCFV,
      author  = {Syed Hussain Raza and Matthias Grundmann and Irfan
      booktitle  = {{Proceedings of IEEE Conference on Computer Vision
          and Pattern Recognition (CVPR)}},
      doi    = {10.1109/CVPR.2013.396},
      month    = {June},
      organization  = {IEEE Computer Society},
      pdf    = {},
      title    = {Geoemetric Context from Video},
      url    = {},
      video    = {},
      year    = {2013},
      bdsk-url-3  = {}


We present a novel algorithm for estimating the broad 3D geometric structure of outdoor video scenes. Leveraging spatio-temporal video segmentation, we decompose a dynamic scene captured by a video into geometric classes, based on predictions made by region-classifiers that are trained on appearance and motion features. By examining the homogeneity of the prediction, we combine predictions across multiple segmentation hierarchy levels alleviating the need to determine the granularity a priori. We built a novel, extensive dataset on geometric context of video to evaluate our method, consisting of over 100 ground-truth annotated outdoor videos with over 20,000 frames. To further scale beyond this dataset, we propose a semi-supervised learning framework to expand the pool of labeled data with high confidence predictions obtained from unlabeled data. Our system produces an accurate prediction of geometric context of video achieving 96% accuracy across main geometric classes.

via IEEE Xplore – Geometric Context from Videos.

AddThis Social Bookmark Button

Google I/O 2013: Secrets of Video Stabilization on YouTube

May 28th, 2013 Irfan Essa Posted in Computational Photography and Video, Google, In The News, Matthias Grundmann, Presentations, Vivek Kwatra No Comments »

Presentation at Google I/0 2013 by Matthias Grundmann, John Gregg, and Vivek Kwatra on our Video Stabilizer on YouTube

Video stabilization is a key component of YouTubes video enhancement tools and All YouTube uploads are automatically detected for shakiness and suggested stabilization if needed. This talk will describe the technical details behind our fully automatic one-click stabilization technology, including aspects such as camera path optimization, rolling shutter detection and removal, distributed computing for real-time previews, and camera shake detection. More info:

via Secrets of Video Stabilization on YouTube — Google I/O 2013.

AddThis Social Bookmark Button