Who's Better? Who's Best?: Pairwise Deep Ranking for Skill Determination

March 2018

Hazel Doughty, Dima Damen, Walterio Mayol-Cuevas

We present a method for assessing skill of performance from video, applicable to a variety of tasks, ranging from surgery to drawing and rolling pizza dough. We formulate the problem as pairwise (who's better) and overall (who's best)ranking of video collections, using supervised deep ranking. We propose a novel loss function that learns discriminative features when a pair of videos exhibit variance in skill, and learns shared features when a pair of videos exhibit comparable skill levels. Results demonstrate our method is applicable across tasks, with the percentage of correctly ordered pairs of videos ranging from 70% to 83% for four datasets. We demonstrate the robustness of our approach via sensitivity analysis of its parameters.

We see this work as effort toward the automated and objective organisation of how-to videos and overall, generic skill determination in video.

Skill Determination Overview


Hazel Doughty, Dima Damen and Walterio Mayol-Cuevas (2018). Who's Better, Who's Best: Skill Determination in Video using Deep Ranking. CVPR. PDF arXiv


The EPIC-Skills 2018 Dataset is available here. It contains the videos for the Drawing and Chopstick Using tasks alongside the annotations for all tasks.

The Surgery videos are taken from the JIGSAWS dataset and the Dough Rolling videos are taken from the CMU-MMAC pizza making activity.