A comprehensive dataset of breakfast preparation activities performed by 52 different individuals in 18 different kitchens, designed to reflect real-world recognition scenarios.
This dataset comprises 10 actions related to breakfast preparation. The dataset is to-date one of the largest fully annotated datasets available, designed to reflect real-world conditions for monitoring and analysis of daily activities.
Dataset specifications: ~77 hours of video (>4 million frames), 320×240 pixels resolution (down-sampled), 15 fps frame rate, 3-5 uncalibrated cameras per location.
Note: Large files are hosted externally on Dropbox or other services. Please contact the lab for access to external files.
The dataset includes predefined splits for evaluation:
Please cite the following papers when using this dataset:
Primary paper:
H. Kuehne, A. B. Arslan and T. Serre. The Language of Actions: Recovering the Syntax and Semantics of Goal-Directed Human Activities. CVPR, 2014.
Follow-up work:
H. Kuehne, J. Gall and T. Serre. An end-to-end generative framework for video segmentation and recognition. WACV, 2016.