GMU-Kitchens Scenes =================== Contains 9 scenes {'gmu_scene_001', ..., 'gmu_scene_009'} Each scene directory is named after the scene-name and has the following four subdirectories: i) Images: containing the rgb images of a scene ii) Depths: containing the depth images of a scene iii) Annotations: containing the xml files with the bounding box annotations of just the 11 BigBird objects. Each annotation file has the same name as the rgb image in the 'Images' subdirectory. We followed the same format as the VOC2007/VOC2012 datasets. iv) Annotations_withExtra: similar to Annotations but the xml files in this folder also contain ground truth bounding boxes for the extra objects that can be found scattered in the scenes. Object categories and corresponding labels ========================================== BigBird Objects: coca_cola_glass_bottle 1 coffee_mate_french_vanilla 2 honey_bunches_of_oats_honey_roasted 3 hunts_sauce 4 mahatma_rice 5 nature_valley_soft_baked_oatmeal_squares_cinnamon_brown_sugar 6 nature_valley_sweet_and_salty_nut_almond 7 palmolive_orange 8 pop_secret_light_butter 9 pringles_bbq 10 red_bull 11 Extra Objects (and scenes they appear) bowl_1 12 scenes 1,2 bowl_2 13 scenes 1,2 bowl_3 14 scenes 5,6 bowl_4 15 scenes 5,6 bowl_5 16 scene 9 mug_1 17 scenes 1,2 mug_2 18 scenes 1,2 mug_3 19 scenes 5,6 mug_4 20 scenes 5,6 mug_5 21 scene 9 salt 22 scenes 1,2 cleaner 23 scenes 3,4 Train/test split: ================= We did a 3-fold cross validated experiment. We did a random train/test split according to the scene names and prepared the following folds: FOLD 1: ------- train fold:{gmu_scene_001, gmu_scene_003, gmu_scene_006, gmu_scene_007, gmu_scene_008, gmu_scene_009} test fold:{gmu_scene_002, gmu_scene_004, gmu_scene_005} FOLD 2: ------- train fold:{gmu_scene_001, gmu_scene_002, gmu_scene_004, gmu_scene_005, gmu_scene_007, gmu_scene_008} test fold:{gmu_scene_003, gmu_scene_006, gmu_scene_009} FOLD 3: ------- train fold:{gmu_scene_002, gmu_scene_003, gmu_scene_004, gmu_scene_005, gmu_scene_006, gmu_scene_009} test fold:{gmu_scene_001, gmu_scene_007, gmu_scene_008} For each fold, we trained a model using images of 6 scenes in the train fold.