This README describes the Daimler Multi-Cue Occluded Pedestrian Classification Benchmark introduced in the publication:
M. Enzweiler, A. Eigenstetter, B. Schiele and D. M. Gavrila,
Multi-Cue Pedestrian Classification with Partial Occlusion Handling,
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2010.
This dataset contains a collection of pedestrian (non-occluded and partially occluded) and non-pedestrian images. It is made publicly available to academic and non-academic entities for research purposes.
This dataset is made freely available to academic and non-academic entities for research purposes such as academic research, teaching, scientific publications, or personal experimentation. Permission is granted to use, copy, and distribute the data given that you agree:
Our training and test samples consist of manually labeled pedestrian and non-pedestrian bounding boxes in images captured from a vehicle-mounted calibrated stereo camera rig in an urban environment. For each manually labeled pedestrian, we created additional samples by geometric jittering. Non-pedestrian samples were the result of a shape detection pre-processing step with relaxed threshold setting, i.e. containing a bias towards more difficult patterns.
Dense stereo is computed using the semi-global matching algorithm (H. Hirschmueller, Stereo processing by semi-global matching and mutual information, IEEE PAMI, 30(2):328-341, 2008) To compute dense optical flow, we use structure- and motion-adaptive regularized flow (A. Wedel et al., Structure- and motion-adaptive regularization for high accuracy optic flow, ICCV, 2009).
Training and test samples have a resolution of 48 x 96 pixels with a 12-pixel border around the pedestrians. Note, that the experiments in our paper (see above) were done on 36 x 84 pixel images with a border of 6 pixels, i.e. crops of the provided dataset, with a three-component layout corresponding to head, torso, legs. For publication of the dataset, we chose to provide images with a larger border and without a pre-defined component layout, to allow for higher flexibility in the selection of components.
Datasets are provided in Matlab .mat format which contain a N (rows) x M (cols) matrix with N the number of samples and M the vectorized dimension of the images (48*96 = 4608). Images were vectorized using a row-wise scheme (note that Matlab typically uses column-wise ordering). Samples are aligned across image cues, so that the n-th sample in intensity corresponds to the n-th sample in stereo and flow data.
Pedestrians | Non-Pedestrians | |
Training Set | 52112 samples pedTrainIntensity.mat pedTrainStereo.mat pedTrainFlow.mat |
32465 samples nonpedTrainIntensity.mat nonpedTrainStereo.mat nonpedTrainFlow.mat |
Test Set (Non-Occluded) | 25608 samples pedTestIntensity.mat pedTestStereo.mat pedTestFlow.mat |
16235 samples nonpedTestIntensity.mat nonpedTestStereo.mat nonpedTestFlow.mat |
Test Set (Partially Occluded) | 11160 samples pedOccludedTestIntensity.mat pedOccludedTestStereo.mat pedOccludedTestFlow.mat |
16235 samples nonpedTestIntensity.mat nonpedTestStereo.mat nonpedTestFlow.mat |
In intensity images, each pixel encodes gray-level intensity. In stereo images, each pixel encodes the estimated depth in meters. In flow images, each pixel encodes the estimated (horizontal) sub-pixel optical flow between two temporally aligned images. Note, that to prevent flow values to become negative, an offset of 127 has been added to the estimated flow value, i.e. a value of 127 corresponds to zero flow.
Here is Matlab sample code to load and visualize the data:
clear all; close all; %% load load('pedOccludedTestIntensity.mat'); load('pedOccludedTestStereo.mat'); load('pedOccludedTestFlow.mat'); %% visualize (the 3rd sample) whichSample=3; figure(1); imshow(reshape(pedOccludedTestIntensity(whichSample,:), 48,96)',[]); figure(2); imshow(reshape(pedOccludedTestStereo(whichSample,:), 48,96)',[]); colormap hot; figure(3); imshow(reshape(pedOccludedTestFlow(whichSample,:), 48,96)',[]); colormap hot;
The resulting figures should look like this:
Intensity | Stereo | Flow |
Please direct questions regarding the dataset and benchmarking procedure to Prof. Dr. Dariu Gavrila or Markus Enzweiler.