|project:||SwRI/NIST Code Sprint|
I am a PhD student at the Intelligent Autonomous Systems Laboratory of the University of Padua, Italy. My work focuses on people detection, tracking and re-identification from RGB-D sensors.
I report here below a summary of the tracking framerates for the main approaches I tested (measured on an Intel i7-3630QM processor at 2.4-3.4GHz with 4 cores and 8 threads). These framerates have been measured when publishing input images at 50 fps.
- original approach in run_from_kinect_nodelet.launch: 23.8 fps
- approach in run_kinect_nodelet_CodeSprint.launch (with HogSvmPCL node): 33 fps
- approach in run_kinect_CodeSprint_bis.launch (with PCL’s people detector): 25 fps.
In the figure below, I report Detection Error Trade-off (DET) curves which compare the main tracking approaches contributed to the human_tracker project in terms of False Positives Per Frame (x axis) and False Rejection Rate (y axis). They have been obtained varying the minimum confidence parameter for the detector using HOG+SVM. The best working point for these curves is located at the bottom-left corner (with FRR = 0% and FPPF = 0). For visualization purposes, the curves are reported in logarithmic scale.
It can be noticed how the approaches developed during this Code Sprint (red and green curves) obtain considerably less FPPF with respect to the original code (blue curve). Morover, they are also shown to be faster than the original approach.
As the final step of the code sprint, I created a ROS node which performs people detection with the ground based people detector present in PCL 1.7 (GroundBasedPeopleDetectionApp).
Instead of rgb and disparity images, this node takes as input a XYZRGB pointcloud, performs people detection and outputs a message containing the detected rois, as the other detection nodes of the human_tracker project. This node is called ground_based_people_detector (GPD) and it is used in run_kinect_CodeSprint_bis.launch of the System_Launch package. In this launch file, the detection cascade is composed of the GPD and the HaarDispAda nodes. The GPD node produces very good detections which are persistent in time and well centered on people, while the HaarDispAda node removes some false positives. The framerate is of about 25 fps, thus slightly lower than the approach in run_kinect_nodelet_CodeSprint.launch.
In the figure below, an example of the output of the nodes launched by run_kinect_CodeSprint_bis.launch is reported:
In order to better analyze detection results with the roiViewer package, I added a confidence field to the RoiRect message. It could be used to store the score computed by a detection node for every Roi.
I also modified the roiViewer so that it could display that confidence if the show_confidence parameter is set to true.
For reducing the latency and increasing the framerate of the detection cascade, I tried to reduce the detection nodelets to only two:
- Consistency nodelet
- HogSvm nodelet
The framerate considerably slowed down (from 28.2 to 9.8 fps) because the HogSvm nodelet was too slow to process all the rois outputted by the Consistency nodelet.
For this reason, I implemented a new nodelet, called HogSvmPCL, which exploits the HOG descriptor and the pre-trained Support Vector Machine in PCL 1.7 for detecting whole persons in RGB image patches. This new node is much faster than HogSvm while being very accurate in classification. By using the reduced cascade with HogSvmPCL in place of HogSvm (Consistency + HogSvmPCL), the tracking framerate tripled with respect to the Consistency + HogSvm approach (from 9.8 to 30 fps). The FRR decreased of 4% (from 18.24% to 22.48%) wrt the default code from SwRI, while the FPPF improved of 35% (from 0.58 to 0.38).
I also trained a SVM targeted to recognize upper bodies in order to be more robust when the lower part of a person is occluded or out of the image. This approach led to a further reduction of false positives of about 15%, while maintaining the same false rejection rate and framerate. From the launch files, the wholebody or halfbody classifier can be selected by setting the mode parameter and specifying the path to the classifier file with classifier_file.
In order to further improve accuracy, I exploited also disparity information adding to the detection cascade the HaarDispAda nodelet which uses Haar features on the disparity and Adaboost as a classifier. With respect to the detection cascade only composed by the Consistency and the HogSvmPCL nodelets, the accuracy considerably improved. In particular, the FRR decreased of 0.5% and the FPPF decreased of 55%.
The nodes of this cascade (Consistency + HogSvmPCL + HaarDispAda + ObjectTracking) can be launched with run_kinect_nodelet_CodeSprint.launch.
A node for creating XYZRGB pointclouds from RGB and disparity images has been contributed to the human_tracker repository. This node is called pointcloudPublisher and performs the following steps:
- reading images and camera parameters from folder
- pointcloud creation
- depth to rgb registration
- pointcloud publishing to /camera/depth_registered/points topic, which is the standard pointcloud topic for OpenNI devices.
This node allows to apply PCL’s people detector, which works on pointclouds, even if only RGB and disparity images are available. Later on, I will contribute a ROS node which exploits this detector.