Honda Research Institute is sponsoring the development of 3D technologies for driver-assistance systems in PCL.
For a complete list of all the present and past PCL code sprints please visit http://www.pointclouds.org/blog.
Click on any of the links below to find out more about our team of PCL developers that are participating in the sprint:
The Honda Research Institute code sprint was finished. All code was commited, the final report is attached below.
We’ve implemented almost everything, what was planned, and now we want to present our results.
First of all please watch this video with results on the whole Castro dataset. Road is marked with red, left and right road borders are indicated by green and blue lines respectively.
The main problem is the presence of holes in the road surface. This is caused by the holes in the input disparity maps. We decided not to inpaint them, because we have no information about scene in those points. But we will compute the quality of the labeling only in points with known disparity. It allows to estimate the results of our method independently of the quality of the method for generating disparity maps.
Our method has a significant advantage in situations with one or both curbs are clearly visible (with corresponding sidewalks). You can compare the result of the previous sprint’s method (left) with the result of our method (right) on the same frame (which has two clearly visible sidewalks).
Next, I’m going to show you the numerical results. Precision is a ratio of right detected road’s points to all detected pixels, recall is a percent of detected road’s points. Only points with known disparity are taken into account.
We have implemented an algorithm which processes frames independently (i.e. without the connection to the previous frame). Also, now we make an assumption that both of the curbs (left and right) are presented in the scene.
Below you can see a projection of the labeled DEM to the left image. Green points correspond to the left sidewalk, blue - to the right one. Red points mark the road surface. The algorithm couldn’t find the right curb on this image, so right side of the road was labeled uncorrectly. The good news is that the left curb was detected correctly.
However our goal is to label a road on the image, not on the DEM. So, if we mark each pixel with label corresponding to the DEM’s cell we get the following labeling of the road surface:
You can see a lot of holes in the road area. They caused by holes in the disparity map. We decided not to fill them, because someone/something can be situated there (we have no information).
A disparity map of this frame is shown below. Points without disparity are marked with red.
Hello everybody. Last few weeks I was trying to train an SVM for car recognition. For this purpose I was using some clouds that I had. These were the clouds of the city of Enschede, Netherlands, that I had manually labeled earlier. Training set consists of 401 clouds of cars and 401 cloud of the other objects (people, trees, signs etc.). As for the classifier, I was using Support Vector Machine from the libSVM library.
During the training I was using 5-fold cross validation and the grid search in order to get the best values of gamma and soft margin C (parameters of the Gaussian kernel). The best accuracy achived during cross validation was 91.2718% with Gamma and C equal and respectively.
The model obtained after training was then used for recognition. The set for recognition consists of the 401 cars and 401 other objects. Training and testing sets were taken randomly from different scanned streets. The best accuracy achived this far when trying to reconize test set is 90.7731% (728 correctly recognized objects of 802).
As for descriptors, I was using combination of RoPS feature and some global features such as height and width of the oriented bounding box. RoPS feature was calculated for the center of mass of the cloud with the support radius big enough to include all the points of the given cloud.
Since RoPS is better fits for the purpose of local feature extraction, I believe that using it with ISM and Hough Transform voting will result in higher accuracy.
Hello everybody. I’d like to thank Yulan Guo, one of the authors of the RoPS feature, for his help. I’ve tested my implementation against his and got the same results. I have also tested my implementation for memory leaks with the VLD and it works fine, no memory leaks were detected. Right now the code is ready for commit. And as always I have wrote a tutorial about using the code. Right now all is left is to discuss where to place the implemented code.
Hello everybody. I have finished implenting the ROPS feature. Next step I am going to do is to write to the authors and ask them about some samples of data and precomputed features, so that I could compare the result. After that I am planning to test ROPS feature for object recognition. For this purpose I am going to use Implicit Shape Model algorithm from PCL.
We are pleased to report, that labeling of the Castro dataset (6440 frames) is finished. Here are some examples of labeled images:
We also tested an algorithm, developed by Alex Trevor in the previous HRI code sprint. This algorithm segments points using their normal distribution, this makes it very sensitive to noise.
Basically, this algorithm computes a disparity for a stereo pair using its own dense matching method, implemented by Federico Tombari. But I additionally tested it using disparity maps precomputed by HRI. Here are the typical results (left - disparity is computed with Federico Tombari’s method, right - precomputed by HRI):
You can see that Federico Tombari’s method is friendlier to the normal-based algorithm. But it is not good enough for describing a scene, there are a lot of false positives.
Some noise is presented at the HRI’s disparity maps, even a lot of pixels have no valid disparity, sometimes there are no segments that are similar to road and there are a lot frames in which road was not found.
This algorithm has thresholds for disparity and doesn’t mark as “road” any point which doesn’t satisfy these thresholds. I didn’t take it to account because it would make these results not completely correct. Therefore, 50% recall would be a very good result.
The goal is find all pixels, that belong to the road. Total results are on the image below (precision is a ratio of right detected road’s points to all detected pixels, recall is a percent of detected road’s points):
Hello everybody. I finally finished the code for simple features. I’ve implemented a pcl::MomentOfInertiaEstimation class which allows to obtain descriptors based on eccentricity and moment of inertia. This class also allows to extract axis aligned and oriented bounding boxes of the cloud. But keep in mind that extracted OBB is not the minimal possible bounding box.
The idea of the feature extraction method is as follows. First of all the covariance matrix of the point cloud is calculated and its eigen values and vectors are extracted. You can consider that the resultant eigen vectors are normalized and always form the right-handed coordinate system (major eigen vector represents X-axis and the minor vector represents Z-axis). On the next step the iteration process takes place. On each iteration major eigen vector is rotated. Rotation order is always the same and is performed around the other eigen vectors, this provides the invariance to rotation of the point cloud. Henceforth, we will refer to this rotated major vector as current axis.
For every current axis moment of inertia is calculated. Moreover, current axis is also used for eccentricity calculation. For this reason current vector is treated as normal vector of the plane and the input cloud is projected onto it. After that eccentricity is calculated for the obtained projection.
Implemented class also provides methods for getting AABB and OBB. Oriented bounding box is computed as AABB along eigen vectors.
At first, we decided to implement some simple features such as:
I’ve already started implementing them. After this step I will implement some more complex descriptors (e.g. 3D SURF, RoPS - Rotational Projection Statistics). And finally I’m going to use machine learning methods for the object recognition.
For evaluation of the developed algorithm we need ground truth so we decided to outsource manual labelling. For this purpose a highly efficient tool was developed. A person could easily solve this task however there are some difficulties.
First of all, what to do, if there are several separated roads in the scene? The solution is to mark as “road” only the pixels of the road, on which the vehicle is. Below is an example of such frame (left) and the labeling for it (right).
How to label an image if two different roads on the previous frames are merging in the current frame? We decided to mark pixels of the first road and the second road pixels lying above the horizontal line, drawn through the end of the curb that separates the roads. Here an example for explanation:
We optimize manual labelling time in 10 times in contrast to our initial version and now we could obtain enough labelled data in reasonable time. All results will be publicly available later.
The project “Fast 3D cluster recognition of pedestrians and cars in uncluttered scenes” has been started!
The project “Part-based 3D recognition of pedestrians and cars in cluttered scenes” has been started!
A few words about the project: the goal is to detect a drivable area (continuously flat area, segmented by a height gap such as curb). As an input we have two rectified images from the cameras on the car’s roof and a disparity map. An example of such images is below.
The point cloud, that was computed from them:
The point cloud is converted into the Digital Elevation Map (DEM) format to simplify the task. DEM is a grid in column-disparity space with heights associated to each node. A projection of DEM onto the left image is illustrated below.
On the following image you can see that despite of a low resolution of the DEM it is still possible to distinguish the road from the sidewalk.
A front view of the DEM in 3D space (nodes without corresponding points, i.e. the disparity map had no points, that should be projected onto this node, are marked with red):
DEM as a point cloud:
As a starting point we are going to implement an algorithm: J. Siegemund, U. Franke, and W. Forstner, “A temporal filter approach for detection and reconstruction of curbs and road surfaces based on conditional random fields,” in Proc. IEEE Intelligent Vehicles Symp., 2011, pp. 637-642.
All source code related to the project can be found here.
The Honda Research Code Sprint for ground segmentation from stereo has been completed. PCL now includes tools for generating disparity images and point clouds from stereo data courtesy of Federico Tombari, as well as tools for segmenting a ground surface from such point clouds from myself. Attached is a report detailing the additions to PCL and the results, as well as a video overview of the project. There is a demo available in trunk apps, as pcl_stereo_ground_segmentation.