PCL Developers blog

Roman Shapovalov

This is my personal page

email:shapovalov@graphics.cs.msu.ru
project:Geometric object recognition
mentor:Aitor Aldoma (Radu B. Rusu [Zoltan-Csaba Marton, Vincent Rabaud, Nico Blodow, Michael Dixon])

About me

I’m a student at Moscow State University.

My hobbies include urban orienteereing, snowboarding and editing Wikipedia. See also my homepage.

Project summary / Roadmap

In this project we would like to implement new object descriptors and test them together with the already existent descriptors. Moreover, we are interested in the pose of the objects on the scene and ideally we would like to handle several level of occlusion.

We would like to focus on geometric characteristics of the surfaces/objects but we can decide to incorporate other kind of information like color,etc.

For benchmarking, we might use different sources for training like CAD models and training data obtained from real sensors like the Kinect. We might use existing databases like the one presented in the Perception Challenge from Willow Garage or create new data sets that better fit our specific needs.

Finally, one very important piece missing into PCL consists of a generic descriptor/feature matcher that can be used to exchange the type of descriptor being used for optimal benchmarking.

Generic Trainer / Matcher

Roadmap

Here is a brief outline of my GSoC project milestones:

  • Fill out this page
  • Decide the appropiate benchmark data set for object reconition and pose.
    • CAD models.
    • Data obtained from the Kinect.
  • Implement generic trainer
    • Given training data in a certain format, it should need to load the data and compute the specific descriptor on the given data. Finally, make all the needed information persistent for the matcher.
    • Documentation.
    • Some descriptors might need specific subclasses to fulfill its specific needs.
  • Implement generic matcher
    • Similar to the generic trainer but for matching purposes.
    • Needs to be able to interface with the data generated on training.
  • Benchmarking of existing descriptors.
    • Eventually create a new training set.
    • Evaluate accuracy in object recognition and pose, time performance, etc.
  • Implement new descriptors.

Click here for a more detailed roadmap

Recent status updates

Debugged Harris vertical detector
Tuesday, July 12, 2011
../../_images/gsoc13.png
../../_images/cat-correct-dir.PNG

Fig. 1. Psychodelic cat.

It seems I finally debugged the detector and dominant orientation estimation. The color shows detector response at some scale (red is low, blue is high), spheres are keypoints at local maxima.

Now I want to check its robustness to transformations such as rotation in horisontal axes, holes, subsampling etc. Unfortunately, SHREC page does not contain ground truth correspondences, so I emailed the organisers about that.

Sixth week
Thursday, July 07, 2011
../../_images/gsoc13.png

This week I returned to work on the project. I wanted to make sure the implementation of Harris vertical detector works correctly and tested it on artificial data. The result did not look okay. So I debugged the code and found one source of errors.

It seems that in the new version of FLANN the interface changed and the indices returned by radius search were not sorted. So a lot of points were ignored during neighborhood analysis.

I fixed that and the keypoints look great on the sphere:

../../_images/sphere-space.PNG

Here the colour codes the detector response. The dominant orientations are estimated incorrectly. I am going to fix it next weak and then compare the detector with some other ones.

Fifth week
Saturday, June 25, 2011
../../_images/gsoc13.png

As I described earlier, I had faced the problem of normal orientation. When the orientations are random, descriptor and detector could be estimated incorrectly. So I tried to make them invariant to individual normal orientations.

Since my implementation of Harris detector uses only squares of projections, it does not require to have consistent orientations. So I needed to change implementation only for the dominant direction estimation. It attempts to estimate the direction where the most normals are directed (weighted window is used). So I decided to vote for the directions in pi instead of 2pi, which makes the method insensitive to orientations.

I am leaving to the UK for a summer school in a few hours, so I won’t be able to work during the next week. On returning I plan to test the repeatability of this detetor on SHREC dataset and look if the dominant direction is estimated robustly. Then I plan to use spin images to estimate the quality of the overall voting framework.

Lolcatz, or Keypoint detection
Friday, June 17, 2011
../../_images/gsoc13.png

Everybody is posting fancy pictures, so as me. I implemented Harris height map keypoints, here is visualization.

../../_images/cat-keypoints.PNG

Note that I intensionally set high non-maxima suppression interval to make the picture visually plausible.

The directions for voting are defined as described in my previous post. The problem is the normal map is discontinuous. For example, if all normals are directed towards the model centroid, for the points whose tangent goes through the centroid there are discontinuties. In such cases the dominant orientation cannot be estimated properly.

../../_images/cat-normals.PNG

I tried to use MLS to get a consistent map of normals, but fixed-radius MLS failed on that model. Another solution is to use greedy triangulation, but it is unlikely to work with real-world scans. So I am going to ignore the orientation of normals and vote for two locations.

Third week
Monday, June 13, 2011
../../_images/gsoc13.png

As I noted earlier, both keypoint detectors implemented in PCL are not suitable for monochrome unorganized clouds. So I needed to come up with a new one.

Vertical world prior (though not always holds) might help in recognition. In many applications we know that the vertical axis of an object remains constant. For example, a car is usually oriented horizontally, while a pole is usually strictly vertical. On the opposite, car toy models might have any orientation. So, if the application allows it, the keypoint detector should be invariant only to horizontal rotations. Scale invariance (across octaves) often does not matter too, since e.g. car parts are usually of similar size.

Therefore, it is natural to consider the cloud as a height map. Derivatives of this function along the horizontal axes are essentially the projections of the point normals to those axes. So, the Harris operator could be computed across the scale-space. The maximums of the operator gives (hopefully) reliable keypoints. Intuitively, keypoints are the points where the normal projection to the horizontal plane changes significantly.

It is still unclear how to estimate the dominant orientation in that setting (the horizontal keypoint orientation may be required by the voting framework). Like in images, we can take the direction of maximum gradient, which means here the most horiontal normal (summed in gaussian weighted window). For horizontal planes there is no dominant orientation, but the Harris operator value is low there, so it fits to the defined keypoint detector. I implemented these ideas during the previous week and now ready to verify them experimentally.

Not connected to that, we also encountered a problem with FLANN radius search. We need that to estimate peaks in the ISM voting space. So we declare a point structure like this:

struct VoteISM
{
  PCL_ADD_POINT4D;
  union
  {
    struct
    {
      float votePower;
    };
    float data_c[4];
  };

  EIGEN_MAKE_ALIGNED_OPERATOR_NEW
} EIGEN_ALIGN16;

POINT_CLOUD_REGISTER_POINT_STRUCT (VoteISM,           // here we assume a XYZ + "test" (as fields)
  (float, x, x)
  (float, y, y)
  (float, z, z)
  (float, votePower, votePower)
)

pcl::KdTreeFLANN<VoteISM>::radiusSearch() works correctly only if the sorted_ flag if false. If the result should be sorted, there appear repetitions in indices. Obviously, sorting does not work right:

int n = 0;
int* indices_ptr = NULL;
DistanceType* dists_ptr = NULL;
if (indices.cols > 0) {
    n = indices.cols;
    indices_ptr = indices[0];
    dists_ptr = dists[0];
}
RadiusResultSet<DistanceType> result_set(radius, indices_ptr, dists_ptr, n);
nnIndex->findNeighbors(result_set, query[0], searchParams);
size_t cnt = result_set.size();
if (n > 0 && searchParams.sorted) {
    std::sort(make_pair_iterator(dists_ptr, indices_ptr),
              make_pair_iterator(dists_ptr + cnt, indices_ptr + cnt),
              pair_iterator_compare<DistanceType*, int*>());
}

I am not sure I defined the point type correctly, so may be those errors are due to alignment violations or something. I am going to try to analyze the problem and then submit a bug if needed.