Tomorrow I am going on vacation for two weeks, I just hope I will be able to take the time to finish the API for MeanShift and get it up and running until the deadline. In my spare time I am checking out some other existing implementations of MeanShift, namely the Edison library and the one existing in OpenCV as well as some existing Matlab implementations. I’ve also wrote a first script doing segmentation based on color, but results were not exactly what I was hoping for.

I have started working on implementing mean shift. I am going to try to keep the API as modular as possible, thinking of the fact that mean shift can be used for a variety of things among which segmentation is only one. First I am just writing a script like program to check if the algorithm works, and then I’ll implement it in the API. (segmentation will be based on color space and on normals at a first try)

I managed to fix the region growing, so now, if the seed point is on a plane paralell to the table top the method does not fail. Below you can see some screen shots of the

This past week I have been fine tuning and rewriting the region growing part of the segmentation method in order to fix the eroneous segmentation shown in my last post.

I’ve finaly managed to add the normals to the region growing algorithm and results are promising. I’ve conditioned adding of points to the region on the angle between the seed points’ normal and the current point. Results doing this are shown in the screen shot below.

Adding this condition helped and not at the same time. Although now growing stops when we reach the table top, parts of the object that are parallel to it don’t get added as well. This was expected of course. Still...better then the first try.

Thanks to a friend of mine I found out about this theory:

The interesting part for me is that according to this theory every object can be broken down into piece primitives and there are only 32 kinds of these primitive shapes and all of them are convex, meaning that complex objects can be separated into these parts at their concavenesses.

The following is an extract from the book *From Fragments to Objects - Segmentation and Grouping in Vision T.F. Shipley and P.J. Kellman (Editors) 2001 Elsevier Science B.V. All rights reserved.*

*“The simplest shapes are convex shapesâ€”whose outlines have positive curvature throughout (see, e.g., Rosin, 2000, for the role of convexity in parsing). If the outline of a shape has regions of negative curvature, especially if these regions contain salient negative minima of curvature, this usually indicates that the shape can be further parsed to give simpler subshapes. [...] Three main geometrical factors determine the perceptual salience of a part (Hoffman & Singh, 1997): (1) its protrusion, (2) its relative area, and (3) the strength of its boundaries. Salience of a part increases as its protrusion, relative area, or boundary strength increases. In this section we briefly consider protrusion and relative area, and then discuss in more detail the strength of part boundaries. We restrict attention to 2D shapes; the theory for 3D shapes is more complex and discussed elsewhere (Hoffman & Singh, 1997).”*

*“Hypothesis of normalized curvature: The salience of a part boundary increases as the magnitude of normalized curvature at the boundary increases.”*

*“Hypothesis of Turning Angle: The salience of a negative-minimum boundary increases as the magnitude of the turning angle around the boundary increases.”*

Based on these theories I introduces another constraint to my region growing: when adding a point the angle between the line connecting that point to the fixation point and the points normal is checked. If this angle is concave it means that the point currently being verified belongs to the object. Doing so resulted in the following results (blue points are the ones segmented out and the green patch on the objects is the neighborhood of the fixation point):

As it can be observed introducing that extra condition improved results a lot.The last scene is not a good result. My next step will involve investigating the cause of that erroneous segmentation. As a final observation, it was clear for me since the beginning, that the whole of this method is dependent on having a good fixation point. If this is not the case, the algorithm would not work at the moment. One of my next steps will be to investigate the method recently proposed by the authors of the original paper for automatic fixation point estimation.

With this occasion I would like to thank Zoltan-Cs. Marton for coming up with the idea of using convex vs. concave angles, it helped me a lot:). THX m8:)

Since I was a little swamped with work this week I did not have time to try out how adding the normals will change my region growing yet, so I did some reading in my spare time about mean shift segmentation and existing implementations. Since the original papers on mean shift are not the easiest to understand and to clarify to myself about steps involved in segmenting i started searching for other articles about this topic. I’ve found this article to be the most helpful:

For the approximation of the boundaries I used the class implemented by our fellow GSOC student Changhyun Choi.

Because the active segmentation method is one that is most likely to be used in a robotic application, as described in the paper scenes involved are table top scenes containing several objects. I chose the scene below as the one for testing because there are multiple objects on the table some occluding others, but as this is the scene that i will be running the algorithm on the first time I did not want it to be too complex (e.g. extremely cluttered scenes)

After implementing and testing the first version of the region growing I got the results shown in the screen shot below....just to clarify....everything that is blue gets segmented out. The green points on the large box are the points near the “fixation point”. This version of region growing is based only on the borders and since there are no borders at the touching point of the box with the table growing does not stop. To get around this problem I will take into consideration the normals of the points as well when growing.

As promised in a previous post I took the time and created a basic flow chart of the involved steps of segmenting around a fixation point. Upper part of the chart (getting a boundary map, setting input cloud etc. ) are steps that need to be implemented by whoever want to use the *ActiveSegmentation* class(example of this will be shortly available *pcl/examples*), steps in the lower part are implemented in the class. It is left to the users discretion to choose an appropriate boundary map. I chose to do it this to have a bigger flexibility, and because many others are working on edge/boundary detection at the moment. My test will be based on the already implemented Boundary detection (which works for ordered and unordered point clouds as well) and trying it out with mapping 2d edge detections to the 3d cloud.

**Flow Chart Active Segmentation**

In my next post I will share some preliminary results as well.

After thinking and discussing about it I have decided to go on and implement a region growing method based on a boundary map and the fixation point. Calculating and setting the boundary map will be left up to the user the method segmenting the region of the fixation point that is enclosed by w boundary. I will also shortly add an example of how this is done in pcl/examples.

I’m trying to push harder this week to try and get an implemented working version of the active segmentation approach. I started inserting my already existing code parts in the PCL API, created the base classes and added the basic functionalities to it. I’m planing on creating a flowchart to illustrate how things work, but life keeps getting in the way:). Anyway, I am having a little bit of trouble on deciding which way to go after I have the fixation point and the boundary map. I have to decide on doing this the way the guys did it in the original publication and implement a mapping between the 2d image and point cloud, or I could try implementing it using region growing having the fixation point as the seed. I will have to decide on this, or maybe try out both and see where that leads me.

As I mentioned in my first entry, I have decided to implement/adapt the Active Segmentation with Fixation approach first, developed by A.Mishra. As a first step I took the time and thoroughly studied the works of the aforementioned author, and found that there are several improvements to the first approach. Details on this can be found on the authors home page.

For those who are not familiar how active segmentation based on fixation works and don’t want or don’t have the time to read the initial paper here is a short description. The reasoning behind the choice of segmenting based on a fixation point originates from the way humans perceive objects in their surroundings. This method proposes an approach where we do not segment the whole image in several regions and then reason on what these regions might be, but segment only the region which contains our fixation point. So in short segmenting an object from the scene implies finding a “fixation point” and finding the contour of the object that encloses our point of interest. The main steps of the segmentation algorithm are(these steps are for rgb images using monocular cues):

- Obtain the boundary map
- Improve boundary map using using monocular cues.
- Set fixation points (points chosen must be part of the object we want to segment out).
- Convert to polar coordinates around the fixation point
- Search for shortest path/ apply graph-cut algorithm in order to find the contour of the object.
- Convert back to Cartesian space, points found on the “left” of the shortest path in polar space belong to our object of interest.

Later works improve on this method among other things by adding an automated fixation strategy for finding interest points and by using depth information. Although, as stated in the work of A. Mishra we can find object boundaries by checking for depth discontinuities, but often these boundary points do not correspond to the true boundaries of the objects. In order to find the true boundaries color and texture cues are recommended. While the purpose of my project is the implementation of the segmentation algorithm, if I find the time for it, I will pursue the implementation of a fixation strategy.

Implementation wise I have created the basic API of the segmentation based on the the other segmentation algorithms already implemented in PCL. I played around with existing implementations of boundary detection and I realized that I will probably be able to use the code of my fellow GSOC developer Changhyun Choi. I am looking forward to seeing how his different implementations of edge detection will influence the performance of segmentation.

I will be back with more information soon (mostly on implementation issues).

These past weeks I have been reading up on the two segmentation methods described in the papers (Mean-shift and Active Segmentation in Robotics) and in the meanwhile checking out the segmentation API and the available tutorials in PCL. I worked on setting up my environment and making everything work. My next steps will involve doing some research on the adaptation of Active Segmentation to 3D data as well as creating some draft classes and outlining the main steps of the method.