How to find all connected components in a binary image in Matlab?

I have been trying to find all connected components using 8 neighbors in a binary image, without using the function "bwlabel".

For example, my input matrix is:

a =

     1     1     0     0     0     0     0
     1     1     0     0     1     1     0
     1     1     0     0     0     1     0
     1     1     0     0     0     0     0
     0     0     0     0     0     1     0
     0     0     0     0     0     0     0

I would to have something like this:

a =

     1     1     0     0     0     0     0
     1     1     0     0     2     2     0
     1     1     0     0     0     2     0
     1     1     0     0     0     0     0
     0     0     0     0     0     3     0
     0     0     0     0     0     0     0

There are 3 connected objects in this image.


Solution 1:

This is a common problem in image processing. There are many variations, such as flood filling a region in an image, or finding what pixels belong to the same region. One common approach is to use depth first search. The idea is that you traverse your image from left to right and top to bottom and for any pixels encountered that are equal to 1, you add them to a stack. For each pixel in your stack, you pop off the stack, then look at the neighbouring pixels that are surrounding this pixel. Any pixels that are 1 you add to the stack. You need to keep an additional variable where any pixels you have already visited, you don't add these to the stack. When the stack is empty, we have found those pixels that are an entire region, so you mark these with a unique ID. You then repeat this procedure until you run out of regions in your image.

As such, given that your matrix is stored in A, this is the basic algorithm:

  1. Initialize an array that's the same size as A that is logical. This will record which pixels we have examined or visited. Also initialize an output array B to all zeroes that gives you all of the connected components that you are seeking. Any locations that are zero in the end don't belong to any connected components. Also initialize an ID counter that keeps track of what connected component label each of these will have.

  2. For each location that's in our matrix:

    a. If the location is 0, mark this location as visited and continue.

    b. If we have already visited this location, then continue.

    c. If we have not visited this location... go to Step #3.

  3. Add this unvisited location to a stack.

    a. While this stack is not empty...

    b. Pop this location off of the stack

    c. If we have visited this location, then continue.

    d. Else, mark this location as visited and mark this location with the connected components ID.

    e. Given this location, look at the 8 neighbouring pixels.

    f. Remove those pixels in this list that have been visited, not equal to 1 or out of bounds of the matrix

    g. Whatever locations are remaining, add these to the stack.

  4. Once the stack is empty, increment the counter, then go back to Step #2.

  5. Keep going until we have visited all of the locations in our array.

Without further ado, here's the code.


%// Step #1
visited = false(size(A));
[rows,cols] = size(A);
B = zeros(rows,cols);
ID_counter = 1;

%// Step 2
%// For each location in your matrix...
for row = 1 : rows
    for col = 1 : cols
        %// Step 2a
        %// If this location is not 1, mark as visited and continue
        if A(row,col) == 0
            visited(row,col) = true;

        %// Step 2b
        %// If we have visited, then continue
        elseif visited(row,col)
            continue;

        %// Step 2c
        %// Else...
        else
            %// Step 3
            %// Initialize your stack with this location
            stack = [row col];

            %// Step 3a
            %// While your stack isn't empty...
            while ~isempty(stack)
                %// Step 3b
                %// Pop off the stack
                loc = stack(1,:);
                stack(1,:) = [];

                %// Step 3c
                %// If we have visited this location, continue
                if visited(loc(1),loc(2))
                    continue;
                end

                %// Step 3d
                %// Mark location as true and mark this location to be
                %// its unique ID
                visited(loc(1),loc(2)) = true;
                B(loc(1),loc(2)) = ID_counter;

                %// Step 3e
                %// Look at the 8 neighbouring locations
                [locs_y, locs_x] = meshgrid(loc(2)-1:loc(2)+1, loc(1)-1:loc(1)+1);
                locs_y = locs_y(:);
                locs_x = locs_x(:);

                 %%%% USE BELOW IF YOU WANT 4-CONNECTEDNESS
                 % See bottom of answer for explanation
                 %// Look at the 4 neighbouring locations
                 % locs_y = [loc(2)-1; loc(2)+1; loc(2); loc(2)];
                 % locs_x = [loc(1); loc(1); loc(1)-1; loc(1)+1];

                %// Get rid of those locations out of bounds
                out_of_bounds = locs_x < 1 | locs_x > rows | locs_y < 1 | locs_y > cols;

                locs_y(out_of_bounds) = [];
                locs_x(out_of_bounds) = [];

                %// Step 3f
                %// Get rid of those locations already visited
                is_visited = visited(sub2ind([rows cols], locs_x, locs_y));

                locs_y(is_visited) = [];
                locs_x(is_visited) = [];

                %// Get rid of those locations that are zero.
                is_1 = A(sub2ind([rows cols], locs_x, locs_y));
                locs_y(~is_1) = [];
                locs_x(~is_1) = [];

                %// Step 3g
                %// Add remaining locations to the stack
                stack = [stack; [locs_x locs_y]];
            end

            %// Step 4
            %// Increment counter once complete region has been examined
            ID_counter = ID_counter + 1;
        end
    end %// Step 5
 end   

With your example matrix, this is what I get for B:

B =

     1     1     0     0     0     0     0
     1     1     0     0     2     2     0
     1     1     0     0     0     2     0
     1     1     0     0     0     0     0
     0     0     0     0     0     3     0
     0     0     0     0     0     0     0

To search in a 4-connected neighbourhood

To modify the code to search in a 4-connected region, that is only North, East, West and South, the section where you see %// Look at the 8 neighbouring locations, that is:

 %// Look at the 8 neighbouring locations
 [locs_y, locs_x] = meshgrid(loc(2)-1:loc(2)+1, loc(1)-1:loc(1)+1);
 locs_y = locs_y(:);
 locs_x = locs_x(:);

To search in a 4-connected fashion, you simply have to modify this code to only give those cardinal directions:

 %// Look at the 4 neighbouring locations
locs_y = [loc(2)-1; loc(2)+1; loc(2); loc(2)];
locs_x = [loc(1); loc(1); loc(1)-1; loc(1)+1];

The rest of the code remains untouched.

To match MATLAB's bwlabel function

If you want to match the output of MATLAB's bwlabel function, bwlabel searches for connected components in column major or FORTRAN order. The above code searches in row major or C order. Therefore you simply have to search along columns first rather than rows as what the above code is doing and you do this by swapping the order of the two for loops.

Specifically, instead of doing:

for row = 1 : rows
    for col = 1 : cols
        ....
        ....

You would do:

for col = 1 : cols
    for row = 1 : rows
        ....
        ....

This should now replicate the output of bwlabel.