Splitting image by whitespace

I have an image I am attempting to split into its separate components, I have successfully created a mask of the objects in the image using k-means clustering. (I have included the results and mask below)

I am then trying to crop each individual part of the original image and save it to a new image, is this possible?

import numpy as np
import cv2

path = 'software (1).jpg'
img = cv2.imread(path)

img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
twoDimage = img.reshape((-1,3))
twoDimage = np.float32(twoDimage)

criteria = (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER, 10, 1.0)
K = 2
attempts=10

ret,label,center = cv2.kmeans(twoDimage,K,None,criteria,attempts,cv2.KMEANS_PP_CENTERS)
center = np.uint8(center)
res = center[label.flatten()]
result_image = res.reshape((img.shape))


cv2.imwrite('result.jpg',result_image)

Original image

Result of k-means


My solution involves creating a binary object mask where all the objects are colored in white and the background in black. I then extract each object based on area, from smallest to smallest. I use this "isolated object" mask to segment each object in the original image. I then write the result to disk. These are the steps:

  1. Resize the image (your original input is gigantic)
  2. Convert to grayscale
  3. Extract each object based on area from largest to smallest
  4. Create a binary mask of the isolated object
  5. Apply a little bit of morphology to enhance the mask
  6. Mask the original BGR image with the binary mask
  7. Apply flood-fill to color the background with white
  8. Save image to disk
  9. Repeat the process for all the objects in the image

Let's see the code. Through the script I use two helper functions: writeImage and findBiggestBlob. The first function is pretty self-explanatory. The second function creates a binary mask of the biggest blob in a binary input image. Both functions are presented here:

# Writes an PGN image:
def writeImage(imagePath, inputImage):
    imagePath = imagePath + ".png"
    cv2.imwrite(imagePath, inputImage, [cv2.IMWRITE_PNG_COMPRESSION, 0])
    print("Wrote Image: " + imagePath)


def findBiggestBlob(inputImage):
    # Store a copy of the input image:
    biggestBlob = inputImage.copy()
    # Set initial values for the
    # largest contour:
    largestArea = 0
    largestContourIndex = 0

    # Find the contours on the binary image:
    contours, hierarchy = cv2.findContours(inputImage, cv2.RETR_CCOMP, cv2.CHAIN_APPROX_SIMPLE)

    # Get the largest contour in the contours list:
    for i, cc in enumerate(contours):
        # Find the area of the contour:
        area = cv2.contourArea(cc)
        # Store the index of the largest contour:
        if area > largestArea:
            largestArea = area
            largestContourIndex = i

    # Once we get the biggest blob, paint it black:
    tempMat = inputImage.copy()
    cv2.drawContours(tempMat, contours, largestContourIndex, (0, 0, 0), -1, 8, hierarchy)
    # Erase smaller blobs:
    biggestBlob = biggestBlob - tempMat

    return biggestBlob

Now, let's check out the main script. Let's read the image and get the initial binary mask:

# Imports
import cv2
import numpy as np

# Read image
imagePath = "D://opencvImages//"
inputImage = cv2.imread(imagePath + "L85Bu.jpg")

# Get image dimensions
originalImageHeight, originalImageWidth = inputImage.shape[:2]

# Resize at a fixed scale:
resizePercent = 30
resizedWidth = int(originalImageWidth * resizePercent / 100)
resizedHeight = int(originalImageHeight * resizePercent / 100)

# resize image
inputImage = cv2.resize(inputImage, (resizedWidth, resizedHeight), interpolation=cv2.INTER_LINEAR)
writeImage(imagePath+"objectInput", inputImage)

# Deep BGR copy:
colorCopy = inputImage.copy()

# Convert BGR to grayscale:
grayscaleImage = cv2.cvtColor(inputImage, cv2.COLOR_BGR2GRAY)

# Threshold via Otsu:
_, binaryImage = cv2.threshold(grayscaleImage, 250, 255, cv2.THRESH_BINARY_INV)

This is the input resized by 30% according to resizePercent:

And this is the binary mask created with a fixed threshold of 250:

Now, I'm gonna run this mask through a while loop. With each iteration I'll extract the biggest blob until there's no blobs left. Each step will create a new binary mask where the only thing present is one object at a time. This will be the key to isolating the objects in the original (resized) BGR image:

# Image counter to write pngs to disk:
imageCounter = 0

# Segmentation flag to stop the processing loop:
segmentObjects = True

while (segmentObjects):

    # Get biggest object on the mask:
    currentBiggest = findBiggestBlob(binaryImage)

    # Use a little bit of morphology to "widen" the mask:
    kernelSize = 3
    opIterations = 2
    morphKernel = cv2.getStructuringElement(cv2.MORPH_RECT, (kernelSize, kernelSize))
    # Perform Dilate:
    binaryMask = cv2.morphologyEx(currentBiggest, cv2.MORPH_DILATE, morphKernel, None, None, opIterations,cv2.BORDER_REFLECT101)

    # Mask the original BGR (resized) image:
    blobMask = cv2.bitwise_and(colorCopy, colorCopy, mask=binaryMask)

    # Flood-fill at the top left corner:
    fillPosition = (0, 0)
    # Use white color:
    fillColor = (255, 255, 255)
    colorTolerance = (0,0,0)
    cv2.floodFill(blobMask, None, fillPosition, fillColor, colorTolerance, colorTolerance)

    # Write file to disk:
    writeImage(imagePath+"object-"+str(imageCounter), blobMask)
    imageCounter+=1

    # Subtract current biggest blob to
    # original binary mask:
    binaryImage = binaryImage - currentBiggest

    # Check for stop condition - all pixels
    # in the binary mask should be black:
    whitePixels = cv2.countNonZero(binaryImage)

    # Compare agaisnt a threshold - 10% of
    # resized dimensions:
    whitePixelThreshold = 0.01 * (resizedWidth * resizedHeight)
    if (whitePixels < whitePixelThreshold):
        segmentObjects = False

There are some things worth noting here. This is the first isolated mask created for the first object:

Nice. A simple mask with the BGR image will do. However, I can improve the quality of the mask if I apply a dilate morphological operation. This will "widen" the blob, covering the original outline by a few pixels. (The operation actually searches for the maximum intensity pixel within a Neighborhood of pixels). Next, the masking will produce a BGR image where there's only the object blob and a black background. I don't want that black background, I want it white. I flood-fill at the top left corner to get the first BGR mask:

I save each mask a new file on disk. Very cool. Now, the condition to break from the loop is pretty simple - stop when all the blobs have been processed. To achieve this I subtract the current biggest blob to the original binary white and count the number of white pixels. When the count is below a certain threshold (in this case 10% of the resized image) stop the loop.

Check out this gif of every object isolated. Each frame is saved to disk as a png file: