On this page you can learn about robot vison, with some simple exsamples on how to implement it in python.

Digital image

In OpenCV:

# Load single color image:
image = cv2.imread('img/ball01.jpg')

# numpy array:
image.shape
# (667, 500, 3)
# width(px), height(px), layers

cv2.imshow("Original", image)
cv2.waitKey(0)
  • Digital images are typically stored in RGB format
  • In OpenCV the default order is not RGB but BGR !!

Plotting the layers separately:

cv2.imshow("B", image[:,:,0])
cv2.imshow("G", image[:,:,1])
cv2.imshow("R", image[:,:,2])
cv2.waitKey(0)

Convert color space

  • In OpenCV we can easily convert the color space of an image:

    hsv_image = cv2.cvtColor(image, cv2.COLOR_BGR2HSV)
    lab_image = cv2.cvtColor(image, cv2.COLOR_BGR2LAB)
    
    image = cv2.cvtColor(image, cv2.COLOR_LAB2BGR)
    
  • Layers in CIELAB:
    cv2.imshow("L", lab_image[:,:,0])
    cv2.imshow("A", lab_image[:,:,1])
    cv2.imshow("B", lab_image[:,:,2])
    cv2.waitKey(0)
    
  • Layers in HSV:
    cv2.imshow("H", hsv_image[:,:,0])
    cv2.imshow("S", hsv_image[:,:,1])
    cv2.imshow("V", hsv_image[:,:,2])
    cv2.waitKey(0)
    
  • The color space we are working in matters, since we perform all our algorithms on the layers of the digital image.
  • Some features of the image may be much more characteristic in one color space than the in another.
  • We will see this clearly when implementing color blob detection

Image Processing

  • Image processing is a computational process that transforms one or more input images into an output image
  • Frequently used to make images more appealing
  • Enhancing of imperfect images
  • In robotics and computer vision, image processing is often the basis for feature extraction

  • Some algorithm categories
    • Pixel-wise on single image: monadic operations
    • Pixel-wise on a pir of images: dyadic operations
    • On local groups of pixels: spatial operations
    • Shape changing operations

Pixel Value Distribution

  • Min, max, mean, median, standard deviation of pixel values
    image.mean()
    160.36176011994004
    image[:,:,0].mean()
    152.46574812593704
    image[:,:,1].mean()
    161.83708245877062
    image[:,:,2].mean()
    166.78244977511244
    image[:,:,1].max()
    214
    image[:,:,1].min()
    27
    

calcHist

histr = cv2.calcHist([image], [0], None, [256], [0, 256])
plt.plot(histr)
plt.show()

Get an impression of the pixel value distribution in an image by plotting it in a histogram showing the frequency of pixel values. For color images: one histogram per layer.

cv2.calcHist(images, channels, mask, histSize, ranges[, hist[, accumulate]])

images: it is the source image of type uint8 or float32. it should be given in square brackets, ie, “[img]”.

channels: it is also given in square brackets. It is the index of channel for which we calculate histogram. For example, if input is grayscale image, its value is [0]. For color image, you can pass [0], [1] or [2] to calculate histogram of blue, green or red channel respectively.

mask: mask image. To find histogram of full image, it is given as “None”. But if you want to find histogram of particular region of image, you have to create a mask image for that and give it as a mask.

histSize: this represents our BIN count. Need to be given in square brackets. For full scale, we pass [256].

ranges: this is our RANGE. Normally, it is [0,256].

Monadic Operations

  • Very common: Thresholding
    threshold_image = cv2.threshold(image[:,:,0], 110, 255, cv2.THRESH_BINARY)
    

    All pixels where the intensity is greater 110 in the blue channel become True (255) and all less become False (0). The output is a binary image

  • To threshold on all channels simultaneously:
    threshold_image = cv2.inRange(image, np.array([5, 20, 80]), np.array([60, 130, 190]))
    

    All pixels where the intensities of all channels fall between the lower and upper bound become True, all other ones false.

Threshold

cv2.threshold(src, thresh, maxval, type[, dst])

src: This is the source image, which should be a grayscale image

thresh: This is the threshold value witch is used to classify the pixel intensities in the grayscale image.

maxval: This is the value to be given if pixel value is more than (sometimes less than, depending on the type of thresholding) the threshold value.

type: This is the type of threshold to be applied. Some types are:

  • cv2.THRESH_BINARY
  • cv2.THRESH_BINARY_INV
  • cv2.THRESH_TRUNC
  • cv2.THRESH_TOZERO
  • cv2.THRESH_TOZERO_INV

This function returns two outputs:

  • retval: This is the threshold that was used.
  • dst: This is the thresholded image.

inRange

cv2.inRange(src, lowerb, upperb[, dst])
  • src: This is the source array or image.

  • lowerb: This is the inclusive lower boundary array or a scalar.

  • upperb: This is the inclusive upper boundary array or a scaler.

The function checks if elements in the src array lie between lowerb and upperb (inclusive). If they do, the function sets the corresponding element in the output array to 255 (white). If they dont’t, the function sets the corresponding element in the output array to 0 (black).

Dyadic Operations

Common examples: arithmetic operators such as addition, subtraction, element-wise multiplication. Also logic operations such as AND, OR, XOR. Here we apply a mask to an image with the bitwise AND operator:

thresholded_image = cv2.inRange(image, np.array([5, 20, 80]), np.array([60, 130, 190]))
res = cv2.bitwise_and(image, image, mask=thrasholded_image)
cv2.imshow("Result", res)

Blob detection

OpenCV comes with a blob detector that is easy to use:

  • We first define the filter parameters

    params = cv2.SimpleBlobDetector_Params()
    
    # Filter by Area.
    params.filterByArea = True
    params.minArea = 150
    params.maxArea = 6000
    
    # Filter by Circularity
    params.filterByCircularity = False
    params.minCircularity = 0.1
    
    # Filter by Convexity
    params.filterByConvexity = False
    params.minConvexity = 0.87
    
    # Filter by Inertia
    params.filterByInertia = False
    params.minInertiaRatio = 0.01
    
  • With the parameters object we create the blob detector

    # Create a detector with the parameters
    detector = cv2.SimpleBlobDetector_create(params)
    
  • Detect blobs and visualize them

    # Create a detector with the parameters
    detector = cv2.SimpleBlobDetector_create(params)
    
    # Detect blobs
    keypoints = detector.detect(~thresholded_image)
    
    im_with_keypoints = cv2.drawKeypoints(image, keypoints, np.array([]), (0, 0, 255), cv2.DRAW_MATCHES_FLAGS_DRAW_RICH_KEYPOINTS)
    
    cv2.imshow("Blobs", im_with_keypoints)
    cv2.waitKey(0)
    
  • Keypoints: The blob detector returns a list of keypoints. Each keypoint is a special structure whitch contains information about the detected feature. Here’s a brief overview of the properties of a keypoint:

    • pt: The coordinates of the detected feature in the format (x, y).
    • size: The diameter of the meaningful keypoint neighborhood.
    • angle: Computed orientation of the keypoint (-1 if not applicable); it’s the angle that the keypoint vector is pointing in.
    • response: The response by witch the most strong keypoints have been selected. Can be used for futher sorting or subsampling.
    • ovtave: The octave (pyramid layer) from witch the keypoint has been extracted.
    • class_id: Can be used to cluster keypoints by an object the belong to.