DAT160 Robot Vision
On this page you can learn about robot vison, with some simple exsamples on how to implement it in python.
Digital image
In OpenCV:
# Load single color image:
image = cv2.imread('img/ball01.jpg')
# numpy array:
image.shape
# (667, 500, 3)
# width(px), height(px), layers
cv2.imshow("Original", image)
cv2.waitKey(0)
- Digital images are typically stored in RGB format
- In OpenCV the default order is not RGB but BGR !!
Plotting the layers separately:
cv2.imshow("B", image[:,:,0])
cv2.imshow("G", image[:,:,1])
cv2.imshow("R", image[:,:,2])
cv2.waitKey(0)
Convert color space
-
In OpenCV we can easily convert the color space of an image:
hsv_image = cv2.cvtColor(image, cv2.COLOR_BGR2HSV) lab_image = cv2.cvtColor(image, cv2.COLOR_BGR2LAB) image = cv2.cvtColor(image, cv2.COLOR_LAB2BGR)
- Layers in CIELAB:
cv2.imshow("L", lab_image[:,:,0]) cv2.imshow("A", lab_image[:,:,1]) cv2.imshow("B", lab_image[:,:,2]) cv2.waitKey(0)
- Layers in HSV:
cv2.imshow("H", hsv_image[:,:,0]) cv2.imshow("S", hsv_image[:,:,1]) cv2.imshow("V", hsv_image[:,:,2]) cv2.waitKey(0)
- The color space we are working in matters, since we perform all our algorithms on the layers of the digital image.
- Some features of the image may be much more characteristic in one color space than the in another.
- We will see this clearly when implementing color blob detection
Image Processing
- Image processing is a computational process that transforms one or more input images into an output image
- Frequently used to make images more appealing
- Enhancing of imperfect images
-
In robotics and computer vision, image processing is often the basis for feature extraction
- Some algorithm categories
- Pixel-wise on single image: monadic operations
- Pixel-wise on a pir of images: dyadic operations
- On local groups of pixels: spatial operations
- Shape changing operations
Pixel Value Distribution
- Min, max, mean, median, standard deviation of pixel values
image.mean() 160.36176011994004 image[:,:,0].mean() 152.46574812593704 image[:,:,1].mean() 161.83708245877062 image[:,:,2].mean() 166.78244977511244 image[:,:,1].max() 214 image[:,:,1].min() 27
calcHist
histr = cv2.calcHist([image], [0], None, [256], [0, 256])
plt.plot(histr)
plt.show()
Get an impression of the pixel value distribution in an image by plotting it in a histogram showing the frequency of pixel values. For color images: one histogram per layer.
cv2.calcHist(images, channels, mask, histSize, ranges[, hist[, accumulate]])
images
: it is the source image of type uint8 or float32. it should be given in square brackets, ie, “[img]”.
channels
: it is also given in square brackets. It is the index of channel for which we calculate histogram. For example, if input is grayscale image, its value is [0]. For color image, you can pass [0], [1] or [2] to calculate histogram of blue, green or red channel respectively.
mask
: mask image. To find histogram of full image, it is given as “None”. But if you want to find histogram of particular region of image, you have to create a mask image for that and give it as a mask.
histSize
: this represents our BIN count. Need to be given in square brackets. For full scale, we pass [256].
ranges
: this is our RANGE. Normally, it is [0,256].
Monadic Operations
- Very common: Thresholding
threshold_image = cv2.threshold(image[:,:,0], 110, 255, cv2.THRESH_BINARY)
All pixels where the intensity is greater 110 in the blue channel become True (255) and all less become False (0). The output is a binary image
- To threshold on all channels simultaneously:
threshold_image = cv2.inRange(image, np.array([5, 20, 80]), np.array([60, 130, 190]))
All pixels where the intensities of all channels fall between the lower and upper bound become True, all other ones false.
Threshold
cv2.threshold(src, thresh, maxval, type[, dst])
src
: This is the source image, which should be a grayscale image
thresh
: This is the threshold value witch is used to classify the
pixel intensities in the grayscale image.
maxval
: This is the value to be given if pixel value is more than
(sometimes less than, depending on the type of thresholding) the threshold value.
type
: This is the type of threshold to be applied. Some types are:
cv2.THRESH_BINARY
cv2.THRESH_BINARY_INV
cv2.THRESH_TRUNC
cv2.THRESH_TOZERO
cv2.THRESH_TOZERO_INV
This function returns two outputs:
retval
: This is the threshold that was used.dst
: This is the thresholded image.
inRange
cv2.inRange(src, lowerb, upperb[, dst])
-
src
: This is the source array or image. -
lowerb
: This is the inclusive lower boundary array or a scalar. -
upperb
: This is the inclusive upper boundary array or a scaler.
The function checks if elements in the src
array lie between lowerb
and upperb
(inclusive). If they do, the function sets the corresponding element in the output array to 255
(white). If they dont’t, the function sets the corresponding element in the output array to 0 (black).
Dyadic Operations
Common examples: arithmetic operators such as addition, subtraction, element-wise multiplication. Also logic operations such as AND, OR, XOR. Here we apply a mask to an image with the bitwise AND operator:
thresholded_image = cv2.inRange(image, np.array([5, 20, 80]), np.array([60, 130, 190]))
res = cv2.bitwise_and(image, image, mask=thrasholded_image)
cv2.imshow("Result", res)
Blob detection
OpenCV comes with a blob detector that is easy to use:
-
We first define the filter parameters
params = cv2.SimpleBlobDetector_Params() # Filter by Area. params.filterByArea = True params.minArea = 150 params.maxArea = 6000 # Filter by Circularity params.filterByCircularity = False params.minCircularity = 0.1 # Filter by Convexity params.filterByConvexity = False params.minConvexity = 0.87 # Filter by Inertia params.filterByInertia = False params.minInertiaRatio = 0.01
-
With the parameters object we create the blob detector
# Create a detector with the parameters detector = cv2.SimpleBlobDetector_create(params)
-
Detect blobs and visualize them
# Create a detector with the parameters detector = cv2.SimpleBlobDetector_create(params) # Detect blobs keypoints = detector.detect(~thresholded_image) im_with_keypoints = cv2.drawKeypoints(image, keypoints, np.array([]), (0, 0, 255), cv2.DRAW_MATCHES_FLAGS_DRAW_RICH_KEYPOINTS) cv2.imshow("Blobs", im_with_keypoints) cv2.waitKey(0)
-
Keypoints: The blob detector returns a list of keypoints. Each keypoint is a special structure whitch contains information about the detected feature. Here’s a brief overview of the properties of a keypoint:
pt
: The coordinates of the detected feature in the format (x, y).size
: The diameter of the meaningful keypoint neighborhood.angle
: Computed orientation of the keypoint (-1 if not applicable); it’s the angle that the keypoint vector is pointing in.response
: The response by witch the most strong keypoints have been selected. Can be used for futher sorting or subsampling.ovtave
: The octave (pyramid layer) from witch the keypoint has been extracted.class_id
: Can be used to cluster keypoints by an object the belong to.