From the course: Computer Vision on the Raspberry Pi 4
Introducing OpenCV
From the course: Computer Vision on the Raspberry Pi 4
Introducing OpenCV
- [Instructor] Earlier, I explained how to install OpenCV and use some of its basic features. Now, I'd like to take a step back and provide a proper introduction. OpenCV's full name is the Open Source Computer Vision Library, and Intel released the alpha version in 2000. Their goal was to advance vision research by providing a common infrastructure that developers could build on. Since then, developers have added many new capabilities to OpenCV. These include stereoscopic vision, video stabilization, and facial analysis. In this course, we'll focus on using OpenCV for object detection. Object detection determines if an object is present in an image, and if so, where it's located. Most of the OpenCV applications in this course perform the same set of operations. First, the application reads an image by calling imread or by calling the capture function of the PiCamera package. Next, the application examines the image using OpenCV's analysis capabilities. These include support vector machines, neural networks, and decision trees. When the analysis is complete, applications can display the results by drawing graphics on the image, such as a rectangle around a detected object. Then, the resulting image can be saved to a file by calling imwrite. Toward the end of the course, I'll explain how to capture images from a camera. Until then, we'll read images from files by calling imread. This accepts two arguments, the path of the file containing the image and the image's type. All of the images in this course will be color, so we'll set the type to IMREAD_COLOR. For example, if you want to read a color image from smiley.jpeg, you'd call imread with the first argument set to the full path of smiley.jpeg and the second argument set to IMREAD_COLOR. When an image is read from a file, OpenCV returns the image's data in a NumPy ndarray. This stands for n-dimensional array. For a color image, imread will return an array with three dimensions, one for the image's height, one for its width, and one for each channel in the image's pixels. This slide shows what happens when imread reads the color image on the left. The array has three dimensions, but you can think of it as having three two-dimensional arrays, one for each color channel in the image's pixels. Most programs access color in RGB order with red first, green second, and blue last, but OpenCV stores colors in BGR order with blue first. After an application analyzes an image, it can use OpenCV's drawing functions to draw graphics. These graphics may include boundary lines, bounding boxes, or text. The three main drawing functions are line, rectangle, and putText. Each is straightforward to understand, and I'll demonstrate how they're used later on. I like to use putText when I need to figure out why my application isn't analyzing the right region of the image. Unfortunately, the font selection is limited. Two options are FONT_HERSHEY_SIMPLEX, which is normal-sized sans-serif, and FONT_HERSHEY_COMPLEX, which is normal-sized serif. After an application analyzes an image and draws graphics, it can save the image to a file by calling imwrite. The first argument identifies the name of the file, and OpenCV can create several types of images, including JPEGs, PNGs, TIFs, and Windows Bitmaps. The second argument identifies the ndarray containing the image's data. This example code shows how you can save the data from image_array to an image file named out_image.png. OpenCV is a powerful toolset that provides a wealth of capabilities for computer vision. This video has presented many of the simple functions, and later videos will present more advanced features.