Let’s get started!
So what is it we want to do?
When wanting to analyze an image, we actually have three major tasks we need to perform
1 – The Asks:
That is, finding out what is in the image and what we want to know about it. In Vision terminology, these are called requests.
2 – The Machinery:
In the next step, Vision machinery will serve as request handler and provide us with a result.
3 – The Result:
The result will be called an observation – what Vision observed in this image. These observations depend on what we asked Vision to do in the first place.
So, how do we approach this challenge? How do we make Vision see what we want it to see? Here’s when the coding comes in!
Step 1: Capturing video frames from the camera
Before we can do any Vision magic we need to obtain the image we want to run the Asks on. We will need to get the necessary frames from our camera by using AVFoundation:
Create a new Single View Application
Navigate to the ViewController.swift file, import AVFoundation and Vision
Import AVFoundation
Import Vision
In the viewDidLoad, add the method setupVideoCaptureSession() which will create our AVCaptureSession object, that will handle capture activity and manage the data flow between input devices (such as the camera) and outputs.