AI in imaging goes beyond consumer products. It is actually more important for its uses in commercial and scientific applications across many industries. In the field of computational photography, AI is making a difference because it is implemented in software. Whatever the camera captures can be enhanced and processed in realtime with the aid of hardware (image signal processing circuit) and software (imaging application). The results not only improve image quality or resolution, but also to gain information through image analysis and prediction. This combines machine learning techniques in the field of computer vision.
One application that is gaining a lot of use is face recognition or facial recognition. We extract features of a face by subtracting the pixels of the identified feature from its surrounding region. Then apply the Adaboost algorithm to the feature detection by running it against a classifier of trained data sets. The resulting output determines whether the object detected in an image is a face or not. This is called face detection and is the first part of facial recognition systems.
After a face has been detected, it must be identified. This is the actual face recognition taking place. Such systems, like Amazon Rekognition and the Google Could Vision API technology, can be used to identify a person. What this requires is a connection to a database that holds identity information. When a face is detected, the software will analyze it from the database. If a match is found it will then return the identity of the face detected.
Practical applications of facial recognition have been criticized for being unethical. For example, there is controversy that it can be used for mass surveillance and intrude on people’s privacy. It does have benefit for use in public ID systems like in airports, government agencies or office buildings, where a person’s ID is their face.
For this example, Google Vision API will be used to explain how labels can be detected in an image. The sample code is in Python 3.x courtesy of Google.
"""Detects labels in the file."""
from google.cloud import vision
client = vision.ImageAnnotatorClient()
with io.open(path, 'rb') as image_file:
content = image_file.read()
image = vision.types.Image(content=content)
response = client.label_detection(image=image)
labels = response.label_annotations
for label in labels:
This requires installing the client library and enabling the API. More instructions available at this Google link. This requires a valid Google user account.
When used with imaging devices, like on a smartphone camera, the user can quickly identify labels in an image. In this example from Google Vision API, labels are given a score between 0 and 1. The higher the score the higher the probability of the label detected.
As we can see from the label detection, ‘Street’ has a score of 0.872. That means that the since the score is 0.872 or 87.2% then the label ‘Street’ is highly probable i.e. “the image is located in a street”.
Google Lens is another application that uses label detection. However it provides another feature, which allows it to identify products or brands within the image. Google Assistant can then link to a website with more information about the product.
Apple’s Portrait Modes available with the iPhone brings AI to computational imaging. This brings features that allow users to capture amazing images without having any knowledge of photography. What makes this possible is the intelligence in the Neural Engine chip of the iPhone’s A series processor (e.g. A12 and A13 Bionic). Using machine learning techniques, the camera software can adjust the lighting and apply image stabilization to capture the sharpest image. The iPhone also can apply filters to smoothen out the image and make it look better for selfies or portraits. The iPhone uses Portrait Mode to take extraordinary images, but users can disable it.
There has been some controversy with the iPhone XS and XS Max Portrait Mode selfie’s aggressive retouching aka “Beautygate”. The results were an almost perfect selfie, in which blemishes and any imperfections on the face were removed. It was caused by a bug, which Apple has confirmed, in the computational photography smart HDR algorithm. The iPhone used a 4 frame buffer of exposures. The exposures are then combined to get the best features from each. What iOS was doing was it favored a higher noise reduction so images look so smooth with less details. All this was done in software, so it could be corrected. Now AI has to be careful with how much retouching it does since this is an example of how far it goes.
Other smartphones like Samsung’s Galaxy Note and Huawei Mate have their own implementations of portrait modes that focus on beauty. With applications like this, it allows a non-photographer to accomplish two things with one click. A stunning image capture with retouching. In the past in order to get that you would have to hire a professional photographer to capture the image and retoucher to edit it. Now smartphone’s are intelligent enough to do these things. With AI, neural chips in the smartphone’s processor can remember certain settings and apply it to image processing.
Google’s Pixel smartphone camera is also ahead of the curve when it comes to computational imaging. The fact is instead of using multiple cameras with complicated optics, Google’s original Pixel use just a single extra lens that relies on AI processing using dual-pixel technology. The results are still stunning even though the Pixel uses just one camera rather than multiple cameras to capture the image.
Image superscaling is a way to increase the image resolution without losing detail and preserving as much quality as possible. Using FSRCNN (Fast Super-Resolution Convoluted Neural Networks), images can be upscaled without losing too much quality. This would allow users to upscale lower resolution images that were shot with older digital cameras.
FSCRNN involves the following steps:
- Feature extraction — Replaces bicubic interpolation with 5×5 convolutions.
- Shrinking — Reduction in feature maps.
- Non-Linear Mapping — Multiple layers are applied 3×3.
- Expanding — The feature map is now increased by 1×1 convolutions.
- Deconvolution — High resolution image is reconstructed using 9×9 filter.
This research paper (Chao Dong, Chen Change Loy, Xiaoou Tang — Cornell University) contains more information about FSRCNN.
Popular image editing software by companies like Adobe now incorporate AI in their products. We see this with the Adobe Creative Cloud suite with the Enhance Details feature with Sensei AI used in LR (LightRoom). Retouchers and photo editors (raster image) can surely benefit from these new features in the software they use. The image processing takes place at the demosaicing stage to achieve crisp details and better color accuracy.
There are other ways AI is making image editing more revolutionary. Intel and University of Illinois at Urbana-Champaign researchers have come up with an algorithm to process low-light images. It uses post-processing enhancements that can bring out details from images shot in low-light photography. Normally, photographers would not even bother to edit poorly lit images because they are too underexposed. It can help cameras “see in the dark”.
Not surprisingly, the researchers called the system See-in-the-Dark (SID). It works by comparing a photo shot in low-light to a dataset of other photos that were shot with a slower shutter speed (long exposures). Imagine being able to recover details from a poorly lit image. That means it really doesn’t matter how dark the RAW image is when the algorithm can be applied to enhance it.
The medical field is promising for the application of AI in their diagnostic imaging systems. Radiologists are benefiting from intelligent software that use AI to assist in analyzing MRI and CT scans. Human error, by the technician or radiologist, can contribute to a misdiagnosed report that can lead to patient fatalities. The use of AI helps to correct or prevent these problems.
Medical imaging combines machine learning, pattern recognition and computer vision. Since AI is data driven, the more datasets learned by the software, the greater the probability of a correct prediction. This helps radiologists to make faster yet accurate diagnosis, that aims to make the process more efficient. This was not meant to fully replace the doctor from the system, but to assist them in making better diagnosis.
The use of deep learning techniques is being applied to medical diagnostic imaging. This uses a training set that identifies labels to make classifications. In this way, the algorithm can detect certain things like tumors or conditions in a patient’s diagnostic images to assist doctors. According to a study from Lancet Digital Health, researchers found that deep learning systems correctly detected a disease 87% of the time correctly compared to 86% for healthcare professionals. That shows that AI systems are close to accurate when compared to human counterparts.
Computer vision when used with sensors are being applied to self-driving cars. Tesla is a testament of this system, which favors cameras to LiDAR. Using software to analyze the images the camera captures, self-driving cars can detect pedestrians, roadblocks and other cars on the street. The problem this system faces is still accuracy. Public safety is the biggest concern with self-driving cars because they are for the most part not yet at the level of full autonomy.
Cameras can perform as the “eyes” for self-driving cars. The cameras capture the surroundings and then process the images using detection and classification techniques. The objects are identified and classified by the software. The data is then merged using sensor fusion to process information used in decision making (e.g. lane changes, full stop at red light, etc.).
Just like self-driving cars, robots used in manufacturing and industrial applications use computer vision. Amazon uses a combination of human workers and robots in their warehouses. Cameras are used as visioning systems to for the robots to read package codes that are needed to track and record delivery. The use of robots helps to speed up fulfillments and perform tasks that would otherwise require a human operator, like that for driving carts with packages ready for the delivery truck.
More importantly there are industrial robots that can benefit from vision systems to make their work more precise and accurate. One type of example of this are VGRS (Vision Guided Robotic Systems). For automated systems that use robots, it can increase the level of accuracy and efficiency. The robots also use a sensor fusion system that combines vision with other types of sensors.
Other Use Cases
Any system that requires computer vision will benefit from computational imaging and AI. The list is endless because in the process of new innovation will come new products and services. Some companies are exploring the use of imaging systems to analyze the satellite images with machine learning. This technique allows scientists to understand and gain more knowledge about terrain and even geography. Other projects include using AI-enabled cameras to monitor crime, virtual makeup for beauty applications using AR (Augmented Reality) technology and real estate and tourism guides that use mixed reality technology that combines AR/VR.
Great Outlook For Imaging
The use of AI can assist in computational imaging without necessarily having to replace humans. The photographer is not going to be replaced if there is a skill set required that AI lacks. A very good director will need someone who directs photography, if there is no AI expert system that can replace this profession. Job security is a matter of demand, not just obsolescence. As far as AI is concerned with computational imaging, it is best implemented with expert system software to perform its tasks. It is more likely to be used to help assist in tasks that will further improve efficiency. It may also be needed when the task required cannot be performed by conventional systems. Many of the applications we discussed can be put into one device, which makes it innovative enough to warrant some praise. While traditional imaging will always rely on optics, the future of digital imaging is looking computational and AI will be there to support it.