There’s a famous trope in crime TV shows: The characters are peering anxiously at a grainy surveillance camera, when suddenly they see their suspect — in a blurry image that’s only visible for a second.
“Wait a second,” someone says. “Zoom in … enhance.”
And suddenly they’re looking at a crystal-clear, perfect image of their suspect.
The whole concept is, of course, silly (and the trope has come in for mockery). If the camera only captured so many pixels in the first place, a button to retrieve a clearer image would have to be magic. Even in the distant, Star Trek future, after all, you can’t create information out of thin air.
Except … the enhance button might finally be here.
In the latest cool advance for artificial learning and machine intelligence, researchers have created a code that can reconstruct blurry, low-resolution images of faces to clear, higher-resolution versions that come very close to what the actual faces look like. This development comes in an area of machine-learning research called “face super-resolution,” which focuses on reconstructing faces from distorted or low-resolution images.
In a new paper recently accepted to a machine-learning conference, “Progressive Face Super-Resolution via Attention to Facial Landmark,” by researchers at the Korea Advanced Institute of Science and Technology, detailed faces are reconstructed from 16-by-16, highly pixelated images. Here are some examples from the paper:
Jonathan Fly, a machine-learning enthusiast who replicates (and goofily riffs on) recent machine-learning advances at his blog I Forced a Bot, put together this impressive collection of images, where the approach laid out in the new paper is able to do, on the whole, a shockingly good job predicting faces from pixelated images:
Okay, so the predicted faces are … a bit off. A weird number of them have mustaches, and even the ones that are almost right look kind of creepy, like they were assembled by a robot with no comprehension of what a human is but tons of examples of what we look like. (That’s because that’s exactly how they were assembled.)
But on the whole, they’re pretty good — it’d be way easier to identify a person from the AI-generated images than from the pixelated starting images.
To learn more about how this worked, I asked Law to give it a shot with a picture of me. The results show some of the ways this can be impressive — and some of the ways it can be horrifying.
On the far left, top row, you have the blurry, 16-by-16-pixel image that the AI starts with. (In the bottom row, it’s altered to have higher contrast.) Then you have some guesses from the bot — a “best case” guess where it did unusually well, a “more realistic” case with output that’s pretty typical, and some horrifying images showing how confused the computer gets if the facial features aren’t where the computer expects them.
My conclusion, after running these funhouse mirror images by some friends to ask whether they’d recognize me if they saw these pics on the evening news: The “enhance” button may not be good enough for use by law enforcement yet, but it’s getting quite close.
As we get better at using computers to fill in the blanks in low-quality images, law enforcement very well might start using technology like this to turn a blurry surveillance image into a reasonable sketch of the person they’re looking for. Photo editing programs might offer “super-resolution” filters that let you get an image that appears higher-resolution than it started out as.
And in the not-too-distant future, the “enhance” button, a staple of science fiction and police procedurals, might just be available for anyone who wants it. (You can download the code and try yourself.)
How is this possible?
Super-resolution and how it works
Machine-learning research focused on computer vision has seen all sorts of cool triumphs lately, from generating faces that don’t exist to making fake videos in which the Mona Lisa talks.
AI enhancement of blurry images is just the latest leap, and one of the more unexpected ones. The argument against the existence of the “enhance” button is simple — you can’t get information out of nowhere. If an image is blurry because the camera didn’t catch enough of a person’s face, then there’s no button that will fill in those blanks.
What you can do, though — and what turns out to be enough to recreate images with pretty high accuracy — is use all of your other knowledge about how human faces look to add information from the little that you started with. For example, we already know that human faces have a nose and cheeks and eyes. Therefore, all we need to do with the few pixels we have is guess which nose, out of the common human noses, best matches our few data points.
The thing the AI is doing to generate the above images is a lot more complicated — but that’s the general idea. The way to generate information out of nothingness is to start with reasonable assumptions about what you’re looking at, and to use the information you have about facial features in general to predict these facial features in particular. And it turns out it works pretty well.
In the past 10 years, breakthrough after breakthrough in machine learning has challenged our conception of what it’s possible for computers to do. We’ve learned how to create algorithms that generate images of people who never existed, that write poetry, and that animate the Mona Lisa. The new capabilities of AI are forcing us to reconsider what’s possible — and the “enhance button,” escaped from the realm of TV, is just the latest example.
Sign up for the Future Perfect newsletter. Twice a week, you’ll get a roundup of ideas and solutions for tackling our biggest challenges: improving public health, decreasing human and animal suffering, easing catastrophic risks, and — to put it simply — getting better at doing good.
Credit: Google News