Challenges are no longer on the horizon — they are right here
Technology can create virtually identical copy of humans.
Sounds familiar, huh? That’s right, Blade Runner.
The thing is, I wasn’t talking about a movie. Something very similar it’s happening for real, right now. In our world.
If in the movie the replicants have their own physical body and consciousness, in our 2019 we can create just a perfect portrait of men, women and children. How many you like.
Just take a look to the website thispersondoesnotexist.com and keep refreshing the page.
Trending AI Articles:
1. Making a Simple Neural Network
2. From Perceptron to Deep Neural Nets
3. Neural networks for solving differential equations
4. Turn your Raspberry Pi into homemade Google Home
As you probably assume by the website domain, all of those people do not really exist in our world.
How is it possible? It all begins with a massive download of HD portraits from Flickr, something like 70,000 different people of any age, sex and ethnicity. After an introductory step where these images are automatically adjusted and cut with an open source toolkit and followed by a second one removing pictures containing statues, paintings and photos of photos — here comes the magic. Basically a machine learning algorithm reads different characteristics in each photo and, starting from these, it automatically creates a new image — a new person, a fake one.
The website creator and software developer, Philip Wang, said that it is a great example to show of the power of AI to the public. I can’t help but agree with him: technically, it’s just an incredible project, with huge potential.
On the other hand, I think this is also a great example of how AI should not work, now and in the future.
On a closer view, in fact, underneath it reveals serious potential legal violations.
First, let’s talk about personal data protection.
Have you ever asked a person who can draw very well “How did you learn to draw like that”?
The same question goes for our case: how did StyleGAN learn to create those perfectly fake human faces?
AI, however technically sophisticated, is a collection of algorithms. It does not know our world. To be useful, it usually needs to be trained with data. In this case, the training set was composed of 70,000 cropped images from Flickr. Fed by all those portraits, StyleGAN learned what a human face looks like. The authors of the project probably believed there was nothing wrong with it: in their GitHub page, they wrote that “the individual images were published in Flickr by their respective authors” under permissive licenses, so basically they felt they could do anything with those pictures.
But wait a second… because there is something wrong here.
Let’s take a closer view.
StyleGAN was fed with 70,000 images not in order to understand what an “image with permissive license” is.
It was fed with 70,000 portraits of human faces, so it could learn what they look like in our world.
And under GDPR here in Europe, gathering that kind of data on european people (in legal language sounds like “processing of special categories of personal data” — yes, human face is considered a special personal data) requires appropriate attention, more than the other personal data.
Are there european faces in the training set? Sure. In the GitHub page there are some statistics.
The country is unknown in 85% of the images and 5,7% of them are in “other country” category, more than 1.100 images came from UK (in EU so far), more than 400 people in the training set are from Italy, and the same goes for Spain.
In case you are asking yourself, yes — there are even underage european children. Below (“21695.png” file) you can see an example from a local children festival in Italy called “La notte delle streghe e dei folletti” (“The night of witches and elves”). Here is the Flickr link.
So is the consent for processing special categories of personal data required? Speaking in general, yes — absolutely! But there are also some exceptions, and this could be one case.
In fact, GDPR allows waiver (art.89) in data processing for scientific research purposes. Even though it is not always clear if a project is or is not made for that objective, the paper of the project is on a page of Cornell University — so I guess it is.
Still, with that kind of project as basis can researchers do whatever they want? No, GDPR is not that Machiavellian.
Concerned about possible misuses, the European Data Protection Supervisor (EDPS), and in some cases even member states such as Italy, setup legal boundaries and ethic rules for processing personal data for statistical purposes or scientific research. An example of this is carrying out research in compliance with the GDPR principles, specifying the measures taken when processing personal data.
I asked the (Tyrrel Corporation) StyleGAN team if they were compliant with these ethical rules in their project.
Nobody answered me.