The generation of facial images has important applications in various industries, such as criminal investigations, character design, education and training. However, it takes at least several hours for a professional painter to draw a realistic portrait of a human face, it is even more difficult for a novice who has never been involved in painting. Face sketches drawn by novices are often very simple and abstract, even uneven and incomplete. But if you use the smart face drawing board, it will undoubtedly be like a god.
The recent deep image-to-image conversion technology has made it possible to quickly generate face images from sketches, but these methods are often affected by their input . Therefore, only when the original graphics have high-quality artistry or edge maps, they can get the most realistic results.
A group of researchers from the Chinese Academy of Sciences and the City University of Hong Kong introduced a local to global approach that can generate realistic portraits from relatively simple sketches.
Unlike most sketch-to-image translation solutions based on deep learning, the latter takes the input sketch as a fixed hard constraint, and then attempts to reconstruct the texture or shadow information lost between the strokes.
The key idea behind the new method is implicit in the form of learning, a reasonable face sketch space is obtained from the real face sketch image, and the point closest to the input sketch is found in this space. Since this method treats the input sketch more as a soft constraint for guiding image synthesis, it can generate high-quality face images with higher realism even from rough and/or incomplete input.
2. Fundamentals of AI, ML and Deep Learning for Product Managers
3. Roadmap to Data Science
4. Work on Artificial Intelligence Projects
The system consists of three main modules :
- Component Embedding (CE)
- Functional Mapping ( FM)
- Image Synthesis (IS).
The CE module uses an auto-encoder architecture and learns five feature descriptors from the facial sketch data-left eye, right eye, nose, mouth and the rest, which are projected into the local linear manifold space. The manifold space of each part is composed of feature vectors encoded by a large number of samples in the database. The input feature vector of the hand-drawn sketch sample is projected into this space as a point sample to find the nearest neighbors, and the sketch input is optimized through linear combination reconstruction, as shown in following Figure.
The FM and IS modules together form another deep learning subnet for conditional image generation, and map component feature vectors to real images,the decoding part of the FM module is similar to that of the CE module, but the difference is that the FM module maps the feature vector to the 32-channel feature mapping space instead of a 1-channel sketch.
Since the sketch has only one channel, it is difficult to solve the incompatibility problem of adjacent components in the overlapping area through the network from the sketch to the image. The mapping method in the FM module improves the information flow, thereby providing greater flexibility to fuse individual face parts to obtain higher-quality synthesis results.
After the overall framework is built, it will naturally require a large number of pairs of faces and sketches to train good results. In order to realize the abstract representation of human faces with sparse lines, this paper is based on the CelebAMask-HQ face image database. After screening the unobstructed facial images, the sketches are extracted using Photoshop plus sketch simplification method, and a new containing 17K dataset of face images and corresponding sketches.
After the data set is constructed, the network will be trained. Here, the training is divided into two stages. In the first stage, the content embedding module is trained, and the MSE loss after the encoded sketch is reconstructed as the loss function training. The second stage is to fix the parameters of the content embedding module, and then train the feature mapping module and the image synthesis module network in an end-to-end manner.
According to the researchers, both qualitative and quantitative evaluations show that the method can produce more visually pleasing facial images. The usability and expressive ability of the system are also well confirmed in user research.The researchers say that even for non-artists, their tools are easy to use, while still supporting fine-grained control over shape details. They are working hard to release the source code as soon as possible.
- Shu-Yu Chen, Wanchao Su, Lin Gao, Shihong Xia, Hongbo Fu DeepFaceDrawing: Deep Generation of Face Images from Sketches