The author of Twin-GAN is Jerry Li, he was interested in anime but he is not statisfied with his attempts to draw his favorite characters. So, he started doing machine learning to turn human portrait into an original anime character.
But let’s check the previous attempts at teaching AI how to draw.
Neural Style Transfer
· In this approach, the style of one image is applied to another, as you can see below.
· The important notion is that style transfer method requires a trained object detection network.
· Most such networks are trained on real-life objects.
· So, this solution is not likely to help with anime style, unless we create a new dataset manually. But that’s cost us lots of money.
Generative Adversarial Network (GAN)
· GAN is another way to the anime world.
· GAN includes a pair of competing for neural networks that can mimic any data given enough samples, good enough network, and enough time for training.
· Below we can see incredibly realistic faces generated using Progressive Growing of GANs (PGGAN).
· Besides generating pretty high-quality images, GAN is also capable of translating one type of images into another.
· However, this approach requires paired data (one image from each domain), but unfortunately, there is no paired datasets on human and anime potraits.
CycleGAN
· So, before creating Twin-GAN, Jerry Li tried to use CycleGAN for translation of human portraits into anime characters.
· He took 200K images from CelebA dataset as human portraits and around 30K anime figures from Getchu website. Two days of training and he got the results depicted below.
· The results are not bad, but they reveal some limitations of CycleGAN. Let’s not go that deep of limitations of CycleGAN.
Twin-GAN Model
· To solve the issues of previous model the structure of Twin-GAN was finally created.
· PGGAN was chosen as a generator. This network takes a high dimensional vector as its input and in our case an image is an input.
· The researcher used an encoder with structure symmetric to PGGAN to encode the image into the high dimensional vector.
· In order to keep the details of the input image, he used the UNet structure for connecting the convolutional layers in the encoder with the corresponding layers in the generator.
The input and output fall into the following three categories:
1. Human Portrait->Encoder->High Dimensional Vector->PGGAN Generator + human-specific-batch-norm->Human Portrait
2. Anime Portrait->Encoder->High Dimensional Vector->PGGAN Generator + anime-specific-batch-norm->Anime Portrait
3. Human Portrait->Encoder->High Dimensional Vector->PGGAN Generator + anime-specific-batch-norm->Anime Portrait
· The idea behind this structure is that letting the human and anime portraits share the same network will help the network realize that although they look different, both human and anime portraits are describing a face. This is crucial to image translation.
· Here are the results of translating human portraits into anime characters using Twin-GAN.
· Twin-GAN can turn a human portrait into an original anime character, cat face or any character given by the user, and the algorithm demonstrates quite a good performance when completing these tasks.
Limitations of TwinGAN
· It is also prone to some errors like mistaking the background color for hair color, ignoring important features or misrecognizing them.
· When considering anime characters generation, the problem is also with the availability of well-balanced dataset.
· Most of the anime faces collected by the researcher are female, and so the network is prone to translating male human portraits into female anime characters, like on the image below.
Now let’s understand the code now:
Code
Step 1: First clone the given websites
Install other dependencies
Step 2: Download and extract the frozen graph of the pretained model.
Step 3: Now we change current working directory to TwinGAN.
We provide two pre-trained models:
· Human to Anime
· Human to Cats
Step 4: We choose the models
Step 5: Now we write the command for image which we will want to print
Step 6: We choose a picture form in our local file system
Then we upload the file
Step 7: Run the following command to translate the demo inputs.
The input_image_path can be either one single image or a path containing images.
Step 8: Now by running python command to import image
Then by using image.open() command we show the result of the image.
Conclusion
To sum up, this is a great start, but some more work needs to be done to improve the performance of this budding approach.
Comments