Super-resolution for natural images is a classic and difficult problem in the field of image and video processing.
Super Resolution
Super-resolution is the process of upscaling and or improving the details within an image.
Often a low-resolution image is taken as an input and the same image is upscaled to a higher resolution, which is the output.
The details in the high-resolution output are filled in where the details are essentially unknown.
Below is an example of a low-resolution image with super-resolution performed upon it to improve it:
Left low-resolution image. Right super-resolution of the low-resolution image using the model trained here.
In the set of images below there are five images:
The lower resolution input image to be upscaled
The input image upscaled by nearest-neighbor interpolation
The input image upscaled by bi-linear interpretation, this is what your Internet browser would typically need
The input image upscaled and improved by this model’s prediction
The target image or ground truth, which was downscaled to create the lower resolution input.
Image repair and inpainting
Models that are trained for super-resolution should also be useful for repairing defects in an image (jpeg compression, tears, folds, and other damage)
Image inpainting is the process of retouching an image to remove unwanted elements in the image, such as a wire fence.
For training, it is common to cut out sections of the image and train the model to replace the missing parts based on prior knowledge of what should be there.
Image inpainting is usually a very slow process when carried out manually by a skilled human.
GANs for Super-resolution
Most deep learning-based super-resolution models are trained using Generative Adversarial Networks (GANs).
One of the limitations of GANs is that they are effectively a lazy approach as their loss function, the critic, is trained as part of the process and not specifically engineered for this purpose. This could be one of the reasons many models are only good at super-resolution and not image repair.
Universal application
Many deep learning super-resolution methods can’t be applied universally to all types of image and almost all have their weaknesses. For example, a model trained for the super-resolution of animals may not be good for the super-resolution of human faces.
The model trained with the methods detailed seemed to perform well across varied datasets including human features, indicating a universal model that is effective at upscaling on any category of the image may be possible.
Let's see an example:
Examples of X2 super-resolution (doubling the image size) from the same model trained on the Div2K dataset, 800 high-resolution images of a variety of subject matter categories.
From a model trained on varied categories of image. The model has added detail to the trees, the roof, and the building windows. Its impressive results.
From a model trained on varied categories of image. During training models on different datasets, I had found human faces to had the least pleasing results, however, the model here trained on varied categories of images has managed to improve the details in the face and look at the detail added to the hair, this is very impressive.
This model’s predictions having performed super-resolution
Both the images above were improvements made on validation image sets during or at the end of the training.
The trained model has been used to create upscaled images of over 1 megapixel, below are some examples:
In this example, a 512-pixel image saved at low JPEG quality (30) is inputted into the model that upscales the image to a 1024 pixel square image performing X2 super-resolution on a lower quality source image. Here the model’s prediction I believe looks better than the target ground truth image, which is amazing:
In this very basic terms this model:
Takes in an image as an input
Passes it through a trained mathematical function which is a type of neural network
Outputs an image of the same size or larger that is an improvement over the input.
TecoGAN
Let's start the code -
Code
Step 1: First we need to clone the TechoGAN repository
Step2: Now we need to install the required dependencies
Step 3: For further explanations of the parameter take a look at the runGan.py file
Step 4: There is several running example, they run with python3 runGan.py 1. The last number is the run case number.
Step 5: The training and validation dataset can be downloaded with the following commands into a chosen directory TrainingDataPath.
Step 6: Now we run these test cases one by one
Step 7: We calculate all metrics and save the CSV files and should use png
Step 8: In order to use the VGG as a perceptual loss, we need to download it from TensorFlow-Slim image classification model library
Step 9: Download our pre-trained FRVSR. If you want to train one try run case 4 and update this path by - FRVSRModel = "ex_FRVSRmm-dd-hh/model-500000"
Once ready, please update the parameter TrainingDataPath in runGAN.py (for case 3 and case 4) and then you can start training with the downloaded data.
Step 10: Now we prepare the training folder.
Step 11: Now for video training data, we update the TrainingDataPath according to ReadMe.md
For FRVSR Model -
Parameters for GAN training -
Here, the fading is now disabled
and the other losses -
Step 12: By pressing ctrl + c to stop current training and try to save the last model
Step 13: Video training data, same as case 3 in step 12
This is the same as the above step, runcase 3.
Step 14: This section gives the command to train a new TecoGAN Model. The detail and additional parameters can be found in the runGan.py file.
Conclusion
Super Resolution trained using loss functions such as these can perform very well for super-resolution including:
Upscaling low-resolution images to higher resolution images
Improving the quality of an image maintaining the resolution
Removing watermarks
Removing damaging from images
Removing JPEG and other compression artifacts
Images Code Source: Tecogan. nipynb
image source: Google Images...
Commenti