(To be updated with more pictures)
In 2017 summer, I joined the Visual Computing group at the Institute for Infocomm Research to work on Generative Adversarial Networks(GANs) in Tensorflow. Even today, I still do not fully understand the mathematics behind GANs. Yet, the idea behind it is really simple and interesting.
There are two components of GANs: the generator and the discriminator. The generator takes random noise and aims to generate realistic images that fool the discriminator. The discriminator needs to tell which images are real (from the dataset) and which are fake (from the generator). The networks are trained in a way that neither network dominates the other.
Our team incorporated a classifier in the networks of generator and discriminator. We wanted the networks to effectively tell the class that a particular image belonged to. The second aim was to reduce the percentage of labelled data as they could be expensive in the real world.
To achieve this, we implemented a five-layer ResNet with some tricks described in this paper by OpenAI. The classifier and discriminator shared the first four layers as they had similar functions. My job was mainly to tune the hyperparameters for our experiments. While it was really cool to work on GANs, tuning hyperparameter was indeed annoying due to the following reasons:
GANs are sensitive and most of the times there isn’t a pattern. For example, I found out that setting high objectives(90%) for classifier, generator and discriminator generally helped to improve the accuracy. Yet, my colleague Emily set the objectives to 70% achieched comparable result.
I had no clue why a certain set of hyperparameters worked particularly well and why another set worked badly. For this reason, I felt research in deep learning was more like cooking rather than engaging in an intellectual challenge.
Fortunatley after two months of training, we were able to achieve an accuracy of 72.6% with 10% labelled data. Furthermore, our generator was able to generate quality images especially for the car and horse classes (or are human beings just better at recognising car and horse instead of frog?).
Here are some resources about GANs which I found useful: