Keshigeyan Chandrasegaran /
Ngoc‑Trung Tran /
Ngai‑Man Cheung
Singapore University of Technology and Design (SUTD)
CVPR 2021 (Oral)
CNN-based generative modelling has evolved to produce synthetic images indistinguishable from real images in the RGB pixel space. Recent works have observed that CNN-generated images share a systematic shortcoming in replicating high frequency Fourier spectrum decay attributes. Furthermore, these works have successfully exploited this systematic shortcoming to detect CNN-generated images reporting up to 99% accuracy across multiple state-of-the-art GAN models. In this work, we investigate the validity of assertions claiming that CNN-generated images are unable to achieve high frequency spectral decay consistency. We meticulously construct a counterexample space of high frequency spectral decay consistent CNN-generated images emerging from our handcrafted experiments using DCGAN, LSGAN, WGAN-GP and StarGAN, where we empirically show that this frequency discrepancy can be avoided by a minor architecture change in the last upsampling operation. We subsequently use images from this counterexample space to successfully bypass the recently proposed forensics detector which leverages on high frequency Fourier spectrum decay attributes for CNN-generated image detection. Through this study, we show that high frequency Fourier spectrum decay discrepancies are not inherent characteristics for existing CNN-based generative models—contrary to the belief of some existing work—, and such features are not robust to perform synthetic image detection. Our results prompt re-thinking of using high frequency Fourier spectrum decay attributes for CNN-generated image detection.
In this study, we investigated the validity of contemporary beliefs that CNN-based generative models are unable to reproduce high frequency decay attributes of real images. We employ a systematic study to design counterexamples to challenge the existing beliefs. With maximum frequency bounded by the spatial resolution, and Fourier discrepancies reported at the highest frequencies, we hypothesized that the last upsampling operation is mostly related to this shortcoming. With carefully designed experiments spanning multiple GAN architectures, loss functions, datasets and resolutions, we observe that high frequency spectral decay discrepancies can be avoided by replacing zero insertion based scaling used by transpose convolutions with nearest or bilinear at the last step. Note that we do not claim that modifying the last feature map scaling method will always fix spectral decay discrepancies in every situation, but rather the goal of our study is to provide counterexamples to argue that high frequency spectral decay discrepancies are not inherent characteristicsof CNN-generated images. Further, we easily bypass the recently proposed synthetic image detector that exploits this discrepancy information to detect CNN-generated images indicating that such features are not robust for the purposes of synthetic image detection. In Supplementary material, we provide more GAN models with no high frequency decay discrepancies. We also investigate whether such high frequency decay discrepancies are found in other types of computational image synthesis methods (synthesis using Unity game engine). To conclude, through this work we hope to help image forensics research manoeuvre in more plausible directions to combat the fight against CNN-synthesized visual disinformation.
@InProceedings{Chandrasegaran_2021_CVPR,
author = {Chandrasegaran, Keshigeyan and Tran, Ngoc-Trung and Cheung, Ngai-Man},
title = {A Closer Look at Fourier Spectrum Discrepancies for CNN-Generated Images Detection},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2021},
pages = {7200-7209}}
This project was supported by SUTD project PIE-SGP-AI-2018-01. This research was also supported by the National Research Foundation Singapore under its AI Singapore Programme [Award Number: AISG-100E2018-005]. This work was also supported by ST Electronics and the National Research Foundation (NRF), Prime Minister’s Office, Singapore under Corporate Laboratory @ University Scheme (Programme Title: STEE Infosec - SUTD Corporate Laboratory). We also gratefully acknowledge the support of NVIDIA AI Technology Center (NVAITC) for our research.