DRM for Computer Vision Datasets

History suggests that eventually the ‘open’ age of computer vision research, where reproducibility and favorable peer review are central to the development of a new initiative, must give way to a new era of IP protection – where closed mechanisms and walled platforms prevent competitors from undermining high dataset development costs, or from using a costly project as a mere stepping-stone to developing their own (perhaps superior) version.

Currently the growing trend towards protectionism is mainly supported by fencing proprietary central frameworks behind API access, where users send sparse tokens or requests in, and where the transformational processes that make the framework’s responses valuable are entirely hidden.

In other cases, the final model itself may be released, but without the central information that makes it valuable, such as the pre-trained weights that may have cost multiple millions to generate; or lacking a proprietary dataset, or exact details of how a subset was produced from a range of open datasets. In the case of OpenAI’s transformative Natural Language model GPT-3, both protection measures are currently in use, leaving the model’s imitators, such as GPT Neo, to cobble together an approximation of the product as best they can.

Copy-Protecting Image Datasets

However, interest is growing in methods by which a ‘protected’ machine learning framework could regain some level of portability, by ensuring that only authorized users (for instance, paid users) could profitably use the system in question. This usually involves encrypting the dataset in some programmatic way, so that it is read ‘clean’ by the AI framework at training time, but is compromised or in some way unusable in any other context.

Such a system has just been proposed by researchers at the University of Science and Technology of China at Anhui, and Fudan University at Shanghai. Titled Invertible Image Dataset Protection, the paper offers a pipeline that automatically adds adversarial example perturbation to an image dataset, so that it cannot be usefully used for training in the event of piracy, but where the protection is entirely filtered out by an authorized system containing a secret token.

From the paper: a ‘valuable’ source image is rendered effectively untrainable with adversarial example techniques, with the perturbations removed systematically and entirely automatically for an ‘authorized’ user. Source: https://arxiv.org/pdf/2112.14420.pdf

The mechanism that enables the protection is called reversible adversarial example generator (RAEG), and effectively amounts to encryption on the actual usability of the images for classification purposes, using reversible data hiding (RDH). The authors state:

‘The method first generates the adversarial image using existing AE methods, then embeds the adversarial perturbation into the adversarial image, and generates the stego image using RDH. Due to the characteristic of reversibility, the adversarial perturbation and the original image can be recovered.’

The original images from the dataset are fed into a U-shaped invertible neural network (INN) in order to produce adversarially affected images that are crafted to deceive classification systems. This means that typical feature extraction will be undermined, making it difficult to classify traits such as gender, and other face-based features (though the architecture supports a range of domains, rather than just face-based material).

An inversion test of RAEG, where different sorts of attack are performed on the images prior to reconstruction. Attack methods include Gaussian Blur and JPEG artefacts.

Thus, if attempting to use the ‘corrupted’ or ‘encrypted’ dataset in a framework designed for GAN-based face generation, or for facial recognition purposes, the resulting model will be less effective than it would have been if it had been trained on unperturbed images.

Locking the Images

However, that’s just a side-effect of the general applicability of popular perturbation methods. In fact, in the use case envisioned, the data is going to be crippled except in the case of authorized access to the target framework, since the central ‘key’ to the clean data is a secret token within the target architecture.

This encryption does come with a price; the researchers characterize the loss of original image quality as ‘slight distortion’, and state ‘[The] proposed method can almost perfectly restore the original image, while the previous methods can only restore a blurry version.’

The previous methods in question are from the November 2018 paper Unauthorized AI cannot Recognize Me: Reversible Adversarial Example, a collaboration between two Chinese universities and the RIKEN Center for Advanced Intelligence Project (AIP); and Reversible Adversarial Attack based on Reversible Image Transformation, a 2019 paper also from the Chinese academic research sector.

The researchers of the new paper claim to have made notable improvements in the usability of restored images, in comparison to these prior approaches, observing that the first approach is too sensitive to intermediary interference, and too easy to circumvent, while the second causes excessive degradation of the original images at (authorized) training time, undermining the applicability of the system.

Architecture, Data, and Tests

The new system consists of a generator, an attack layer that applies perturbation, pre-trained target classifiers, and a discriminator element.

The architecture of RAEG. Left-middle, we see the secret token ‘I_prt‘, which will allow de-perturbation of the image at training time, by identifying the perturbed features baked into the source images and discounting them.

Below are the results of a test comparison with the two prior approaches, using three datasets: CelebA-100; Caltech-101; and Mini-ImageNet.

The three datasets were trained as target classification networks, with a batch size of 32, on a NVIDIA RTX 3090 over the course of a week, for 50 epochs.

The authors claim that RAEG is the first work to offer an invertible neural network that can actively generate adversarial examples.

First published 4th January 2022.

Credit: Source link