IPGuard: Protecting the Intellectual Property of Deep Neural Networks via Fingerprinting the Classification Boundary
A deep neural network (DNN) classifier represents a model owner's intellectual property as training a DNN classifier often requires lots of resource. Watermarking was recently proposed to protect the intellectual property of DNN classifiers. However, watermarking suffers from a key limitation: it sacrifices the utility/accuracy of the model owner's classifier because it tampers the classifier's training or fine-tuning process. In this work, we propose IPGuard, the first method to protect intellectual property of DNN classifiers that provably incurs no accuracy loss for the classifiers. Our key observation is that a DNN classifier can be uniquely represented by its classification boundary. Based on this observation, IPGuard extracts some data points near the classification boundary of the model owner's classifier and uses them to fingerprint the classifier. A DNN classifier is said to be a pirated version of the model owner's classifier if they predict the same labels for most fingerprinting data points. IPGuard is qualitatively different from watermarking. Specifically, IPGuard extracts fingerprinting data points near the classification boundary of a classifier that is already trained, while watermarking embeds watermarks into a classifier during its training or fine-tuning process. We extensively evaluate IPGuard on CIFAR-10, CIFAR-100, and ImageNet datasets. Our results show that IPGuard can robustly identify post-processed versions of the model owner's classifier as pirated versions of the classifier, and IPGuard can identify classifiers, which are not the model owner's classifier nor its post-processed versions, as non-pirated versions of the classifier.
READ FULL TEXT