Dimension-independent Certified Neural Network Watermarks via Mollifier Smoothing
Abstract
Certified_Watermarks is the first to provide a watermark certificate against ๐2-norm watermark removal attacks, by leveraging the randomized smoothing techniques for certified robustness to adversarial attacks. However, the randomized smoothing techniques suffer from hardness of certified robustness in high-dimensional space against ๐๐-norm attacks for large ๐ (๐>2). The certified watermark method based on the randomized smoothing is no exception, i.e., fails to provide meaningful certificates in high-dimensional space against the ๐๐-norm watermark removal attacks (๐>2). By leveraging mollifier theory, this paper proposes a mollifier smoothing method with dimension-independent certified radius of our proposed smooth classifier, for conducting the certified watermark problem against the ๐๐-norm watermark removal attacks (1โค๐โคโ) for high parameter dimension ๐. Based on partial differential equation (PDE) theory, an approximation of mollifier smoothing is developed to alleviate the inefficiency of sampling and prediction in the randomized smoothing as well as numerical integration in the mollifier smoothing, while maintaining the certified watermark against the ๐๐-norm watermark removal attacks (1โค๐โคโ).
Venue
ICML 2023
Date
2023