Dimension-independent Certified Neural Network Watermarks via Mollifier Smoothing

Abstract

Certified_Watermarks is the first to provide a watermark certificate against 𝑙2-norm watermark removal attacks, by leveraging the randomized smoothing techniques for certified robustness to adversarial attacks. However, the randomized smoothing techniques suffer from hardness of certified robustness in high-dimensional space against 𝑙𝑝-norm attacks for large 𝑝 (𝑝>2). The certified watermark method based on the randomized smoothing is no exception, i.e., fails to provide meaningful certificates in high-dimensional space against the 𝑙𝑝-norm watermark removal attacks (𝑝>2). By leveraging mollifier theory, this paper proposes a mollifier smoothing method with dimension-independent certified radius of our proposed smooth classifier, for conducting the certified watermark problem against the 𝑙𝑝-norm watermark removal attacks (1≤𝑝≤∞) for high parameter dimension 𝑑. Based on partial differential equation (PDE) theory, an approximation of mollifier smoothing is developed to alleviate the inefficiency of sampling and prediction in the randomized smoothing as well as numerical integration in the mollifier smoothing, while maintaining the certified watermark against the 𝑙𝑝-norm watermark removal attacks (1≤𝑝≤∞).

View PDF

Dimension-independent Certified Neural Network Watermarks via Mollifier Smoothing

Abstract

Authors

Venue

Date

Share

Related Publications

Join Us on the Cutting Edge of AI Innovation