Lingjuan is a senior research scientist and Privacy-Preserving Machine Learning (PPML) team leader in Sony AI. Prior to joining Sony AI, she spent more than six years working in academia and at industry research organizations. Lingjuan received her Ph.D. from the University of Melbourne. She was a recipient of the prestigious IBM PhD Fellowship Award in 2017, and has contributed to various professional activities, including ICML, NeurIPS, AAAI, IJCAI, and others. Lingjuan’s current research interests include federated learning, AI privacy and security, fairness, edge intelligence, and more. She has had more than 50 papers published in top conferences and journals, including NeurIPS, ICML, ICLR, Nature, AAAI, IJCAI, etc. Her papers have won numerous awards and have been selected as oral presentations at top conferences.


“The Sony AI Privacy-Preserving Machine Learning (PPML) team conducts cutting-edge research on trustworthy AI. Our team aims to integrate more privacy-preserving and robust AI solutions across Sony products. In the long-term, I hope that we can make the industrial AI systems privacy-compliant and robust for social good.”


MocoSFL: enabling cross-client collaborative self-supervised learning

ICLR, 2023
Jingtao Li, Lingjuan Lyu, Daisuke Iso, Chaitali Chakrabarti*, Michael Spranger

Existing collaborative self-supervised learning (SSL) schemes are not suitable for cross-client applications because of their expensive computation and large local data requirements. To address these issues, we propose MocoSFL, a collaborative SSL framework based on Split Fe…

IDEAL: Query-Efficient Data-Free Learning from Black-Box Models

ICLR, 2023
Jie Zhang, Chen Chen, Lingjuan Lyu

Knowledge Distillation (KD) is a typical method for training a lightweight student model with the help of a well-trained teacher model. However, most KD methods require access to either the teacher's training data or model parameter, which is unrealistic. To tackle this prob…

Twofer: Tackling Continual Domain Shift with Simultaneous Domain Generalization and Adaptation

ICLR, 2023
Chenxi Liu*, Lixu Wang, Lingjuan Lyu, Chen Sun*, Xiao Wang*, Qi Zhu*

In real-world applications, deep learning models often run in non-stationary environments where the target data distribution continually shifts over time. There have been numerous domain adaptation (DA) methods in both online and offline modes to improve cross-domain adaptat…

MECTA: Memory-Economic Continual Test-Time Model Adaptation

ICLR, 2023
Junyuan Hong, Lingjuan Lyu, Jiayu Zhou*, Michael Spranger

Continual Test-time Adaptation (CTA) is a promising art to secure accuracy gains in continually-changing environments. The state-of-the-art adaptations improve out-of-distribution model accuracy via computation-efficient online test-time gradient descents but meanwhile cost …

Towards Robustness Certification Against Universal Perturbations

ICLR, 2023
Yi Zeng, Zhouxing Shi*, Ming Jin*, Feiyang Kang*, Lingjuan Lyu, Cho-Jui Hsieh*, Ruoxi Jia*

In this paper, we investigate the problem of certifying neural network robustness against universal perturbations (UPs), which have been widely used in universal adversarial attacks and backdoor attacks. Existing robustness certification methods aim to provide robustness gua…

Minimum Topology Attacks for Graph Neural Networks

WWW, 2023
Mengmei Zhang*, Xiao Wang*, Chuan Shi*, Lingjuan Lyu, Tianchi Yang*, Junping Du*

With the great popularity of Graph Neural Networks (GNNs), their robustness to adversarial topology attacks has received increasing attention. Although many attack methods have been proposed, they mainly focus on fixed-budget attacks, aiming at finding the most adversarial p…

Delving into the Adversarial Robustness of Federated Learning

AAAI, 2023
Zijie Zhang*, Bo Li*, Chen Chen, Lingjuan Lyu, Shuang Wu*, Shouhong Ding*, Chao Wu*

In Federated Learning (FL), models are as fragile as centrally trained models against adversarial examples. However, the adversarial robustness of federated learning remains largely unexplored. This paper casts light on the challenge of adversarial robustness of federated le…

Defending Against Backdoor Attacks in Natural Language Generation

AAAI, 2023
Xiaofei Sun*, Xiaoya Li*, Yuxian Meng*, Xiang Ao*, Lingjuan Lyu, Jiwei Li*, Tianwei Zhang*

The frustratingly fragile nature of neural network models make current natural language generation (NLG) systems prone to backdoor attacks and generate malicious sequences that could be sexist or offensive. Unfortunately, little effort has been invested to how backdoor attac…

MocoSFL: enabling cross-client collaborative self-supervised learning

NeurIPS, 2022
Jingtao Li, Lingjuan Lyu, Daisuke Iso, Chaitali Chakrabarti*, Michael Spranger

Existing collaborative self-supervised learning (SSL) schemes are not suitable for cross-client applications because of their expensive computation and large local data requirements. To address these issues, we propose MocoSFL, a collaborative SSL framework based on Split Fe…

Outsourcing Training without Uploading Data via Efficient Collaborative Open-Source Sampling

NeurIPS, 2022
Junyuan Hong, Lingjuan Lyu, Jiayu Zhou*, Michael Spranger

As deep learning blooms with growing demand for computation and data resources, outsourcing model training to a powerful cloud server becomes an attractive alternative to training at a low-power and cost-effective end device. Traditional outsourcing requires uploading device…

Calibrated Federated Adversarial Training with Label Skewness

NeurIPS, 2022
Chen Chen, Yuchen Liu*, Xingjun Ma*, Lingjuan Lyu

Recent studies have shown that, like traditional machine learning, federated learning (FL) is also vulnerable to adversarial attacks.To improve the adversarial robustness of FL, few federated adversarial training (FAT) methods have been proposed to apply adversarial training…

DENSE: Data-Free One-Shot Federated Learning

NeurIPS, 2022
Jie Zhang, Chen Chen, Bo Li*, Lingjuan Lyu, Shuang Wu*, Shouhong Ding*, Chunhua Shen*, Chao Wu*

One-shot Federated Learning (FL) has recently emerged as a promising approach, which allows the central server to learn a model in a single communication round. Despite the low communication cost, existing one-shot FL methods are mostly impractical or face inherent limitatio…

CATER: Intellectual Property Protection on Text Generation APIs via Conditional Watermarks

NeurIPS, 2022
Xuanli He*, Qiongkai Xu*, Yi Zeng, Lingjuan Lyu, Fangzhao Wu*, Jiwei Li*, Ruoxi Jia*

Previous works have validated that text generation APIs can be stolen through imitation attacks, causing IP violations. In order to protect the IP of text generation APIs, a recent work has introduced a watermarking algorithm and utilized the null-hypothesis test as a post-h…

Prompt Certified Machine Unlearning with Randomized Gradient Smoothing and Quantization

NeurIPS, 2022
Zijie Zhang*, Xin Zhao*, Tianshi Che*, Yang Zhou*, Lingjuan Lyu

The right to be forgotten calls for efficient machine unlearning techniques that make trained machine learning models forget a cohort of data. The combination of training and unlearning operations in traditional machine unlearning methods often leads to the expensive computa…

FairVFL: A Fair Vertical Federated Learning Framework with Contrastive Adversarial Learning

NeurIPS, 2022
Tao Qi*, Fangzhao Wu*, Chuhan Wu*, Lingjuan Lyu, Tong Xu*, Hao Liao*, Zhongliang Yang*, Yongfeng Huang*, Xing Xie*

Vertical federated learning (VFL) is a privacy-preserving machine learning paradigm that can learn models from features distributed on different platforms in a privacy-preserving way. Since in real-world applications the data may contain bias on fairness-sensitive features (…

FedSkip: Combatting Statistical Heterogeneity with Federated Skip Aggregation

ICDM, 2022
Ziqing Fan*, Yanfeng Wang*, Jiangchao Yao*, Lingjuan Lyu, Ya Zhang*, Qi Tian*

The statistical heterogeneity of the non-independent and identically distributed (non-IID) data in local clients significantly limits the performance of federated learning. Previous attempts like FedProx, SCAFFOLD, MOON, FedNova and FedDyn resort to an optimization perspecti…

Privacy and Robustness in Federated Learning: Attacks and Defenses

TNNLS, 2022
Lingjuan Lyu, Han Yu*, Xingjun Ma*, Chen Chen, Lichao Sun*, Jun Zhao*, Qiang Yang*, Philip S. Yu*

As data are increasingly being stored in different silos and societies becoming more aware of data privacy issues, the traditional centralized training of artificial intelligence (AI) models are facing efficiency and privacy challenges. Recently, federated learning (FL) has …

Cross-Network Social User Embedding with Hybrid Differential Privacy Guarantees

CIKM, 2022
Jiaqian Ren*, Lei Jiang*, Hao Peng*, Lingjuan Lyu, Zhiwei Liu*, Chaochao Chen*, Jia Wu*, Xu Bai*, Philip S. Yu*

Integrating multiple online social networks (OSNs) has important implications for many downstream social mining tasks, such as user preference modelling, recommendation, and link prediction. However, it is unfortunately accompanied by growing privacy concerns about leaking s…

Beyond Model Extraction: Imitation Attack for Black-Box NLP APIs

COLING, 2022
Qiongkai Xu*, Xuanli He*, Lingjuan Lyu, Lizhen Qu*, Gholamreza Haffari*

Machine-learning-as-a-service (MLaaS) has attracted millions of users to their splendid large-scale models. Although published as black-box APIs, the valuable models behind these services are still vulnerable to imitation attacks. Recently, a series of works have demonstrate…

Fine-mixing: Mitigating Backdoors in Fine-tuned Language Models

EMNLP, 2022
Zhiyuan Zhang*, Lingjuan Lyu, Xingjun Ma*, Chenguang Wang*, Xu Sun*

Deep Neural Networks (DNNs) are known to be vulnerable to backdoor attacks. In Natural Language Processing (NLP), DNNs are often backdoored during the fine-tuning process of a large-scale Pre-trained Language Model (PLM) with poisoned samples. Although the clean weights of P…

Extracted BERT Model Leaks More Information than You Think!

EMNLP, 2022
Xuanli He*, Chen Chen, Lingjuan Lyu, Qiongkai Xu*

The collection and availability of big data, combined with advances in pre-trained models (e.g. BERT), have revolutionized the predictive performance of natural language processing tasks. This allows corporations to provide machine learning as a service (MLaaS) by encapsulat…

Privacy for Free: How does Dataset Condensation Help Privacy?

ICML, 2022
Tian Dong, Bo Zhao*, Lingjuan Lyu

To prevent unintentional data leakage, research community has resorted to data generators that can produce differentially private data for model training. However, for the sake of the data privacy, existing solutions suffer from either expensive training cost or poor general…

Accelerated Federated Learning with Decoupled Adaptive Optimization

ICML, 2022
Jiayin Jin*, Jiaxiang Ren*, Yang Zhou*, Lingjuan Lyu, Ji Liu*, Dejing Dou*

The federated learning (FL) framework enables edge clients to collaboratively learn a shared inference model while keeping privacy of training data on clients. Recently, many heuristics efforts have been made to generalize centralized adaptive optimization methods, such as S…

A Federated Graph Neural Network Framework for Privacy-Preserving Personalization

Nature Communications, 2022
Yongfeng Huang*, Chuhan Wu*, Fangzhao Wu*, Lingjuan Lyu, Tao Qi*, Xing Xie*

Graph neural network (GNN) is effective in modeling high-order interactions and has been widely used in various personalized applications such as recommendation. However, mainstream personalization methods rely on centralized GNN learning on global graphs, which have conside…

Heterogeneous Graph Node Classification with Multi-Hops Relation Features

ICASSP, 2022
Xiaolong Xu*, Lingjuan Lyu, Hong Jin*, Weiqiang Wang*, Shuo Jia*

In recent years, knowledge graph~(KG) has obtained many achievements in both research and industrial fields. However, most KG algorithms consider node embedding with only structure and node features, but not relation features. In this paper, we propose a novel Heterogeneous …

How to Inject Backdoors with Better Consistency: Logit Anchoring on Clean Data

ICLR, 2022
Zhiyuan Zhang*, Lingjuan Lyu, Weiqiang Wang*, Lichao Sun*, Xu Sun*

Since training a large-scale backdoored model from scratch requires a large training dataset, several recent attacks have considered to inject backdoors into a trained clean model without altering model behaviors on the clean data. Previous work finds that backdoors can be i…

Vertically Federated Graph Neural Network for Privacy-Preserving Node Classification

IJCAI, 2022
Chaochao Chen*, Longfei Zheng*, Huiwen Wu*, Lingjuan Lyu, Jun Zhou*, Jia Wu*, Bingzhe Wu*, Ziqi Liu*, Li Wang*, Xiaolin Zheng*

Graph Neural Network (GNN) has achieved remarkable progresses in various real-world tasks on graph data. High-performance GNN models always depend on both rich features and complete edge information in graph. However, such information could possibly be isolated by different …

Data- Free Adversarial Knowledge Distillation for Graph Neural Networks

IJCAI, 2022
Yuanxin Zhuang*, Lingjuan Lyu, Chuan Shi*, Carl Yang*, Lichao Sun*

Graph neural networks (GNNs) have been widely used in modeling graph structured data, owing to its impressive performance in a wide range of practical applications. Recently, knowledge distillation (KD) for GNNs has enabled remarkable progress in graph model compression and …

Decision Boundary-aware Data Augmentation for Adversarial Training

TDSC, 2022
Chen Chen, Jingfeng Zhang*, Xilie Xu*, Lingjuan Lyu, Chaochao Chen*, Tianlei Hu*, Gang Chen*

Adversarial training (AT) is a typical method to learn adversarially robust deep neural networks via training on the adversarial variants generated by their natural examples. However, as training progresses, the training data becomes less attackable, which may undermine the …

Communication-Efficient Federated Learning via Knowledge Distillation

Nature Communications, 2022
Yongfeng Huang*, Chuhan Wu*, Fangzhao Wu*, Lingjuan Lyu, Xing Xie*

Federated learning is a privacy-preserving machine learning technique to train intelligent models from decentralized data, which enables exploiting private data by communicating local model updates in each iteration of model learning rather than the raw data. However, model …

Practical Attribute Reconstruction Attack Against Federated Learning

IEEE Transactions on Big Data, 2022
Chen Chen, Lingjuan Lyu, Han Yu*, Gang Chen*

Existing federated learning (FL) designs have been shown to exhibit vulnerabilities which can be exploited by adversaries to compromise data privacy. However, most current works conduct attacks by leveraging gradients calculated on a small batch of data. This setting is not …

Traffic Anomaly Prediction Based on Joint Static-Dynamic Spatio-Temporal Evolutionary Learning

TKDE, 2022
Xiaoming Liu*, Zhanwei Zhang*, Lingjuan Lyu, Zhaohan Zhang*, Shuai Xiao*, Chao Shen*, Philip Yu*

Accurate traffic anomaly prediction offers an opportunity to save the wounded at the right location in time. However, the complex process of traffic anomaly is affected by both various static factors and dynamic interactions. The recent evolving representation learning provi…

FedCTR: Federated Native Ad CTR Prediction with Cross Platform User Behavior Data

ACM TIST, 2022
Chuhan Wu*, Fangzhao Wu*, Lingjuan Lyu, Yongfeng Huang*, Xing Xie*

Native ad is a popular type of online advertisement which has similar forms with the native content displayed on websites. Native ad CTR prediction is useful for improving user experience and platform revenue. However, it is challenging due to the lack of explicit user inten…

FedBERT: When Federated Learning Meets Pre-Training

ACM TIST, 2022
Yuanyishu Tian*, Yao Wan*, Lingjuan Lyu, Dezhong Yao*, Hai Jin*, Lichao Sun*

The fast growth of pre-trained models (PTMs) has brought natural language processing to a new era, which becomes a dominant technique for various natural language processing (NLP) applications. Every user can download weights of PTMs, then fine-tune the weights on a task on …

Byzantine-resilient Federated Learning via Gradient Memorization

AAAI, 2022
Chen Chen, Lingjuan Lyu, Yuchen Liu*, Fangzhao Wu*, Chaochao Chen*, Gang Chen*

Federated learning (FL) provides a privacy-aware learning framework by enabling a multitude of participants to jointly construct models without collecting their private training data. However, federated learning has exhibited vulnerabilities to Byzantine attacks. Many existi…

GEAR: A Margin-based Federated Adversarial Training Approach

AAAI, 2022
Chen Chen, Jie Zhang, Lingjuan Lyu

Previous studies have shown that federated learning (FL) is vulnerable to well-crafted adversarial examples. Some recent efforts tried to combine adversarial training with FL, i.e., federated adversarial training (FAT), in order to achieve adversarial robustness in FL. Howev…

Differential Private Knowledge Transfer for Privacy-Preserving Cross-Domain Recommendation

WWW, 2022
Chaochao Chen*, Huiwen Wu*, Jiajie Su*, Lingjuan Lyu, Xiaolin Zheng*, Li Wang*

Cross Domain Recommendation (CDR) has been popularly studied to alleviate the cold-start and data sparsity problem commonly existed in recommender systems. CDR models can improve the recommendation performance of a target domain by leveraging the data of other source domains…

DADFNet: Dual Attention and Dual Frequency-Guided Dehazing Network for Video-Empowered Intelligent Transportation

AAAI, 2022
Yu Guo*, Wen Liu*, Jiangtian Nie*, Lingjuan Lyu, Zehui Xiong*, Jiawen Kang*, Han Yu*, Dusit Niyato*

Visual surveillance technology is an indispensable functional component of advanced traffic management systems. It has been applied to perform traffic supervision tasks, such as object detection, tracking and recognition. However, adverse weather conditions, e.g., fog, haze …

Protecting Intellectual Property of Language Generation APIs with Lexical Watermark

AAAI, 2022
Xuanli He*, Qiongkai Xu*, Lingjuan Lyu, Fangzhao Wu*, Chenguang Wang*

Nowadays, due to the breakthrough in natural language generation (NLG), including machine translation, document summarization, image captioning, etc NLG models have been encapsulated in cloud APIs to serve over half a billion people worldwide and process over one hundred bil…

Exploiting Data Sparsity in Secure Cross-Platform Social Recommendation

NeurIPS, 2021
Jamie Cui*, Chaochao Chen*, Lingjuan Lyu, Carl Yang*, Li Wang*

Social recommendation has shown promising improvements over traditional systems since it leverages social correlation data as an additional input. Most existing works assume that all data are available to the recommendation platform. However, in practice, user-item interacti…

Anti-Backdoor Learning: Training Clean Models on Poisoned Data

NeurIPS, 2021
Yige Li*, Xixiang Lyu*, Nodens Koren*, Lingjuan Lyu, Bo Li*, Xingjun Ma*

Backdoor attack has emerged as a major security threat to deep neural networks(DNNs). While existing defense methods have demonstrated promising results on detecting and erasing backdoor triggers, it is still not clear if measures can be taken to avoid the triggers from bein…

Gradient Driven Rewards to Guarantee Fairness in Collaborative Machine Learning

NeurIPS, 2021
Xu Xinyi*, Lingjuan Lyu, Xingjun Ma*, Chenglin Miao*, Chuan-Sheng Foo*, Bryan Kian Hsiang Low*

Collaborative machine learning provides a promising framework for different agents to pool their resources (e.g., data) for a common learning task. In realistic settings where agents are self-interested and not altruistic, they may be unwilling to share data or model without…

Data Poisoning Attacks on Federated Machine Learning

IEEE IoT-J, 2021
Gan Sun*, Yang Cong*, Jiahua Dong*, Qiang Wang*, Lingjuan Lyu, Ji Liu*

Federated machine learning which enables resource-constrained node devices (e.g., Internet of Things (IoT) devices, smartphones) to establish a knowledge-shared model while keeping the raw data local, could provide privacy preservation and economic benefit by designing an ef…

Joint Stance and Rumor Detection in Hierarchical Heterogeneous Graph

Chen li*, Hao Peng*, Jianxin Li*, Lichao Sun*, Lingjuan Lyu, Lihong Wang*, Philip Yu*, Lifang He*

Recently, large volumes of false or unverified information (e.g., fake news and rumors) appear frequently in emerging social media, which are often discussed on a large scale and widely disseminated, causing bad consequences. Many studies on rumor detection indicate that the…

FLEAM: A Federated Learning Empowered Architecture to Mitigate DDoS in Industrial IoT

IEEE TII, 2021
Jianhua Li*, Lingjuan Lyu, Ximeng Liu*, Xuyun Zhang*, Xixiang Lyu*

A Novel Attribute Reconstruction Attack in Federated Learning

IJCAI, 2021
Lingjuan Lyu, Chen Chen

Federated learning (FL) emerged as a promising learning paradigm to enable a multitude of partici- pants to construct a joint ML model without expos- ing their private training data. Existing FL designs have been shown to exhibit vulnerabilities which can be exploited by adv…


November 29, 2021 | Sony AI

Meet the Team #2: Lingjuan, Jerone and Roberto

What do privacy, pattern recognition, and percussion all have in common? They are concepts and creative endeavors that have inspired Sony AI team members Lingjuan, Jerone and Roberto. Read on to learn more about these three Sony…

What do privacy, pattern recognition, and percussion all have in common? They are concepts and creative endeavors that have insp…


Shape the Future of AI with Sony AI

We want to hear from those of you who have a strong desire
to shape the future of AI.