Building a federated NIDS that withstands sophisticated backdoor attacks in Non-IID, privacy-constrained network environments --> introducing SENTINEL, a novel defense combining multi-signal anomaly filtering with coordinate-wise trimmed median aggregation.
Network Intrusion Detection Systems (NIDS) are a critical line of defence against cyberattacks. Federated Learning (FL) enables collaborative model training across distributed organisations without sharing raw network traffic — making it an attractive paradigm for privacy-preserving NIDS.
However, FL is inherently vulnerable to backdoor attacks, where malicious clients poison the global model to misclassify specific traffic patterns as benign. Our experiments demonstrate that a model achieving 93% main task accuracy can simultaneously suffer a devastating 99.95% Attack Success Rate (ASR), proving that accuracy alone is a dangerously insufficient security metric.
This research investigates how sophisticated backdoor attacks evolve to evade state-of-the-art defences (FLAME, Multi-Krum) and proposes SENTINEL — a novel backdoor-resilient aggregation algorithm combining multi-signal anomaly filtering (L2 norm + Sybil similarity) with coordinate-wise trimmed median aggregation and optional Differential Privacy.
A backdoor-resilient federated aggregation algorithm that fuses multiple anomaly detection signals to identify and filter malicious clients before performing robust coordinate-wise trimmed median aggregation.
Compute per-client model weight deltas (update − global model).
Calculate L2 norm anomaly score and Sybil similarity score (cosine similarity between clients). Normalize using Median Absolute Deviation (MAD).
Combine signals into a unified anomaly score. Sort clients, reject outliers using IQR threshold. Ensure minimum benign clients remain.
Stack deltas per coordinate, sort and trim top/bottom f values, compute coordinate-wise trimmed median, update global model.
Add calibrated Gaussian noise to the aggregated update to provide formal privacy guarantees.
We study three generations of backdoor attacks — each designed to evade the defences that defeated the previous generation.
Train a backdoored model locally, compute the model update, and scale it with a large factor (λ) to replace the global model in a single round. Easily detected by large update norms.
"Ninja logic" — the attacker measures the magnitude of honest updates and caps the malicious update norm at the median of honest clients using L2 projection. Bypasses norm-based filters while maintaining high ASR.
The most sophisticated threat model. Uses Shadow Training to simulate clean update directions, then applies a Gradient Alignment Penalty in the loss function — forcing malicious updates to mimic benign ones. Bypasses FLAME (SOTA).
Phase 03 introduces PFedBA (Proxy Federated Backdoor Attack) — the most sophisticated attacker in our evaluation, combining Shadow Training and Gradient Alignment to bypass FLAME and other SOTA defences.
The attacker maintains a shadow model trained on a proxy dataset that approximates the global data distribution. This lets them predict clean update directions — disguising the malicious update to mimic benign gradient behaviour.
A custom loss term minimises cosine distance between the malicious and simulated clean gradient — forcing the malicious update to align directionally with honest clients, evading cosine-similarity filters like FLAME.
The attacker caps their update's L2 norm to match the median of honest clients. Combined with gradient alignment, PFedBA updates are statistically indistinguishable from benign ones in both magnitude and direction.
FLAME uses HDBSCAN on cosine similarities to detect malicious clients. PFedBA's gradient alignment forces the attacker into the same cluster as honest clients, completely neutralising FLAME. Under Non-IID (α=0.1), FLAME reaches 90.3% ASR.
PFedBA attack pipeline: Shadow Training → Gradient Alignment → Norm Scaling → Aggregation Evasion
PFedBA represents the convergence of norm stealth (Phase 02) and directional stealth (Phase 03). No existing SOTA defense — including FLAME — counters both simultaneously. SENTINEL was built against this attacker, achieving ≈0% ASR (IID) and 10.5% ASR (Non-IID α=0.5), vs FLAME's 7.7% and 60.4%.
A model at 93% Main Task Accuracy still suffered 99.95% Attack Success Rate under PFedBA — shattering the assumption that accuracy implies safety.
SENTINEL reduces ASR to ≈0% in IID settings and 10.5% under Non-IID (α=0.5) — a 9× improvement over FLAME (SOTA) at 60.4%.
At α=0.1, even SENTINEL struggles against PFedBA (ASR≈30%), exposing a critical open problem in highly heterogeneous FL-NIDS environments.
* All results on UNSW-NB15 dataset with 10 clients (3 malicious). Lower ASR = stronger defense.
This research demonstrates a critical gap in federated NIDS security: existing state-of-the-art defences (including FLAME) are fundamentally broken against sophisticated adaptive attackers like PFedBA. Our proposed SENTINEL algorithm significantly outperforms all baselines in IID and moderate Non-IID settings, reducing ASR from 99.5% (FedAvg) to 10.5% (Non-IID α=0.5) — while preserving high main-task accuracy.
A novel backdoor-resilient federated aggregation algorithm combining L2 norm anomaly scoring, Sybil similarity detection, IQR-based filtering, and coordinate-wise trimmed median aggregation — with optional Differential Privacy.
We formally demonstrated that a model with 93% main-task accuracy can simultaneously achieve 99.95% Attack Success Rate — proving accuracy is a dangerously insufficient metric for evaluating FL-NIDS security.
A systematic evaluation of 7 aggregation strategies (FedAvg, Median, Trimmed Mean, Krum, Multi-Krum, FLAME, SENTINEL) against 3 attack generations across IID and Non-IID data distributions.
Current experiments use a single-feature trigger; real-world attacks may use multi-feature or adaptive triggers.
Tested with 10 clients. Behaviour in larger federations (100+ nodes) may differ.
At α=0.1, SENTINEL still struggles against PFedBA — a critical open challenge for highly Non-IID environments.
The current threat model uses a static trigger pattern; adaptive trigger-based attacks remain a future challenge.





Full source code, notebooks, and experiment scripts.
GitHub →This GitHub Pages site for the research project.
Visit →Faculty of Engineering, University of Peradeniya.
Website →Faculty of Engineering, Peradeniya, Sri Lanka.
Website →