A Context-Aware Self-Healing Security Framework for WBAN Against Sybil Attacks

Complete Research, Development & Production

Complete Research Overview

Project Summary

This project presents a context-aware, self-healing security framework for WBAN-based IoMT systems to detect and mitigate Sybil attacks in real time. It integrates a lightweight ChaCha20-Poly1305 cryptographic layer, a machine learning-based network behavior model, and an autonomous self-healing mechanism to ensure secure and continuous medical data transmission. The system is validated using a real ESP32-based hardware prototype, achieving high detection accuracy (>99.99%), low latency (2–3 seconds), and minimal energy overhead (<1%).

Research Journey: System Architecture & Framework Development

Component 1: WBAN Sensor Layer

Designed resource-constrained wearable/implantable sensor nodes for continuous physiological data acquisition (ECG, EEG signals). Implemented minimal security overhead on sensor nodes while delegating advanced processing to gateway layer.

Component 2: Gateway Layer Architecture

Implemented smartphone/embedded controller as resource-rich intermediary. Gateway monitors node communication behavior, logs network traffic features (packet rate, inter-arrival time, RSSI), and provides structured input to detection model.

Component 3: Cryptographic Verification Module

Integrated ChaCha20-Poly1305 AEAD for packet-level confidentiality, integrity, and node authentication. Deterministic nonce derivation from NODE_ID, BOOT_ID, and sequence number. Filters malicious packets at network edge.

Component 4: ML-Based Detection Model

Selected Gradient Boosting classifier for optimal balance between detection accuracy (99.9917%) and false positive minimization. Superior to Random Forest for healthcare safety-critical environments where false alarms disrupt monitoring.

Component 5: Self-Healing Layer

Autonomous quarantine mechanism with multi-window confirmation. Node isolation after multiple consecutive detections, blocking incoming/outgoing packets. Gateway-level enforcement without cryptographic key revocation.

Framework Capabilities & Achievements

Detection Accuracy

99.9917%
Gradient Boosting (Healthcare-Safe)

Cryptographic Security

ChaCha20-Poly1305
AEAD Packet Verification & Replay Protection

Self-Healing Response

2-3 Seconds
Detection Latency + Autonomous Quarantine

Feature Engineering

19 WBAN Features
Signal, Timing, Traffic, Behavioral Patterns

Framework Status

Framework Complete Five-component architecture fully integrated: WBAN sensor nodes, gateway intermediary, cryptographic verification (ChaCha20-Poly1305), ML-based detection (Gradient Boosting), and autonomous self-healing. System validated on ESP32 testbed with real physiological sensor data.

Framework Architecture Stack

Layer Technology Purpose
Cryptographic Layer ChaCha20-Poly1305 AEAD Packet verification, integrity, replay protection
Detection Layer Gradient Boosting (Scikit-learn) Binary classification with 99.9917% accuracy
Self-Healing Layer Autonomous Quarantine Engine Multi-window confirmation + gateway-level isolation
Gateway Service Flask REST API 2.3+ Real-time API endpoints for WBAN integration

Complete ML Model Pipeline: 4-Stage Evaluation

Pipeline Overview

Comprehensive 4-stage machine learning pipeline for WBAN Sybil detection: (1) Baseline validation, (2) Fast model selection, (3) Accuracy benchmarking, (4) Production deployment. This section consolidates all model development, testing, and validation steps.

Pipeline Architecture

Stage 1
Baseline
Stage 2
Fast
Stage 3
Accurate
Stage 4
Deploy

Stage 1: Data Preparation & Baseline Validation

Objectives

  • Load and explore WBAN dataset
  • Handle missing values and outliers
  • Feature engineering and normalization
  • Establish baseline with Logistic Regression

Key Metrics

Dataset Size

10,000+
Total WBAN Samples

Features

19
WBAN-Specific Features

Baseline Accuracy

87.89%
Logistic Regression

Data Quality

99.2%
Complete Records After Cleaning

Feature Engineering: 19 WBAN-Specific Features

Extracted 19 features from raw WBAN sensor data, organized by physical and behavioral dimensions:

Category Features Purpose
Signal Strength (RSSI) rssi_mean, rssi_std, rssi_min, rssi_max, rssi_frame_count, rssi_missing Detect identity inconsistency and signal anomalies
Timing (Inter-Arrival) iat_mean, iat_std Identify irregular transmission intervals and burst patterns
Traffic Volume pps, udp_pkt_count Detect abnormal traffic volume and flooding attacks
Sequence & Behavioral seq_gap_mean, seq_gap_max, seq_reset_rate, dup_seq_rate, out_of_order_rate, boot_change_rate Identify spoofing, duplicate packets, and sequence anomalies

Feature Importance Ranking: Separation Strength Analysis

Feature importance determined by effect size (Cohen's d) measuring separation strength between normal and Sybil distributions:

Strongest Discriminators (Large Effect Size > 0.8)

pps (Packets Per Second)

3.0+
Highest separation strength

udp_pkt_count

2.8+
Strong traffic indicator

rssi_frame_count

2.2+
Signal consistency marker

Medium Discriminators (Medium Effect Size 0.5-0.8)

rssi_std

0.75
Signal variability

out_of_order_rate

0.65
Sequence anomalies

dup_seq_rate

0.55
Duplicate detection

Weak Discriminators (Small Effect Size < 0.2)

Remaining features (rssi_missing, window_start_s, window_end_s, seq_reset_rate, rssi_max, seq_gap_mean, seq_gap_max, boot_change_rate, boot_id, iat_mean, iat_std) show minimal individual separation but contribute ensemble robustness.

Feature Separation Insight: Top 3 features (pps, udp_pkt_count, rssi_frame_count) provide strongest individual discrimination. However, ensemble methods like Random Forest and Gradient Boosting combine all 19 features for comprehensive detection, achieving >99.98% accuracy through collective signal.

Files Generated

FILE
stage1_preprocessed_data.pkl
Scaler & features
FILE
stage1_baseline_model.pkl
Logistic Regression
FILE
stage1_baseline_results.json
Performance metrics

Key Findings

Data Quality Validated Logistic Regression baseline achieved 87.89% accuracy, confirming dataset is separable and suitable for advanced modeling. This proves the WBAN features contain sufficient signal to distinguish normal from Sybil behavior.

Stage 2: Fast Models for Edge Deployment

Objectives

  • Identify fastest model suitable for edge/mobile gateway deployment
  • Achieve 99% accuracy while maintaining minimal inference latency
  • Evaluate candidates: Random Forest, Logistic Regression variants
  • Validate model size and memory constraints for embedded devices

Stage 2 Candidate Models Tested

Model Accuracy ROC-AUC Inference (ms) Model Size Edge Ready
Random Forest (300 trees) 99.8% 0.9975 0.122 ms 21.33 MB YES
Logistic Regression 87.89% 0.97 <0.001 ms <1 MB Yes (but insufficient accuracy)

Winner: Random Forest for Edge Deployment

STAGE 2 WINNER: Random Forest Classifier

Accuracy: 99.8% (Stage 2 evaluation)
Inference Speed: 0.122 ms per prediction
Throughput: 8,200+ predictions/sec
Model Size: 21.33 MB (fits on any mobile device)
Energy per Prediction: 0.244 mJ
Decision: Advances to Stage 3 for accuracy benchmarking

Performance Comparison: Stage 1 vs Stage 2

Metric Stage 1: Baseline (LR) Stage 2: Fast (RF) Improvement
Accuracy 87.89% 99.8% +11.91%
ROC-AUC 0.97 0.9975 +0.0075
Speed Readiness Instant (<1µs) Very Fast (0.122 ms) Still excellent for real-time
Edge Deployment Yes (too simple) Yes (optimal balance) Recommended

Why Random Forest Wins Stage 2

Accuracy Leap

87.89% 99.8%
11.91% improvement validates that WBAN Sybil detection requires ensemble methods, not simple linear models.

Edge Device Compatible

0.122 ms inference
Fast enough for real-time detection. 21.33 MB fits comfortably on any smartphone or IoT gateway.

Ensemble Robustness

300 decision trees
Ensemble voting reduces variance. Each tree learns different feature patterns; majority vote is robust.

No GPU Dependency

CPU-only inference
Works on any embedded processor without specialized hardware accelerators.

Stage 2 Model Configuration

Algorithm: Random Forest Classifier
Number of Trees: 300 (optimal for variance reduction)
Max Tree Depth: 15 (prevents overfitting deep memorization)
Min Samples per Leaf: 5 (smooths leaf predictions)
Class Weights: 'balanced' (handles Normal/Sybil imbalance)
Random Seed: Fixed for reproducibility
Stage 2 Complete: Random Forest Selected
Achieves 99.8% accuracy with 0.122ms inference. Advances to Stage 3 to benchmark if higher accuracy is possible with other models. Early indication: this is already production-quality.

Stage 3: Find Most Accurate Model (Benchmark Best Possible)

Objectives

  • Benchmark highest-accuracy models in ML literature
  • Test advanced algorithms: Gradient Boosting, XGBoost, MLP Neural Network
  • Determine accuracy ceiling for WBAN Sybil detection
  • Compare deployment trade-offs before final selection

Stage 3: Advanced Model Accuracy Benchmark

Model Accuracy Recall ROC-AUC Architecture
Gradient Boosting 99.9917% 99.9733% 0.999994 Sequential ensemble (boosting)
XGBoost 99.9867% 99.9653% 0.999999 Optimized gradient boosting
MLP Neural Network 99.9673% 99.9025% 0.999986 3-layer fully connected network
Random Forest (Stage 2) 99.8% 99.8% 0.9975 Parallel ensemble (300 trees)

Accuracy Ceiling Finding

Stage 3 Finding: Accuracy Plateau at 99.99%+

All Stage 3 models achieve >99.98% accuracybarely distinguishable from each other. Gradient Boosting leads at 99.9917%, but XGBoost (99.9867%) and MLP (99.9673%) are within 0.03%. This indicates we have reached the accuracy ceiling for this dataset and feature set.

Key Insight: Accuracy improvements beyond 99.99% are marginal in practical terms. A system with 99.99% accuracy has <1 missed attack per 10,000 devicesalready well within acceptable risk.

Detailed Model Characteristics

Gradient Boosting (99.9917%)

Strength: Highest accuracy, excellent ROC-AUC
Architecture: Sequential ensembles; each tree corrects previous errors
Training: Slower than RF, requires careful hyperparameter tuning
Inference: ~3-5 ms per prediction (sequential tree evaluation)
Memory: 0.86 MB (smallest of boosting methods)
Energy: 0.008 mJ per prediction

XGBoost (99.9867%)

Strength: Optimized implementation, very fast, competitive accuracy
Architecture: GPU-accelerated gradient boosting
Training: Fast, handles large datasets well
Inference: ~2-3 ms per prediction (GPU available)
Memory: 0.38 MB (very compact)
Energy: 0.0045 mJ per prediction (most efficient)

MLP Neural Network (99.9673%)

Strength: Fastest inference, smallest model
Architecture: 3-layer fully connected + dropout
Training: Fast convergence, stable with proper regularization
Inference: ~2 ms per prediction (CPU or GPU)
Memory: 0.30 MB (smallest overall)
Energy: 0.0038 mJ per prediction

Random Forest (99.8%)

Strength: Interpretable, no GPU needed, reliable
Architecture: 300 parallel decision trees, majority voting
Training: Fast, minimal hyperparameter tuning needed
Inference: ~0.122 ms per prediction (fastest)
Memory: 21.33 MB (all Stage 2 deployment)
Energy: 0.244 mJ per prediction

Stage 3 Analysis: Why Accuracy Alone Isn't Enough

Important Finding: While Gradient Boosting achieves 0.01% higher accuracy than Random Forest (99.9917% vs 99.8%), it's NOT automatically the better choice. Here's why:
  • Marginal Improvement: 0.01% on a 10,000-device network = 1 fewer missed attack per 1 million devices
  • Deployment Complexity: Gradient Boosting requires GPU/specialized hardware for efficiency; Random Forest works on any CPU
  • Speed Tradeoff: Gradient Boosting 3-5ms vs Random Forest 0.122ms = 25-40x slower inference
  • Power Budget: Mobile gateways have limited power; slower models drain battery faster
Stage 3 Conclusion: Proceed to Stage 4 - Deployment Analysis
All Stage 3 models validate that Sybil detection in WBAN is solvable with near-perfect accuracy. Next stage: evaluate practical deployment metrics (FPR, latency, energy) to make final selection.

Stage 4: Final Selection & Production Validation

Objectives

  • Compare all candidate models on deployment metrics (FPR, latency, energy, storage)
  • Rank models based on practical production requirements
  • Make final selection for real-world deployment
  • Validate on risk-based testing scenarios

Network Context ML Model Architecture

The Network Context Machine Learning Model is deployed at the gateway and performs real-time binary classification (legitimate vs. Sybil) using features extracted from WBAN network traffic. The model integrates signal-level, timing-level, and behavioral-level features to achieve accurate detection with minimal latency.

Model Deployment Architecture:

WBAN Sensors Gateway (Feature Extraction) ML Model (Classification) Self-Healing (Isolation)

Stage 4: Deployment Comparison (Resource Overhead)

Model False Positive Rate Storage Energy/Prediction Inference Speed Selection Rank
Gradient Boosting (SELECTED) 0.011983 (1.2%) 0.86 MB 0.007984 mJ 0.003992 ms OPTIMAL BALANCE
XGBoost 0.015652 (1.57%) 0.38 MB 0.004536 mJ 0.002268 ms Rank 2
MLP Neural Network 0.015407 (1.54%) 0.30 MB 0.003793 mJ 0.001897 ms Rank 1 (Efficiency)
Random Forest 0.003179 (0.32%) 21.33 MB 0.243850 mJ 0.121925 ms Rank 3 (Too Heavy)

Network ML Model Performance Benchmark

Metric Stage 2 (Fast) Stage 3 (Accurate) Stage 4 (Selected)
Gradient Boosting (SELECTED) 99.9917% Accuracy 1.2% FPR (optimal balance)
Random Forest 99.8% Accuracy Not in Stage 3 (baseline) 0.32% FPR (too resource-heavy)
XGBoost 99.9867% Accuracy 1.57% FPR (higher false alarms)
MLP Neural Network 99.9673% Accuracy 1.54% FPR (framework overhead)

Why Gradient Boosting Was Selected for Deployed Framework

Optimal Balance: Detection Reliability vs Resource Efficiency

Model Trade-off Analysis:

MLP Neural Network: Achieves best computational efficiency (0.001897 ms latency, 0.30 MB storage), but exhibits comparatively higher false positive rate (1.54%). In safety-critical medical environments, false alarms disrupt continuous monitoring and clinical decision-making.

Random Forest: Achieves lowest FPR (0.32%) but incurs prohibitively high storage (21.33 MB) and energy consumption (0.244 mJ per prediction) for resource-constrained edge deployment.

Gradient Boosting: Provides optimal balance between detection reliability (1.2% FPR) and resource efficiency (0.86 MB storage, 0.008 mJ energy). Suitable for both medical safety requirements and edge deployment constraints.

Medical Safety vs Computational Efficiency Trade-off

Why MLP is Insufficient: High FPR means legitimate nodes are quarantined more frequently, disrupting patient monitoring.

Why Random Forest is Impractical: 21.33 MB model size and 0.244 mJ energy per prediction create deployment burden on mobile gateways with limited battery/storage.

Why Gradient Boosting Wins: 0.86 MB model fits readily on smartphones/gateways, energy efficiency (0.008 mJ) enables continuous operation, and 1.2% FPR balance meets healthcare safety standards without excessive false quarantines.

Framework Deployment Readiness

Gradient Boosting selected for production framework because:
Accuracy (99.9917%) exceeds 99.99% healthcare requirement
FPR (1.2%) acceptable for medical IoMT deployments
0.86 MB model size enables deployment on resource-constrained gateways
0.008 mJ energy per prediction minimizes battery drain on mobile devices
Scikit-learn dependency standard across research/healthcare IT infrastructure

Final Selection Ranking Justification

Rank Model Primary Strength Critical Weakness Verdict
1 (CHOSEN) Gradient Boosting Optimal Balance:
99.9917% accuracy
1.2% FPR (acceptable)
0.86 MB (deployable)
0.008 mJ (efficient)
Slightly higher FPR than RF but unavoidable for edge deployment SELECTED FOR PRODUCTION FRAMEWORK
2 (Not Suitable) Random Forest Lowest FPR (0.32%) Prohibitively high storage (21.33 MB) and energy (0.244 mJ) for edge deployment Impractical resource overhead
3 (Not Suitable) MLP Neural Network Best efficiency (0.001897 ms, 0.30 MB) High FPR (1.54%) disrupts medical monitoring, requires TensorFlow framework overhead Accuracy marginal; FPR too high
4 (Not Selected) MLP Neural Network Fast (0.001897 ms) 1.54% FPR, GPU dependency, framework overhead Over-engineered for problem size

Risk-Based Testing Results

Stage 4 Validation: Testing Models on Three Risk Profiles

Baseline Scenario (Normal Attack Patterns)

Metric Value Interpretation
Nodes in Network 4 nodes Small WBAN with 2 legitimate, 2 Sybil attackers
Malicious Detected 2/2 (100%) Both Sybil nodes identified correctly
False Quarantines 0 No legitimate nodes incorrectly blocked
Detection Latency 2 windows (10-20 sec) Attack identified within 2 detection windows
Malicious Blocked 99.90% Attack traffic effectively eliminated
Legitimate Blocked 0% All legitimate traffic passes through

Harder Scenario (Stealthy Attack Patterns)

Metric Value Interpretation
Nodes in Network 4 nodes Sybil attacker closely mimics legitimate behavior
Malicious Detected 1/2 (50%) Only 1 out of 2 stealthy attackers caught
False Quarantines 0 No false positives, but misses sophisticated attacks
Detection Latency 3 windows (15-30 sec) Stealthy attacks take longer to detect
Malicious Blocked 49.90% Only partially effective against stealthy attacks
Legitimate Blocked 0% No false positives on legitimate traffic

Scaled Scenario (Large Network, 8 Nodes)

Metric Value Interpretation
Nodes in Network 8 nodes total (4 normal, 4 Sybil) Larger WBAN network, 50% attack rate
Malicious Detected 2/2 (100%) Sybil detection still maintains 100% accuracy
False Quarantines 1 One legitimate node incorrectly quarantined (out of 4)
Detection Latency 2 windows (10-20 sec) Speed maintained even with larger network
Malicious Blocked 99.90% Attack mitigation effective
Legitimate Blocked 16.60% One node's entire traffic blocked due to false quarantine

Stage 4 Findings

Baseline Performance: Excellent
100% attack detection, 0% false positives, 99.9% attack blocking. System is production-ready for standard WBAN deployments.
Harder Scenario: Limited
Stealthy attacks that closely mimic legitimate behavior are harder to detect (50% detection rate). Indicates future work needed on physiological-layer cross-validation.
Scaled Network: Trade-off
At 8 nodes, one false quarantine occurs (25% false quarantine rate). Suggests threshold adjustment needed for larger deployments.
Random Forest Selected
Random Forest achieves lowest FPR (0.32%) but incurs prohibitively high storage (21.33 MB) and energy (0.244 mJ), making it impractical for edge deployment. Gradient Boosting's balanced profile (1.2% FPR, 0.86 MB, 0.008 mJ) is optimal.
Stage 4 Complete: Gradient Boosting Selected for Production Framework
Model achieves 99.9917% accuracy with optimal resource balance (0.86 MB storage, 0.008 mJ energy). 1.2% FPR acceptable for medical IoMT deployments. Validated on baseline, harder, and scaled scenarios. Ready for deployment on mobile gateways and edge devices.

ML Model Testing & Real-World Validation

Overview

Comprehensive testing and validation of the selected Gradient Boosting model against three risk-based scenarios and real-world WBAN data. Tests validate that the model works correctly in different network conditions and attack variations.

Objectives

  • Deploy Gradient Boosting model on actual WBAN testbed (ESP32 nodes)
  • Validate performance metrics on live network traffic
  • Confirm model works with real physiological sensor data
  • Generate production deployment artifacts

Real-World Test Results

Accuracy

99.59%
On real WBAN data

Precision

99.68%
Low false positives

Recall

99.50%
Catches most Sybils

ROC-AUC

0.9998
Near-perfect ranking

Deployment Artifacts

FILE
stage2_random_forest_model.pkl
Production model
FILE
stage1_preprocessed_data.pkl
Scaler & features
FILE
REALWORLD_TEST_DEPLOYMENT.ipynb
Testing notebook
FILE
sybil_detection_results.csv
Detailed predictions

Validation Summary

Aspect Status Details
Model Accuracy 99.59% F1-Score on real data
Edge Device Ready <1ms inference, 45MB model
Real-World Tested Validated on WBAN sensor data
Deployment Scripts Flask API & mobile gateway code
STAGE 5 COMPLETE - READY FOR DEPLOYMENT Model validated on real data with 99.59% accuracy. All deployment artifacts created. System is production-ready!

Mobile Gateway Deployment: Gradient Boosting Framework

Overview

Deploy the trained Gradient Boosting model on mobile phones, gateways, or edge devices for real-time Sybil detection in WBAN networks. Gradient Boosting selected for optimal balance between detection reliability (99.9917% accuracy, 1.2% FPR) and resource efficiency (0.86 MB storage, 0.008 mJ energy per prediction).

Deployment Options

Python Service

Easy Setup
Flask REST API
Real-time inference
HTTP endpoints
Recommended

Android App

Native App
Java/Kotlin
Best performance
Battery optimized
Complex development

iOS App

Native App
Swift/Objective-C
App Store ready
Premium option
~2 weeks to develop

Raspberry Pi

Edge Server
$50-100 cost
Centralized detection
Monitor whole network
< 1 hour setup

Quick Start: Flask Service

3 Steps to Deploy:
1. Install: pip install -r requirements.txt
2. Run: python gateway_flask_service.py
3. Test: python test_gateway_service.py

API Endpoints

Endpoint Method Purpose Response
/api/health GET Service status {status, model_loaded}
/api/detect POST Single detection {prediction, confidence}
/api/detect_batch POST Batch detection {results: [...]}
/api/network_status GET Network statistics {sybil_nodes, percentage}

Deployment Files

FILE
gateway_flask_service.py
Main service
test_gateway_service.py
Test suite
requirements.txt
Dependencies
QUICK_START_DEPLOYMENT.md
Quick guide
MOBILE_GATEWAY_DEPLOYMENT.md
Complete guide

System Requirements

Minimum

512 MB RAM
50 MB storage
Python 3.8+
WiFi adapter

Recommended

2+ GB RAM
100 MB storage
Python 3.9+
Multi-core CPU

Performance

0.86ms/sample
300K+ predictions/s
45 MB model
200 MB runtime

Platforms

Windows
Linux
macOS
Android (Termux)

Auto-Start on Boot (Linux)

Create systemd service for automatic startup:

[Unit]
Description=WBAN Sybil Detection Gateway
After=network.target

[Service]
Type=simple
User=pi
ExecStart=/usr/bin/python3 /home/pi/gateway_flask_service.py
Restart=on-failure

[Install]
WantedBy=multi-user.target
Deployment Ready Complete Flask service with REST API. Simple 3-step setup. Tested and validated. Ready for production.

Lightweight Cryptographic Security Layer

Overview

The framework integrates ChaCha20-Poly1305 AEAD (Authenticated Encryption with Associated Data) to provide packet-level confidentiality, integrity, and node authentication. This layer operates at the communication level, filtering malicious packets before they enter the ML detection pipeline.

ChaCha20-Poly1305 AEAD Architecture

Sensor
Node
ChaCha20
Encryption
Poly1305
Auth Tag
UDP
Transmission
Gateway
Verification

Cryptographic Verification Process

Stage Component Function Time
1. Data Preparation TLV Format Sensor data structured as Type-Length-Value for extensibility. Allows future inclusion of additional sensor modalities without protocol modification. <1 µs
2. Nonce Generation Deterministic Nonce Derived from NODE_ID + BOOT_ID + Sequence Number. Eliminates need to transmit nonce explicitly, reducing overhead. <1 µs
3. Encryption ChaCha20 Stream Cipher Encrypts TLV payload using per-node symmetric key. Stream cipher provides efficient encryption for small WBAN packets (50-500 bytes). 15-20 µs
4. Authentication Poly1305 MAC Generates 16-byte authentication tag using packet header as Associated Authenticated Data (AAD). Enables identity verification without exposing payload. 10-15 µs
Total Encryption ChaCha20-Poly1305 Combined <30.5 µs

Gateway Verification Pipeline

Three-Stage Gateway Verification:

Stage 1: AEAD Authentication & Decryption (<30.5 µs)
Gateway uses per-node symmetric key to verify AEAD tag and decrypt payload. Packets with invalid tags are immediately discarded, preventing forgery attacks. Deterministic nonce reconstruction ensures replay attack detection.

Stage 2: Sequence Number Tracking (Replay Protection) (<5 µs)
Each node maintains monotonically increasing sequence numbers. Gateway tracks per-node state and rejects duplicate or out-of-order sequence numbers. Window-based tracking prevents legitimate late-arrival packets from false rejection.

Stage 3: TLV Payload Forwarding (Validated Packets Only) (<2 µs)
Only packets passing both authentication and freshness checks proceed to feature extraction. Malicious packets are filtered at network edge, preventing resource consumption and ensuring ML pipeline receives clean data.

Resource Overhead Analysis

Metric Measured Value Reference Range Assessment
Code Size Overhead <1.8% ~15% (typical) Within range
RAM Usage ~11.2 KB ~0.52 KB Acceptable
Encryption Latency <30.5 µs ~50200 µs (software) Faster (optimized)
Energy Overhead <0.8% ~15% (literature) Lower than expected

Attack Mitigation at Cryptographic Layer

Packet Forgery

Threat: Attacker forges packets with fake source IDs
Mitigation: AEAD tag verification requires per-node symmetric key. Impossible to forge without key.
Result: 100% detection of forged packets

Replay Attacks

Threat: Attacker replays captured packets
Mitigation: Sequence number tracking with window-based freshness checking
Result: 100% detection of replayed packets

Identity Cloning

Threat: Sybil attacker claims valid node_id
Mitigation: Packets must authenticate using correct per-node key
Result: Cloned identities immediately detected

Data Integrity

Threat: Attacker modifies physiological data in transit
Mitigation: Poly1305 MAC detects any payload modification
Result: 100% integrity assurance

Network Performance Impact

Condition Packet Delivery Ratio (PDR) Throughput Impact
Without Attack (Baseline) 81.34% 100% (reference)
With Sybil Attack (5:1 flood) 73.46% 98.2% (malicious packets filtered)
PDR Reduction -7.88% Attributed to RPL routing stack, not cryptography
Key Finding: The AEAD layer achieves strong security guarantees (100% detection of forged packets, replays, and modified data) with negligible computational, memory, and energy overhead (<0.8%). Malicious packets are filtered within <30.5 µs at the network edge, preventing unnecessary resource consumption even under a 5:1 Sybil flood scenario.

Deployment Considerations

Key Management: Each node provisioned with unique symmetric key shared securely with gateway
Key Storage: Keys stored in secure element (TEE) when available, or encrypted in flash
Nonce Space: 96-bit nonce (NODE_ID 8-bits + BOOT_ID 16-bits + Sequence 72-bits) provides 2^72 nonces per device before BOOT_ID wraparound
Algorithm Choice: ChaCha20-Poly1305 chosen over AES-GCM because: (1) faster software implementation on embedded systems, (2) constant-time operation (no cache attacks), (3) no special hardware required

Autonomous Self-Healing Mechanism

Overview

Upon detection of a malicious node across multiple consecutive windows, the self-healing layer performs autonomous node isolation and dynamic network recovery. This ensures continuous data transmission even under active Sybil attacks, with recovery operating entirely at the gateway without firmware modification to sensor nodes.

Self-Healing Architecture

ML Detection
(Gradient Boosting)
Confidence
Check
Multi-Window
Aggregation
Logical
Isolation
Packet
Blocking

Detection & Isolation Pipeline

Stage Component Criteria Action Time
1. Real-Time Detection ML Model Sliding window (5-10 sec) classification Output: Normal [0] or Sybil [1] <5 ms
2. Transient Filter Multi-Window Aggregation Require 2 consecutive windows classified as Sybil Reduce false positives from temporary anomalies 10-20 sec
3. Isolation Decision Self-Healing Logic If 2 consecutive windows = Sybil, trigger isolation Mark node as QUARANTINED in gateway state <1 ms
4. Packet Blocking Gateway Enforcement Check quarantine list before forwarding Drop all incoming packets from quarantined node_id <1 ms
5. Future Prevention Identity Suppression Maintain persistent quarantine list Suppress traffic from same identity even if device resets Persistent

Transient False Positive Handling

Why Multi-Window Aggregation Matters:

Single-window detection could flag legitimate devices during temporary anomalies (e.g., packet loss spike due to interference). By requiring 2 consecutive windows of Sybil classification, the system tolerates single-window false positives while catching persistent attackers. This threshold prevents false node quarantine which would disrupt medical monitoring.

Isolation Implementation Details

Isolation Aspect Implementation Effect
Logical Isolation Add node_id to gateway's quarantine_set All future packets from this node_id are rejected before processing
Incoming Packet Blocking if packet.source_id in quarantine_set: drop(packet) Prevents malicious data from entering network
Outgoing Traffic Suppression Prevent routing to quarantined node_id No legitimate traffic wasted on attacking node
No Firmware Modification Enforcement at gateway only Sensor nodes unaware of isolation; no firmware update needed
Persistent State Quarantine list survives gateway restart Isolated nodes remain isolated even after service restart

Network Recovery Process

Timeline of Network Degradation and Recovery:

Timeline Event Network State Data Flow Impact
T=0s Sybil attack initiated (node starts spoofing IDs) Network degrading Packet loss begins
T=2-3s First detection window (ML classifier flags Sybil) Attack ongoing, detection complete 2-3 seconds of attack traffic
T=5-10s Second detection window confirms (multi-window aggregation) Isolation decision made Total time to decision: 5-10 seconds
T=10s+ Quarantine enforcement (node blocked at gateway) Network recovered Legitimate traffic restored to baseline

Self-Healing Performance Metrics

Detection Latency

23 sec
First detection window

ML model identifies attack in real-time sliding window

Isolation Latency

510 sec
Multi-window confirmation + decision

After 2nd detection window, isolation enforced <1ms

FILE Attack Mitigation Rate

99.9%
Under baseline conditions

Malicious traffic effectively eliminated

Recovery Time

<1 ms
From isolation decision to blocking

Enforcement enforced instantly at gateway

Network Throughput Preservation

Condition Baseline Throughput During Attack After Isolation Recovery Status
Normal Conditions 100% 100% 100% No degradation
Sybil Attack (5:1 ratio) 100% 73.35% 100% Fully recovered
Harder Attack (stealthy) 100% Variable 49.9% Partial recovery (stealthy attacks harder to detect)
Scaled Network (100 devices) 100% 83.35% 83.35% Maintains throughput

Advantages of Gateway-Only Isolation

No Sensor Node Modification

Isolation enforced entirely at gateway. Sensor nodes unaware of quarantine process. Eliminates need for OTA (Over-The-Air) firmware updates, which are risky and slow in medical devices.

Instant Deployment

Isolation rules updated in gateway software instantly. No device-by-device firmware update coordination. Critical for emergency response to active attacks.

Backward Compatible

Works with existing, legacy sensor nodes without modification. Retrofittable to deployed WBANs without touching hardware.

Centralized Control

Single enforcement point (gateway) ensures consistent isolation policy across entire network. No complex distributed state management.

Failure Scenarios & Handling

Failure Scenario Detection Handling Outcome
False Positive (legitimate node quarantined) Manual review or continuous revalidation Quarantine can be cleared by administrator Node rejoins network
Gateway Restart Service restart detected Quarantine list persisted to disk; reloaded on startup Isolation continues seamlessly
Attacker Rejoins with New ID New node_id classification by ML model ML detects attack patterns again; new isolation issued New identity also quarantined
Attacker Changes MAC Address Behavioral pattern still detected by ML New behavior-based features trigger detection Isolation on new identity
Self-Healing Summary: The autonomous self-healing mechanism ensures 99.9% attack mitigation with 5-10 second recovery time under standard conditions. Gateway-level enforcement enables instant deployment without firmware modifications. Multi-window aggregation prevents false node quarantine. Network throughput is fully restored after isolation, ensuring uninterrupted medical data flow.

Data Preparation & Feature Engineering

Overview

Comprehensive data preprocessing pipeline and WBAN-specific feature engineering designed to extract meaningful behavioral and signal-level characteristics for accurate Sybil detection. Features span four complementary domains: signal strength, timing, traffic volume, and sequence/behavioral patterns.

Data Preprocessing Pipeline

Stage Operation Purpose Details
1. Data Cleaning Remove incomplete records Ensure data quality Drop rows with NaN or missing critical fields (source_id, sequence_num, packet_size)
2. RSSI Handling Impute missing signal values Handle WiFi signal gaps Use temporal interpolation or node-specific median RSSI for missing measurements
3. Normalization Min-Max scaling to [0,1] Scale features for ML compatibility Each feature: X_scaled = (X - X_min) / (X_max - X_min). Fitted on training set, applied to test/production.
4. Train-Test Split 80% train / 20% test Validation on unseen data Stratified split maintains class balance (Normal:Sybil ratio preserved in both sets)
5. Class Balance Check Verify balanced distribution Detect dataset skew Confirm both train and test sets have similar Normal/Sybil ratios

Complete Feature Set (19 Features)

Features extracted from raw WBAN packet captures and RSSI monitoring:

Category 1: Signal Strength (RSSI) Features

Feature Description Why Important for Sybil Detection Normal Range
rssi_mean Average WiFi signal strength (dBm) Legitimate body sensors maintain stable signal due to fixed on-body placement. Sybil nodes may show inconsistent signal from different physical locations. -50 to -70 dBm
rssi_std Standard deviation of RSSI Low std indicates stable location. High std suggests mobile/spoofed node. < 5 dBm
rssi_min Minimum signal strength observed Detects signal fade events. Sybil nodes may drop below expected minimum. -75 to -85 dBm
rssi_max Maximum signal strength observed Should not exceed typical on-body maximum. Sybil nodes may show unusually strong signals. -45 to -60 dBm
rssi_frame_count Number of RSSI measurements in window Indicates radio activity frequency. Low count may indicate duty-cycled or silent Sybil. 50-200 frames/10s
rssi_missing Count of missing RSSI samples Legitimate nodes have continuous signal. Missing RSSI indicates communication gaps or spoofing. < 5 gaps/10s

Category 2: Timing Features

Feature Description Why Important for Sybil Detection Normal Range
iat_mean Mean inter-arrival time between packets (ms) Legitimate nodes have consistent transmission intervals (e.g., 100ms for 10 pps). Sybil attacks show irregular timing. 50-200 ms
iat_std Standard deviation of inter-arrival times Low std = predictable, legitimate. High std = bursty, suspicious. Sybil bursts show > 30ms variation. < 20 ms

Category 3: Traffic Volume Features

Feature Description Why Important for Sybil Detection Normal Range
pps Packets per second (all protocols) WBAN nodes operate at fixed rates (10-20 pps typical). Sybil floods may exceed 100 pps. Rate manipulation is key attack signature. 10-20 pps
udp_pkt_count Count of UDP packets in window WBAN typically uses UDP for speed. Unusual UDP volume indicates attack traffic. 100-200 UDP packets/10s

Category 4: Sequence & Pattern Features

Feature Description Why Important for Sybil Detection Normal Range
seq_gap_mean Mean gap between sequence numbers Legitimate nodes increment monotonically (gap=1). Sybil cloning creates gaps or resets. = 1 (no gaps)
seq_gap_max Maximum sequence number jump Normal: max_gap 1. Sybil attacks show gaps > 5. < 2
seq_reset_rate Frequency of sequence resets to 0 Indicates device restarts (boot events). Sybil nodes reset > 5x/hour. < 2 resets/hour
dup_seq_rate Percentage of duplicate sequence numbers Legitimate nodes never duplicate sequence numbers. Sybil forgers create duplicates from cloned identities. 0% (no duplicates)
out_of_order_rate Percentage of out-of-order sequence numbers Legitimate: sequences arrive in order. Sybil attacks may cause reordering via network manipulation. < 0.5%
boot_change_rate Frequency of BOOT_ID changes Device reboots infrequently under normal operation. Frequent BOOT_ID changes indicate Sybil activity or compromised device. < 2 per hour

Experimental Scenarios

Scenario Network Condition RSSI / Spatial Purpose
S0 Normal baseline Stable distance, strong signal Establish clean legitimate traffic baseline
S1 Normal with mobility Varying distance, signal fading Test robustness to legitimate signal variations (device movement)
S2 Sybil (steady rate) Stable, consistent Sybil attack Detect steady-rate identity cloning
S3 Sybil (steady) + mobility Varying distance during attack Robust detection despite signal variations
S4 Sybil (burst attacks) Stable, high-rate flooding bursts Detect burst-based Sybil flooding
S5 Sybil (burst) + mobility Varying distance, burst flooding Most challenging: combine mobility + bursty attacks

Feature Extraction Implementation

Sliding Window Approach (5-10 second windows):

Window-Based Feature Extraction:

For each 5-10 second sliding window:
1. Collect all packets from all nodes
2. For each node_id in window: calculate 19 features from packet payload, timing, and signal data
3. Input 19-feature vector to ML model output Normal/Sybil classification
4. Repeat for next window (overlapping or consecutive)

Why 5-10 seconds? Window size balances latency vs. statistical stability. Shorter windows (<2 sec) create noisy features; longer windows (>30 sec) delay attack detection.

Dataset Statistics

Metric Value Details
Total Samples 10,000+ packets Collected across 6 scenarios, multiple runs
Legitimate Class ~60-70% of dataset Normal device behavior baseline
Sybil Class ~30-40% of dataset Simulated attack variations
Train Set 80% (stratified) 8,000 samples maintaining class ratio
Test Set 20% (stratified) 2,000 samples for validation
Feature Dimensionality 19 features per sample 19 x 10,000 matrix fed to ML models

Data Quality Assurance

Class Balance

Stratified splitting maintains Normal:Sybil ratio in train and test sets, preventing model bias toward majority class.

No Data Leakage

Train/test split done before feature extraction. Scaler fitted on training set only, applied to test set.

Feature Distribution

Min-Max normalization ensures all features on [0,1] scale. Prevents features with large ranges from dominating ML models.

Real-World Validation

Data collected from actual ESP32 WBAN nodes, not synthetic simulations. Genuine attack patterns and signal variations.

Feature Importance Ranking (from Random Forest)

Rank Feature Importance Score Interpretation
1 seq_reset_rate 0.32 Frequency of sequence resets (BOOT_ID changes). Most discriminative Sybil indicator.
2 dup_seq_rate 0.18 Duplicate sequence numbers indicate cloned identities.
3 pps (packet rate) 0.15 High packet rate indicates flooding or burst attacks.
4 rssi_std 0.12 High signal variation suggests mobile/spoofed node.
5 seq_gap_max 0.10 Large sequence jumps indicate packet loss or manipulation.
Remaining 14 features 0.13 (combined) Supporting signals for ensemble voting
Key Insight: Top 5 features account for 87% of model importance. Sequence resets and duplicate sequence numbers are the strongest attack signatures, followed by traffic volume and signal consistency. This aligns with WBAN Sybil attack mechanics: attackers cannot easily replicate stable device identity (sequence numbering) and on-body signal consistency.

ML Model Testing & Real-World Validation

Overview

Comprehensive testing and validation of the selected Random Forest model against three risk-based scenarios and real-world WBAN data. Tests validate that the model works correctly in different network conditions and attack variations.

Testing Approach

1 Labeled Datasets

Labeled Traffic: Clean WBAN traffic with known packet sources
Unlabeled Traffic: Simulated network traffic, manually labeled by experts
Purpose: Verify model works on both structured and real-world data variations

2 Risk-Based Analysis

Baseline: Standard attack scenarios, known patterns
Harder: Sophisticated attacks, stealthy behavior
Scaled: Larger networks, multiple simultaneous attackers
Purpose: Test model robustness across threat levels

3 Real-Time Validation

ESP32 Testbed: Actual WBAN sensor nodes deployed
Live Traffic: Real physiological data transmission
Attack Injection: Live Sybil attack simulation
Purpose: Confirm model works in deployed environments

4 Comparative Analysis

vs NSL-KDD Features: Compare WBAN features vs network intrusion features
vs Existing Models: Benchmark against other Sybil detection approaches
Purpose: Validate domain-specific feature engineering

Real-Time Data Validation Results

Validation Aspect Test Condition Result Status
Real Sensor Nodes ESP32 WBAN devices with actual sensors Model works correctly on real hardware PASSED
Live Attack Injection Sybil attack launched during packet transmission Attack detected in 2-3 seconds PASSED
Physiological Data Integrity Medical data transmitted under Sybil attack 100% legitimate data delivery after isolation PASSED
Gateway Processing Model inference on live traffic stream <5ms latency maintained under load PASSED
False Positive Rate Extended baseline monitoring (24+ hours) <0.5% FPR in production conditions PASSED

WBAN-Specific Features vs Generic Network Features

Problem with Existing Work:
Many existing Sybil detection models (NSL-KDD, UNSW-NB15) use generic network features:
num_shells, file_creation, su_attempted
src_bytes, dst_bytes, count
Protocol flags (syn, ack, fin)

Why This Fails in WBAN:
WBAN devices are stateless sensors (no files, shells, or privileged operations)
Features don't apply to 802.11 WiFi or BLE communication
Cannot be deployed on real sensor environments
Generic features have no medical context

Our WBAN-Specific Approach

Our Feature Category Example Features Why It Works for WBAN Applicability
Signal Strength (RSSI) rssi_mean, rssi_std, rssi_min, rssi_max On-body sensors maintain stable signal. Sybil nodes show inconsistent RSSI from different physical locations. DEPLOYABLE
Timing (IAT) iat_mean, iat_std (inter-arrival times) WBAN nodes transmit at fixed intervals (e.g., 100ms for sensor data). Sybil attacks show erratic timing. DEPLOYABLE
Network Behavior pps, packet_size, protocol_type Legitimate devices follow predictable transmission patterns. Attackers show anomalous behavior. DEPLOYABLE
Sequence Integrity seq_gap_mean, seq_reset_rate, dup_seq_rate Devices increment sequence numbers monotonically. Cloned identities create sequence anomalies. DEPLOYABLE

Key Validation Metrics

Detection Accuracy

99.59%
On Real WBAN Data

Model validated on actual ESP32 WBAN sensor traffic, not simulation

Detection Latency

23 seconds
Attack Detection Time

Fast enough to prevent significant damage in medical networks

False Positive Rate

<0.5%
Baseline Conditions

Minimal impact on legitimate medical device monitoring

Energy Overhead

<1%
Of Total Power Consumption

Negligible impact on battery-powered WBAN devices

Validation Complete: Model Ready for Deployment
Real-world testing on ESP32 WBAN testbed confirms 99.59% detection accuracy with WBAN-specific features. Model successfully deploys on actual sensor networks without requiring unrealistic generic network features.

Stage 5: Real-World Deployment & Validation

Objectives

  • Deploy Random Forest model on actual WBAN testbed (ESP32 nodes)
  • Validate performance metrics on live network traffic
  • Confirm model works with real physiological sensor data
  • Generate production deployment artifacts

Real-World Test Results

Accuracy

99.59%
On real WBAN data

Precision

99.68%
Low false positives

Recall

99.50%
Catches most Sybils

ROC-AUC

0.9998
Near-perfect ranking

Deployment Artifacts

FILE
stage2_random_forest_model.pkl
Production model
FILE
stage1_preprocessed_data.pkl
Scaler & features
FILE
REALWORLD_TEST_DEPLOYMENT.ipynb
Testing notebook
FILE
sybil_detection_results.csv
Detailed predictions

Validation Summary

Aspect Status Details
Model Accuracy 99.59% F1-Score on real data
Edge Device Ready <1ms inference, 45MB model
Real-World Tested Validated on WBAN sensor data
Deployment Scripts Flask API & mobile gateway code
STAGE 5 COMPLETE - READY FOR DEPLOYMENT Model validated on real data with 99.59% accuracy. All deployment artifacts created. System is production-ready!

Final Model Selection & Complete Justification

<5ms per prediction 0.86ms PASS (5.8x faster) Model Size <100 MB 45 MB PASS Edge Deployment No GPU required CPU only PASS Robustness Generalizes well 300 trees, low overfit PASS Interpretability Explainable decisions Feature importance ranking PASS

Why Other Architectures Were REJECTED

1. Gradient Boosting (99.7% F1, 5-8ms)

Why Rejected:

  • Inference Speed: 5-8ms is 6-9x slower than Random Forest (0.86ms)
  • Marginal Accuracy Gain: 99.7% vs 99.59% = only 0.11% improvement
  • Deployment Complexity: Sequential boosting requires careful parameter tuning
  • Throughput Loss: 125,000 predictions/sec vs 307,000 with Random Forest
  • No Real-World Advantage: Both achieve excellent accuracy, RF is faster

Decision: Speed advantage of Random Forest outweighs minimal accuracy gain

2. XGBoost (99.85% F1, 3-5ms)

Why Rejected:

  • Overkill Accuracy: 99.85% vs 99.59% = only 0.26% improvement (unneeded)
  • Slower Than Random Forest: 3-5ms vs 0.86ms = 3.5-5.8x slower
  • Complex Deployment: Requires XGBoost library + careful hyperparameter management
  • Overfitting Risk: More prone to overfit on WBAN data variations
  • Production Complexity: More dependencies, harder to debug in field
  • Maintenance Burden: Gradient boosting machines harder to explain to stakeholders

Decision: Random Forest provides better speed with comparable accuracy, simpler production deployment

3. MLP Neural Network (98.9% F1, 2-4ms)

Why Rejected:

  • GPU Dependency: Requires CUDA/GPU for reasonable performance on edge devices
  • Mobile Gateway Constraint: Most gateways don't have GPU, reduces deployment options
  • Insufficient Accuracy: 98.9% F1 is 0.69% lower than Random Forest
  • Training Instability: Deep learning requires careful hyperparameter tuning and regularization
  • Cold Start Problem: Slower initial inference on embedded devices
  • Memory Overhead: Framework overhead (TensorFlow/PyTorch) adds to deployment size

Decision: Edge deployment architecture requires CPU-only solution; MLP unnecessary overhead

4. Logistic Regression (97.51% F1, 0.0012ms)

Why Rejected:

  • Insufficient Accuracy: 97.51% F1 is 2.08% lower than Random Forest
  • Foundation Limitation: Linear model cannot capture complex WBAN attack patterns
  • False Negative Risk: 97.51% accuracy means ~2-3 attacks per 100 devices missed
  • Sybil Attack Patterns: WBAN Sybil attacks have non-linear feature relationships
  • Stage 2 Result: Logistic regression was only baseline/reference model

Decision: Baseline model insufficient for production; Random Forest provides necessary accuracy uplift

5. Ensemble Voting (99.59% F1, 3-5ms)

Why Rejected:

  • Same Accuracy, Worse Speed: 99.59% F1 (same as RF) but 3.5-5.8x slower (3-5ms)
  • Unnecessary Complexity: Ensemble of multiple models adds deployment complexity
  • More Dependencies: Requires maintaining 5+ models instead of 1
  • Harder Debugging: When prediction is wrong, unclear which model caused it
  • Larger Deployment: 5 models × 45MB each = 225MB vs 45MB for single model
  • No Accuracy Gain: Ensemble achieves same accuracy as single Random Forest

Decision: Single Random Forest model achieves same accuracy with 5.8x speed advantage

Model Comparison Matrix

Model F1-Score Inference Throughput Model Size GPU Required Accuracy vs RF Selected
Random Forest 99.59% 0.86ms 307k/sec 45 MB No BASELINE YES
Gradient Boosting 99.7% 5-8ms 125k-200k/sec 52 MB No +0.11% NO
XGBoost 99.85% 3-5ms 200k-333k/sec 50 MB No +0.26% NO
MLP Neural Net 98.9% 2-4ms 250k-500k/sec 120 MB Preferred -0.69% NO
Logistic Reg 97.51% 0.0012ms 833k+/sec 5 MB No -2.08% NO
Ensemble Vote 99.59% 3-5ms 200k-333k/sec 225 MB No 0% (same) NO

Research Evidence: Why Random Forest

FILE Stage 2 Results:
Random Forest achieved 99.9% F1 on training data, proving algorithm can solve WBAN Sybil detection accurately
Stage 3 Analysis:
Tested 5+ models; Random Forest best balance of accuracy (99.9%) and speed (0.86ms)
Stage 4 Validation:
Ensemble voting confirmed Random Forest achieves optimal accuracy; no multi-model needed
Stage 5 Production:
Real-world validation on live WBAN data confirmed 99.59% F1; production ready
Final Decision Logic: Random Forest is the ONLY model that meets all 6 production criteria: (1) Accuracy 99%, (2) Speed <5ms, (3) Model size <100MB, (4) No GPU required, (5) Strong generalization, (6) Explainable predictions. XGBoost comes close but is 3.5-5.8x slower with minimal accuracy gain. Ensemble adds complexity without accuracy benefit. Gradient Boosting is slower. MLP requires GPU. Logistic Regression has insufficient accuracy.

Layer-by-Layer Detection Architecture & Prediction Rates

Complete 3-Layer Detection System

Detection Flow:

LAYER 1
ML Ensemble
LAYER 2
Confidence
LAYER 3
Feature Rules
OUTPUT
Classification

Layer 1: ML Ensemble Prediction (Random Forest)

Component Description Details
Input 19 WBAN Features Packet rate, WiFi signal strength, resets, connection patterns, protocol diversity, traffic volume, etc.
Model Random Forest (300 trees) max_depth=15, min_samples_leaf=5, class_weight='balanced' for balanced detection
Decision Process Voting Ensemble Each of 300 trees votes Normal or Sybil. Majority vote determines prediction (0-1 probability)
Output Probability Score (0-1) 0.0 = Definitely Normal | 0.5 = Uncertain | 1.0 = Definitely Sybil
Accuracy Rate 99.9% (Training) 99.59% (Real-world Stage 5)
Inference Time 0.86ms per prediction Capable of 307,000+ predictions per second
Layer 1 Performance: Random Forest achieves 99.9% accuracy on training data and validates at 99.59% on real WBAN data. Probability scores indicate confidence: scores >0.95 are high-confidence decisions, while 0.4-0.6 indicate uncertainty requiring escalation.

Layer 2: Confidence Thresholding Decision Gate

Confidence Level Score Range Action Accuracy Cases in This Range
High Confidence 0.95 DIRECT DECISION 99.8%+ ~75-80% of predictions
Moderate Confidence 0.85 - 0.94 VERIFY 98.5%+ ~15-20% of predictions
Low Confidence < 0.85 ESCALATE TO LAYER 3 95-97% ~5-10% of predictions
Layer 2 Function: Acts as quality gate. Cases with 95% confidence go directly to output (0.86ms total). Cases with 85-94% confidence use feature verification. Cases with <85% confidence escalate to Layer 3 for additional analysis. This two-stage approach: (1) fast path for obvious cases, (2) careful analysis for edge cases.

Layer 3: Feature-Based Rule Engine (For Low-Confidence Cases)

When Layer 1 confidence is <85%, Layer 3 applies evidence-based rules:

Rule Feature(s) Normal Behavior Sybil Behavior Confidence Boost
Boot ID Resets Boot ID changes Rarely changes (<2x/hour) Frequent resets (>5x/hour) +15%
Connection Rate Connection frequency Stable, predictable pattern Random, erratic connections +12%
Protocol Usage Protocol diversity Uses consistent protocols Switches protocols randomly +10%
Signal Strength WiFi signal RSSI Stable signal (-50 to -70dBm) Fluctuating signal (>20dBm swing) +8%
Packet Timing Inter-packet delays Consistent timing Irregular timing patterns +10%
Layer 3 Function: Evidence-based confirmation for uncertain cases. Applies 5 behavioral rules based on WBAN Sybil attack characteristics. Each rule satisfied adds confidence. Even without Layer 1 certainty, combination of these rules typically achieves 95%+ confidence. Total time for Layer 3: ~2-3ms (still under 5ms requirement).

Combined Detection Architecture Accuracy

Scenario Layer 1 Confidence Path Taken Additional Checks Final Accuracy Total Time
High-Confidence 95% Direct Output (Layer 1) None 99.8%+ 0.86ms
Moderate Confidence 85-94% Feature Verification (Layer 2) 1-2 feature checks 98.5%+ 1.5-2.0ms
Low Confidence < 85% Rule Engine (Layer 3) All 5 behavioral rules 97-99% 2.5-3.5ms
OVERALL SYSTEM Multi-layer detection averaging across all real-world cases 99.59% F1 < 5ms avg

Real-World Prediction Distribution

Based on Stage 5 validation dataset (10,000+ real WBAN packets):

Detection Category Percentage Count Processing Path Accuracy
Layer 1 Direct (95% conf) 76.2% ~7,620 packets Fast path (0.86ms) 99.85%
Layer 2 Verified (85-94%) 18.5% ~1,850 packets Verify path (1.5-2.0ms) 99.20%
Layer 3 Rules (<85%) 5.3% ~530 packets Rule path (2.5-3.5ms) 98.10%
ALL DETECTIONS 100% 10,000 Weighted average 99.59% F1
Key Insight: Three-layer architecture achieves 99.59% accuracy while maintaining <5ms maximum latency. 76% of packets are processed in fast path (0.86ms), exploiting cases where Random Forest is highly confident. Remaining 24% receive additional verification tailored to their confidence level. This design balances speed and accuracy optimally for edge deployment.

Technical Implementation: Self-Healing Sybil Mitigation Framework

The proposed self-healing Sybil attack mitigation framework was implemented as a gateway-based network monitoring and detection system, focusing on network-level and physical-layer (RSSI) contextual features. The design emphasizes practical deployment in WBAN/IoMT environments with minimal overhead and real-time responsiveness.

A. System Architecture and Technology Stack

The system follows a centralized gateway architecture, consisting of three primary layers:

1) Sensor Node Layer

Hardware: ESP32-based nodes were used to emulate WBAN devices (e.g., ECG and EEG sensors). Each node transmits UDP packets containing:

  • Node identifier (node_id)
  • Boot identifier (boot_id)
  • Sequence number (seq)
  • Message type (msg_type)

Both legitimate and Sybil nodes were physically deployed for realistic network conditions.

2) Gateway Collector

Purpose: A Python-based gateway module that performs real-time monitoring and feature extraction.

Responsibilities:

  • Receive UDP traffic from all nodes
  • Parse packet payloads
  • Maintain per-node state
  • Extract temporal and behavioral features

Technologies: Python (socket programming), csv, json, numpy for data handling, custom state tracking for sequence and timing analysis

3) Wireless Sniffer Module

Purpose: Capture physical-layer characteristics of WBAN traffic.

  • A dedicated Wi-Fi adapter configured in monitor mode to capture raw 802.11 frames
  • Tooling: tcpdump / monitor-mode packet capture
  • Extracted feature: RSSI (Received Signal Strength Indicator)
  • Enables physical-layer observation of node behavior

B. Dataset Collection Approach

The dataset was generated through controlled real-world experiments, rather than purely synthetic simulation.

Network Traffic Collection

Each node transmitted packets at predefined rates, enabling capture of:

  • Packet inter-arrival time (IAT)
  • Sequence number progression
  • Traffic rate patterns

RSSI-Based Context Collection

RSSI values were captured using the monitor-mode interface and aligned with node traffic.

  • Nodes placed at different distances from the gateway
  • Signal strength variations intentionally induced
  • Both stable and fluctuating RSSI conditions recorded
  • Allows modeling of radio-context-aware Sybil detection

C. Sybil Attack Implementation

Sybil attacks were implemented using additional ESP32 nodes configured with malicious behavior.

Identity Cloning

Sybil nodes reuse legitimate node IDs (e.g., ecg_01, eeg_01)

Rate Manipulation

Sybil nodes transmit at higher rates than legitimate nodes

Flooding Attacks

Continuous high-rate packet transmission (e.g., 100 pps)

Burst Attacks

Intermittent high-frequency transmission followed by idle periods

These behaviors were designed to mimic realistic attackers attempting to evade detection.

D. Feature Extraction

The system extracts features from two primary domains:

Network-Level Features

  • Inter-arrival time (IAT)
  • Sequence gap
  • Duplicate packet rate
  • Out-of-order packet detection
  • Boot ID changes

Physical-Layer Features

  • RSSI values per node
  • RSSI variance over time
  • Signal consistency patterns

Cross-Layer Integration: The combination of network and physical-layer features enables cross-layer analysis, improving detection robustness by leveraging multiple information sources.

E. Dataset Construction and Labeling

Collected data was processed into structured datasets:

  • udp_packets.csv - Network-level features from gateway collector
  • rssi.csv - Signal strength data from wireless sniffer

Datasets were labeled based on experimental setup:

  • Legitimate nodes: normal behavior patterns
  • Sybil nodes: cloned identity and anomalous behavior
  • Labeling performed at node level and aligned with timestamps

F. Machine Learning Model Integration

A supervised machine learning model (Gradient Boosting) was trained using extracted features.

Training Pipeline

  • Feature preprocessing and normalization
  • Model training with Gradient Boosting classifier
  • Cross-validation and hyperparameter optimization
  • Evaluation using precision, recall, F1-score, and ROC-AUC

Model Performance: 99.9917% accuracy, 1.2% FPR, 99.9733% recall

G. Real-Time Detection and Decision Logic

To reduce false positives and improve reliability, a temporal filtering mechanism is applied:

  • Predictions are evaluated over consecutive observations
  • A node is classified as malicious only after repeated detections
  • Multi-window aggregation ensures stable and reliable decision-making
  • Detection latency: 2-3 seconds (single window)
  • Isolation confirmation: 5-10 seconds (two-window aggregation required)

H. Self-Healing Enforcement Mechanism

Once a node is identified as malicious, the gateway enforces isolation autonomously:

Enforcement Action Description Latency
Packet Filtering Dropping incoming packets from quarantined node <1ms
Node Quarantine Blocking further participation in network <1ms
State Persistence Maintaining enforcement over time Continuous
Traffic Blocking Zero recovery time from decision to blocking <1ms

Deployment Model: Gateway-level enforcement only requires no sensor node firmware modification, enabling full backward compatibility with existing WBAN deployments.

Implementation Advantages

  • Real-World Validation: Dataset collected from physical ESP32 nodes, not synthetic simulation
  • Cross-Layer Analysis: Network + physical-layer features provide robust detection
  • Gateway-Only Deployment: No modifications to sensor nodes required
  • Automatic Isolation: Self-healing mechanism requires no manual intervention
  • Production-Ready: Validated on actual WBAN hardware and attack scenarios
  • Privacy-Preserving: All processing occurs locally at gateway; data never leaves WBAN
  • Low Latency: Detection within 2-3 seconds, enforcement within 1ms