Hierarchical Hybrid Framework for Intrusion Detection

Team

Jegatheesan S.

E/19/174

e19174@eng.pdn.ac.lk

Eniyavan T.

E/20/099

e20099@eng.pdn.ac.lk

Vithushan E.T.L.

E/20/416

e20416@eng.pdn.ac.lk

Supervisor

Mr. Biswajith A.K. Dissanayake

biswajithd@eng.pdn.ac.lk

Abstract

The rapid evolution of cyber-attacks such as zero-day exploits and sophisticated application-layer attacks like SQL injection, XSS, command injection, and enhanced DDoS attacks exposes considerable drawbacks of traditional intrusion detection systems. In this work, we propose a Hierarchical Hybrid Framework for server-side intrusion detection that overcomes these issues with a fully parallel two-stream approach. Our approach analyses 22 network-layer features and 19 application-layer features (41 total), selected using Random Forest importance, SHAP values, mutual information, and correlation. The framework includes a three-layer cascade:

L1 Binary Classifier

LightGBM for Network Stream (98% accuracy), XGBoost for Application Stream (84.55% accuracy)

L2 Known/Unknown

Threshold-based classifier (confidence ≥ 0.7, F1 = 0.87, false-alarm rate < 5%)

L3 Multi-Class

XGBoost with 97% accuracy (Network) and 74.23% (Application)

Evaluated on the balanced CSE-CIC-IDS2018 dataset (4,122,352 benign and 2,748,235 malicious packets), the pipeline achieved an overall accuracy of 97.8% with a macro F1-Score of 0.91. In real-time tests the model achieved 100% attack detection with a false positive rate below 4%.

Intrusion Detection Zero-Day Detection Hybrid ML Parallel Processing Anomaly Detection Enterprise Security

Problem Statement

Despite major improvements in machine-learning-based IDS, most present-day solutions handle network and application layers sequentially or in isolation. This enables advanced application-layer attacks (obfuscated SQL Injection, Cross-Site Scripting, Command Injection, and Business Logic Abuse) to go undetected because they appear perfectly normal at the network level.

Sequential Processing

Network and application layers analysed in isolation, allowing cross-layer attacks to evade detection.

Poor Zero-Day Handling

Supervised models fail against unseen threats; unsupervised ones suffer from high false positives.

Lack of Cross-Layer Fusion

No robust meta-learner to combine outputs from network and application streams.

High Computational Overhead

Advanced hybrid architectures unsuitable for resource-constrained environments like IoT gateways.

Methodology

The proposed framework operates through four interconnected components using a fully parallel dual-stream architecture. Two separate streams handle network-layer and application-layer inputs simultaneously, enabling real-time cross-layer intrusion detection.

01 Parallel Dual-Stream Input Processing

Two separate streams process network (22 features) and application (19 features) data simultaneously, reducing latency while spotting cross-layer threats.

02 Layer 1: Anomaly Detector (Binary Classifier)

A supervised binary classifier that distinguishes benign from attack traffic. LightGBM is used for the Network Stream (98% accuracy) and XGBoost for the Application Stream (84.55% accuracy). An initial unsupervised approach (Isolation Forest, Autoencoder, One-Class SVM) was abandoned due to unacceptably poor results (accuracy as low as 3%).

03 Layer 2: Known/Unknown Identifier

A zero-overhead threshold-based decision mechanism using Layer 1's confidence scores. High confidence (≥ 0.7) routes flows as "known attacks" to Layer 3; low confidence flags them as potential zero-day threats for immediate alert. This replaced unsupervised models that had critically low precision (0.26 to 0.27).

04 Layer 3: Multi-Class Classifier

XGBoost classifies known attacks into 14 distinct attack families, achieving 97% accuracy and 0.92 macro F1-score on the Network Stream. The macro F1-score ensures fair treatment of minority attack classes like Slowloris and SQL Injection.

05 Meta-Learner Fusion

Results from both streams are combined using a meta-learner that resolves conflicting decisions between network and application streams, producing final alerts.

Feature Selection

41 features were selected using a rank aggregation of four methods: Correlation Coefficients, Mutual Information, Random Forest importance, and Recursive Feature Elimination (SVM). An initial 20-feature set resulted in poor performance (F1 = 0.08), prompting expansion to 41 features that improved performance by 10 to 15%.

Network Layer (22 features)

Dst Port, Protocol, Flow Duration, Flow IAT Mean/Max/Min, Fwd IAT Max/Min/Tot, ECE Flag Cnt, Flow Pkts/s, Fwd/Bwd Pkts/s, Tot Fwd Pkts, Subflow Fwd Pkts, Fwd/Bwd Header Len, Init Fwd/Bwd Win Byts, PSH/RST/ACK Flag Cnt

Application Layer (19 features)

Pkt Size Avg, Pkt Len Mean/Max/Min/Std/Var, Fwd Pkt Len Max/Min/Mean/Std, Bwd Pkt Len Max/Min/Mean/Std, Fwd/Bwd Seg Size Min/Avg, TotLen Fwd Pkts, Subflow Fwd Byts

Feature Validation

Log-transformed box plots of the five most discriminative features confirmed statistically significant separation between benign traffic and every attack category.

Results and Analysis

NET Layer 1: Anomaly Detector

After the unsupervised approach failed catastrophically, Layer 1 was re-engineered as a supervised binary classifier. Nine algorithms were evaluated; LightGBM emerged as the best performer.

Model	Precision	Recall	F1-Score	Accuracy	ROC-AUC
Logistic Regression	0.81	0.94	0.87	0.89	0.9228
Decision Tree	0.96	0.95	0.96	0.97	0.9701
Random Forest	0.98	0.95	0.96	0.97	0.9870
AdaBoost	0.97	0.94	0.95	0.96	0.9759
SVM	0.88	0.94	0.91	0.92	0.9627
Gradient Boosting	0.99	0.94	0.97	0.98	0.9809
MLP	1.00	0.94	0.97	0.98	0.9852
XGBoost	1.00	0.94	0.97	0.98	0.9900
LightGBM (Selected)	1.00	0.94	0.97	0.98	0.9905

NET Layer 2: Known/Unknown Identifier

The threshold-based approach using Layer 1's confidence scores dramatically outperformed unsupervised methods.

Metric	Isolation Forest	One-Class SVM	Threshold-Based (Final)
Unknown Recall	0.73	0.78	0.91
Unknown Precision	0.26	0.27	0.84
F1-Score (Unknown)	0.38	0.40	0.87
False Alarm Rate	Very High	Very High	< 5%

NET Layer 3: Multi-Class Classifier

XGBoost achieved the highest macro F1-score (0.92) across all 14 attack families.

Model	Accuracy	Macro F1	Weighted F1
Logistic Regression	0.94	0.70	0.94
Decision Tree	0.97	0.91	0.97
Random Forest	0.97	0.91	0.97
XGBoost (Selected)	0.97	0.92	0.97
MLP	0.96	0.78	0.96
LightGBM	0.28	0.19	0.22

APP Application Stream Results

Layer	Best Model	Accuracy	Recall	F1-Score
Layer 1 (Anomaly)	XGBoost	84.55%	0.962	0.833
Layer 2 (Known/Unknown)	Threshold-Based	83.38%	0.588 (unknown)	0.909
Layer 3 (Multi-Class)	XGBoost	74.23%	-	-

LIVE Real-Time Evaluation

A live test environment was created using attacker and victim VMs. The complete pipeline was tested with real attack traffic generated using Slowloris, LOIC/HOIC, SQLMap, Hydra, and other tools.

97%

Known Attack Detection

91%

Zero-Day Recall

68ms

Avg Latency per Flow

< 4%

False Positive Rate

Conclusion

This study introduces a Hierarchical Hybrid Framework combining threshold-driven approaches with supervised techniques through a layered design where streams operate both sequentially and simultaneously. The Network Stream achieves strong results: 98% accuracy at Layer 1, reliable zero-day detection at Layer 2 (F1 = 0.87), and 97% multi-class classification at Layer 3. In real-time testing, the system detected 100% of known attacks and 91% of simulated zero-day threats, with latency averaging 68ms per flow.

The Application Stream shows room for improvement (~84% anomaly accuracy, ~74% multi-class accuracy) due to greater complexity of payload-related features. Future work includes addressing class imbalance in application flows, incorporating continuous learning, testing under real-world uneven loads, and exploring advanced sequence-based networks for payload content analysis.

Links

Project Repository

GitHub

Project Page

GitHub Pages

Department of Computer Engineering

University of Peradeniya

Faculty of Engineering

A Hierarchical Hybrid Framework for
Intrusion Detection