Team
Supervisor
Mr. Biswajith A.K. Dissanayake
biswajithd@eng.pdn.ac.lkContents
Abstract
The rapid evolution of cyber-attacks such as zero-day exploits and sophisticated application-layer attacks like SQL injection, XSS, command injection, and enhanced DDoS attacks exposes considerable drawbacks of traditional intrusion detection systems. In this work, we propose a Hierarchical Hybrid Framework for server-side intrusion detection that overcomes these issues with a fully parallel two-stream approach. Our approach analyses 22 network-layer features and 19 application-layer features (41 total), selected using Random Forest importance, SHAP values, mutual information, and correlation. The framework includes a three-layer cascade:
LightGBM for Network Stream (98% accuracy), XGBoost for Application Stream (84.55% accuracy)
Threshold-based classifier (confidence ≥ 0.7, F1 = 0.87, false-alarm rate < 5%)
XGBoost with 97% accuracy (Network) and 74.23% (Application)
Evaluated on the balanced CSE-CIC-IDS2018 dataset (4,122,352 benign and 2,748,235 malicious packets), the pipeline achieved an overall accuracy of 97.8% with a macro F1-Score of 0.91. In real-time tests the model achieved 100% attack detection with a false positive rate below 4%.
Problem Statement
Despite major improvements in machine-learning-based IDS, most present-day solutions handle network and application layers sequentially or in isolation. This enables advanced application-layer attacks (obfuscated SQL Injection, Cross-Site Scripting, Command Injection, and Business Logic Abuse) to go undetected because they appear perfectly normal at the network level.
Sequential Processing
Network and application layers analysed in isolation, allowing cross-layer attacks to evade detection.
Poor Zero-Day Handling
Supervised models fail against unseen threats; unsupervised ones suffer from high false positives.
Lack of Cross-Layer Fusion
No robust meta-learner to combine outputs from network and application streams.
High Computational Overhead
Advanced hybrid architectures unsuitable for resource-constrained environments like IoT gateways.
Methodology
The proposed framework operates through four interconnected components using a fully parallel dual-stream architecture. Two separate streams handle network-layer and application-layer inputs simultaneously, enabling real-time cross-layer intrusion detection.
01 Parallel Dual-Stream Input Processing
Two separate streams process network (22 features) and application (19 features) data simultaneously, reducing latency while spotting cross-layer threats.
02 Layer 1: Anomaly Detector (Binary Classifier)
A supervised binary classifier that distinguishes benign from attack traffic. LightGBM is used for the Network Stream (98% accuracy) and XGBoost for the Application Stream (84.55% accuracy). An initial unsupervised approach (Isolation Forest, Autoencoder, One-Class SVM) was abandoned due to unacceptably poor results (accuracy as low as 3%).
03 Layer 2: Known/Unknown Identifier
A zero-overhead threshold-based decision mechanism using Layer 1's confidence scores. High confidence (≥ 0.7) routes flows as "known attacks" to Layer 3; low confidence flags them as potential zero-day threats for immediate alert. This replaced unsupervised models that had critically low precision (0.26 to 0.27).
04 Layer 3: Multi-Class Classifier
XGBoost classifies known attacks into 14 distinct attack families, achieving 97% accuracy and 0.92 macro F1-score on the Network Stream. The macro F1-score ensures fair treatment of minority attack classes like Slowloris and SQL Injection.
05 Meta-Learner Fusion
Results from both streams are combined using a meta-learner that resolves conflicting decisions between network and application streams, producing final alerts.
Feature Selection
41 features were selected using a rank aggregation of four methods: Correlation Coefficients, Mutual Information, Random Forest importance, and Recursive Feature Elimination (SVM). An initial 20-feature set resulted in poor performance (F1 = 0.08), prompting expansion to 41 features that improved performance by 10 to 15%.
Network Layer (22 features)
Dst Port, Protocol, Flow Duration, Flow IAT Mean/Max/Min, Fwd IAT Max/Min/Tot, ECE Flag Cnt, Flow Pkts/s, Fwd/Bwd Pkts/s, Tot Fwd Pkts, Subflow Fwd Pkts, Fwd/Bwd Header Len, Init Fwd/Bwd Win Byts, PSH/RST/ACK Flag Cnt
Application Layer (19 features)
Pkt Size Avg, Pkt Len Mean/Max/Min/Std/Var, Fwd Pkt Len Max/Min/Mean/Std, Bwd Pkt Len Max/Min/Mean/Std, Fwd/Bwd Seg Size Min/Avg, TotLen Fwd Pkts, Subflow Fwd Byts
Feature Validation
Log-transformed box plots of the five most discriminative features confirmed statistically significant separation between benign traffic and every attack category.
Results and Analysis
NET Layer 1: Anomaly Detector
After the unsupervised approach failed catastrophically, Layer 1 was re-engineered as a supervised binary classifier. Nine algorithms were evaluated; LightGBM emerged as the best performer.
| Model | Precision | Recall | F1-Score | Accuracy | ROC-AUC |
|---|---|---|---|---|---|
| Logistic Regression | 0.81 | 0.94 | 0.87 | 0.89 | 0.9228 |
| Decision Tree | 0.96 | 0.95 | 0.96 | 0.97 | 0.9701 |
| Random Forest | 0.98 | 0.95 | 0.96 | 0.97 | 0.9870 |
| AdaBoost | 0.97 | 0.94 | 0.95 | 0.96 | 0.9759 |
| SVM | 0.88 | 0.94 | 0.91 | 0.92 | 0.9627 |
| Gradient Boosting | 0.99 | 0.94 | 0.97 | 0.98 | 0.9809 |
| MLP | 1.00 | 0.94 | 0.97 | 0.98 | 0.9852 |
| XGBoost | 1.00 | 0.94 | 0.97 | 0.98 | 0.9900 |
| LightGBM (Selected) | 1.00 | 0.94 | 0.97 | 0.98 | 0.9905 |
NET Layer 2: Known/Unknown Identifier
The threshold-based approach using Layer 1's confidence scores dramatically outperformed unsupervised methods.
| Metric | Isolation Forest | One-Class SVM | Threshold-Based (Final) |
|---|---|---|---|
| Unknown Recall | 0.73 | 0.78 | 0.91 |
| Unknown Precision | 0.26 | 0.27 | 0.84 |
| F1-Score (Unknown) | 0.38 | 0.40 | 0.87 |
| False Alarm Rate | Very High | Very High | < 5% |
NET Layer 3: Multi-Class Classifier
XGBoost achieved the highest macro F1-score (0.92) across all 14 attack families.
| Model | Accuracy | Macro F1 | Weighted F1 |
|---|---|---|---|
| Logistic Regression | 0.94 | 0.70 | 0.94 |
| Decision Tree | 0.97 | 0.91 | 0.97 |
| Random Forest | 0.97 | 0.91 | 0.97 |
| XGBoost (Selected) | 0.97 | 0.92 | 0.97 |
| MLP | 0.96 | 0.78 | 0.96 |
| LightGBM | 0.28 | 0.19 | 0.22 |
APP Application Stream Results
| Layer | Best Model | Accuracy | Recall | F1-Score |
|---|---|---|---|---|
| Layer 1 (Anomaly) | XGBoost | 84.55% | 0.962 | 0.833 |
| Layer 2 (Known/Unknown) | Threshold-Based | 83.38% | 0.588 (unknown) | 0.909 |
| Layer 3 (Multi-Class) | XGBoost | 74.23% | - | - |
LIVE Real-Time Evaluation
A live test environment was created using attacker and victim VMs. The complete pipeline was tested with real attack traffic generated using Slowloris, LOIC/HOIC, SQLMap, Hydra, and other tools.
97%
Known Attack Detection
91%
Zero-Day Recall
68ms
Avg Latency per Flow
< 4%
False Positive Rate
Conclusion
This study introduces a Hierarchical Hybrid Framework combining threshold-driven approaches with supervised techniques through a layered design where streams operate both sequentially and simultaneously. The Network Stream achieves strong results: 98% accuracy at Layer 1, reliable zero-day detection at Layer 2 (F1 = 0.87), and 97% multi-class classification at Layer 3. In real-time testing, the system detected 100% of known attacks and 91% of simulated zero-day threats, with latency averaging 68ms per flow.
The Application Stream shows room for improvement (~84% anomaly accuracy, ~74% multi-class accuracy) due to greater complexity of payload-related features. Future work includes addressing class imbalance in application flows, incorporating continuous learning, testing under real-world uneven loads, and exploring advanced sequence-based networks for payload content analysis.