Final Year Project / Group 13

A Hierarchical Hybrid Framework for
Intrusion Detection

in Network and Application Layers

Department of Computer Engineering · University of Peradeniya

Team

JS

Jegatheesan S.

E/19/174

e19174@eng.pdn.ac.lk
ET

Eniyavan T.

E/20/099

e20099@eng.pdn.ac.lk
VE

Vithushan E.T.L.

E/20/416

e20416@eng.pdn.ac.lk

Supervisor

Mr. Biswajith A.K. Dissanayake

biswajithd@eng.pdn.ac.lk

Contents

Abstract

The rapid evolution of cyber-attacks such as zero-day exploits and sophisticated application-layer attacks like SQL injection, XSS, command injection, and enhanced DDoS attacks exposes considerable drawbacks of traditional intrusion detection systems. In this work, we propose a Hierarchical Hybrid Framework for server-side intrusion detection that overcomes these issues with a fully parallel two-stream approach. Our approach analyses 22 network-layer features and 19 application-layer features (41 total), selected using Random Forest importance, SHAP values, mutual information, and correlation. The framework includes a three-layer cascade:

L1 Binary Classifier

LightGBM for Network Stream (98% accuracy), XGBoost for Application Stream (84.55% accuracy)

L2 Known/Unknown

Threshold-based classifier (confidence ≥ 0.7, F1 = 0.87, false-alarm rate < 5%)

L3 Multi-Class

XGBoost with 97% accuracy (Network) and 74.23% (Application)

Evaluated on the balanced CSE-CIC-IDS2018 dataset (4,122,352 benign and 2,748,235 malicious packets), the pipeline achieved an overall accuracy of 97.8% with a macro F1-Score of 0.91. In real-time tests the model achieved 100% attack detection with a false positive rate below 4%.

Intrusion Detection Zero-Day Detection Hybrid ML Parallel Processing Anomaly Detection Enterprise Security

Problem Statement

Despite major improvements in machine-learning-based IDS, most present-day solutions handle network and application layers sequentially or in isolation. This enables advanced application-layer attacks (obfuscated SQL Injection, Cross-Site Scripting, Command Injection, and Business Logic Abuse) to go undetected because they appear perfectly normal at the network level.

Sequential Processing

Network and application layers analysed in isolation, allowing cross-layer attacks to evade detection.

Poor Zero-Day Handling

Supervised models fail against unseen threats; unsupervised ones suffer from high false positives.

Lack of Cross-Layer Fusion

No robust meta-learner to combine outputs from network and application streams.

High Computational Overhead

Advanced hybrid architectures unsuitable for resource-constrained environments like IoT gateways.

Methodology

The proposed framework operates through four interconnected components using a fully parallel dual-stream architecture. Two separate streams handle network-layer and application-layer inputs simultaneously, enabling real-time cross-layer intrusion detection.

Hierarchical Hybrid IDS Architecture

01 Parallel Dual-Stream Input Processing

Two separate streams process network (22 features) and application (19 features) data simultaneously, reducing latency while spotting cross-layer threats.

02 Layer 1: Anomaly Detector (Binary Classifier)

A supervised binary classifier that distinguishes benign from attack traffic. LightGBM is used for the Network Stream (98% accuracy) and XGBoost for the Application Stream (84.55% accuracy). An initial unsupervised approach (Isolation Forest, Autoencoder, One-Class SVM) was abandoned due to unacceptably poor results (accuracy as low as 3%).

03 Layer 2: Known/Unknown Identifier

A zero-overhead threshold-based decision mechanism using Layer 1's confidence scores. High confidence (≥ 0.7) routes flows as "known attacks" to Layer 3; low confidence flags them as potential zero-day threats for immediate alert. This replaced unsupervised models that had critically low precision (0.26 to 0.27).

04 Layer 3: Multi-Class Classifier

XGBoost classifies known attacks into 14 distinct attack families, achieving 97% accuracy and 0.92 macro F1-score on the Network Stream. The macro F1-score ensures fair treatment of minority attack classes like Slowloris and SQL Injection.

05 Meta-Learner Fusion

Results from both streams are combined using a meta-learner that resolves conflicting decisions between network and application streams, producing final alerts.

Feature Selection

41 features were selected using a rank aggregation of four methods: Correlation Coefficients, Mutual Information, Random Forest importance, and Recursive Feature Elimination (SVM). An initial 20-feature set resulted in poor performance (F1 = 0.08), prompting expansion to 41 features that improved performance by 10 to 15%.

Network Layer (22 features)

Dst Port, Protocol, Flow Duration, Flow IAT Mean/Max/Min, Fwd IAT Max/Min/Tot, ECE Flag Cnt, Flow Pkts/s, Fwd/Bwd Pkts/s, Tot Fwd Pkts, Subflow Fwd Pkts, Fwd/Bwd Header Len, Init Fwd/Bwd Win Byts, PSH/RST/ACK Flag Cnt

Application Layer (19 features)

Pkt Size Avg, Pkt Len Mean/Max/Min/Std/Var, Fwd Pkt Len Max/Min/Mean/Std, Bwd Pkt Len Max/Min/Mean/Std, Fwd/Bwd Seg Size Min/Avg, TotLen Fwd Pkts, Subflow Fwd Byts

Feature Validation

Log-transformed box plots of the five most discriminative features confirmed statistically significant separation between benign traffic and every attack category.

Flow Duration Distribution
Flow Pkts/s Distribution
Flow IAT Mean Distribution
Fwd Pkts/s Distribution
ACK Flag Count Distribution

Results and Analysis

NET Layer 1: Anomaly Detector

After the unsupervised approach failed catastrophically, Layer 1 was re-engineered as a supervised binary classifier. Nine algorithms were evaluated; LightGBM emerged as the best performer.

Model Precision Recall F1-Score Accuracy ROC-AUC
Logistic Regression0.810.940.870.890.9228
Decision Tree0.960.950.960.970.9701
Random Forest0.980.950.960.970.9870
AdaBoost0.970.940.950.960.9759
SVM0.880.940.910.920.9627
Gradient Boosting0.990.940.970.980.9809
MLP1.000.940.970.980.9852
XGBoost1.000.940.970.980.9900
LightGBM (Selected)1.000.940.970.980.9905

NET Layer 2: Known/Unknown Identifier

The threshold-based approach using Layer 1's confidence scores dramatically outperformed unsupervised methods.

Metric Isolation Forest One-Class SVM Threshold-Based (Final)
Unknown Recall0.730.780.91
Unknown Precision0.260.270.84
F1-Score (Unknown)0.380.400.87
False Alarm RateVery HighVery High< 5%

NET Layer 3: Multi-Class Classifier

XGBoost achieved the highest macro F1-score (0.92) across all 14 attack families.

Model Accuracy Macro F1 Weighted F1
Logistic Regression0.940.700.94
Decision Tree0.970.910.97
Random Forest0.970.910.97
XGBoost (Selected)0.970.920.97
MLP0.960.780.96
LightGBM0.280.190.22

APP Application Stream Results

Layer Best Model Accuracy Recall F1-Score
Layer 1 (Anomaly)XGBoost84.55%0.9620.833
Layer 2 (Known/Unknown)Threshold-Based83.38%0.588 (unknown)0.909
Layer 3 (Multi-Class)XGBoost74.23%--

LIVE Real-Time Evaluation

A live test environment was created using attacker and victim VMs. The complete pipeline was tested with real attack traffic generated using Slowloris, LOIC/HOIC, SQLMap, Hydra, and other tools.

Real-Time Deployment Architecture

97%

Known Attack Detection

91%

Zero-Day Recall

68ms

Avg Latency per Flow

< 4%

False Positive Rate

Conclusion

This study introduces a Hierarchical Hybrid Framework combining threshold-driven approaches with supervised techniques through a layered design where streams operate both sequentially and simultaneously. The Network Stream achieves strong results: 98% accuracy at Layer 1, reliable zero-day detection at Layer 2 (F1 = 0.87), and 97% multi-class classification at Layer 3. In real-time testing, the system detected 100% of known attacks and 91% of simulated zero-day threats, with latency averaging 68ms per flow.

The Application Stream shows room for improvement (~84% anomaly accuracy, ~74% multi-class accuracy) due to greater complexity of payload-related features. Future work includes addressing class imbalance in application flows, incorporating continuous learning, testing under real-world uneven loads, and exploring advanced sequence-based networks for payload content analysis.