Predicting Cybersecurity Attacks Using Compound Analysis with Classification Algorithms
Table of content
- Abstract
- Introduction
- Problem Statement
- Aim and Objectives
- Related works
- Methodology
- Experiment Setup and Implementation
- Results and Analysis
- Conclusion
- Publications
- Links
Abstract
The increasing sophistication of cyber threats targeting web applications poses significant challenges for traditional security mechanisms. Identifying the vulnerable components of a web application during a cyber attack is crucial for mitigating security risks and enhancing system resilience. Attackers exploit weaknesses in web applications and APIs, often bypassing conventional security defenses.
In this work, we propose a framework to detect and analyze vulnerable components by correlating system logs and access logs with identified cyber attacks. First, network traffic is classified to distinguish between normal requests and attack patterns. If an attack is detected, the corresponding application logs are retrieved based on the time interval of the attack. By analyzing these logs, we determine which application function, service, or module is susceptible to exploitation.
The framework employs a three-stage classification approach. Initially, supervised classifiers are used to identify known attack patterns. Next, an anomaly detection model, specifically a One-Class SVM, classifies previously unseen or anomalous traffic. Finally, a root cause analysis component processes application logs within the detected attack timeframe to accurately determine the vulnerable component.
Designed for seamless integration into existing security infrastructures, such as web application firewalls (WAFs) and API gateways, the proposed framework operates efficiently without disrupting normal workflows. Performance evaluation is conducted based on detection accuracy, computational efficiency, and adaptability to evolving cyber threats. This approach enhances application-layer security by providing actionable insights into vulnerable components, aiding in proactive threat mitigation during software development and deployment.
Keywords: Cyber Attacks, Web Application Firewalls, Compound Prediction, Neural Networks, Network Security Situation.
Introduction
Background
The rapid digital transformation has led to a significant increase in cyber threats, particularly at the application layer, where web applications and microservices are prime targets for attackers. Traditional Intrusion Detection Systems (IDS) and Web Application Firewalls (WAFs) primarily rely on rule-based or signature-based mechanisms to detect malicious activities. However, these approaches struggle against modern cyber threats, such as zero-day attacks and sophisticated obfuscation techniques, which bypass conventional security measures.
Research Motivation
The motivation for this research stems from the need for an advanced intrusion detection approach that addresses the limitations of current IDS solutions. The proposed framework integrates multiple detection techniques into a structured, hierarchical architecture, enhancing scalability, accuracy, and adaptability to diverse and evolving cyber threats.
Key Challenges Addressed
- High false positive rates in traditional IDS
- Emergence of adversarial attacks
- Lack of generalization in existing IDS models
- Complexity of application-layer attacks
- Failure to establish cross-layer correlations between security events
Problem Statement
Modern cyber threats increasingly target the application layer of web infrastructures, exploiting vulnerabilities in web applications and APIs while bypassing traditional security measures. A major limitation of current security solutions is their inability to correlate attack indicators across multiple layers, including system logs, application logs, and network activity.
The primary objective of this research is to develop a hierarchical, multi-stage intrusion detection framework that not only detects cyberattacks but also identifies the specific vulnerable component of the application during an attack.
Aim and Objectives
Aim
To design, implement, and evaluate a hierarchical, multistage detection system capable of identifying both known and unknown web application attacks as well as the application component which is vulnerable during the cyber attacks with high accuracy, scalability, and adaptability, thereby enhancing overall application-layer security.
Objectives
- To design and develop a hierarchical, multi-stage algorithmic framework
- To define measurable security metrics, including classification accuracy, detection precision, computational efficiency, and adaptability
- To explore several algorithmic approaches within the hierarchical framework, combining neural networks with ensemble-based machine learning techniques
- To design scalable algorithms capable of adapting to future application-layer attack innovations and network expansions
- To investigate practical techniques for integrating the proposed framework into real-world application-layer security systems
Related works
Key Research Areas
- Intrusion Detection Systems (IDS): Evolution from signature-based to ML/DL-based approaches
- Network-Layer vs. Application-Layer Detection: Focus on application-layer threats like SQL injection, XSS, and RCE
- Machine Learning in Cybersecurity: SVM, Decision Trees, Random Forests, CNN, LSTM implementations
- Limitations of Traditional Approaches: Static pattern sets, high false positives, lack of adaptability
- Dataset Challenges: Issues with outdated datasets (KDD Cup 1999, DARPA) and need for modern alternatives (NSL-KDD, CICIDS2017)
Methodology
Overview
Our hierarchical, multistage approach combines:
- Supervised Classification for known threats
- Anomaly Detection for suspicious, potentially unknown attacks
- Root Cause Analysis for identifying the specific vulnerable application component using log correlation
Data Collection
- Setup Vulnerable Server Environment
- Deploy intentionally vulnerable web applications (DVWA, OWASP Juice Shop, WebGoat)
- Host on virtual machines or Docker containers
- Install logging/monitoring tools
- Traffic Generation and Labelling
- Normal traffic: Selenium, JMeter, or Locust for user behavior simulation
- Attack traffic: OWASP ZAP, Burp Suite for known vulnerabilities
- Custom scripts for additional malicious payloads
- Log Collection and Correlation
- Web application logs using security tools
- Database logs (MySQL, MongoDB, PostgreSQL)
- Centralized logging with ELK Stack or Splunk
Model Development
- Supervised Classifier: Logistic Regression, Random Forest, Gradient Boosting, or Neural Networks
- Anomaly Detector: One-Class SVM, Isolation Forest, or Autoencoder-based approaches
- Vulnerability Identification Component: Correlation analysis to identify vulnerable components
Detection Pipeline
The hierarchical detection pipeline includes:
- Feature extraction
- Classifier decision stage
- Anomaly detection stage
- Root cause analysis through log correlation
- Feedback loop for continuous improvement
Experiment Setup and Implementation
Environment Setup
- Vulnerable applications deployed in controlled environments
- Docker containers for isolation
- Custom log collection scripts
Evaluation Metrics
- Performance Metrics: Accuracy, Precision, Recall, F1-score, FPR, FNR
- Root Cause Accuracy: Correctness in identifying affected components
- Scalability Performance: Ability to handle large traffic volumes
Experimental Scenarios
- Known attacks evaluation
- Unknown or mutated attacks testing
- Unknown normal cases assessment
- Adaptive attack resistance testing
Results and Analysis
[To be updated as the project progresses]
Conclusion
This research proposes a hierarchical, multistage intrusion detection framework focused on application-layer threats. By combining supervised classification for known malicious patterns with anomaly detection for novel exploits, the proposed system aims to improve:
- Accuracy: Through targeted use of ML/DL models
- Scalability: Via efficient algorithms and containerized environments
- Adaptability: Through feedback loops and continuous learning
- Component Identification: By pinpointing affected microservices or functions
The framework integrates seamlessly into existing WAFs and API gateways, minimizing overhead while maximizing security benefits for web applications and APIs.
Publications
Timeline
Key milestones include:
- Literature Review & Proposal (Jan 2025)
- Environment Setup & Data Collection (Feb-Mar 2025)
- Model Development & Training (Mar-May 2025)
- Evaluation & Testing (May-Jun 2025)
- Final Report & Presentation (Jul 2025)