Multi-Modal Data Fusion for Trading Market Price Forecasting
Team
- E/19/278, Perera A.P.T.T., email
- E/19/349, Sandaruwan K.G.S.T., email
- E/19/492, Somawansha M.V.N.L., email
Supervisors
Table of content
- Introduction
- Background and Motivation
- Problem Statement
- Research Gap
- Aim
- Proposed Solution
- Research Methodology
- Data Collection
- High Level Design Architecture
- Data Preprocessing & Feature Engineering
- Links
Introduction
What is Trading?
- Buying and selling of financial assets in markets with the aim of making a profit.
Challenges in Traditional Trading
- Technical analysis alone fails to capture sudden market shifts.
- Emotional trading leads to 70–90% losses among retail traders.
Background and Motivation
Highlights the importance of stock market forecasting for reducing risk and improving trading strategies in a $110 trillion global market influenced by factors like interest rates, inflation, and employment data. Traditional technical analysis methods often fall short, leading to high retail trader losses. Advances in machine learning enable analysis of large datasets and complex patterns, especially when integrating macroeconomic indicators that can predict market shifts. The study aims to develop a comprehensive prediction framework that combines historical data, market patterns, and macroeconomic factors to enhance accuracy and provide traders with actionable, data-driven insights for better decision-making.
Problem Statement
Accurately predicting tradable moments and entry/exit points in Gold (XAU/USD) is extremely difficult due to its high volatility and sudden market shocks based on economic conditions.
Research Gap
Current trading price forecasting models face several key limitations: they mostly rely on price data and technical indicators, often ignoring important macroeconomic factors; they lack effective mechanisms to control traders’ emotional biases like FOMO or panic selling; and they suffer from poor interpretability, making it hard for traders to trust AI predictions. Additionally, many models fail to adapt to different market regimes (bull, bear, sideways), reducing their accuracy during volatile or shifting conditions. Addressing these gaps requires multi-modal, emotionally aware, interpretable, and adaptive machine learning models for more reliable trading forecasts.
Aim
The aim of this research is to forecast trading moments, entry prices, and exit prices for gold (XAU/USD) by analyzing the relationship between historical price movement patterns, market trading volume, and key macroeconomic indicators such as CPI, GDP, PPI, PCE, NFP, and interest rates. The research seeks to develop a multi-modal machine learning framework that enhances the predictive accuracy of trading decisions by integrating these factors.
Proposed Solution
Introduces a multi-modal machine learning framework to improve financial forecasting by integrating historical price data, trading volume, and macroeconomic factors. Using deep learning, the model analyzes price trends and volume to gauge trend strength, while incorporating economic indicators like inflation and interest rates to capture external influences. It also predicts optimal stop-loss levels to enhance risk management during market volatility. This comprehensive approach aims to deliver more accurate, transparent, and data-driven trade entry and exit predictions, addressing existing model limitations and supporting better decision-making in complex markets.
Research Objectives
General Objective:
Develop a multi-modal machine learning framework that integrates:
- Historical price data
- Market trading volume
- Macroeconomic factors
- Technical indicators
Specific Objectives:
- Analyze the relationship between key macroeconomic factors in the United States towards fluctuations in the XAU/USD price.
Macroeconomic Factors
Macroeconomic indicators significantly influence markets, providing insights into a country’s economic health and investor sentiment. Key factors include:
- Interest Rate (Federal Reserve Rate)
- Consumer Price Index (CPI)
- Non-Farm Payrolls (NFP)
- Personal Consumption Expenditures (PCE)
- Gross Domestic Product (GDP)
- Producer Price Index (PPI)
Key highlights:
- Data Fusion: Combining historical price data, trading volume, and macroeconomic factors.
- Modeling: Utilizing deep learning models like LSTM or Transformer.
- Goal: Improve breakout classification and support better trading decisions.
We aim to help traders identify high-probability opportunities and reduce risks amid market volatility.
Research Methodology
This multi-stage framework for breakout prediction is a comprehensive system that integrates multiple data-driven approaches to enhance trading accuracy. Each module plays a crucial role in processing and analyzing different aspects of financial data, ensuring a well-rounded predictive model. The structured interaction between trend analysis, volume assessment, macroeconomic influences, and support/resistance classification enables the identification of high-confidence trading signals. By leveraging machine learning and real-time market insights, this system provides traders with a powerful tool to differentiate between real breakouts and fakeouts, improving profitability and risk management.
Data Collection
Data is collected from multiple sources:
- Macroeconomic Indicators from Federal Reserve Statements, Investing.com (2018-2025)
- XAU/USD Price & Trading Volume through MetaTrader 5 API, IC Market Broker
Data Preprocessing & Feature Engineering
- Missing data handling - Interpolation - Forward filling
- Normalization and scaling
- Outlier detection
- Time series data preprocessing - Date & Time Conversion - Resampling
- Feature Engineering
- Correlation & Feature Selection
• Gold Price Data (XAU/USD): We will collect historical XAU/USD OHLCV (Open, High, Low, Close, Volume) data across multiple timeframes (30-minute, hourly, 4-hourly, daily, weekly, and monthly) using the MetaTrader5 local terminal. This ensures we capture both short-term fluctuations and long-term trends. The data will cover the period from 2015 to 2025.
• Macroeconomic Indicators: These fundamental factors drive long-term price movements and market sentiment. We will focus on: Interest Rate (Federal Reserve Rate): Determines monetary policy direction, Consumer Price Index (CPI): Reflects inflationary pressures, Non-Farm Payrolls (NFP): Measures employment trends and economic health, Personal Consumption Expenditures (PCE): Tracks consumer spending behavior, Gross Domestic Product (GDP): Indicates overall economic growth, Producer Price Index (PPI): Represents inflation at the producer level.
Integration Strategy: Web scraping from Investing.com and manually extracting Federal Reserve statements. Data will be collected for the 2015-2025 range to cover multiple economic cycles.
High Level Design Architecture
Models Used
- ARIMA/SARIMA: Statistical model for forecasting based on time series analysis.
- LSTM (Long Short-Term Memory): Deep learning approach for capturing long-term dependencies in sequential data.
- XGBoost: Tree-based ensemble model, capturing complex patterns from structured data.
Key Findings
- CPI, PPI, and PCE are highly correlated with market trends.
- Macroeconomic Factors: CPI, PCE, and GDP show a negative correlation with the market trend.
- Market Volume: Significant variations observed in global market sessions (Sydney, Tokyo, London, New York).
Model Performance
ARIMA vs SARIMA Model Comparison
LSTM Model
XGBoost Model
Insights & Impact
- Regime-Aware Forecasting: Build models to identify different market conditions.
- Session-Dependent Liquidity Patterns: Tune execution algorithms based on market sessions.
- Automated Execution: Leverage real-time data for risk-controlled, smart trading execution.
Future Research Directions
-
Regime‑Aware Forecasting Build models that can highly identify different market conditions
-
Multi‑Step, Feature‑First Pipelines Generalize the two‑stage approach
-
Combine News Sentiments Integrate streaming news, social‑media sentiment, and order‑book signals
-
One‑Step LSTM Strengths & Limits Deploy separate LSTMs per forecast horizon (micro‑structure vs. longer term).
-
Risk Controls - Trading systems with stop‑loss rules
Limitations
- Limited data availability for real-time forecasting.
- The complexity of models increases the risk of overfitting.