Machine Learning for Weather Forecasting
Team
- E/18/242, Nimnadi J. A. S., e18242@eng.pdn.ac.lk
- E/18/368, Uduwanage H. U., e18368@eng.pdn.ac.lk
- E/18/398, Wijerathne R. M. N. S., e18398@eng.pdn.ac.lk
Table of Contents
Introduction
Machine learning for climate forecasting addresses the pressing real-world problem of accurately predicting and understanding climate patterns. Climate change has severe consequences on ecosystems, agriculture, and human lives. By harnessing the power of machine learning algorithms, this project aims to enhance climate models, enabling more precise and reliable forecasts. The impact is significant as it assists policymakers, researchers, and communities in making informed decisions related to disaster management, agriculture, energy, and urban planning. Improved climate forecasting can aid in mitigating the effects of climate change, minimizing risks, and promoting sustainable development, ultimately fostering a resilient and adaptive society. For this project we are mainly focused on precipitation forecasting.
Features
- Select Date: Users can enter any date and easily get the prediction.
- Select Region: User can select any wanted district to get predictions. For now we got the dataset relevent to Puttalam District. Otherwise if the user have relevent datasets he can upload it and get the results easily.
- Easy Prediction
Dashboard
Predictions
Solution Architecture
Web-Based Precipitation Forecasting Solution Architecture
Our web-based precipitation forecasting solution combines modern technologies to provide accurate and timely precipitation predictions. The architecture comprises a React.js frontend, a Flask backend, an SQLite database, MLflow for model registration and management, and Jupyter notebooks for model testing and development.
Frontend (React.js):
- Provides a user-friendly interface for users to input location and date.
- Communicates with the backend to retrieve forecasted precipitation data.
- Offers an intuitive visualization of the forecast.
Backend (Flask):
- Handles user requests and communicates with the SQLite database.
- Integrates MLflow for model registration, version control, and deployment.
- Utilizes ML models to generate precipitation forecasts.
Database (SQLite):
- Stores user preferences, historical data, and model parameters.
- Enables quick retrieval of data required for forecasting.
MLflow:
- Manages ML model lifecycle, including versioning and tracking.
- Provides a seamless transition from Jupyter notebooks to deployment in the Flask backend.
Jupyter Notebooks:
- Serves as a development and testing environment for ML models.
- Allows for iterative model improvement and experimentation.
This architecture ensures a scalable, robust, and efficient system for accurate precipitation forecasting with easy model management and development.
Data Flow
The prediction of Puttalam district will happen from the already uploaded dataset. If the user wants any other district’s prediction and if he have the dataset with him he can upload the data set via our system and the model will get the prediction via our trained ML model.
Work Flow
- Get a clear understanding of problem and problem domain
- Get the data and pre process
- Develop and test machine learning models
- Start developing the backend with Flask
- Initialize the SQLite database
- Designing and developing the frontend of the web application
- Connecting all together and testing
- Deploying
Modeling
Data preprocessing
Created a target column which is a one month ahead prediction Checked missing values and duplicates
Feature selection
Checked constant, quasi-constant features Check feature to feature correlations and dropped highly correlated redundant features Got important feature to the target using feature importance method
Model selection
Applied models
- Linear regression model
- Polynomial regression model
- ARMA model / ARIMA model / SARIMA model
- XGBoost regression model
- Random Forest regression model
- Holt-Winters Exponential Smoothing model
- LSTM model
Model evaluation
- Mean Squared Error
- Mean Absolute Error
Did hyper parameter tunings for models and improved the accuracy.
Choose Random Forest regression model as our final model.