Enhancing Multimodal Fusion Techniques for Depression Detection

Overview

This project focuses on advancing multimodal fusion techniques for depression detection by addressing key challenges and integrating clinical data into the system. Our main contributions include:

System Workflow

  1. Data Processing & Model Training
    • Preprocessing multimodal data (audio, video, text, clinical).
    • Handling missing modalities and class imbalance.
    • Incorporating clinical data extracted from transcripts.
    • Training deep learning models using Python, PyTorch, etc.
  2. Web Application
    • Frontend developed with React.
    • Backend built with Spring Boot.
    • Users can connect with counselors, view depression assessments, and interact with the chatbot.
  3. Deployment & Infrastructure
    • Containerization using Docker and Kubernetes.
    • Cloud deployment on AWS:
      • S3 for storing videos and preprocessed data.
      • Lambda functions for triggering preprocessing tasks.

Web Application High-Level Architecture

System Architecture Overview

The web application follows a microservices architecture pattern with clear separation of concerns between frontend, backend, and AI processing components.

System Architecture Diagram

Component Architecture

Frontend Layer (React.js)

Backend Layer (Spring Boot)

AI Processing Layer (Python)

Cloud Infrastructure (AWS)

Data Preprocessing Methods

Data Type Preprocessing Method Description
Audio Data Preprocessing    
Video Data Preprocessing    
Text Data Preprocessing    
Clinical Data Preprocessing    
Multimodal Data Integration    

Tech Stack

| Component | Technologies | | ——— | ———— | | Modal Training & Preprocessing | Python, PyTorch, NumPy, Pandas, … | | Frontend Development | React.js | | Backend Development | Spring Boot (Java), Flask | | Containerization & Deployment | Docker, Kubernetes, AWS S3, AWS Lambda |

Features

Installation & Usage


Research Team

Team Members

Supervisors