SNAP‑V

SNAP‑V: A RISC‑V SoC with Configurable Neuromorphic Acceleration for Small‑Scale Spiking Neural Networks

A configurable, resource‑efficient neuromorphic accelerator integrated within a RISC‑V System‑on‑Chip, optimized for executing small‑scale SNNs (~1,000 neurons) in real‑time and low‑power environments. Addresses the gap between large‑scale neuromorphic processors and conventional SoC architectures for edge computing applications.

View on GitHub
RISC‑V SoC Small‑Scale SNNs LIF Neurons Edge Computing Real‑Time Open Source
Block diagram or sample visualization

SNAP‑V Overview

High‑level view of the SoC, accelerator clusters, NoC and memory.

Architecture & Implementation

SNAP‑V addresses the critical gap in neuromorphic hardware design: efficient support for small‑scale SNNs in edge and embedded contexts. Unlike large‑scale neuromorphic processors that are over‑provisioned for edge applications, SNAP‑V provides a specialized SoC platform optimized for real‑time processing of small‑scale SNNs (~1,000 neurons) in a power‑efficient and scalable manner.

� Research Objectives

1. Configurable Neuromorphic Accelerator

Develop a small‑scale, resource‑efficient accelerator capable of simulating biologically inspired spiking neuron models, such as Leaky Integrate‑and‑Fire (LIF), with tunable parameters to support varied computational behaviors.

2. Power‑Efficient Interconnection Fabric

Implement a hierarchical Network‑on‑Chip (NoC) optimized for spike‑based, asynchronous communication to enable low‑latency and scalable data exchange among neuron clusters and memory modules.

3. RISC‑V SoC Integration

Embed the neuromorphic accelerator within a modular RISC‑V System‑on‑Chip to support seamless coordination of neural processing with control tasks, data pre‑processing, and external interfacing.

4. Complete System Validation

Validate the design through RTL simulation and FPGA prototyping, and benchmark performance and power efficiency against existing neuromorphic platforms, with focus on real‑time responsiveness and scalability.

🧠 Neuron Processing Elements

Complete neuron PEs implementing LIF dynamics with dedicated hardware units for each computational stage of the spiking neuron model.

  • Controller: Decodes incoming spike/data packets and controls processing flow
  • Weight Accumulator: Summation of synaptic weights from multiple inputs
  • Potential Decay Unit: Implements LIF decay equations in hardware
  • Adder & Spike Generator: Compares potential against threshold, generates spikes
  • Timing Logic: Ensures correct refractory period handling

🏗️ Cluster Architecture

Multiple neuron processing elements grouped into clusters for parallelism, with dedicated controllers and routing infrastructure.

  • Cluster Controller: Manages neuron configuration & schedules spike processing
  • Incoming Forwarder: Routes incoming spikes from NoC to neurons
  • Outgoing Encoder: Encodes neuron output into spike packets for the NoC
  • Scalability: Clusters can be tiled to increase neuron count

🔗 Hierarchical Network‑on‑Chip

Dual‑path hierarchical topology supporting hundreds of neurons with low routing congestion and optimized spike delivery.

  • Spike Packet Path: Minimal payload (neuron ID + timestamp), multicast‑optimized forwarding
  • Data/Control Path: Handles weight initialization, coding mode selection, and configuration
  • Routing Mechanism: Address‑based forwarding with small routing tables
  • Flow Completion: Logic to signal when spike propagation is complete

⚡ RISC‑V SoC Integration

Integration via RoCC (Rocket Custom Coprocessor) interface in the Chipyard framework, with custom instructions for accelerator control.

  • Accelerator Controller: Accepts custom instructions from RISC‑V CPU
  • Operations Control: Triggers start/stop, load parameters, run inference
  • Coding Hardware Unit: On‑chip input spike generation (TTFS and rate coding)
  • Peripheral Subsystem: UART/SPI for external communication, GPIO for debugging

�️ Memory & Interfaces

Distributed SRAM near clusters and a controller for memory‑mapped I/O, initialization, and spike formatting.

  • Weight resolver: maps presynaptic IDs to weight addresses
  • Locality: cluster‑group partitioning to cut traffic
  • Init/config: load weights and neuron params at boot
  • Timing: synchronous time‑step execution

🎥 Vision Pipeline

End‑to‑end flow for vision: encode sensor frames to spike trains, process through accelerator layers, and classify via spike timing/rate.

  • Spike coding: TTFS and rate coding hardware units
  • Multi‑layer processing: clustered neurons and routers
  • Outputs: classification by spike timing or rate
  • Host control: configure, trigger, and read results

🛠️ Implementation & Tools

RTL in Verilog/SystemVerilog with simulation, FPGA prototyping and ASIC flows for power/area/timing analysis.

  • Simulation: ModelSim & Verilator + GTKWave
  • FPGA: prototyping and timing closure
  • ASIC: Synopsys DC/PrimeTime/PrimePower
  • Tooling: Python utilities for encoding/decoding

🔧 Technical Specifications

Neuromorphic Accelerator

  • Scale: Optimized for small‑scale SNNs (~1,000 neurons)
  • Neuron Model: Leaky Integrate‑and‑Fire (LIF) with tunable parameters
  • Organization: Cluster‑based parallel execution
  • Spike Coding: TTFS (Time‑to‑First‑Spike) and rate coding support

System Integration

  • Framework: Chipyard with RoCC (Rocket Custom Coprocessor) interface
  • Target Platform: Xilinx FPGA (ZCU104) for validation
  • RTL: Complete SystemVerilog implementation
  • Verification: ModelSim, Verilator, and post‑synthesis timing validation

Performance Results & Benchmarks

SNAP‑V demonstrates significant performance improvements over software SNN implementations, with power analysis showing memory‑dominated consumption and hardware accuracy closely matching software baselines across MNIST classification tasks.

⚡ Power & Efficiency

Memory‑dominant
>95% of power from memories (normalized)
  • Memory accounts for >95% of power consumption in normalized analysis
  • Synaptic energy efficiency exceeds state‑of‑the‑art designs (normalized to 16nm)
  • Event‑driven computation reduces dynamic switching power
  • Local memory placement minimizes NoC traffic and communication overhead

🎯 Model Accuracy

Near‑baseline
MNIST classification (TTFS / rate)
  • Hardware accuracy closely matches software baseline for MNIST classification
  • Minimal fixed‑point degradation with robust performance
  • Supports both TTFS and rate coding with consistent results
  • Validated across multiple time‑step configurations

� Throughput

Significant speedup
vs. software SNN implementations
  • Significant speedup achieved through parallel cluster processing
  • Multicast spike routing reduces communication latency
  • Two‑core RISC‑V architecture offloads neuromorphic computation
  • On‑chip spike coding eliminates I/O bottlenecks

📊 Experimental Results

FPGA Prototyping

End‑to‑end validation on FPGA with synthesis reports and timing analysis for spike routing, neuron firing, and weight fetch.

  • Multiple cluster/neuron configurations
  • Clock generation and timing validation
  • Resource utilization optimization
  • NoC and memory stress tests
View FPGA Builds →

ASIC Synthesis

Power/area/timing analysis using Synopsys tools with technology normalization for fair comparisons.

  • Design Compiler for synthesis
  • PrimeTime for STA
  • PrimePower for power breakdown
  • Critical path and clock‑domain checks
View ASIC Results →

Vision Workloads

SNN experiments on MNIST and vision pipelines using TTFS and rate coding to evaluate accuracy/latency‑power trade‑offs.

  • Multiple neuron/cluster sizes
  • Variable time steps per inference
  • Latency vs. accuracy studies
  • On‑chip coding performance
View Experiments →

Verification & Simulation

Functional verification with ModelSim/Verilator and post‑synthesis timing checks; waveform inspection via GTKWave.

  • Comprehensive test benches
  • Spike encoding/decoding validation
  • NoC stress and backpressure tests
  • Hardware‑in‑the‑loop demos
View Models →

Team & Acknowledgements

👥 Development Team

Final Year Project Thesis (CO421 & CO425)
Department of Computer Engineering, University of Peradeniya
B.Sc.Eng. in Computer Engineering - July 2025

Kanishka Gunawardana (E/19/129) Hardware Architecture & RTL Design
Sanka Peeris (E/19/275) RISC‑V Integration & SoC Design
Kavishka Rambukwella (E/19/309) SNN Models & Neuromorphic Algorithms

🎓 Supervision

Dr. Isuru Nawinne Primary Supervisor - Computer Architecture & SoC Design
Prof. Roshan G. Ragel Co‑Supervisor - RISC‑V Systems & Embedded Computing

🙏 Acknowledgements

We extend our sincere gratitude to all those who supported and guided us throughout this project. Special thanks to the neuromorphic computing and RISC‑V communities, the Department of Computer Engineering at University of Peradeniya, and the previous project groups (Ms. Saadia Jameel, Mr. Thamish Wanduragala, and Mr. Akila Karunanayake) whose foundational work laid the groundwork for our continuation.

📚 Thesis Information

Title: SNAP‑V: A RISC‑V SoC with Configurable Neuromorphic Acceleration for Small‑Scale Spiking Neural Networks
Submission: July 2025
Dedication: To our loving parents, teachers, mentors, and friends who inspired and guided us throughout our academic journey.

📊 Project Statistics

15,000+
Lines of Verilog
50+
Test Benches
6
Neuron Models
15
Experimental Configs

🔧 Technology Stack

SystemVerilog RTL RISC‑V RV32IM Small‑Scale SNNs Chipyard Framework RoCC Interface Hierarchical NoC LIF Neurons FPGA Prototyping ModelSim/Verilator Synopsys ASIC Flow Edge Computing Open Source