(pronounced: ai·ris | aɪ.rɪs)

"AI That Opens Eyes"

Chapter I The Vision

A New Dimension of Awareness

AIris is not merely a tool; it is a paradigm shift in assistive technology for the visually impaired. Our mission is to deliver instantaneous, contextual awareness of the visual world, empowering users with an unprecedented level of freedom and independence. Where other tools offer a glimpse, AIris delivers sight.

Development Team

Rajin Khan (2212708042) & Saumik Saha Kabbya (2211204042)
North South University | CSE 499A/B Senior Capstone Project

Chapter II The Challenge

Bridging the Visual Gap

Current assistive technologies are a compromise—slow, costly, and tethered to the cloud. They offer fragmented data, not holistic understanding. We identified four critical failures to overcome.

High Latency

5+ second delays and complex interactions break immersion and utility.

Cost Barriers

Proprietary hardware and expensive cloud APIs limit accessibility.

Cloud Dependency

No internet means no functionality, creating a fragile reliance on connectivity.

Context Gap

Static image analysis fails to understand user intent or the dynamics of an environment.

Chapter III The Solution

The AIris Solution

An elegant, purpose-built wearable that delivers sub-2-second, offline-first, context-aware descriptions. It is a quiet companion, a real-time narrator, and a bridge to visual freedom.

Instant Analysis

Sub-2-second response from a single button press to audio description. No apps, no menus, just instant awareness.

Edge AI Processing

Local-first approach on a Raspberry Pi 5 ensures privacy, low latency, and functionality without an internet connection.

Safety Prioritized

The AI engine is trained to identify and announce potential hazards—like obstacles, traffic, and steps—first.

Human-First Design

A lightweight, comfortable, and discreet form factor designed for all-day wear, with private audio delivery.

Chapter IV Literature Review

Grounding Our Vision in Research

The AIris project is built upon a solid foundation of academic and applied research. Our review of existing literature validates our architectural choices and highlights our key contributions to the field of assistive technology.

Key Research Gaps Addressed

Research Gap Identified	How AIris Addresses the Gap
High Latency & Cloud Dependency	An offline-first architecture on a Raspberry Pi 5 ensures sub-2-second response times, eliminating reliance on internet connectivity.
Lack of Contextual Understanding	Integration of modern Vision-Language Models (LLaVA, BLIP-2) provides rich, human-like descriptions, moving beyond simple object lists.
High Cost & Poor Accessibility	A targeted hardware budget under $160 USD and an open-source philosophy make the technology vastly more accessible than commercial alternatives.
On-Device Performance Limitations	Targeted hardware/software co-design, including model quantization and memory management, is a core development phase, not an afterthought.

References

Chapter V System Architecture

Anatomy of Instant Vision

Our modular architecture separates the system into a wearable Spectacle Unit and a powerful Pocket Unit. This core design is flexible, allowing for multiple physical form factors.

Spectacle Unit

USB Camera

Mini Speaker

Pocket Unit

Raspberry Pi 5

Power Bank

Tactile Button

3D-Printed Case

Conceptual Form Factors

The complete AIris wearable system showing glasses connected by a wire to the pocket unit. — Concept A: AIris Wearable

Concept B: AIris Mini

Chapter VI Technology Deep Dive

Our Technology Stack

We are leveraging a state-of-the-art technology stack, chosen for performance on edge devices. This is not just a concept; it is an engineered system.

AI Model Evaluation

Benchmarking multiple vision-language models to find the optimal balance of speed, accuracy, and resource usage for local deployment.

LLaVA-v1.5: Primary for balanced local performance.
BLIP-2: Used as an accuracy benchmark.
Groq API: For high-speed cloud fallback.
Ollama: For flexible local LLM hosting.

Software Stack

Built on a robust Python foundation, utilizing industry-standard libraries for computer vision, AI, and hardware interfacing.

Python 3.11+ (Core Language)
PyTorch 2.0+ (AI Framework)
OpenCV (Computer Vision)
RPi.GPIO & picamera2 (Hardware Control)

Chapter VII Prototyping & Evaluation

Current Development Status

We are in the active prototyping and testing phase, using a web interface to rapidly evaluate and optimize different multimodal AI models before hardware integration.

Web interface showing an image upload and the AI-generated description. — Web Interface Testing Platform

Code snippet or system diagram related to the project. — Real-time Metrics & System Logic

Chapter VIII The Blueprint

Budget & Portability

Accessibility includes affordability. We've sourced components to keep the cost under our target for the Bangladesh market, without sacrificing the core mission of complete portability.

Component Category	Cost Range (BDT)	Weight Est.
Core Computing (Pi 5, SD Card)	৳10,600 - ৳12,600	~200g
Portable Power (Power Bank, Cables)	৳2,350 - ৳3,600	~400g
Camera & Audio System	৳1,980 - ৳3,470	~150g
Control & Housing	৳955 - ৳1,910	~180g
TOTAL ESTIMATE (Target < ৳17,000)	৳15,885 - ৳21,580	~930g

Chapter IX The Roadmap

Two Phases of Innovation

Phase 1: CSE 499A (Current)

Focus: Software Foundation & AI Integration. This phase involves deep research into lightweight vision-language models, benchmarking their performance on the Raspberry Pi 5, building the core scene description engine, and optimizing the entire software pipeline for latency and efficiency.

Phase 2: CSE 499B (Upcoming)

Focus: Hardware Integration & User Experience. This phase brings the project into the physical world. We will 3D model and print the custom enclosures, assemble the complete wearable system, and conduct extensive field testing with users to gather feedback and refine the final product.

Chapter X Academic Alignment

Exceeding Course Outcomes

This project is meticulously designed to meet and exceed the learning outcomes for the CSE 499A/B Senior Capstone course.

Problem & Design: We identify a real-world engineering problem and design a complete, constrained hardware/software system to meet desired needs.

Modern Tools: We leverage a modern stack including Python, PyTorch, modern AI models, and embedded systems.

Constraint Validation: Our budget addresses economic factors; offline-first design addresses privacy, and the core function is safety-focused.

Defense & Documentation: This experience, along with our detailed documentation, fulfills all reporting and defense requirements.

AIris

Thank you.