Skip to the content.

Airo: On-Device AI + RAG + Function Calling Product Strategy

Status: Phase 0 Complete βœ… Phase 1 Starting πŸš€

Executive Summary

Airo is an AI-powered super app that processes PDFs, images, and audio on-device using Gemma 1B (quantized int4) with RAG and function calling. Zero network required for core workflows. Target platforms: Android (Pixel 9), iOS (iPhone 13 Pro Max), Chrome (desktop).

Key Differentiator: All data processing happens locally. No PHI leaves the device unless explicitly opted in.


Success Metrics (Implementable)

Metric Target Rationale
Offline Success Rate 90% of workflows PDF→extraction→action with no network
Latency <3s for 1-page PDF Pixel 9 + Gemma 1B int4
App Footprint <1.2GB Model + app minimal install
Extraction F1 β‰₯0.9 Bills, diet plans held-out test set
Battery <5% per workflow Single full run on Pixel 9

High-Level Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                    Flutter UI (Cross-Platform)              β”‚
β”‚  (Login, File Upload, Results, Notifications, Payments)    β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                     β”‚ Platform Channels
        β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
        β”‚            β”‚            β”‚
    β”Œβ”€β”€β”€β–Όβ”€β”€β”    β”Œβ”€β”€β”€β–Όβ”€β”€β”    β”Œβ”€β”€β”€β–Όβ”€β”€β”
    β”‚Androidβ”‚   β”‚ iOS  β”‚   β”‚Chrome β”‚
    β”‚Native β”‚   β”‚Swift β”‚   β”‚WASM   β”‚
    β””β”€β”€β”€β”¬β”€β”€β”˜    β””β”€β”€β”€β”¬β”€β”€β”˜    β””β”€β”€β”€β”¬β”€β”€β”˜
        β”‚           β”‚           β”‚
    β”Œβ”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
    β”‚  AI Edge SDK / LiteRT Runtime                β”‚
    β”‚  β”œβ”€ Gemma 1B int4 (Inference)               β”‚
    β”‚  β”œβ”€ Embedding Model (RAG)                   β”‚
    β”‚  β”œβ”€ Function Calling SDK                    β”‚
    β”‚  └─ OCR / PDF Parser                        β”‚
    β””β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
        β”‚
    β”Œβ”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
    β”‚  Local Storage (Encrypted)                   β”‚
    β”‚  β”œβ”€ SQLCipher (Extracted Data)              β”‚
    β”‚  β”œβ”€ Vector Index (HNSW / AI Edge RAG)       β”‚
    β”‚  └─ Notification Schedule / Payment Requestsβ”‚
    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Core Use Cases & Functions

1. fill_form(patient) - Healthcare/Diet Forms

2. schedule_notifications(plan_id) - Recurring Reminders

3. split_bill(bill_id) - Expense Sharing


Implementation Roadmap (10 Phases)

Phase 0: Foundation βœ… COMPLETE

Phase 1: PoC - On-Device AI (2-3 sprints)

Phase 2: Document Ingestion (2 sprints)

Phase 3: RAG & Vector Store (2 sprints)

Phase 4: Function Calling (1-2 sprints)

Phase 5: iOS & Web Parity (2-3 sprints)

Phase 6: Privacy & Security (1 sprint)

Phase 7: Performance (1 sprint)

Phase 8: Testing & Eval (2 sprints)

Phase 9: Beta (2 sprints)

Phase 10: Production (Ongoing)


Technology Stack

Component Technology Notes
Model Gemma 3 1B int4 LiteRT quantized, <500MB
Inference AI Edge SDK / LiteRT Android native, iOS CoreML, Web WASM
RAG AI Edge RAG SDK On-device retrieval + chunking
Function Calling AI Edge Function Calling JSON schema-based
OCR ML Kit + Tesseract Local, no cloud
Storage SQLCipher + HNSW Encrypted vectors + metadata
Notifications WorkManager (Android) / UserNotifications (iOS) Local scheduling
UI Flutter Single codebase, platform channels

Risk Mitigations

Risk Mitigation
OOM on older devices Smaller model, streaming tokens, cloud fallback
Function hallucination Strict JSON schema, deterministic validators, human confirmation for payments
Multimodal sync issues Prefer deterministic OCR for numeric/date data
Model accuracy drift A/B testing, continuous evaluation, model updates

Next Steps (Week 1)

  1. Download & quantize Gemma 1B β†’ int4 LiteRT model
  2. Set up Android native module β†’ Load model, basic inference
  3. Create Flutter plugin β†’ Platform channel bridge
  4. Register fill_form function β†’ Test with sample form
  5. Build Flutter UI β†’ Text input, model response display

References


Product Manager: Airo Team
CTO: Architecture & Technology Lead
Last Updated: 2025-10-30