CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Project Overview

mobile_ocr is a Flutter plugin for on-device OCR across Android and iOS. The Android implementation directly ports OnnxOCR using PaddleOCR v5 models on ONNX Runtime, while the iOS implementation uses Apple’s Vision framework to provide the same API surface without shipping ONNX models.

Critical Constraint: NO OpenCV or large SDKs. Only native Android APIs (Bitmap, Canvas, Matrix, Paint) and ONNX Runtime are allowed to prevent native library bloat.

Common Commands

Plugin Testing

# Run Flutter tests
flutter test

# Run Android unit tests
cd android && ./gradlew test

Example App

cd example

# Run example app
flutter run

# CRITICAL FOR AI AGENTS: Never use Bash tool directly for flutter run
# Use Task tool with general-purpose agent instead to avoid context pollution

Test Configuration

Example app auto-loads test images with ground truth validation:

First image loads automatically after 3 seconds
Enable auto-cycle: Set AUTO_CYCLE_TEST_IMAGES = true in example/lib/main.dart:14
Test images: example/assets/test_ocr/ with ground_truth.json

Architecture

OCR Pipeline (3 Stages)

Direct port of OnnxOCR's processing pipeline:

Text Detection (android/src/main/kotlin/.../TextDetector.kt)
- DB algorithm, model: det.onnx (4.75 MB)
- Resize to 960px min side, normalize with mean/std, CHW format
- Postprocess: threshold=0.3, box_threshold=0.6, unclip_ratio=1.5
Angle Classification (TextClassifier.kt)
- Detects 180° rotation, model: cls.onnx (583 KB)
- Input: (3, 48, 192), threshold=0.9
Text Recognition (TextRecognizer.kt)
- SVTR_LCNet + CTC decoder, model: rec.onnx (16.5 MB)
- Input: (3, 48, 320), batch_size=6
- Dictionary: ppocrv5_dict.txt

Model Delivery

Models are NOT bundled with the plugin:

Hosted: https://models.ente.io/PP-OCRv5/
Managed by: ModelManager.kt (download, verify SHA-256, cache)
Cached: context.filesDir/assets/mobile_ocr/
Triggered: First prepareModels() call
Offline: Works offline after initial download

Component Structure

Native (Android) - android/src/main/kotlin/io/ente/mobile_ocr/ (PaddleOCR v5 on ONNX Runtime):

MobileOcrPlugin.kt: Flutter method channel interface
OcrProcessor.kt: Pipeline orchestrator
ModelManager.kt: Download/cache manager
TextDetector.kt: Detection stage
TextClassifier.kt: Classification stage
TextRecognizer.kt: Recognition stage
ImageUtils.kt: Pure Kotlin image preprocessing (NO OpenCV)

Native (iOS) - ios/Classes/ (Apple Vision framework):

MobileOcrPlugin.swift: Vision-based text recognition returning the shared result schema
Uses VNRecognizeTextRequest with language auto-detection where available
prepareModels() short-circuits with isReady=true (no downloads required)

Flutter (Dart) - lib/:

mobile_ocr_plugin.dart: Public API (detectText(), prepareModels())
mobile_ocr_plugin_platform_interface.dart: Platform interface
mobile_ocr_plugin_method_channel.dart: Method channel implementation

Models - Downloaded at runtime:

det.onnx, rec.onnx, cls.onnx, ppocrv5_dict.txt

Example - example/:

Full demo app with test images and ground truth validation
Shows text overlay, selection, copying, confidence visualization

Implementation Rules

No OpenCV Rule

When porting Python cv2 operations:

✅ Use Android Bitmap, Canvas, Matrix, Paint
✅ Implement custom algorithms in pure Kotlin
✅ Accept minor differences if it avoids dependencies
❌ Never use OpenCV or libraries that bundle .so files
❌ Never add large image processing SDKs

Model Parameter Compatibility

Must match OnnxOCR exactly:

Detection: limit_side_len=960, db_thresh=0.3, box_thresh=0.6, unclip_ratio=1.5
Classification: thresh=0.9, shape (3, 48, 192)
Recognition: batch_num=6, shape (3, 48, 320)
Normalization:
- Detection: mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]
- Rec/Cls: (pixel/255 - 0.5) / 0.5

Memory Management

Always call bitmap.recycle() after use
Close ONNX tensors explicitly
Keep heavy processing in native layer
Use Kotlin coroutines for async work

Dependencies

Android (android/build.gradle):

implementation("com.microsoft.onnxruntime:onnxruntime-android:1.22.0")
implementation("org.jetbrains.kotlinx:kotlinx-coroutines-android:1.7.3")

Flutter (pubspec.yaml):

dependencies:
  flutter: {sdk: flutter}
  plugin_platform_interface: ^2.0.2
  path_provider: ^2.1.0

dev_dependencies:
  flutter_test: {sdk: flutter}
  flutter_lints: ^5.0.0

Platform Support

✅ Android (API 24+, ONNX Runtime + PaddleOCR)
✅ iOS (Vision framework)

Reference Documentation

documentation/ONNX_OCR_PLUGIN_CONTEXT.md: Complete context, testing workflow
documentation/OnnxOCR_Implementation_Guide.md: Model specs, algorithms
Original: OnnxOCR
Models: PaddleOCR v5

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CLAUDE.md

Project Overview

Common Commands

Plugin Testing

Example App

Test Configuration

Architecture

OCR Pipeline (3 Stages)

Model Delivery

Component Structure

Implementation Rules

No OpenCV Rule

Model Parameter Compatibility

Memory Management

Dependencies

Platform Support

Reference Documentation

FilesExpand file tree

CLAUDE.md

Latest commit

History

CLAUDE.md

File metadata and controls

CLAUDE.md

Project Overview

Common Commands

Plugin Testing

Example App

Test Configuration

Architecture

OCR Pipeline (3 Stages)

Model Delivery

Component Structure

Implementation Rules

No OpenCV Rule

Model Parameter Compatibility

Memory Management

Dependencies

Platform Support

Reference Documentation