This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
mobile_ocr is a Flutter plugin for on-device OCR across Android and iOS. The Android implementation directly ports OnnxOCR using PaddleOCR v5 models on ONNX Runtime, while the iOS implementation uses Apple’s Vision framework to provide the same API surface without shipping ONNX models.
Critical Constraint: NO OpenCV or large SDKs. Only native Android APIs (Bitmap, Canvas, Matrix, Paint) and ONNX Runtime are allowed to prevent native library bloat.
# Run Flutter tests
flutter test
# Run Android unit tests
cd android && ./gradlew testcd example
# Run example app
flutter run
# CRITICAL FOR AI AGENTS: Never use Bash tool directly for flutter run
# Use Task tool with general-purpose agent instead to avoid context pollutionExample app auto-loads test images with ground truth validation:
- First image loads automatically after 3 seconds
- Enable auto-cycle: Set
AUTO_CYCLE_TEST_IMAGES = trueinexample/lib/main.dart:14 - Test images:
example/assets/test_ocr/withground_truth.json
Direct port of OnnxOCR's processing pipeline:
-
Text Detection (
android/src/main/kotlin/.../TextDetector.kt)- DB algorithm, model:
det.onnx(4.75 MB) - Resize to 960px min side, normalize with mean/std, CHW format
- Postprocess: threshold=0.3, box_threshold=0.6, unclip_ratio=1.5
- DB algorithm, model:
-
Angle Classification (
TextClassifier.kt)- Detects 180° rotation, model:
cls.onnx(583 KB) - Input: (3, 48, 192), threshold=0.9
- Detects 180° rotation, model:
-
Text Recognition (
TextRecognizer.kt)- SVTR_LCNet + CTC decoder, model:
rec.onnx(16.5 MB) - Input: (3, 48, 320), batch_size=6
- Dictionary:
ppocrv5_dict.txt
- SVTR_LCNet + CTC decoder, model:
Models are NOT bundled with the plugin:
- Hosted:
https://models.ente.io/PP-OCRv5/ - Managed by:
ModelManager.kt(download, verify SHA-256, cache) - Cached:
context.filesDir/assets/mobile_ocr/ - Triggered: First
prepareModels()call - Offline: Works offline after initial download
Native (Android) - android/src/main/kotlin/io/ente/mobile_ocr/ (PaddleOCR v5 on ONNX Runtime):
MobileOcrPlugin.kt: Flutter method channel interfaceOcrProcessor.kt: Pipeline orchestratorModelManager.kt: Download/cache managerTextDetector.kt: Detection stageTextClassifier.kt: Classification stageTextRecognizer.kt: Recognition stageImageUtils.kt: Pure Kotlin image preprocessing (NO OpenCV)
Native (iOS) - ios/Classes/ (Apple Vision framework):
MobileOcrPlugin.swift: Vision-based text recognition returning the shared result schema- Uses
VNRecognizeTextRequestwith language auto-detection where available prepareModels()short-circuits withisReady=true(no downloads required)
Flutter (Dart) - lib/:
mobile_ocr_plugin.dart: Public API (detectText(),prepareModels())mobile_ocr_plugin_platform_interface.dart: Platform interfacemobile_ocr_plugin_method_channel.dart: Method channel implementation
Models - Downloaded at runtime:
det.onnx,rec.onnx,cls.onnx,ppocrv5_dict.txt
Example - example/:
- Full demo app with test images and ground truth validation
- Shows text overlay, selection, copying, confidence visualization
When porting Python cv2 operations:
- ✅ Use Android
Bitmap,Canvas,Matrix,Paint - ✅ Implement custom algorithms in pure Kotlin
- ✅ Accept minor differences if it avoids dependencies
- ❌ Never use OpenCV or libraries that bundle .so files
- ❌ Never add large image processing SDKs
Must match OnnxOCR exactly:
- Detection:
limit_side_len=960,db_thresh=0.3,box_thresh=0.6,unclip_ratio=1.5 - Classification:
thresh=0.9, shape(3, 48, 192) - Recognition:
batch_num=6, shape(3, 48, 320) - Normalization:
- Detection: mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]
- Rec/Cls: (pixel/255 - 0.5) / 0.5
- Always call
bitmap.recycle()after use - Close ONNX tensors explicitly
- Keep heavy processing in native layer
- Use Kotlin coroutines for async work
Android (android/build.gradle):
implementation("com.microsoft.onnxruntime:onnxruntime-android:1.22.0")
implementation("org.jetbrains.kotlinx:kotlinx-coroutines-android:1.7.3")Flutter (pubspec.yaml):
dependencies:
flutter: {sdk: flutter}
plugin_platform_interface: ^2.0.2
path_provider: ^2.1.0
dev_dependencies:
flutter_test: {sdk: flutter}
flutter_lints: ^5.0.0- ✅ Android (API 24+, ONNX Runtime + PaddleOCR)
- ✅ iOS (Vision framework)
documentation/ONNX_OCR_PLUGIN_CONTEXT.md: Complete context, testing workflowdocumentation/OnnxOCR_Implementation_Guide.md: Model specs, algorithms- Original: OnnxOCR
- Models: PaddleOCR v5