In this article we are going to see how we can create an application that recognizes text extracted from images and prints it on the screen. To do this we are going to use the camera plugin to open a video feed directly from the device's camera, and the google_mlkit_text_recognition package to scan the camera frames for text.
Create a new Flutter application:
flutter create --platforms android,ios flutter_text_recognition
Modify the pubspec.yaml
file so that it uses the aforementioned plugins as well as permission_handler, which will be used to request permission to use the camera:
name: flutter_text_recognition
description: A new Flutter project.
publish_to: 'none'
version: 1.0.0+1
environment:
sdk: '>=2.18.2 <3.0.0'
dependencies:
camera: ^0.10.0+4
flutter:
sdk: flutter
google_mlkit_text_recognition: ^0.4.0
permission_handler: ^10.2.0
dev_dependencies:
flutter_test:
sdk: flutter
flutter_lints: ^2.0.0
flutter:
uses-material-design: true
It is possible that if you try to run this app now you will get an error because the google_mlkit_text_recognition
package needs you to target a higher Android SDK, if that is the case modify the android
block of build.gradle
(android/app /build.gradle) with the following values:
// Modify only the android block as follows:
android {
compileSdkVersion 33
ndkVersion flutter.ndkVersion
compileOptions {
sourceCompatibility JavaVersion.VERSION_1_8
targetCompatibility JavaVersion.VERSION_1_8
}
kotlinOptions {
jvmTarget = '1.8'
}
sourceSets {
main.java.srcDirs += 'src/main/kotlin'
}
defaultConfig {
// TODO: Specify your own unique Application ID (https://developer.android.com/studio/build/application-id.html).
applicationId "com.example.flutter_text_recognition"
// You can update the following values to match your application needs.
// For more information, see: https://docs.flutter.dev/deployment/android#reviewing-the-build-configuration.
minSdkVersion 22
targetSdkVersion 32
versionCode flutterVersionCode.toInteger()
versionName flutterVersionName
}
buildTypes {
release {
// TODO: Add your own signing config for the release build.
// Signing with the debug keys for now, so `flutter run --release` works.
signingConfig signingConfigs.debug
}
}
}
Notice that I have set targetSdkVersion
to 32. The reason is that there is a bug right now that causes problems with version 33.
Camera permission
The first step will be to get permission to open the camera. To do this we must define said permission, first in the AndroidManifest.xml
file (android/app/src/main/AndroidManifest.xml):
<uses-permission android:name="android.permission.CAMERA" />
And also in Info.plist
(ios/Runner/Info.plist):
<key>NSCameraUsageDescription</key>
<string>This app needs access to your camera in order to scan text.</string>
Additionally, if we look at the permission_handler setup instructions for iOS, we must also define the permissions that we are going to use in Podfile (ios/Podfile):
# Modify the last part of this file to look like this:
post_install do |installer|
installer.pods_project.targets.each do |target|
flutter_additional_ios_build_settings(target)
# Add the following block:
target.build_configurations.each do |config|
config.build_settings['IPHONEOS_DEPLOYMENT_TARGET'] = $iOSVersion
config.build_settings['GCC_PREPROCESSOR_DEFINITIONS'] ||= [
'$(inherited)',
## dart: PermissionGroup.camera
'PERMISSION_CAMERA=1',
]
end
end
end
Now modify the lib/main.dart file with the following code:
import 'package:flutter/material.dart';
import 'package:permission_handler/permission_handler.dart';
void main() {
runApp(const App());
}
class App extends StatelessWidget {
const App({super.key});
@override
Widget build(BuildContext context) {
return MaterialApp(
title: 'Flutter Text Recognition',
theme: ThemeData(
primarySwatch: Colors.blue,
),
home: const MainScreen(),
);
}
}
class MainScreen extends StatefulWidget {
const MainScreen({super.key});
@override
State<MainScreen> createState() => _MainScreenState();
}
class _MainScreenState extends State<MainScreen> {
bool _isPermissionGranted = false;
late final Future<void> _future;
@override
void initState() {
super.initState();
_future = _requestCameraPermission();
}
@override
Widget build(BuildContext context) {
return FutureBuilder(
future: _future,
builder: (context, snapshot) {
return Scaffold(
appBar: AppBar(
title: const Text('Text Recognition Sample'),
),
body: Center(
child: Container(
padding: const EdgeInsets.only(left: 24.0, right: 24.0),
child: Text(
_isPermissionGranted
? 'Camera permission granted'
: 'Camera permission denied',
textAlign: TextAlign.center,
),
),
),
);
},
);
}
Future<void> _requestCameraPermission() async {
final status = await Permission.camera.request();
_isPermissionGranted = status == PermissionStatus.granted;
}
}
In this code block we define a Future
that is going to be executed in a FutureBuilder. This Future
calls the _requestCameraPermission()
method, which is responsible for requesting the camera's permission through Permission.camera.request
and altering the _isPermissionGranted
state variable depending on whether the permission has been granted or not.
In this case, the FutureBuilder
widget is especially useful, since the act of requesting a permission is an asynchronous operation that we will want to perform at the beginning of the app execution. It is important to note that the Future
used by FutureBuilder
must be declared in the body of the class, this way if this widget is rebuilt this Future
will contain the previous result and will not be executed again.
This is a very simplified way of asking for a permission, you can see how I ask for permissions and handle all the possible scenarios in this article.
Show the camera preview
Now we are going to take the camera feed and render it on the screen so that the user can point to the text they want to scan.
A possible way to do it would be that once we have the permissions of the camera we show the preview inside the Scaffold
, but the problem with doing it that way is that due to the aspect ratio of the camera, it is possible that it does not look very good. For this reason, I prefer to show the preview of the camera behind the Scaffold
, in the following way:
import 'package:camera/camera.dart';
import 'package:flutter/material.dart';
import 'package:permission_handler/permission_handler.dart';
void main() {
runApp(const App());
}
class App extends StatelessWidget {
const App({super.key});
@override
Widget build(BuildContext context) {
return MaterialApp(
title: 'Flutter Text Recognition',
theme: ThemeData(
primarySwatch: Colors.blue,
),
home: const MainScreen(),
);
}
}
class MainScreen extends StatefulWidget {
const MainScreen({super.key});
@override
State<MainScreen> createState() => _MainScreenState();
}
// Add the WidgetsBindingObserver mixin
class _MainScreenState extends State<MainScreen> with WidgetsBindingObserver {
bool _isPermissionGranted = false;
late final Future<void> _future;
// Add this controller to be able to control de camera
CameraController? _cameraController;
@override
void initState() {
super.initState();
WidgetsBinding.instance.addObserver(this);
_future = _requestCameraPermission();
}
// We should stop the camera once this widget is disposed
@override
void dispose() {
WidgetsBinding.instance.removeObserver(this);
_stopCamera();
super.dispose();
}
// Starts and stops the camera according to the lifecycle of the app
@override
void didChangeAppLifecycleState(AppLifecycleState state) {
if (_cameraController == null || !_cameraController!.value.isInitialized) {
return;
}
if (state == AppLifecycleState.inactive) {
_stopCamera();
} else if (state == AppLifecycleState.resumed &&
_cameraController != null &&
_cameraController!.value.isInitialized) {
_startCamera();
}
}
@override
Widget build(BuildContext context) {
return FutureBuilder(
future: _future,
builder: (context, snapshot) {
return Stack(
children: [
// Show the camera feed behind everything
if (_isPermissionGranted)
FutureBuilder<List<CameraDescription>>(
future: availableCameras(),
builder: (context, snapshot) {
if (snapshot.hasData) {
_initCameraController(snapshot.data!);
return Center(child: CameraPreview(_cameraController!));
} else {
return const LinearProgressIndicator();
}
},
),
Scaffold(
appBar: AppBar(
title: const Text('Text Recognition Sample'),
),
// Set the background to transparent so you can see the camera preview
backgroundColor: _isPermissionGranted ? Colors.transparent : null,
body: _isPermissionGranted
? Column(
children: [
Expanded(
child: Container(),
),
Container(
padding: const EdgeInsets.only(bottom: 30.0),
child: Center(
child: ElevatedButton(
onPressed: null,
child: const Text('Scan text'),
),
),
),
],
)
: Center(
child: Container(
padding: const EdgeInsets.only(left: 24.0, right: 24.0),
child: const Text(
'Camera permission denied',
textAlign: TextAlign.center,
),
),
),
),
],
);
},
);
}
Future<void> _requestCameraPermission() async {
final status = await Permission.camera.request();
_isPermissionGranted = status == PermissionStatus.granted;
}
void _startCamera() {
if (_cameraController != null) {
_cameraSelected(_cameraController!.description);
}
}
void _stopCamera() {
if (_cameraController != null) {
_cameraController?.dispose();
}
}
void _initCameraController(List<CameraDescription> cameras) {
if (_cameraController != null) {
return;
}
// Select the first rear camera.
CameraDescription? camera;
for (var i = 0; i < cameras.length; i++) {
final CameraDescription current = cameras[i];
if (current.lensDirection == CameraLensDirection.back) {
camera = current;
break;
}
}
if (camera != null) {
_cameraSelected(camera);
}
}
Future<void> _cameraSelected(CameraDescription camera) async {
_cameraController = CameraController(
camera,
ResolutionPreset.max,
enableAudio: false,
);
await _cameraController!.initialize();
if (!mounted) {
return;
}
setState(() {});
}
}
In this block of code I have left you in the form of comments the changes that must be made. I have added several methods at the end of the class related to camera management. These methods are very well explained in the camera documentation.
Text Recognition
To finish this tutorial we go to the best part: how to recognize text and display it on the screen. First, we will create a new lib/result_screen.dart
file with a widget that will take care of displaying the scanned text:
import 'package:flutter/material.dart';
class ResultScreen extends StatelessWidget {
final String text;
const ResultScreen({super.key, required this.text});
@override
Widget build(BuildContext context) => Scaffold(
appBar: AppBar(
title: const Text('Result'),
),
body: Container(
padding: const EdgeInsets.all(30.0),
child: Text(text),
),
);
}
Now back to the lib/main.dart
file, create a variable of type TextRecognizer
at the top of the class:
final textRecognizer = TextRecognizer();
It is important to close this object in dispose()
:
@override
void dispose() {
_stopCamera();
textRecognizer.close();
super.dispose();
}
Add the following method below everything:
Future<void> _scanImage() async {
if (_cameraController == null) return;
final navigator = Navigator.of(context);
try {
final pictureFile = await _cameraController!.takePicture();
final file = File(pictureFile.path);
final inputImage = InputImage.fromFile(file);
final recognizedText = await textRecognizer.processImage(inputImage);
await navigator.push(
MaterialPageRoute(
builder: (BuildContext context) =>
ResultScreen(text: recognizedText.text),
),
);
} catch (e) {
ScaffoldMessenger.of(context).showSnackBar(
const SnackBar(
content: Text('An error occurred when scanning text'),
),
);
}
}
And use it as a callback for the button we added earlier:
// [...]
child: ElevatedButton(
onPressed: _scanImage,
child: const Text('Scan text'),
),
// [...]
In this _scanImage()
method we first take a picture from the camera, then create an object of type InputImage
passing as a parameter the path to the file of the picture taken, and finally pass it through the scanner (the TextRecognizer
object defined at the beginning of everything).
If all went well, we invoke the ResultScreen
with the scanned text. If something has failed, we show a SnackBar reporting the error.
Conclusion
And in this very simple way it is possible to create a text recognition system in Flutter. There are variations of what we have seen here, for example instead of taking a photo from the camera's live feed we could also upload a local image and run it through the scanner; or even get many frames continuously from the camera and display the text in real-time overlaid on top of the camera preview.
You can find the source code of this project here.
Happy coding :)
Top comments (0)