DEV Community

Cover image for Flutter + A.I. Text-To-Speech: A Simple Guide
Noah Velasco
Noah Velasco

Posted on • Updated on

Flutter + A.I. Text-To-Speech: A Simple Guide

Please like and follow me on Github @noahvelasco!

GitHub Code

Overview

  1. Introduction
  2. Get ElevenLabs API Key
  3. Flutter Project Configuration
  4. Basic UI
  5. API Key Setup
  6. ElevenLabs API Code Call
  7. Full Code
  8. Possible Errors
  9. Conclusion

Introduction

Hey there, fellow developers! We're going to dive into the exciting realm of text-to-speech (TTS) integration in Flutter. In today's fast-paced world, multimedia experiences are key to engaging users, and TTS APIs have become our secret weapon. In this tutorial, I'll walk you through the process of harnessing an API to bring text-to-speech functionality to your Flutter applications with a simple Flutter app.

Whether you're building an educational app, adding an accessibility feature, or simply enhancing your user experience, this guide will equip you with all the know-how to get started.


Get ElevenLabs API Key

First things first! Get your API key from your ElevenLabs profile and save it somewhere! Don't worry, it's free for 10,000 characters a month once you sign up. After you're done with this tutorial you're gonna want to pay them - it's REALLY good. Anyways, save the key - we will need it later!
EL API Key Dialogue Box


Flutter Project Configuration

Create a new flutter project and follow these steps. Do not skip these steps since enabling certain rules and permissions is necessary to make TTS possible! Follow the below steps for your platform.

Android

  • Enable multidex support in the android/app/build.gradle file
defaultConfig {
   ...
   multiDexEnabled true
}
Enter fullscreen mode Exit fullscreen mode
  • Enable Internet Connection on Android in android/app/src/main/AndroidManifest.xml
<uses-permission android:name="android.permission.INTERNET"/>
Enter fullscreen mode Exit fullscreen mode

and update the application tag

<application ... android:usesCleartextTraffic="true">
Enter fullscreen mode Exit fullscreen mode

iOS

  • Enable internet connection on iOS in the iOS/Runner/Info.plist
<dict>
....
<key>NSAppTransportSecurity</key>
<dict>
    <key>NSAllowsArbitraryLoads</key>
    <true/>
</dict>
...
</dict>
Enter fullscreen mode Exit fullscreen mode

Basic UI

Let's code up a simple text form field and a button. The button will call the ElevenLabs API and play the input text through the speaker once pressed. First, lets set up the front end before any API calls -

import 'package:flutter/material.dart';

void main() => runApp(MyApp());

class MyApp extends StatelessWidget {
  @override
  Widget build(BuildContext context) {
    return MaterialApp(
      title: 'TTS Demo',
      home: MyHomePage(),
    );
  }
}

class MyHomePage extends StatefulWidget {
  @override
  _MyHomePageState createState() => _MyHomePageState();
}

class _MyHomePageState extends State<MyHomePage> {
  TextEditingController _textFieldController = TextEditingController();

  @override
  void dispose() {
    _textFieldController.dispose();
    super.dispose();
  }

  @override
  Widget build(BuildContext context) {
    return Scaffold(
      appBar: AppBar(
        title: const Text('EL TTS Demo'),
      ),
      body: Padding(
        padding: const EdgeInsets.all(16.0),
        child: Column(
          crossAxisAlignment: CrossAxisAlignment.stretch,
          children: <Widget>[
            TextField(
              controller: _textFieldController,
              decoration: const InputDecoration(
                labelText: 'Enter some text',
              ),
            ),
            const SizedBox(height: 16.0),
            ElevatedButton(
              onPressed: () {
                //Eleven Labs API Call Here
              },
              child: const Icon(Icons.volume_up),
            ),
          ],
        ),
      ),
    );
  }
}

Enter fullscreen mode Exit fullscreen mode

Basic UI


API Key Setup

Let's utilize the flutter package flutter_dotenv, create a .env file and insert our API key into it, and modify the pubspec.yaml file to include the .env file as it states on the instructions. Follow the below steps -

  • Add package to project

$ flutter pub add flutter dotenv

Make the following changes

  • Create a .env file in the root directory
  • Add the ElevenLabs API key to the .env file (as a string)
  • Add the .env file to the pubspec.yaml assets section
  • Add import to code (as seen below)
  • Add the .env variable as a global (as seen below)
  • Update main method code (as seen below)
import 'package:flutter_dotenv/flutter_dotenv.dart';

String EL_API_KEY = dotenv.env['EL_API_KEY'] as String;

Future main() async {
  await dotenv.load(fileName: ".env");

  runApp(MyApp());
}
Enter fullscreen mode Exit fullscreen mode

dot env setup


ElevenLabs API Code Call

Now for the fun part! Now since we are going to be turning the text into speech using a REST API - we need a couple more packages. Follow the below -

  • Add package to project

$ flutter pub add http
$ flutter pub add just_audio

  • Add the following imports
import 'package:just_audio/just_audio.dart';
import 'package:http/http.dart';
Enter fullscreen mode Exit fullscreen mode
  • Create an AudioPlayer object that will be responsible for playing the audio
final player = AudioPlayer(); //audio player obj that will play audio
Enter fullscreen mode Exit fullscreen mode
  • To play the Audio, we need to borrow a function from the just_audio package. Place the following outside the main() -
// Feed your own stream of bytes into the player
class MyCustomSource extends StreamAudioSource {
  final List<int> bytes;
  MyCustomSource(this.bytes);

  @override
  Future<StreamAudioResponse> request([int? start, int? end]) async {
    start ??= 0;
    end ??= bytes.length;
    return StreamAudioResponse(
      sourceLength: bytes.length,
      contentLength: end - start,
      offset: start,
      stream: Stream.value(bytes.sublist(start, end)),
      contentType: 'audio/mpeg',
    );
  }
}
Enter fullscreen mode Exit fullscreen mode
  • Now we can add the REST API function 'playTextToSpeech' that fetches the main data from ElevenLabs in the class _MyHomePageState. We pass 'text' and that text will be converted to 'bytes' which our helper class/function 'MyCustomSource' will convert into sound.
  //For the Text To Speech
  Future<void> playTextToSpeech(String text) async {

    String voiceRachel =
        '21m00Tcm4TlvDq8ikWAM'; //Rachel voice - change if you know another Voice ID

    String url = 'https://api.elevenlabs.io/v1/text-to-speech/$voiceRachel';
    final response = await http.post(
      Uri.parse(url),
      headers: {
        'accept': 'audio/mpeg',
        'xi-api-key': EL_API_KEY,
        'Content-Type': 'application/json',
      },
      body: json.encode({
        "text": text,
        "model_id": "eleven_monolingual_v1",
        "voice_settings": {"stability": .15, "similarity_boost": .75}
      }),
    );

    if (response.statusCode == 200) {
      final bytes = response.bodyBytes; //get the bytes ElevenLabs sent back
      await player.setAudioSource(MyCustomSource(
          bytes)); //send the bytes to be read from the JustAudio library
      player.play(); //play the audio
    } else {
      // throw Exception('Failed to load audio');
      return;
    }
  } 

Enter fullscreen mode Exit fullscreen mode

If you want to tweak the way the voice sounds you can modify: the voice (in this case we are using the voice ID Rachel - '21m00Tcm4TlvDq8ikWAM'), stability, and the similarity boost. You can view the API docs to go more in depth.

To make this more UI friendly, we can add a linear progress indicator to know if the request is/isn't in progress.

Full Code

import 'package:flutter/material.dart';
import 'dart:convert';

import 'package:flutter_dotenv/flutter_dotenv.dart';
import 'package:just_audio/just_audio.dart';
import 'package:http/http.dart' as http;

String EL_API_KEY = dotenv.env['EL_API_KEY'] as String;

Future main() async {
  await dotenv.load(fileName: ".env");

  runApp(MyApp());
}

class MyApp extends StatelessWidget {
  @override
  Widget build(BuildContext context) {
    return MaterialApp(
      title: 'TTS Demo',
      home: MyHomePage(),
    );
  }
}

class MyHomePage extends StatefulWidget {
  @override
  _MyHomePageState createState() => _MyHomePageState();
}

class _MyHomePageState extends State<MyHomePage> {
  TextEditingController _textFieldController = TextEditingController();
  final player = AudioPlayer(); //audio player obj that will play audio
  bool _isLoadingVoice = false; //for the progress indicator

  @override
  void dispose() {
    _textFieldController.dispose();
    player.dispose();
    super.dispose();
  }

  //For the Text To Speech
  Future<void> playTextToSpeech(String text) async {
    //display the loading icon while we wait for request
    setState(() {
      _isLoadingVoice = true; //progress indicator turn on now
    });

    String voiceRachel =
        '21m00Tcm4TlvDq8ikWAM'; //Rachel voice - change if you know another Voice ID

    String url = 'https://api.elevenlabs.io/v1/text-to-speech/$voiceRachel';
    final response = await http.post(
      Uri.parse(url),
      headers: {
        'accept': 'audio/mpeg',
        'xi-api-key': EL_API_KEY,
        'Content-Type': 'application/json',
      },
      body: json.encode({
        "text": text,
        "model_id": "eleven_monolingual_v1",
        "voice_settings": {"stability": .15, "similarity_boost": .75}
      }),
    );

    setState(() {
      _isLoadingVoice = false; //progress indicator turn off now
    });

    if (response.statusCode == 200) {
      final bytes = response.bodyBytes; //get the bytes ElevenLabs sent back
      await player.setAudioSource(MyCustomSource(
          bytes)); //send the bytes to be read from the JustAudio library
      player.play(); //play the audio
    } else {
      // throw Exception('Failed to load audio');
      return;
    }
  } //getResponse from Eleven Labs

  @override
  Widget build(BuildContext context) {
    return Scaffold(
      appBar: AppBar(
        title: const Text('EL TTS Demo'),
      ),
      body: Padding(
        padding: const EdgeInsets.all(16.0),
        child: Column(
          crossAxisAlignment: CrossAxisAlignment.stretch,
          children: <Widget>[
            TextField(
              controller: _textFieldController,
              decoration: const InputDecoration(
                labelText: 'Enter some text',
              ),
            ),
            const SizedBox(height: 16.0),
            ElevatedButton(
              onPressed: () {
                playTextToSpeech(_textFieldController.text);
              },
              child: _isLoadingVoice
                  ? const LinearProgressIndicator()
                  : const Icon(Icons.volume_up),
            ),
          ],
        ),
      ),
    );
  }
}

// Feed your own stream of bytes into the player
class MyCustomSource extends StreamAudioSource {
  final List<int> bytes;
  MyCustomSource(this.bytes);

  @override
  Future<StreamAudioResponse> request([int? start, int? end]) async {
    start ??= 0;
    end ??= bytes.length;
    return StreamAudioResponse(
      sourceLength: bytes.length,
      contentLength: end - start,
      offset: start,
      stream: Stream.value(bytes.sublist(start, end)),
      contentType: 'audio/mpeg',
    );
  }
}

Enter fullscreen mode Exit fullscreen mode

Possible Errors

  • Check the 'Flutter Project Configuration' section above

Conclusion

Hey, you did it! You've now got the superpower to integrate text-to-speech into your Flutter apps like a pro. By adding this awesome feature, you're taking your users' experience to a whole new level, making your app more accessible and engaging. Don't forget to keep exploring the endless possibilities offered by your chosen API and have fun experimenting with different customization options.


Please like and follow me on Github @noahvelasco!

Top comments (2)

Collapse
 
prasanth_9ff024e60550bac4 profile image
Prasanth

How i read the tamil text and get the tamil voice id

Collapse
 
faisalabdrahim profile image
kalimaty

Image description
please can you tell about all this 3 points?